Grill Me Is the Code Review You Need but Won't Ask Your Teammates For
Most code review is polite. Grill Me is not. It finds every assumption you made and every edge case you didn't handle — and the reason it works is there's no relationship to protect.
I've been the reviewer who was too polite.
The code wasn't great. There were assumptions baked into the data model that I knew were going to cause problems. But the engineer who wrote it had been working late, and the PR was already overdue, and I knew they were stressed about the deadline. I left some comments about minor things and approved it.
The problems I noticed showed up in production three months later. The fix took a week.
This is not an unusual story. Most code review is shaped by relationship management as much as technical evaluation. We soften feedback. We pick battles. We decide not to be the person who blocks every PR with a wall of comments. The result is that the review process, which exists to catch problems, is systematically optimized away from catching problems.
The Architecture of Polite Review
Politeness in code review isn't a character flaw. It's rational behavior given the social constraints of working with people you'll see again tomorrow.
The reviewer who blocks every PR is a difficult teammate. The reviewer who leaves harsh feedback creates defensiveness, which reduces the quality of implementation and damages working relationships. These are real costs. Most experienced engineers navigate them implicitly — moderating their review feedback to preserve the collaboration.
The cost of this moderation is that the review surface area shrinks. Reviewers stop flagging things they're not confident enough about to fight for. They leave comments that invite disagreement, which is the right disposition for maintaining relationships and the wrong disposition for finding bugs.
Grill Me from mattpocock/skills — 49K installs — removes that constraint entirely.
What "No Relationship to Protect" Changes
The skill is explicitly adversarial. It's designed to find every assumption in your code, every edge case you didn't handle, every decision you didn't defend. It asks the questions a polite reviewer wouldn't ask: "Why does this function assume the input will never be null?" "What happens when this API call times out?" "Have you tested this with the empty array case?"
The reason this works is structural. Grill Me has no working relationship with you to protect. It doesn't know your deadline pressures or how late you stayed up last night. It doesn't have to see you at standup tomorrow. The constraints that make human reviewers moderate their feedback don't apply.
49K installs is interesting for that reason. These aren't engineers who couldn't get code review. They're engineers who wanted a different kind of code review — one specifically optimized for finding problems rather than for maintaining working relationships.
The Use Case That Makes Sense
I think there are two situations where Grill Me is most valuable.
The first is self-review before the PR. You've written code that you think is good. You want to know what a hostile reviewer would say before your actual reviewers see it. Grill Me surfaces the problems you rationalized away — the "it's probably fine" assumptions, the "I'll handle that edge case later" decisions that are about to become someone else's problem.
The second is the code you're most confident about. That's counterintuitive. The code you're least confident about, you already know is risky. The dangerous code is the code you're sure about — the elegant solution that has one hidden assumption about input shape, the refactored function that breaks a downstream consumer you forgot about. Adversarial review is most useful on the code that feels solid.
The Feedback Is Different in Kind
What Grill Me produces is structurally different from normal review feedback.
Normal code review surfaces the things your reviewer happened to notice, filtered through their judgment about what's worth flagging. The output is shaped by the reviewer's expertise, their context, and their social calibration.
Adversarial review asks systematically: what could go wrong here? What does this code assume that could turn out to be wrong? Where is the contract between this function and its callers implicit rather than enforced? The questions aren't mediated by what it's comfortable to say.
The result is feedback that's often harder to receive and more useful to implement. Not because adversarial review is smarter than human review, but because it operates without the dampening effect of the relationship.
I still think human code review is valuable for things Grill Me can't do — catching architectural problems, maintaining codebase conventions, transferring context between engineers. But for finding the problems in code you've already convinced yourself is good, 49K developers have found an adversarial reviewer more useful than a polite one.
Part of the Matt Pocock TypeScript Skills — adversarial code review that asks the questions polite reviewers don't.