Claude Code /ultrareview: Cloud Code Review Goes Public Preview
Claude Code's /ultrareview command opened as a public research preview in Week 17 (April 2026). A fleet of bug-hunting agents runs in the cloud and findings land back in your CLI. Here's how to use it well.
For most of 2026 you could review your branch with Claude Code by reading the diff into a session and asking. It worked. But it ran one model with one perspective, and it had to share context with whatever else you were doing. In Week 17, Anthropic shipped /ultrareview as a public research preview — a fleet of cloud-side agents that review your branch in parallel and land their findings back in your CLI.
This piece is about how to use /ultrareview well, not just how to invoke it.
Key Takeaways
/ultrareviewruns in the cloud, in parallel, with multiple specialized review agents.- Public research preview shipped in Week 17 (April 20–24, 2026), v2.1.114–v2.1.119.
- Each agent has a different lens: silent failures, type design, comment accuracy, test coverage, security, performance, etc.
- Findings land back in your CLI or Desktop automatically — you do not have to babysit it.
- Week 18 added a CI-friendly entry point:
claude ultrareviewruns from scripts and CI without an interactive session. - A no-argument run reviews the current branch; pass a PR number to review an existing GitHub PR.
- Billing is metered (the cloud agents consume real reasoning); plan for it in your budget.
- It does not replace a human reviewer for design and intent. It catches a lot of what humans miss on diff scan.
How /ultrareview Differs From "Just Ask Claude"
A single Claude session reviewing your diff is one model with one prompt looking at one thing. /ultrareview is a fleet:
- A "silent failure hunter" looks specifically for empty catch blocks, inappropriate fallbacks, and error suppression.
- A "test coverage analyzer" looks at the diff and asks whether the tests cover the new branches and edge cases.
- A "type design analyzer" assesses encapsulation, invariants, and enforcement of new types.
- A "comment analyzer" verifies that new comments accurately describe the new code.
- A "code reviewer" looks at the overall quality, conventions, and security.
Each runs in parallel. Each has its own context budget. The combined output is much harder to produce in a single sequential session because the prompts cannot all fit.
The fleet approach also bypasses the single-agent failure mode where one early observation biases the rest of the review.
Invoking It
Three entry points:
# Interactive: review current branch
/ultrareview
# Interactive: review a specific GitHub PR
/ultrareview 1234
# Non-interactive: from CI or a script (added in Week 18)
claude ultrareview --pr 1234 --output json
The interactive form streams findings into your session as agents finish. The CI form returns structured JSON you can pipe into your pipeline or post as a PR comment.
For local-only reviews (no GitHub remote), /ultrareview bundles the current branch as a diff against the merge base and ships that to the cloud.
Reading the Output Well
A typical interactive run produces output like:
🔍 /ultrareview started — 6 agents reviewing branch feature/auth-refactor
[silent-failure-hunter] Found 2 high-confidence issues
- auth/middleware.ts:42 — catch block silently returns null on JWT decode failure
- auth/session.ts:88 — fallback to anonymous user when DB is unreachable
[test-coverage-analyzer] 1 critical gap
- No test exercises the "expired refresh token" path in auth/session.ts:130
[type-design-analyzer] 1 medium-priority concern
- UserSession type allows tokens.refresh to be null and expiresAt to be 0 simultaneously
- Recommendation: encode the invariant or split into discriminated union
[code-reviewer] No blocking issues
[comment-analyzer] 1 stale comment
[security-reviewer] No blocking issues
Three habits make this output more useful:
- Prioritize "high-confidence" findings. Each agent rates its confidence. Low-confidence findings are often noise on first review.
- Triage by lens, not by file. Walk the silent-failure findings first across all files; then type design; then tests. This is more efficient than file-by-file.
- Push back when you disagree.
/ultrareview followupopens a session where you can challenge specific findings. The agent that produced the finding responds with reasoning.
CI Integration
Week 18 made /ultrareview available outside the interactive CLI. A simple GitHub Action:
name: ultrareview
on:
pull_request:
types: [opened, synchronize]
jobs:
review:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Run ultrareview
env:
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
run: |
claude ultrareview --pr ${{ github.event.pull_request.number }} \
--output json > review.json
- name: Post findings
run: |
./scripts/post-review-comment.sh review.json
This pattern works well when paired with the auto-fix PR command (/autofix-pr) introduced in Week 15 — /ultrareview finds the issues, /autofix-pr opens a follow-up PR proposing fixes for the ones it can address.
When To Use It (And When Not To)
/ultrareview earns its cost when:
- The branch is non-trivial (>50 lines of meaningful change).
- You are about to merge to a protected branch.
- The change touches security-sensitive code paths.
- You want a check before a human reviewer's time is spent.
It is overkill when:
- The change is a trivial typo, copy edit, or config bump.
- You are iterating quickly during exploratory work and not ready for review.
- You are on a free tier and the metering matters.
It does not replace human review for design intent, alignment with project direction, or product judgment. It catches mechanical, structural, and stylistic issues — the parts of review that humans get tired of doing.
Comparison to Local Review
A single Claude Code session at xhigh effort can produce a good review. The trade-offs vs /ultrareview:
| Property | Local Claude session | /ultrareview (cloud) |
|---|---|---|
| Latency | Seconds | Minutes (parallel) |
| Cost | Counts against current session | Metered separately |
| Specialization | One generalist agent | Fleet of specialists |
| Context budget | Limited by session | Each agent gets its own |
| Best for | Quick sanity check | Pre-merge thorough review |
We tend to do both: local Claude review during iteration; /ultrareview before merge.
Comparison to OpenClaw and Hermes
A natural question: does /ultrareview make Hermes or OpenClaw review skills redundant?
No, and the reason is architectural:
/ultrareviewis a managed cloud service. It is the right call when you want maintained, opinionated review without infrastructure.- A self-hosted Hermes or OpenClaw setup lets you build review pipelines that fit your stack — your security rules, your codebase conventions, your post-review actions.
For most teams in 2026, /ultrareview is the default and Hermes/OpenClaw are how you customize. See OpenClaw vs Hermes vs Claude Code for the broader picture.
Practical Tips
- Run
/ultrareviewearly, not just at the end. Catching a silent failure halfway through a feature is cheaper than at PR review. - Tag findings into your tracker. A scripted post-review step that opens issues for high-confidence findings keeps the loop closed.
- Tune the "noise floor." Configure which lenses are mandatory and which are advisory for your team's PR template.
Bringing It Back Together
/ultrareview is the cloud-side counterpart to everything Claude Code has been building locally. Single-agent review is fast and useful; multi-agent review is thorough and slower. Use both, at the moments where each pays off.
For the broader release context, see Claude Code Plugins from .zip and URLs and Claude Opus 4.7 in Claude Code.
Sources
- Claude Code "What's New" — https://code.claude.com/docs/en/whats-new
- Claude Code Week 17 digest — https://code.claude.com/docs/en/whats-new/2026-w17
- Claude Code Week 18 digest — https://code.claude.com/docs/en/whats-new/2026-w18
- Related: Claude Opus 4.7 in Claude Code
- Related: Claude Code Plugins from .zip and URLs
- Related: OpenClaw vs Hermes vs Claude Code