Anthropic Just Launched Code Review in Claude Code — And 54% of PRs Now Get Real Feedback

AI is writing more code than ever. Anthropic’s own engineers saw their code output jump roughly 200% year over year. But here’s the catch: someone still has to review all that code. Human reviewers can’t scale at the same rate, and the result is a growing gap between code produced and code properly checked.

That’s the problem Anthropic is trying to solve with Code Review, a new multi-agent system built directly into Claude Code. It shipped on March 9, 2026, and it’s already generating serious discussion across the developer community.

The Problem No One Wants to Talk About

AI coding tools have gotten remarkably good at generating code. GitHub reports that Copilot users accept roughly 30% of suggestions. Claude Code, Cursor, and other tools are helping developers ship faster than ever.

But faster output creates a new bottleneck: review. When your team’s code volume doubles, your review capacity doesn’t magically double with it. PRs pile up. Reviews get superficial. Bugs slip through.

Anthropic experienced this firsthand. Before launching Code Review internally, only about 16% of their pull requests received what they’d call “substantive” feedback — comments that actually caught meaningful issues. The rest got rubber-stamped or lightly skimmed.

How Anthropic’s Multi-Agent System Actually Works

Code Review doesn’t work like a typical linter or static analysis tool. When a developer opens a pull request, the system dispatches multiple AI agents that run in parallel. Each agent independently searches for different types of errors across the codebase.

Here’s what makes it different from simpler approaches:

Parallel analysis: Multiple agents examine the PR simultaneously, each looking for different categories of issues
Cross-verification: After individual analysis, agents compare findings and cross-check each other’s conclusions to filter out false positives
Codebase-aware: The agents don’t just look at the diff — they consider the entire codebase to catch cases where a change in one file breaks something in another
Severity ranking: Remaining issues get sorted by severity, so developers see the most critical problems first
Adaptive depth: Simple PRs get a lightweight pass; complex ones engage more agents for a deeper review

The output shows up as a single overview comment on the PR plus inline annotations for specific bugs. If it finds issues, it also suggests fixes that Claude Code can implement on request.

According to Anthropic’s Wu, the team deliberately focused on logical errors rather than style nitpicks. The reasoning: in AI-generated reviews, developers really just want the logic errors. Nobody needs an AI to tell them about missing semicolons — that’s what linters are for.

The Numbers Behind the Launch

The internal results at Anthropic tell a clear story:

Before Code Review: 16% of PRs received substantive feedback
After Code Review: 54% of PRs received substantive feedback
Average review time: ~20 minutes per PR
Cost per review: $15–$25, depending on complexity

That 16% to 54% jump is significant. It means more than three times as many pull requests are getting meaningful review comments. Anthropic says their developers have come to expect Code Review comments on their PRs — and “get a little nervous” when they don’t see them.

The feature is launching as a research preview for Claude for Teams and Claude for Enterprise customers. Pricing is token-based, so simpler PRs cost less and complex ones cost more.

How It Stacks Up Against GitHub Copilot Code Review

GitHub Copilot’s code review feature hit general availability in April 2025 and reached 1 million users within a month. It’s fast, widely adopted, and integrated directly into GitHub’s UI. But it has a fundamental limitation: it’s diff-based.

That means Copilot reviews the changes in isolation. It catches typos, null checks, and simple logic errors effectively. But it misses architectural problems and cross-file dependencies because it doesn’t have the broader codebase context.

Anthropic’s approach trades speed for depth. A 20-minute review is significantly slower than Copilot’s near-instant feedback. But the multi-agent, codebase-aware approach catches categories of bugs that diff-based tools simply can’t see.

Other players in this space include:

Graphite Agent: Shopify reported 33% more PRs merged per developer after adoption, with engineers at Asana saving 7 hours weekly
CodeRabbit: The only option that works across GitHub, GitLab, Bitbucket, and Azure DevOps
Greptile: Indexes your entire codebase upfront for deeper analysis

The $15–$25 per review price point is notably higher than alternatives. For a team pushing 50 PRs a day, that’s $750–$1,250 daily. Whether the depth justifies the cost depends entirely on what kind of code you’re shipping and what bugs cost you in production.

Who Should Care About This

Code Review makes the most sense for teams where:

AI-generated code is a large percentage of output — if your developers are using Claude Code, Copilot, or Cursor heavily, the volume of code that needs review is growing faster than your team
Bugs in production are expensive — for fintech, healthcare, infrastructure, and enterprise SaaS, a bug that slips through review can cost orders of magnitude more than $25
You’re already in the Anthropic ecosystem — if your team uses Claude for Teams or Enterprise, Code Review slots in without additional vendor management

It’s less compelling for small teams with low PR volume, open-source projects with tight budgets, or teams that primarily write code manually and have solid existing review processes.

If you’re interested in how Anthropic has been building out its developer tooling ecosystem, check out related coverage on Claude Code’s security features, Claude Code Remote Control, and the Anthropic vs Pentagon standoff that’s been happening in parallel with this launch.

FAQ

How much does Anthropic Code Review cost?
Pricing is token-based and varies by PR complexity. Anthropic estimates $15–$25 per review on average. It’s available to Claude for Teams and Claude for Enterprise customers during the research preview.

Can I use Code Review with GitLab or Bitbucket?
Currently, Code Review integrates with GitHub. There’s no announced support for GitLab, Bitbucket, or Azure DevOps yet. If you need multi-platform support, CodeRabbit is the main alternative that covers all four.

How does Anthropic Code Review compare to GitHub Copilot code review?
Copilot is faster (near-instant) and cheaper, but it’s diff-based and misses cross-file issues. Anthropic’s system takes ~20 minutes but analyzes the full codebase context using multiple agents. The trade-off is speed and cost vs. depth and accuracy.

Does Code Review work on code not written by AI?
Yes. While it’s positioned as a solution for the surge in AI-generated code, it reviews any code in a pull request regardless of how it was written. The multi-agent system looks for logical errors, not AI-specific patterns.

Will Code Review replace human code reviewers?
Anthropic isn’t positioning it as a replacement. The 54% substantive feedback rate means nearly half of PRs still don’t trigger meaningful comments. It’s designed to augment human review by catching issues that might be missed under time pressure, not to eliminate the need for human judgment on architecture, design patterns, and business logic.

Top AI Product