Nicholas Carlini wrote a simple script. He pointed Claude Code at the Linux kernel source, one file at a time, with a prompt that basically said “find vulnerabilities, treat this like a CTF challenge.” No fancy tooling, no custom pipeline, no months of fine-tuning. Just a loop, an LLM, and the entire Linux kernel.
What came back was something Carlini, a research scientist at Anthropic, says he’s never achieved in his career: multiple remotely exploitable heap buffer overflows in the Linux kernel. Five vulnerabilities total, either fixed directly or reported to kernel maintainers. One of them had been sitting in the codebase since March 2003 — before Git existed, before the first iPhone, before Facebook.
Carlini presented these findings at the [un]prompted AI security conference in San Francisco. The talk hit Hacker News on April 4 with 328 points and 205 comments, and for good reason. This isn’t another “AI helps with code review” story. This is an AI finding kernel-level security bugs that survived 23 years of expert human review, fuzzing tools, and static analysis.
1,056 Bytes Into a 112-Byte Buffer
The flagship vulnerability is in Linux’s NFSv4.0 LOCK replay cache. Here’s what makes it interesting.
When an NFS client requests a file lock, the server stores the response in a 112-byte buffer for replay purposes. Normal operations fit fine. But there’s an edge case involving two cooperating clients.
Client A acquires a lock using a 1,024-byte owner ID — perfectly valid under the NFS spec. Client B then requests the same lock. The server needs to deny Client B and include Client A’s owner ID in the denial message. That denial message balloons to 1,056 bytes. The buffer is 112 bytes.
That’s a heap buffer overflow. An attacker with two cooperating NFS clients can overwrite kernel memory over the network. No authentication bypass needed, no privilege escalation chain — just a protocol-level mismatch between what the spec allows and what the code allocated.
The reason this bug survived so long is revealing. It’s not a simple coding mistake a linter could catch. It’s a semantic gap between the NFS protocol specification and the implementation. You need to understand the protocol well enough to realize that the owner ID field can be up to 1,024 bytes, then trace that through the code to find where a fixed-size buffer was assumed to be sufficient. Traditional fuzzers couldn’t catch it because they don’t understand protocol semantics. Static analyzers couldn’t catch it because the overflow only happens under a specific multi-client interaction pattern.
Claude Code understood both the protocol and the code. It generated a complete vulnerability report with ASCII protocol diagrams showing exactly how the overflow happens. Carlini’s own words: “I have never found one of these in my life before. This is very, very, very hard to do.”
Five Bugs Across Four Subsystems
The NFS bug wasn’t a lucky one-off. The remaining four vulnerabilities include an out-of-bounds read in io_uring and two separate bugs in ksmbd, the in-kernel SMB server implementation. These are different subsystems, different codebases, different vulnerability classes. That breadth matters.
If Claude Code had found five NFS bugs, you could argue it stumbled onto a poorly written module. Finding vulnerabilities across multiple kernel subsystems suggests something more fundamental: LLMs can do the kind of cross-referencing between specifications and implementations that humans find tedious and error-prone at scale.
Antirez, the creator of Redis, made this point in the Hacker News discussion. LLMs work through what he called “recursive generation of hypotheticals interprocedurally” — they follow a variable through multiple function calls, consider what values it could take, and check whether the receiving code handles all possible cases. That’s not magic. That’s exactly what a human security researcher does, just at a speed and patience level humans can’t sustain.
Security researcher Thomas Ptacek added another angle: almost all vulnerabilities are either direct applications of known patterns, incremental extensions of those patterns, or chains of multiple pattern applications. LLMs are trained on millions of examples of exactly these patterns. The surprise isn’t that they can find bugs — it’s that we waited this long to point them at production code in a systematic way.
500 Zero-Days, 22 Firefox Bugs, and a Quality Shift Nobody Can Explain
Carlini’s Linux kernel work is part of something much bigger. Anthropic has disclosed that Claude Opus 4.6, the model powering Claude Code, has found over 500 previously unknown zero-day vulnerabilities across open-source codebases. Every single one was validated by either an Anthropic team member or an outside security researcher.
The most controlled experiment was with Mozilla. Anthropic partnered with Mozilla’s security team and pointed Claude at Firefox’s codebase. In two weeks, it found 22 vulnerabilities — more than were reported in any single month throughout 2025. These weren’t theoretical issues or style complaints. They were real, exploitable bugs in one of the most heavily audited browsers in the world.
The economics deserve attention. One Hacker News commenter estimated that finding a complex privilege escalation bug costs about $750 in compute. A human security consultant doing the same work charges tens of thousands and takes weeks. Another commenter put it bluntly: tokens are “insanely cheap” compared to human security expertise.
But the raw numbers aren’t the real story. Greg Kroah-Hartman, the Linux kernel’s most prolific maintainer, told The Register in late March that AI-generated bug reports underwent a sudden quality shift. His words: they went “from junk to legit overnight.” He couldn’t explain it. “We don’t know. Nobody seems to know why.” The timing, though, lines up with the release of Claude Opus 4.6 and similar capability jumps in other frontier models.
Google’s Sashiko project has been working on a related problem — an agentic AI system that reviews Linux kernel patches and caught 53.6% of bugs that passed human review. But Sashiko operates on individual patches. Claude Code’s approach works at a different scale entirely, auditing complete codebases and cross-referencing protocol specifications against implementation code. Same direction, different altitude.
The Uncomfortable Dual-Use Question
The Hacker News discussion surfaced a tension that anyone in security has been circling. If Claude Code can find these bugs, so can anyone running the same model. The defender/attacker asymmetry has always been brutal — defenders need to find every bug, attackers only need one. AI tips that balance further toward attackers.
Carlini’s presentation was explicitly framed as a warning. He demonstrated a live zero-day discovery in Ghost CMS — 50,000 GitHub stars, no prior critical vulnerability in its history — in about 90 minutes. Then the Linux kernel findings. The message was clear: LLMs have crossed a threshold where they can autonomously discover and exploit zero-day vulnerabilities in major, heavily audited software.
Some in the discussion were skeptical about false positives. One commenter claimed Claude produces thousands of garbage reports requiring months of cleanup. Michael Lynch, who wrote the detailed technical breakdown, pushed back: the false positive rate with Opus 4.6 is well below 20%. A security researcher sifting through five reports where four are real and one is noise will happily take that ratio.
The more interesting question is what this means for the thousands of critical open-source libraries maintained by one or two people who can barely keep up with feature PRs, let alone security audits. The Linux kernel has resources. Firefox has Mozilla. Most open-source software has nothing. A tool that costs $750 to audit an entire codebase could change that — if someone organizes the effort.
Why Opus 4.6, Not Opus 4.1
The capability jump is real and worth understanding. Carlini specifically tested older models — Claude Opus 4.1 and Sonnet 4.5 — against the same kernel code. They performed significantly worse. The improvement from Opus 4.1 to Opus 4.6 wasn’t incremental. Something about the newer model made it substantially better at the specific reasoning security auditing requires: understanding specifications, tracing data flow across function boundaries, and recognizing when valid inputs create invalid states.
This tracks with what developers have been seeing elsewhere. Claude Code has been steadily evolving from a coding assistant into something closer to an autonomous engineering toolkit. The same reasoning capabilities that let it plan multi-step code changes across large repositories are what make it effective at security auditing — following complex logic chains spanning thousands of lines.
Carlini’s approach was deliberately low-effort. No custom training data, no specialized fine-tuning, no vulnerability-specific prompting beyond “treat this like a CTF.” The implication is uncomfortable: this is what an off-the-shelf tool can do today. Dedicated attackers with resources to build custom pipelines will do more.
The five Linux kernel bugs are fixed or being patched. Hundreds more potential vulnerabilities Claude flagged remain unvalidated. The bottleneck is now human review capacity, not detection. That inversion — from “we can’t find the bugs” to “we can’t process them fast enough” — might be the most significant shift in software security this year.
You Might Also Like
- Anthropic Just Launched Code Review in Claude Code and 54 of prs now get Real Feedback
- Claude Replay Turns Your Anthropic Claude Code Sessions Into Shareable Video Like Replays
- Claude Code Security Just Dropped and it Already Found 500 Zero Days Nobody Knew About
- Claude Code Remote Control Just Turned my Phone Into a Coding Terminal and im Weirdly Into it
- Obra Superpowers Turned my Claude Code Into a Proper Engineer and im not Going Back

Leave a comment