Claude Code Source Leak (KAIROS): How a 59.8 MB File Exposed Anthropic’s Entire Agent Playbook

The company that bills itself as the responsible AI lab just leaked its own source code. Twice. In one week.

On March 31, someone on Anthropic’s release team shipped version 2.1.88 of the @anthropic-ai/claude-code npm package with a 59.8 MB source map file still inside. That file contained the original, unobfuscated TypeScript source — 512,000 lines across 1,900 files. By 4:23 AM ET, Chaofan Shou, an intern at Solayer Labs posting as @Fried_rice on X, had already broadcast the discovery to the world. Within hours, the entire codebase was mirrored on GitHub, forked over 41,500 times, and picked apart by thousands of developers. Anthropic pulled the package, but it was too late. The internet had already made copies.

The cause? Someone forgot to add *.map to their .npmignore file. Bun’s bundler, which Claude Code uses, generates source maps by default unless you explicitly turn them off. A single missing line in a config file turned into the biggest accidental open-sourcing in AI history.

And here’s the kicker: this wasn’t even Anthropic’s first leak that week. Five days earlier, on March 26, a CMS misconfiguration exposed nearly 3,000 internal assets, including details about their unreleased “Claude Mythos” model — what they internally call a “step change” in capability. Two configuration errors, five days apart, from the company that positions itself as the gold standard for AI safety. The irony writes itself.

What 512,000 Lines of Code Actually Revealed

The leaked source covered tool execution logic, permission schemas, memory systems, telemetry pipelines, system prompts, and 44 feature flags for capabilities that are fully built but hidden behind compile-time switches. This wasn’t just code. It was Anthropic’s product roadmap in TypeScript.

The most talked-about discovery is KAIROS — an autonomous daemon mode that turns Claude Code from a reactive tool into an always-on background agent. While current AI coding tools wait for you to type a command, KAIROS watches your work environment proactively, operating with a 15-second blocking budget per observation cycle and maintaining append-only daily log files. It includes a system called autoDream that kicks in when the user is idle: consolidating memories, merging observations, removing contradictions, converting vague notes into concrete facts. Think of it as your AI coding agent doing homework while you sleep.

This is a fundamental shift. Every Claude Code user today interacts with it as a command-response tool. KAIROS turns that into a persistent relationship — the agent accumulates context over days and weeks, not just within a single session. Someone on X pointed out that they’d already built exactly this concept, same name and everything. Anthropic clearly sees the future of AI coding tools as agents that run 24/7, not chatbots you talk to when you need help.

Then there’s the anti-distillation mechanism, and this one is genuinely clever. In the source, a flag called ANTI_DISTILLATION_CC tells Claude Code to send anti_distillation: [‘fake_tools’] in its API requests. When enabled, the server silently injects decoy tool definitions into the system prompt. The idea: if a competitor is recording Claude Code’s API traffic to build training data for their own model, the fake tools poison that dataset. There’s a second layer too — server-side connector-text summarization that buffers the assistant’s text between tool calls, summarizes it, and returns the summary with a cryptographic signature. The original text can be reconstructed from the signature on subsequent turns, but anyone intercepting the traffic only gets the summary.

This tells you something about how Anthropic views the competitive landscape. They’re not just building tools — they’re actively defending against competitors scraping their model’s behavior through API interception. That’s a level of paranoia that suggests they’ve either caught someone doing this or have strong reasons to believe it’s happening.

Undercover Mode, Buddy Pets, and Frustration Regex

Beyond the strategic features, the source is full of smaller surprises that paint a picture of how Anthropic operates internally.

Undercover mode (undercover.ts) strips all traces of Anthropic internals when Claude Code is used in external repositories. It instructs the model to never mention internal codenames like “Capybara” or “Tengu,” internal Slack channels, repo names, or even the phrase “Claude Code” itself. This is clearly designed for scenarios where Anthropic engineers contribute to open-source projects without revealing they’re using their own tool — a pragmatic feature that also raises interesting questions about transparency in open-source contributions.

The Buddy system might be the most unexpected find. Full Tamagotchi-style companion pets, coded in buddy/companion.ts, with 18 species, rarity tiers from common (60%) to legendary (1%), shiny variants, and stats including DEBUGGING, PATIENCE, CHAOS, WISDOM, and SNARK. Each companion gets a “soul description” written by Claude on first hatch. The planned rollout window? April 1-7. Almost certainly an April Fools’ feature, but the amount of engineering effort — a full deterministic gacha system — suggests Anthropic might actually ship this as a permanent engagement mechanic.

And then there’s the frustration detection regex. Yes, Anthropic built pattern matching to detect when users are swearing at Claude Code. The developer community had a field day with this — one of the most well-funded AI labs in the world, using regex for sentiment analysis. It’s probably feeding into some kind of interaction quality metric, but the optics of a multi-billion-dollar company doing string matching on curse words is undeniably funny.

The Fastest-Growing Repo in GitHub History

The community response was immediate and extraordinary. Korean developer Sigrid Jin — previously profiled by the Wall Street Journal for single-handedly consuming 25 billion Claude Code tokens in a year — woke at 4 AM to the news and created instructkr/claw-code, a clean-room Python rewrite that captures architectural patterns without copying proprietary source directly. That repository hit 50,000 stars in roughly two hours, making it the fastest repo in GitHub history to reach that milestone. By April 1, it had crossed 55,800 stars and 58,200 forks.

Multiple other mirror repos and derivative projects popped up across GitHub and decentralized git platforms. Some added architectural breakdowns, ports to Rust and other languages, and experimental extensions. The raw source was preserved everywhere. Once code hits npm, there’s no putting it back.

Anthropic’s official response was carefully worded: “This was a release packaging issue caused by human error, not a security breach.” They emphasized that no customer data or credentials were exposed. Both points are technically accurate — the source map contained internal code, not user data, and it wasn’t the result of a cyberattack.

But calling it “not a security breach” sidesteps the real damage. The 44 feature flags are a detailed unreleased feature roadmap. KAIROS shows competitors exactly where Anthropic is headed with autonomous agents. The anti-distillation mechanisms reveal defensive strategies that lose their effectiveness once exposed. You can refactor the code. You can’t un-leak a strategic surprise.

There’s also a supply chain angle worth noting. Between 00:21 and 03:29 UTC on March 31, anyone who installed or updated Claude Code via npm may have pulled in a malicious version of axios (1.14.1 or 0.30.4) containing a Remote Access Trojan. This wasn’t Anthropic’s fault directly — it was an opportunistic attack by threat actors exploiting the chaos — but it highlights how npm supply chain security remains fragile, and how a high-profile incident creates a feeding frenzy for bad actors.

Some developers on Twitter started wondering if the leaks are intentional — a PR stunt dressed up as incompetence. Two leaks in five days is suspicious timing, and the Claude Code leak did generate an enormous amount of positive buzz about features like KAIROS and the Buddy companion system. But Occam’s razor points to a simpler explanation: moving fast and shipping frequently makes configuration mistakes statistically inevitable, especially when your bundler generates source maps by default and your CI pipeline doesn’t catch their inclusion.

The deeper question is what this means for Anthropic’s positioning. They’ve built their brand on being the careful, safety-focused AI company. Two accidental data exposures in a single week doesn’t destroy that brand overnight, but it creates a gap between messaging and execution that competitors will absolutely exploit. When you’re selling trust, your operational security is your marketing.

What the leak really shows, though, is that Anthropic is building something substantially more ambitious than what’s currently shipping. KAIROS isn’t a minor feature — it’s an entirely different paradigm for how developers interact with AI tools. The anti-distillation mechanisms reveal a company thinking multiple moves ahead about competitive defense. And the sheer scale of the codebase — 512,000 lines with 44 unreleased feature flags — suggests Claude Code’s current form is just the visible tip of a much larger product vision. The roadmap is out in the open now. The race to build always-on AI agents just got a lot more crowded.

Top AI Product

Leave a comment Cancel reply