A 20-Year-Old Dropout Built Supermemory — Now It Has 18K GitHub Stars and Google’s Jeff Dean as an Investor

Every AI agent today has the same problem: amnesia. End the conversation, and the context vanishes. Start a new session, and you’re re-explaining everything from scratch. Supermemory is a bet that persistent, time-aware memory will become as essential to AI infrastructure as databases are to web apps — and the bet is attracting serious attention. The project hit 18,000+ GitHub stars in March 2026, is climbing GitHub Trending charts, and counts Google AI chief Jeff Dean among its backers.

From a College Side Project to $3M in Funding

Supermemory’s origin story reads like a Silicon Valley screenplay. Dhravya Shah, originally from Mumbai, skipped India’s IIT entrance exams to attend Arizona State University. There, he challenged himself to build something new every week for 40 weeks. During one of those weeks, he created a project called “Any Context” — a tool to help AI remember things across sessions. He put it on GitHub, and the community responded.

That side project became Supermemory. Shah dropped out of college, moved to San Francisco, and raised $3 million in pre-seed funding led by Susa Ventures, Browder Capital, and SF1.vc. The investor list is stacked: Jeff Dean (Google AI chief), Logan Kilpatrick (DeepMind), David Cramer (Sentry founder), plus executives from OpenAI, Meta, and Cloudflare.

At 20, Shah still operates largely solo. “I still would like a co-founder,” he has said publicly. “I’m neither great at sales nor engineering. I do everything myself.” The honesty is unusual for a VC-backed founder — and apparently, investors don’t mind.

What Supermemory Actually Does

Supermemory is a memory API for AI applications. Developers send it raw data — documents, chat histories, user profiles — and Supermemory handles the pipeline: ingestion, vector embedding, distributed indexing, and semantic retrieval. Queries return in under 300 milliseconds, and the platform processes over 100 billion tokens per month at scale.

What separates it from a generic vector database is time-awareness. Supermemory treats memories as timestamped semantic trajectories, not just static embeddings. This means it can handle temporal reasoning — understanding not only what was said, but when, and how information has changed over time. It automatically extracts facts from conversations, builds user profiles, handles knowledge updates and contradictions, and even forgets expired information.

The product ships as a REST API with SDKs, but it also integrates directly into developer tools. The Claude Code integration (claude-supermemory) has racked up 2,300+ stars on its own — it gives Claude persistent memory across coding sessions, auto-capturing file edits, bash commands, and task context. Similar plugins exist for OpenCode and other AI coding tools.

Benchmark Numbers: Strong but Contested

Supermemory claims the #1 position on three major AI memory benchmarks: LongMemEval, LoCoMo, and ConvoMem. The production engine scores 85.2% on LongMemEval, with particular strength in multi-session recall (71.43%) and temporal reasoning (76.69%). An experimental agentic retrieval system called ASMR (Agentic Search and Memory Retrieval) pushes that to ~99% on the LongMemEval subset, though this is not yet in production.

However, independent benchmarks tell a more nuanced story. A DEV Community comparison of five AI memory systems in 2026 placed Supermemory’s LoCoMo score at approximately 70%, behind Zep (~85%) and Letta/MemGPT (~83.2%). The gap between self-reported and independent numbers is worth noting — it’s a pattern seen across the memory infrastructure space, where Mem0’s independent scores also diverge significantly from its self-reported ones.

The takeaway: Supermemory performs well, especially on temporal reasoning tasks that play to its architectural strengths. But the benchmarking landscape for AI memory is still young and fragmented, and no single leaderboard tells the full story.

How Supermemory Stacks Up Against Competitors

The AI memory infrastructure space has grown crowded in 2026. Here’s where the major players differ:

Mem0 focuses on managed infrastructure with a graph-based memory layer. It’s the fastest path to production if you want someone else to handle scaling, compliance, and infrastructure. Mem0 is strongest for teams that need shared, cross-device memory. However, its independent benchmark scores (~58%) lag behind self-reported numbers (~66%), raising questions about real-world retrieval accuracy.

Zep uses a temporal knowledge graph architecture — graph-based reasoning plus semantic search. It scores well on benchmarks (~85% on LoCoMo) and handles complex enterprise scenarios where facts and relationships change over time. Zep is the strongest choice for enterprise teams dealing with evolving business data.

Letta (formerly MemGPT) takes a different approach entirely, using LLM-driven memory management with tiered storage (core context, recall, and archival memory). It’s open-source and scores around 83% on LoCoMo.

Supermemory positions itself between these options: open-source like Letta, API-first like Mem0, and time-aware like Zep — but with a lighter-weight architecture that avoids the complexity of full knowledge graphs. Its sweet spot is developers building AI agents that need persistent, personalized memory without enterprise-grade graph infrastructure.

The March 6 Outage — And What It Revealed

On March 6, 2026, Supermemory experienced its first major downtime — a 4-hour-and-44-minute service degradation between 4:34 AM and 9:18 AM PST. API response times spiked dramatically, though most requests eventually succeeded rather than failing outright.

The root cause was almost comically simple: an UPDATE query on the API key table was running on every single request, and as traffic grew, the database buckled under the write load. Zero data was lost — requests queued instead of dropping — but the latency was, in the team’s own words, “extremely longer.”

Shah published a transparent postmortem on the Supermemory blog, detailing the timeline, root cause, and remediation steps. The fix: moving rate-limit counters out of the hot path of the primary database. The incident sparked broader discussion on Hacker News and Twitter about the reliability challenges facing AI memory infrastructure — a category where downtime doesn’t just break an app, it can cause an AI agent to lose its entire context history.

For a solo-founder startup processing billions of tokens, the incident was a growing pain. The transparent response, however, earned goodwill in the developer community.

Pricing: Free Tier to Enterprise

Supermemory offers four tiers:

Free: 1M processed tokens, 10K search queries/month, email support
Pro ($19/month): 3M tokens, 100K queries, priority support, advanced analytics
Scale ($399/month): 80M tokens, 20M queries, dedicated support, Slack channel
Enterprise: Custom volumes, SLA guarantees, dedicated engineer

Overage on paid plans runs $0.01 per 1,000 tokens and $0.10 per 1,000 queries. There’s also a startup program offering $1,000 in Pro credits for six months.

Compared to Mem0’s pricing (which starts free but scales quickly for high-volume use) and Zep’s enterprise-focused model, Supermemory’s free tier is competitive for prototyping, and the Pro plan at $19/month is accessible for indie developers and small teams.

FAQ

What is Supermemory used for?
Supermemory provides persistent memory for AI applications. Common use cases include AI coding assistants that remember project context across sessions, customer support bots that recall past interactions, and personalized AI agents that learn user preferences over time.

Is Supermemory open source?
Yes. The core engine is open-source under the MIT license on GitHub (18,000+ stars). The managed API service adds hosted infrastructure, scaling, and support on top of the open-source foundation.

How does Supermemory compare to Mem0?
Supermemory is open-source and emphasizes time-aware semantic retrieval; Mem0 is a managed service focused on graph-based memory with faster enterprise deployment. Supermemory scores higher on some benchmarks (particularly temporal reasoning), while Mem0 offers a more turnkey production experience.

Does Supermemory work with Claude, GPT, and other LLMs?
Yes. Supermemory is model-agnostic — it works with any LLM via its REST API. It also has dedicated integrations for Claude Code, OpenCode, and other AI development tools through MCP (Model Context Protocol) plugins.

What happened during the March 2026 outage?
A database bottleneck caused elevated API latency for about 5 hours on March 6, 2026. No data was lost. The team published a detailed postmortem and implemented fixes to prevent recurrence. It was the service’s first significant downtime event.

Top AI Product

Leave a comment Cancel reply