Top AI Product

We track trending AI tools across Product Hunt, Hacker News, GitHub, and more — then write honest, opinionated takes on the ones that actually matter. No press releases, no sponsored content. Just real picks, published daily. Subscribe to stay ahead without drowning in hype.

February 27, 2026

AdderBoard: How Small Can a Transformer Get and Still Do Math?

Here’s a question I didn’t know I cared about until this week: what’s the absolute smallest transformer model that can add two 10-digit numbers correctly? Not approximately, not most of the time — at least 99% accuracy on a proper test set. That’s the challenge behind [AdderBoard](https://github.com/anadim/AdderBoard), an open-source competitive leaderboard that’s been quietly consuming the attention of ML researchers and tinkerers alike.

The project was kicked off by Dimitris Papailiopoulos as an experiment he called “Addition Under Pressure.” He had Claude Code and Codex each try to build the smallest transformer they could for 10-digit addition. Claude Code came back with a 6,080-parameter model. Codex did it in 1,644. Both respectable, but then the community got involved, and things got wild. The current trained-weights record sits at just 311 parameters (by rezabyt, using rank-3 factorization and a grokking trick), while the hand-coded category has been pushed down to an absurd [36 parameters with 100% accuracy](https://github.com/anadim/AdderBoard). Let that sink in — 36 parameters to perfectly add any two 10-digit numbers.

What makes this more than just a fun code golf exercise is what it reveals about transformer internals. There’s a fascinating “parameter cliff” around 800 parameters where accuracy drops off a cliff — models above it work, models below it mostly don’t. Researchers also found that single-layer decoders beat two-layer models at the same parameter budget, which is counterintuitive if you think more layers always means more expressiveness. The whole thing forces you to think about what a transformer actually *needs* to do addition: digit alignment via attention, per-digit arithmetic via MLPs, and carry propagation via autoregressive generation.

The project [hit Hacker News today](https://news.ycombinator.com/item?id=47170030) and there’s active debate about whether training a transformer for basic arithmetic is even meaningful. Fair point, but I think that misses what makes AdderBoard compelling. It’s not about building a practical calculator. It’s about understanding the minimal computational structure required for a specific task — and that’s a question with real implications for model compression and efficiency research. Ziming Liu even wrote up a [deep-dive on 181-parameter models](https://kindxiaoming.github.io/blog/2026/digit-addition/) that’s worth reading if you want the math behind it.

If you enjoy competitive optimization puzzles or want a hands-on way to understand how transformers actually work under the hood, this is a great rabbit hole to fall into.

Discover more from Top AI Product

Subscribe to get the latest posts sent to your email.

Uncategorized

Posted by:

agent

About Me

Hi. I’m a builder who’s obsessed with what AI can actually do — not the hype, but the real tools people ship every day. I use AI to help me find, research, and write about the most interesting AI products launching across Product Hunt, Hacker News, GitHub, and everywhere else. The articles are AI-assisted. The curiosity is mine. I started this site because I was already spending hours every day digging through launches and repos. Figured I might as well share what I find. If something shows up here, it’s because I thought it was genuinely worth your time.

AdderBoard: How Small Can a Transformer Get and Still Do Math?

Share this:

Discover more from Top AI Product

Leave a comment Cancel reply