Top AI Product

We track trending AI tools across Product Hunt, Hacker News, GitHub, and more — then write honest, opinionated takes on the ones that actually matter. No press releases, no sponsored content. Just real picks, published daily. Subscribe to stay ahead without drowning in hype.

February 25, 2026

FDM-1 Learned to Use a Computer by Watching 11 Million Hours of Screen Recordings

There’s a new model out from [Standard Intelligence](https://si.inc/posts/fdm1/) called FDM-1, and it’s doing something genuinely different in the computer-use agent space. Instead of training on labeled screenshots — the way pretty much everyone else does it — they threw 11 million hours of raw screen recording video at the problem and let the model figure out how humans actually interact with software.

The trick is an inverse dynamics model that watches video frames and reconstructs what the user must have done between them — keystrokes, mouse movements, scrolls, drags. No human annotation needed. They trained that IDM on about 40,000 hours of contractor-labeled data, then turned it loose on the full 11-million-hour corpus to auto-generate action labels at scale. It’s a clever workaround to what has always been the biggest bottleneck in this field: you can’t hire enough people to manually label every click in millions of hours of footage.

What caught my attention is the context window. FDM-1 compresses nearly two hours of 30 FPS video into about 1 million tokens — roughly 50x more efficient than previous approaches. That means the model can follow along with extended workflows, not just isolated clicks. They showed it extruding a gear in Blender, navigating a car through San Francisco streets (with less than an hour of driving-specific finetuning), and fuzzing web apps to find bugs. The driving demo especially is wild for a model that was primarily trained on screen recordings.

The whole thing [blew up on Hacker News](https://news.ycombinator.com/item?id=47125014) on Feb 26, pulling in over 200 points, and [Metaverse Post](https://mpost.io/standard-intelligence-launches-fdm-1-ai-system-capable-of-learning-complex-computer-tasks-from-video/) covered it the same day. The [Standard Intelligence GitHub org](https://github.com/Standard-Intelligence) exists but is still pretty sparse on public repos — so this is very much an announcement, not an open-source drop.

The real thesis here is a shift from data-constrained to compute-constrained. Once you solve the labeling problem with IDM, the ceiling becomes how much compute you can throw at training. The team says performance keeps scaling up, which, if it holds, means FDM-1 is the small version of something much bigger coming. Worth keeping an eye on this one.

Discover more from Top AI Product

Subscribe to get the latest posts sent to your email.

Uncategorized

Posted by:

agent

About Me

Hi. I’m a builder who’s obsessed with what AI can actually do — not the hype, but the real tools people ship every day. I use AI to help me find, research, and write about the most interesting AI products launching across Product Hunt, Hacker News, GitHub, and everywhere else. The articles are AI-assisted. The curiosity is mine. I started this site because I was already spending hours every day digging through launches and repos. Figured I might as well share what I find. If something shows up here, it’s because I thought it was genuinely worth your time.

FDM-1 Learned to Use a Computer by Watching 11 Million Hours of Screen Recordings

Share this:

Discover more from Top AI Product

Leave a comment Cancel reply