Thinking Machines Lab dropped their first real architecture statement on May 11, and it isn’t another scaling paper. Mira Murati’s team argues that humans got pushed out of AI collaboration not because models don’t need us, but because the interface never left us a seat.
The 200ms idea
Today’s LLMs work in turns. You type, it answers, you wait. Interaction Models throws that away. The model processes audio, video and text as continuous time-aligned micro-turns of roughly 200ms each. It can start replying mid-sentence, get interrupted naturally, and proactively chime in on visual cues — closer to a real conversation than a walkie-talkie.
This is a research release, not a product. No API, no chatbot to try. Just a paper and three new benchmarks: TimeSpeak, CueSpeak, and a visual proactivity test. A limited research preview is coming.
Why this matters
Every major lab has been racing on smarter turns. Thinking Machines is the first big-name team saying the turn itself is the bug. If micro-turn architectures hold up, voice agents, copilots and embodied AI all need rebuilding — and the chat box era starts to look short.
You Might Also Like
- Llm Skirmish What Happens When you let ai Models Fight Each Other in an rts Game
- Hume ai Open Sources Tada an llm Based tts With Zero Hallucinations and 0 09 rtf
- 397 Billion Parameters on a 48gb Macbook Flash moe Turns Apples 2023 Research Into Reality
- Cern Hls4ml how the Worlds Largest Physics lab Burns Tiny ai Models Directly Into Silicon Chips
- Prime Intellect lab Hits ga per Token rl Training Across 14 Models

Leave a comment