MiniMax M3, out June 1 from the Shanghai lab, makes a loud claim: the first open-weight model to put frontier coding, a 1-million-token context window, and native multimodal understanding in one architecture. On SWE-Bench Pro it scores 59.0%, edging out OpenAI’s GPT-5.5 and Google’s Gemini 3.1 Pro.
## The benchmark sweep
It’s not just SWE-Bench. M3 posts 66.0 on Terminal-Bench 2.1 (command-line agent tasks), 74.2 on MCP Atlas (tool use), and 83.5 on BrowseComp (web browsing) — the last one ahead of Claude Opus 4.7’s 79.3. The context window is 5x its predecessor, M2.7.
## How it scales context
The architectural piece is MiniMax Sparse Attention (MSA): a lightweight index branch scans incoming tokens and picks which blocks of past context actually need attention, so the model runs full attention only on the relevant blocks instead of the entire history. The API is live now; weights and the technical report are slated to open-source within 10 days — which is what makes the frontier-versus-open framing actually testable rather than a press-release claim.

Leave a comment