Top AI Product

Every day, hundreds of new AI tools launch across Product Hunt, Hacker News, and GitHub. We dig through the noise so you don't have to — surfacing only the ones worth your attention with honest, no-fluff reviews. Explore our latest picks, deep dives, and curated collections to find your next favorite AI tool.


Mistral Medium 3.5 scores 77.6% on SWE-Bench — one point shy of Gemini 3.1 Pro

Mistral just shipped Medium 3.5, a 128B dense model that hits 77.6% on SWE-Bench Verified. For context: Gemini 3.1 Pro Preview leads the board at 78.8%. An open-weight model from a European lab is now within rounding distance of Google’s flagship on real coding work.

It’s a single set of weights doing instruction-following, reasoning, and agentic coding. Reasoning mode is a per-request toggle — quick chat or long-horizon agent run, same model, you decide how much test-time compute to burn. 256K context window. 91.4% on τ³-Telecom, the agentic benchmark Mistral cares about most.

What the API does

$1.50 per million input tokens, $7.50 per million output. Available through Mistral’s API, Le Chat, NVIDIA NIM containers, and Vibe — Mistral’s coding agent, where Medium 3.5 just replaced Devstral 2 to power remote agents. Native function calling and JSON output ship in the box. Typical use cases: long-horizon coding agents, multi-step tool use, RAG over large codebases.

Why it matters

Open weights under a modified MIT license. Self-hostable. Most models scoring near Gemini ship as closed APIs — Mistral is the rare lab handing you the actual weights at this performance tier.


You Might Also Like


Discover more from Top AI Product

Subscribe to get the latest posts sent to your email.



Leave a comment