Top AI Product

Every day, hundreds of new AI tools launch across Product Hunt, Hacker News, and GitHub. We dig through the noise so you don't have to — surfacing only the ones worth your attention with honest, no-fluff reviews. Explore our latest picks, deep dives, and curated collections to find your next favorite AI tool.


Qwen3.7-Max ran 35 hours and called 1,000+ tools to write a kernel 10x faster than the vendor code

Alibaba released Qwen3.7-Max on May 19, unveiling it at the 2026 Alibaba Cloud Summit. It’s a reasoning model engineered for long, multi-stage agentic projects rather than short chat — and the headline demo backs that up: it ran 35 hours uninterrupted, called over 1,000 different tools, and wrote an optimized compute kernel that ran 10x faster than the manufacturer’s official code.

## The benchmarks

GPQA Diamond 92.4, edging Claude Opus 4.6’s 91.3. HLE 41.4 versus Opus 4.6’s 40.0. #3 of 117 on coding benchmarks (average 92.7), #2 overall on BenchLM’s provisional leaderboard, with a 1M token context window. This is the first Chinese model credibly trading blows with frontier Western models on the hardest agentic and reasoning tasks.

## The open-source split

The Plus variant will be open source; the Max flagship will not. Alibaba continues its shift toward monetizing its best model while giving developers the tier below — open enough to build a community, closed enough to capture the premium.

## Why it matters

The “35-hour autonomous run” framing is the real signal. The frontier labs are all converging on long-horizon agentic competence as the next battleground, and Alibaba just planted a flag with a concrete, verifiable demo rather than a benchmark table.


Discover more from Top AI Product

Subscribe to get the latest posts sent to your email.



Leave a comment