Top AI Product

Every day, hundreds of new AI tools launch across Product Hunt, Hacker News, and GitHub. We dig through the noise so you don't have to — surfacing only the ones worth your attention with honest, no-fluff reviews. Explore our latest picks, deep dives, and curated collections to find your next favorite AI tool.


StepFun Ships Step 3.7 Flash, a 198B Vision-Language MoE With Tunable Reasoning

StepFun released Step 3.7 Flash, a 198B-parameter sparse Mixture-of-Experts vision-language model aimed squarely at agentic workflows that mix perception, search, and reasoning. About 11B parameters activate per token, which is the lever that keeps a model this large at Flash-speed throughput.

## Tunable reasoning, native vision

The architecture pairs a 196B language backbone with a 1.8B ViT vision encoder for native image understanding, and supports a 256k context window — long enough to swallow a financial filing in one pass. The interesting product knob is three selectable reasoning levels (low, medium, high), so a developer can trade speed and cost against cognitive depth per call rather than picking once at deployment.

## Built for agentic throughput

Step 3.7 Flash claims up to 400 tokens per second and tightens tool-use reliability over Step 3.5 Flash. Targeted workloads include multi-step search loops with cross-source verification, parsing massive documents end-to-end, and running coding agents concurrently in high-throughput pipelines. It’s available through StepFun’s Open Platform, OpenRouter, and NVIDIA NIM, with DeepInfra, Fireworks AI, and Modal as additional hosts.

## Why it matters

Frontier-class open-weight VLMs that run cheaply are exactly what mid-cost agent products need — closed APIs make per-token economics hard. Tunable reasoning tiers, in particular, are a quietly important UX layer for agents: not every step deserves a deep think.


Discover more from Top AI Product

Subscribe to get the latest posts sent to your email.



Leave a comment