QUEST is a family of open deep research agents, spanning 2B to 35B parameters, trained to do something most agents still rely on closed APIs for: long-horizon search, fact-seeking, citation grounding, and report synthesis. The headline is the training data — there isn’t any human-labeled set behind it.
## Synthetic tasks, verifiable rewards
QUEST is trained on just 8,000 fully synthesized tasks. The pipeline builds them from “unified rubric trees” that generate verifiable rewards without human annotation, so the same recipe scales across fact-finding, multi-hop reasoning, and report writing. The training stack layers mid-training, supervised fine-tuning, and reinforcement learning, plus a built-in context manager for long-horizon reasoning.
## Why it matters
Across eight deep research benchmarks, QUEST approaches or surpasses frontier closed-source agents and posts the best results among recent open-weight models — on 8K synthetic tasks, not a giant scraped corpus. That’s the interesting part for anyone building research agents: the bottleneck has been getting verifiable, long-horizon training data at all. If a rubric-driven synthesis pipeline can produce it cheaply, the moat around closed research agents gets a lot thinner.

Leave a comment