AI Research & Analytics
-
Honen turns your company material into AI-built courses that teach themselves
Corporate training usually means someone manually building slide decks that go stale the moment a process changes. Honen replaces that with automated teaching and learning infrastructure: you hand it your material and it produces a full course on its own. ## How Honen builds a course Honen reads your material, understands the subject, and assembles… Continue reading
-
Veridive Searches the Spoken Web and Jumps to the Exact Second
Most video search still means scrubbing a timeline or skimming a transcript. Veridive, billed as a discovery platform for the spoken world, does something more precise: ask a question in plain language and it returns a cited answer pinpointed to the exact second it’s spoken — across YouTube, podcasts, lectures, and interviews. ## What Veridive… Continue reading
-
Nemotron 3 Nano Omni Unifies Vision, Audio, and Language in One Open Model
NVIDIA’s Nemotron 3 family is its push into open, agent-ready models, and the newest member, Nemotron 3 Nano Omni, is the multimodal one. It unifies vision, audio, and language in a single model — and NVIDIA says it runs agents up to 9x more efficiently than comparable setups. ## What Nemotron 3 Nano Omni does… Continue reading
-
OrchestraML Takes an ML Project From English Prompt to Deployed Model
Building a machine learning model is still a long chain of specialized steps — finding data, cleaning it, engineering features, training, and shipping. OrchestraML compresses that whole lifecycle into a single described goal. You tell it what you want in plain English, and its agents handle the rest, with a human signing off at every… Continue reading
-
Microsoft ASSERT Turns Plain-English Specs Into AI Agent Tests
Testing whether an AI agent actually behaves the way you intended has been one of the messier parts of shipping with LLMs. Microsoft ASSERT — Adaptive Spec-driven Scoring for Evaluation and Regression Testing, open-sourced in early June — tries to make that automatic. It’s an MIT-licensed framework that reads a plain-language description of how an… Continue reading
-
LoomVideo does unified video generation and editing at 5B parameters, not 13B
Most “unified” video models — ones that both generate and edit from mixed text, image, and video inputs — are heavy, 13B parameters or more, and they handle editing by concatenating the source video’s tokens, which doubles the sequence length and quadruples attention cost. LoomVideo, a new arXiv release from Peking University, aims for the… Continue reading
-
ArcANE tests whether role-playing AI characters evolve with the story, not just stay in character
Role-playing AI is usually judged on consistency: does it remember the character’s facts and stay on persona? ArcANE, a new arXiv benchmark, argues that’s the wrong bar. A good character isn’t fixed — its values and behavior should evolve as the story moves, and the real question is whether the AI tracks that arc at… Continue reading
-
VideoKR builds the first large training corpus for knowledge- and reasoning-intensive video understanding
Most video AI gets graded on shallow recall — what’s on screen at minute three. VideoKR, a new arXiv release, targets the harder thing: video questions that need outside knowledge and multi-step reasoning, not a textual shortcut. It’s billed as the first large-scale training corpus built specifically for that. ## What’s in it The dataset… Continue reading
-
Code2LoRA generates a repo-specific adapter so code models keep up as the codebase changes
Code models need to know your repository — its imports, APIs, and conventions — to be useful. The usual fixes are clumsy: stuff all that context into a long prompt (RAG, dependency crawls), or fine-tune a LoRA per repo, which is expensive and goes stale the moment the code changes. Code2LoRA, a new arXiv paper,… Continue reading
-
MLEvolve is a self-evolving agent that hit #1 on MLE-bench in 12 hours
MLEvolve is an open-source multi-agent system that designs machine-learning solutions end to end — planning, coding, validating, and iterating the way a human ML engineer would on a Kaggle problem. Built by Shanghai AI Laboratory and East China Normal University, it reached #1 on the MLE-bench leaderboard with 12 hours of runtime. ## How it… Continue reading
