AI Models & APIs
-
Microsoft MAI-Image-2.5 edits images while keeping faces and brands consistent
Microsoft keeps filling out its in-house model stack, and image generation is the latest piece. MAI-Image-2.5, announced at Build 2026, is the company’s strongest image model yet — it launched at number 2 for image editing on the Arena leaderboard, ahead of Nano Banana 2. ## What MAI-Image-2.5 does The headline capability is identity-preserving image-to-image… Continue reading
-
Microsoft MAI-Voice-2 brings cloning and emotional speech to Azure Copilot in 15 languages
Microsoft is leaning harder on its own models, and voice is a clear example. At Build 2026 on June 2, the company introduced MAI-Voice-2, the second generation of its in-house text-to-speech model, built to make speech a native interface for Azure Copilot. ## What MAI-Voice-2 does The model delivers expressive speech synthesis across 15 languages,… Continue reading
-
xAI Grok Voice Agent API builds voice assistants that speak 100+ languages at $0.05 a minute
xAI has opened up the voice stack behind Grok. The Grok Voice Agent API lets developers build real-time voice assistants that speak dozens of languages, call tools, and pull in live data — priced at $0.05 per audio minute. ## What the API does The headline is multilingual range: support for over 100 languages, including… Continue reading
-
Respan unifies LLM observability, evals, and gateway for 100+ AI startups
Teams shipping LLM features usually bolt together three or four separate tools: one for logging, another for evaluations, a third to route between providers. Respan, a Y Combinator-backed platform formerly known as Keywords AI, folds all of that into a single LLM engineering layer for observability, evals, prompt optimization, and an AI gateway. ## What… Continue reading
-
Claude Fable 5 Is Anthropic First Public Mythos-Class Model
Anthropic has released Claude Fable 5, the first publicly available model in its new “Mythos” class — a capability tier that sits above the Opus line. It launched June 9, and the headline number is coding: Fable 5 scores 80.3% on SWE-Bench Pro, up from Opus 4.8’s 69.2% and well ahead of GPT-5.5 at 58.6%.… Continue reading
-
Nemotron 3 Nano Omni Unifies Vision, Audio, and Language in One Open Model
NVIDIA’s Nemotron 3 family is its push into open, agent-ready models, and the newest member, Nemotron 3 Nano Omni, is the multimodal one. It unifies vision, audio, and language in a single model — and NVIDIA says it runs agents up to 9x more efficiently than comparable setups. ## What Nemotron 3 Nano Omni does… Continue reading
-
ZeroGPU Routes AI Inference to Idle Edge Devices Instead of Renting GPUs
A lot of production AI work is unglamorous, high-volume stuff: summarizing documents, classifying pages, extracting fields, detecting PII, moderating messages. ZeroGPU’s bet is that you shouldn’t pay centralized-GPU prices to run it. It’s a compute-efficiency layer that routes inference across a distributed network of edge devices using small “nano language models” tuned for the job.… Continue reading
-
Krisp Voice Translation API Brings Real-Time Speech-to-Speech to Developers
Krisp, best known for AI noise cancellation, has opened its enterprise voice translation engine to developers. The new Krisp Voice Translation API — launched alongside Voice Translation v3 — does real-time, bidirectional speech-to-speech translation across 60+ languages, the same engine that hit 96% accuracy in a live healthcare deployment. ## What the Krisp Voice Translation… Continue reading
-
ChatGPT Dreaming Gives It Memory That Updates Itself Over Time
OpenAI is rolling out a rebuilt memory system for ChatGPT, called Dreaming, that changes how the assistant remembers you. Instead of just storing facts as you mention them, Dreaming runs a background process that synthesizes memory over time and keeps it current. ## How ChatGPT Dreaming works The standout behavior is that memories age. If… Continue reading
-
Claude Opus 4.8 Runs Up to 1,000 Subagents in a Single Session
Anthropic released Claude Opus 4.8 on May 28, just 41 days after 4.7, and the headline isn’t a benchmark bump — it’s a new way of working called dynamic workflows. In Claude Code, the model can now plan a large task, write a JavaScript orchestration script, and spin up as many as 1,000 subagents running… Continue reading
