OpenAI shipped three Realtime API models on May 7. Read the spec sheet and you can hear a half-dozen voice startups quietly rewriting their decks.
What actually launched
GPT-Realtime-2 is the first voice model with GPT-5-class reasoning baked in. 128K context (up from 32K), five-level reasoning effort, tone control, parallel tool calls, clean recovery from interruptions. It can think mid-conversation without going silent.
GPT-Realtime-Translate handles live speech-to-speech across 70+ input languages into 13 output languages, keeping pace with the speaker — no lag, no chunking. $0.034/min.
GPT-Realtime-Whisper streams transcription with controllable latency at $0.017/min. Captions and call transcripts are now commodities.
What you can build
All three sit behind the Realtime API, which exited beta the same day. Endpoints: gpt-realtime-2, gpt-realtime-translate, gpt-realtime-whisper. Realtime-2 input pricing is $32/M tokens.
Typical builds: agentic phone reps that handle interruptions cleanly, real-time meeting translators, multilingual customer support, live caption overlays. A full layer of voice-agent and AI-translation SaaS just collapsed into three endpoints.
You Might Also Like
- Openai Realtime Voice Webrtc Stack the Infra Blueprint Every Voice Agent Startup now has to Compete With
- Gpt oss 120b Openai Finally Goes Open Source and its Worth the Wait
- Openai Oauth Turns Your Chatgpt Subscription Into a Free Openai api but Should you use it
- Your ai Agent is Burning Tokens on Noise Context Gateway Wants to fix That
- Roborock Saros 20 Sonic Launches Globally With a 7 98cm Chassis and an Open api Your Home Assistant Agent can Actually Call

Leave a comment