Transcription is one of the quieter but most-used AI services, and Microsoft just pushed its in-house model forward. MAI-Transcribe-1.5, announced at Build 2026, is a speech-to-text model that the company says holds the #1 spot on the FLEURS benchmark with best-in-class word-error-rate.
## What MAI-Transcribe-1.5 does
The headline is breadth without losing accuracy. The model now covers 43 languages — 18 more than its predecessor — including a large expansion into Indian languages like Bengali, Tamil, Telugu, and Marathi, plus European additions such as Greek, Ukrainian, and Catalan. Microsoft says it beats Gemini and OpenAI models on accuracy and produces transcripts up to 5x faster than rivals, at roughly $0.36 per hour of audio. It also adds content and keyword biasing, so domain terms and names transcribe correctly.
## Where it shows up
Rather than living only as an API, MAI-Transcribe-1.5 is being wired into Copilot, Teams, GitHub, and Dynamics 365 Contact Center, and is available to developers through Azure AI Foundry. For teams building voice agents, meeting tools, or contact-center automation, an accurate, fast, cheap multilingual transcription layer is the unglamorous piece everything else depends on.

Leave a comment