Microsoft AI's Superintelligence team has unveiled MAI-Voice-2, a text-to-speech model that goes far beyond a routine upgrade. It is positioned as a significant leap from its predecessor across every dimension that matters to production voice experiences: fidelity, language coverage, speaker consistency, and emotional range. The headline trick? In blind listening tests, people often can't tell its output apart from a real human recording.

The model launched at Build 2026 alongside six other MAI releases, and it's already wired into Microsoft Foundry, VS Code, and the Dynamics 365 Contact Center. For developers building voice agents, audiobooks, or accessibility tools, this is one of the more substantial TTS shipments of the year.

From English-only to fifteen languages, without losing the magic

The original MAI-Voice-1, released in April, spoke only English. MAI-Voice-2 expands from English-only to 15 languages while maintaining the same naturalness and expressiveness as English. The supported set covers English (US and Australia), Italian, French, German, Hindi, Spanish (Spain and Mexico), Portuguese (Brazil and Portugal), Korean, Simplified Chinese, Turkish, Russian, Thai, Dutch, Romanian, and Hungarian.

Microsoft says it deliberately chose depth over breadth, supporting a spectrum of expressive capabilities spanning tonal, pitch-accent, stress-timed, and syllable-timed language systems. Translation: it's not just covering more locales, but adapting prosody to the linguistic structure of each one.

The more interesting capability for global products is code-switching. The model has code-switching capabilities for select language pairs such as Hindi-English and Spanish-English, matching the way users naturally mix languages in everyday speech. In Microsoft's demo clips, the model flips between Hindi and English mid-sentence, or weaves Spanish food terms into English narration, all while holding the same speaker identity and rhythm.

Alpha Signal

Don't miss what's next in AI

Join 300,000+ engineers and researchers who get the signal, not the noise.

  • Full access to in-depth AI research breakdowns
  • Be the first to know what's trending before it hits mainstream
  • Daily curated papers, repos, and industry moves