Microsoft Debuts Three Fast AI Media Models

NEW AI MODEL LAUNCH April 02, 2026 siliconangle

Microsoft launched three new AI models — MAI-Image-2, MAI-Transcribe-1, and MAI-Voice-1 — optimized for image generation, speech transcription, and synthetic voice output. Available through Azure's Microsoft Foundry, MAI-Image-2 is twice as fast as its predecessor, while MAI-Transcribe-1 transcribes speech 2.5 times faster with a 3.9% word error rate across 25 languages, beating rivals from Google and OpenAI. MAI-Voice-1 generates synthetic speech from user scripts with customizable voices. Pricing starts at $5 per million input tokens for MAI-Image-2, $0.36 per transcription hour, and $22 per million characters for voice. Microsoft is rolling out the models across Bing, PowerPoint, and Copilot Audio Expressions.

Read the original article →

Meta Unveils Muse Spark 1.1 for Multi-Agent AI

NEW AI MODEL LAUNCH Jul 09, 2026

Microsoft Debuts Three Fast AI Media Models

Related Articles

Meta Unveils Muse Spark 1.1 for Multi-Agent AI

Unconventional AI launches oscillator-based image generation models

Microsoft Launches Three Rival AI Models

OpenAI Researcher Leaves to Build $2B Drug Discovery Startup