Particle News: Mistral Launches Voxtral, Its First Open-Source Audio AI Family

Overview

Voxtral is offered in three variants: Voxtral Small with 24 billion parameters for production-scale deployments, Voxtral Mini with 3 billion parameters for edge and local use, and Mini Transcribe optimized for cost-effective transcription.
The models can transcribe up to 30 minutes of audio and, thanks to an LLM backbone, provide semantic understanding of 40 minutes for summarization, question answering and speech-triggered function calls.
Benchmarks from Mistral show Voxtral outperforms OpenAI’s Whisper large-v3, GPT-4o-mini-transcribe, Google’s Gemini 2.5 Flash and ElevenLabs Scribe across multilingual transcription and comprehension tasks.
API access starts at $0.001 per minute and scales to $0.004, undercutting comparable services by more than half while maintaining lower word error rates.
All Voxtral models are available under an Apache 2.0 license on Hugging Face and can be tested for free through Mistral’s Le Chat chatbot.