Overview
- Mistral’s Voxtral, launched July 15, 2025, is its first open-source audio model family and is available under an Apache 2.0 license via API and Hugging Face.
- The platform includes a 24B-parameter Voxtral Small for cloud and enterprise deployments and a 3B-parameter Voxtral Mini with an ultra-light transcription-only API variant optimized for edge use.
- Built on the Mistral Small 3.1 backbone, Voxtral can transcribe up to 30 minutes of audio and understand 40-minute clips, enabling features such as summarization, Q&A and voice-triggered function-calls.
- Starting at $0.001 per minute, Voxtral undercuts comparable proprietary APIs by more than 50% while reportedly outperforming leading open-source and closed speech models on ASR benchmarks.
- The models natively support transcription and understanding across major languages including English, Spanish, French, Portuguese, Hindi, German, Dutch and Italian.