Overview
- Mistral released the Mistral 3 family under Apache 2.0, pairing the frontier Mistral Large 3 with nine small Ministral 3 models across 14B, 8B, and 3B sizes.
- Mistral Large 3 uses a sparse Mixture‑of‑Experts design with 41B active and 675B total parameters, supports multimodal inputs and a 256k context window, and was trained on 3,000 NVIDIA H200 GPUs.
- Ministral 3 targets edge and on‑prem deployments, runs on a single GPU down to 4GB VRAM with 4‑bit quantization, and ships in base, instruct, and reasoning variants with vision support.
- The models are available on Mistral AI Studio, Hugging Face, Azure Foundry, and Amazon Bedrock, with NVIDIA NIM and AWS SageMaker listed as coming soon.
- NVIDIA and Microsoft detailed enterprise integrations, including NVFP4 checkpoints, TensorRT‑LLM and vLLM optimizations, and a reported 10x NVL72 throughput gain, as Mistral signs customers such as HSBC.