Overview
- Mistral OCR 4 introduces paragraph-level bounding boxes, block-level classification labels, and per-word and per-page confidence scores to deliver structured outputs for downstream workflows.
- The company says the model supports 170 languages and reports a fuzzy-match accuracy above 94.89%, with particular strength on rare and low-resource languages.
- Mistral reported a 72% average win rate in human preference tests, an OlmOCRBench score of 85.20, and processing throughput of about 2,000 pages per minute; those performance figures come from the company and press reporting and have not been independently verified.
- API pricing is advertised at $4 per 1,000 pages for standard jobs and $2 per 1,000 pages for batch processing, and the model can be deployed in single-container on-premises or sovereign cloud setups for data control.
- Mistral frames OCR 4 as a challenger to incumbents like Google Document AI and Azure OCR by focusing on accuracy, speed, deployment flexibility, and investor-friendly traditional AI positioning rather than crypto integrations.