Mistral AI announces the release of OCR 4, a document extraction model that goes beyond converting pages into plain text. The new model returns structured output including bounding boxes, block classification, and inline confidence scores for every extracted element.
What OCR 4 actually does
Traditional OCR tools extract text. OCR 4 also identifies where each piece of text sits on the page, what role it plays (title, table, equation, signature, and more), and how confident the model is in each region. This structured output makes documents more useful for automated workflows and search systems.
The model supports 170 languages across 10 language groups, with Mistral claiming particular improvements for rare and low-resource languages where competing systems often struggle. It accepts common formats including PDF, DOC, PPT, and OpenDocument files.
On the public OlmOCRBench, OCR 4 achieved a top score of 85.20. In a separate human evaluation, independent annotators preferred OCR 4 output over competing systems in an average of 72% of cases. Mistral notes that automated benchmark scores carry known limitations and should be treated as directional rather than definitive.
Pricing and deployment options
- OCR API: $4 per 1,000 pages, or $2 per 1,000 pages via the Batch API
- Document AI (adds structured JSON output and custom prompts): $5 per 1,000 pages
- Self-hosted deployment available for enterprise customers with data-sovereignty requirements
The model is available through Mistral Studio, Amazon SageMaker, and Microsoft Foundry, with Snowflake support announced as coming soon. OCR 4 also serves as an ingestion component for Mistral’s open-source Search Toolkit, which targets enterprise search and retrieval-augmented generation (RAG) pipelines. Mistral positions the model for use cases in legal, financial services, and healthcare, where structured document processing is critical.
Stay up to date
AI for content creation: the latest tools, tips and trends. Every two weeks in your inbox: