Particle: DeepSeek-AI Open-Sources 3B DeepSeek-OCR to Compress Visual Context for Long-Document OCR

Overview

DeepSeek-AI released the DeepSeek-OCR paper and open-sourced code and weights on GitHub and Hugging Face.
The 3B-parameter system pairs a DeepEncoder for visual compression with a DeepSeek3B-MoE-A570M decoder.
Team-reported results show about 97% OCR accuracy at compression ratios up to 10× and about 60% at 20×.
On OmniDocBench, the model is reported to outperform GOT-OCR2.0 using 100 visual tokens and exceed MinerU2.0 with fewer than 800 tokens.
In production claims, a single A100-40G can process over 200,000 pages per day, with all performance figures reported by the authors and pending independent verification.