Particle.news
Download on the App Store

DeepSeek-AI Open-Sources 3B DeepSeek-OCR to Compress Visual Context for Long-Document OCR

A new encoder condenses high-resolution pages into a small set of visual tokens to ease long-context limits for OCR workflows.

Overview

  • DeepSeek-AI released the DeepSeek-OCR paper and open-sourced code and weights on GitHub and Hugging Face.
  • The 3B-parameter system pairs a DeepEncoder for visual compression with a DeepSeek3B-MoE-A570M decoder.
  • Team-reported results show about 97% OCR accuracy at compression ratios up to 10× and about 60% at 20×.
  • On OmniDocBench, the model is reported to outperform GOT-OCR2.0 using 100 visual tokens and exceed MinerU2.0 with fewer than 800 tokens.
  • In production claims, a single A100-40G can process over 200,000 pages per day, with all performance figures reported by the authors and pending independent verification.