Particle News: Nature Publishes DeepSeek R1 Paper, Detailing $294,000 Training and H800 Cluster

Overview

The paper confirms R1 was trained primarily with automated reinforcement learning that rewards correct answers, yielding strong reasoning performance including a reported 86.7% on AIME 2024.
DeepSeek reports a training cost of about $294,000, with the final run completed in 80 hours on 512 Nvidia H800 GPUs after preparatory experiments on A100s.
Authors state R1 did not learn by copying other models’ reasoning examples, while acknowledging the web‑trained base likely absorbed some AI‑generated content.
Nature’s review process led to added sections on safety evaluation and contamination mitigations, and it published the reviewer reports with author responses.
R1 is distributed as open weights and has been widely downloaded on Hugging Face, and the new hardware details renew scrutiny tied to U.S. export controls.