Particle.news

Download on the App Store

Nature Publishes DeepSeek R1 Paper, Detailing $294,000 Training and H800 Cluster

Peer review prompted new disclosures on safety and data contamination.

Overview

  • The paper confirms R1 was trained primarily with automated reinforcement learning that rewards correct answers, yielding strong reasoning performance including a reported 86.7% on AIME 2024.
  • DeepSeek reports a training cost of about $294,000, with the final run completed in 80 hours on 512 Nvidia H800 GPUs after preparatory experiments on A100s.
  • Authors state R1 did not learn by copying other models’ reasoning examples, while acknowledging the web‑trained base likely absorbed some AI‑generated content.
  • Nature’s review process led to added sections on safety evaluation and contamination mitigations, and it published the reviewer reports with author responses.
  • R1 is distributed as open weights and has been widely downloaded on Hugging Face, and the new hardware details renew scrutiny tied to U.S. export controls.