Particle.news

Download on the App Store

DeepSeek Advances Self-Improving AI Models With Tsinghua Collaboration

The Chinese AI startup is developing DeepSeek-GRM, next-generation models designed to enhance reasoning and efficiency through a novel feedback loop mechanism.

  • DeepSeek is collaborating with Tsinghua University to develop self-improving AI models using a new reinforcement learning approach called self-principled critique tuning (SPCT).
  • The upcoming DeepSeek-GRM models aim to improve reasoning and efficiency by incorporating a feedback loop that rewards better performance.
  • Preliminary benchmarks suggest these models could outperform competitors like Google’s Gemini, Meta’s Llama, and OpenAI’s GPT-4 models, though independent verification is limited.
  • DeepSeek plans to release DeepSeek-GRM as open-source technology, continuing its strategy of disrupting the AI market by prioritizing accessibility and innovation.
  • Experts have raised ethical and technical concerns about self-improving AI, including risks like 'model collapse' and the potential need for safeguards such as a 'kill switch.'
Hero image