Particle.news

Download on the App Store

DeepMind’s Gemini Deep Think Earns Official IMO Gold as OpenAI LLM Awaits Validation

IMO certification of Google DeepMind’s model exposes gaps in verification standards alongside embargo compliance issues for OpenAI’s gold-level claim.

Figurines with computers and smartphones are seen in front of Alphabet logo in this illustration taken, February 19, 2024. REUTERS/Dado Ruvic/Illustration
Image
Image
OpenAI’s latest AI model has achieved gold medal-level performance at the IMO, solving 5 out of 6 of the world's toughest math problems.  (AI-generated image)

Overview

  • The IMO has officially certified Google DeepMind’s Gemini Deep Think for solving five of six problems under human testing conditions, awarding it a gold-medal score of 35 out of 42 points.
  • OpenAI’s experimental general-purpose LLM also solved five problems under the same exam constraints but remains self-graded and is pending formal IMO verification.
  • OpenAI’s weekend announcement reportedly violated an embargo that asked companies to wait until July 28 to release results, drawing criticism for overshadowing student competitors.
  • The contrast between DeepMind’s certified result and OpenAI’s self-assessment highlights diverging verification approaches in AI-driven formal reasoning.
  • Neither gold-capable model is publicly available; OpenAI plans to hold back its advanced LLM for months and Google is providing Deep Think to select testers before a wider rollout.