Overview
- The IMO has officially certified Google DeepMind’s Gemini Deep Think for solving five of six problems under human testing conditions, awarding it a gold-medal score of 35 out of 42 points.
- OpenAI’s experimental general-purpose LLM also solved five problems under the same exam constraints but remains self-graded and is pending formal IMO verification.
- OpenAI’s weekend announcement reportedly violated an embargo that asked companies to wait until July 28 to release results, drawing criticism for overshadowing student competitors.
- The contrast between DeepMind’s certified result and OpenAI’s self-assessment highlights diverging verification approaches in AI-driven formal reasoning.
- Neither gold-capable model is publicly available; OpenAI plans to hold back its advanced LLM for months and Google is providing Deep Think to select testers before a wider rollout.