Particle News: Google and OpenAI Claim Unofficial Gold at Math Olympiad, Prompting Benchmarking Debate

Overview

Google’s Gemini with Deep Think and OpenAI’s LLM each solved five of six International Math Olympiad problems under the standard 4.5-hour limit, reaching the unofficial gold-medal threshold.
The latest models processed complex tasks directly in natural language without machine-readable preprocessing, improving on last year’s silver-level results.
OpenAI posted its gold-level outcomes ahead of official validation, drawing public criticism from DeepMind CEO Demis Hassabis for preempting student results and expert review.
IMO President Gregor Dolinar confirmed that correct mathematical proofs are valid regardless of authorship but emphasized the contest is not an AI benchmark.
The experiment has reignited industry discussions over appropriate benchmarking standards and the ethics of early disclosure in AI performance reporting.