Particle.news

Download on the App Store

OpenAI’s o3 Model Defeats Grok 4 to Claim Kaggle AI Chess Title

Magnus Carlsen likened LLM play to "kids’ games", underscoring the persistent reasoning and consistency gaps revealed by general-purpose models.

Overview

  • The three-day exhibition on Google’s Kaggle Game Arena concluded on August 7 with OpenAI’s o3 emerging unbeaten after a 4–0 final sweep of xAI’s Grok 4.
  • Eight general-purpose LLMs from OpenAI, xAI, Google, Anthropic, DeepSeek and Moonshot AI competed without any specialized chess training.
  • Observers noted the models performed at roughly 800–1200 ELO strength, oscillating between solid moves and inexplicable blunders such as repeated queen losses.
  • World champion Magnus Carlsen provided live commentary, quipping that the matches resembled “kids’ games” and sharing personal anecdotes about playing with tech leaders.
  • The tournament highlighted both the potential and the limitations of general-purpose AI in structured, turn-based tasks, as consistency gaps persisted.