Overview
- Citrix engineer Robert Caruso set up a 90-minute chess match between ChatGPT and the primitive Atari 2600 Video Chess engine at ChatGPT’s request.
- ChatGPT repeatedly confused rooks for bishops, overlooked pawn forks and lost track of the board, forcing manual corrections before conceding defeat.
- The Atari engine, which evaluates only one move ahead within a 4KB memory constraint, secured victory by adhering strictly to chess rules.
- Large language models predict text tokens and lack persistent memory, preventing them from maintaining a consistent game state or enforcing rule logic.
- The outcome underscores fundamental differences between LLMs and rule-based algorithms and highlights LLM limitations in precise logical tasks.