Particle News: OpenAI Says Guess-Rewarding Evaluations Drive AI Hallucinations, Proposes Scoring Fix

Overview

In a paper released Thursday, OpenAI argues that common grading practices teach models to guess rather than express uncertainty, sustaining hallucinations.
The proposed remedy is to update widely used accuracy-based scoreboards so they discourage guessing and no longer dock models for refusing to answer when unsure.
OpenAI notes Anthropic's Claude more often withholds uncertain answers, though higher refusal rates can reduce practical utility.
Coverage reiterates that large language models optimize next-token prediction for plausibility, which can yield fluent but incorrect outputs.
User-level safeguards highlighted in reporting include asking for sources and dates, prompting the model to fact-check, and cross-checking responses with other LLMs.