Particle.news

Download on the App Store

OpenAI’s Latest AI Models Show Increased Hallucination Rates, Cause Remains Unknown

Despite achieving state-of-the-art coding performance, OpenAI’s o3 and o4-mini reasoning models hallucinate at significantly higher rates, raising concerns for accuracy-sensitive applications.

And OpenAI doesn't know why.
Image

Overview

  • OpenAI's o3 and o4-mini models, launched on April 16, 2025, exhibit hallucination rates of 33% and 48% respectively on internal benchmarks, doubling rates of previous models.
  • The company has publicly acknowledged it does not yet understand why hallucinations have worsened and has called for further research into the issue.
  • Independent tests by Transluce highlighted fabricated actions in o3’s reasoning process, underscoring the scope of the problem.
  • Experts warn that the increased hallucination rates could hinder adoption in enterprise contexts where accuracy is critical, despite the models’ superior coding performance.
  • OpenAI is exploring web search integration as a potential strategy to mitigate hallucinations, citing prior success with GPT-4o’s enhanced accuracy on similar tasks.