Particle.news

Studies Flag Ongoing Chatbot Errors as OpenAI Publishes Mental-Health Usage Data

Vendor-reported safety gains contrast with new findings of widespread hallucinations plus fabricated citations.

Overview

  • A European Broadcasting Union review of more than 3,000 answers from leading chatbots found 45% contained significant problems, with 31% showing faulty or missing citations and repeated misattribution to reputable outlets.
  • OpenAI disclosed that about 0.07% of weekly users show possible signs of psychosis or mania and 0.15% show explicit indications of possible suicidal thoughts, which equates to roughly 560,000 and 1.2 million people given around 800 million weekly users.
  • OpenAI said updated model specifications and safety tuning cut undesired responses by roughly 65–80% in sensitive categories, reporting GPT‑5 produced the desired behavior in 92% of psychosis or mania cases and showed marked gains on suicide‑related prompts.
  • A Penn State preprint reported that very rude prompts yielded slightly higher accuracy from GPT‑4o than polite ones in a small test set (84.8% versus 80.8%), underscoring how prompt tone can measurably shift outcomes.
  • OpenAI and PayPal plan to enable direct purchases inside ChatGPT starting in 2026 via the Agentic Commerce Protocol, a move that could extend chatbot use into shopping even as studies warn about factual errors and unreliable sourcing.