Overview
- Journalists from 22 public service media in 18 countries reviewed about 3,000 answers from ChatGPT, Copilot, Gemini and Perplexity across 14 languages using blind assessment.
- Overall, 45% of responses contained at least one significant issue, with problems present in 81% of answers, including sourcing, accuracy and missing context.
- Serious sourcing faults appeared in about a third of outputs, with Gemini showing the highest rate—around three quarters of its responses flagged—compared with far lower rates for other assistants.
- Major accuracy issues affected roughly 20% of answers, with cited errors such as naming the wrong current leaders, misreading satire as fact and using outdated information.
- Broadcasters launched the 'Facts In: Facts Out' campaign and urged regulators and AI firms to disclose performance by language and market, enforce rules and improve models as more people—7% of online users overall and 15% under 25—turn to AI for news.