Particle News: Anthropic Releases Open-Source Political Even-Handedness Test as Claude Posts Top Scores

Overview

Anthropic published its evaluation tool on GitHub under an open-source license and urged developers to adopt shared ways to measure political bias.
The method pairs left-leaning and right-leaning prompts for U.S. political topics in single-turn chats and scores responses for fairness, perspective-taking, and refusal rates.
Claude Sonnet 4.5 and Opus 4.1 scored 95% and 94% for even-handedness in Anthropic’s tests, trailing Google’s Gemini 2.5 Pro at 97% and xAI’s Grok 4 at 96%, and exceeding GPT-5 at 89% and Meta’s Llama 4 at 66%.
Anthropic says it uses system prompts and reinforcement learning to steer Claude away from unsolicited political opinions and toward balanced analysis, noting the approach is not foolproof but improves outcomes.
The release follows President Trump’s executive order on ideologically neutral AI for government use, with OMB guidance due November 20 that could shape procurement requirements.