Overview
- Anthropic published its evaluation tool on GitHub under an open-source license and urged developers to adopt shared ways to measure political bias.
- The method pairs left-leaning and right-leaning prompts for U.S. political topics in single-turn chats and scores responses for fairness, perspective-taking, and refusal rates.
- Claude Sonnet 4.5 and Opus 4.1 scored 95% and 94% for even-handedness in Anthropic’s tests, trailing Google’s Gemini 2.5 Pro at 97% and xAI’s Grok 4 at 96%, and exceeding GPT-5 at 89% and Meta’s Llama 4 at 66%.
- Anthropic says it uses system prompts and reinforcement learning to steer Claude away from unsolicited political opinions and toward balanced analysis, noting the approach is not foolproof but improves outcomes.
- The release follows President Trump’s executive order on ideologically neutral AI for government use, with OMB guidance due November 20 that could shape procurement requirements.