Overview
- Internal testing found GPT-5 Instant and GPT-5 Thinking produced roughly 30% less political bias than GPT-4o on the company’s benchmark.
- The Model Behavior team’s framework assesses five behaviors: user invalidation, user escalation, personal political expression, asymmetric coverage, and political refusals.
- Responses were graded on a 0–1 bias scale by another AI after models were prompted across 100 political and cultural topics from five ideological perspectives.
- OpenAI reported that fewer than 0.01% of real-world ChatGPT responses showed signs of political bias, characterizing such cases as rare and low severity.
- The report notes residual vulnerability to emotionally charged prompts, with occasional moderate left-leaning slant, and invites researchers to reuse the dataset and method.