Particle News: OpenAI Study Finds Frontier AI Beating Humans on Tasks Across 44 Occupations

Overview

Researchers used GPDval 2025 to pit advanced models against professionals in nine major industries, with judges blinded to whether outputs came from humans or AI.
AI win rates peaked at 81% for counter and rental clerks, 79% for sales managers, 76% for shipping and receiving clerks, and 75% for editors.
Performance varied by model, with OpenAI’s GPT5-high averaging 48.8% wins, Anthropic’s Claude Opus 4.1 at 47.6%, and GPT-4o at 12.4%.
Sector averages showed retail tasks beaten 56% of the time, wholesale 53%, and certain government roles 52%, while information-sector roles saw a 39% ceiling.
OpenAI said top models are approaching expert-level quality yet emphasized that most jobs involve more than written tasks, as CEO Sam Altman warned of likely losses in customer support.