Overview
- A METR-led randomized controlled trial with 16 professional developers and 246 real-world tasks showed AI-assisted sessions using Cursor Pro and Claude 3.5/3.7 Sonnet increased task times by 19%.
- Participants and external experts had predicted roughly 40% time savings before the study but encountered unexpected slowdowns once friction points emerged in practice.
- Researchers analyzed over 140 hours of screen recordings to pinpoint key contributors to the slowdown, including time spent crafting prompts and reviewing and integrating AI-generated code.
- Authors caution that these results reflect specific conditions in large, mature open-source codebases and are not universally generalizable across all development contexts.
- The study’s team highlights that improvements in prompting techniques, agent scaffolding, and domain-specific fine tuning may be required to unlock genuine productivity gains from AI tools.