Particle News: Frontier AI Models Clear Mock CFA Level III, Raising Practical Questions for Finance

Overview

NYU Stern and GoodFin evaluated 23 AI systems on a mock CFA Level III exam that combines multiple-choice and essay questions.
OpenAI’s o4-mini scored 79.1% and Google’s Gemini 2.5 scored about 77%, both above the 63% pass threshold, with Anthropic’s Claude Opus also passing.
Models clustered around 71%–75% on multiple-choice items, but essay scores varied widely, distinguishing reasoning-focused systems.
Researchers used chain-of-thought prompting to elicit step-by-step explanations, enabling some models to finish the exam in minutes.
Industry voices cautioned that exam success does not equal client-ready judgment, urging hybrid use with human oversight as the February human pass rate stood at 49%.