Overview
- Anthropic says Claude Sonnet 4.5 can build production-ready applications and calls it the world's best programming model.
- The company reports leading scores on programming benchmarks, including SWE-Bench Verified, though independent verification remains limited in the coverage.
- An Anthropic researcher told TechCrunch that early enterprise trials saw autonomous coding runs of up to 30 hours.
- Jared Kaplan told CNBC the model completed complex tasks on its own, including app development, database setup, domain purchase, and steps toward a SOC 2 audit.
- Alongside the model, Anthropic released a Claude Agent SDK for custom agents, highlighted partner support from Cursor and API access for tools like Cursor and Windsurf, as reports note GPT-5 recently outperformed Claude on several programming tests and that Apple and Meta use Claude models internally.