Overview
- Anthropic positions Sonnet 4.5 as capable of generating production-ready applications rather than just prototypes.
- The company cites industry-leading results on programming benchmarks, including SWE-Bench Verified, though recent reports say GPT-5 outperformed Claude on several tests.
- An Anthropic researcher told TechCrunch the model autonomously wrote code for up to 30 hours in early enterprise trials.
- Co-founder Jared Kaplan said the system handled end-to-end tasks such as building an app, setting up a database service, purchasing a domain, and work related to a SOC 2 audit.
- Anthropic also released a Claude Agent SDK for building custom AI agents, with commercial access through partners like Cursor and Windsurf and reported internal use at Apple and Meta.