Particle News: Anthropic Launches Claude Sonnet 4.5 With Benchmark Gains and 30-Hour Autonomy

Overview

Anthropic says Sonnet 4.5 leads key tests, scoring 77.2% on SWE‑Bench Verified and 61.4% on OSWorld, and calls it its best model for coding and computer use.
The model reportedly maintained focus for about 30 hours on complex, multi‑step tasks, a jump from seven hours on Opus 4.
Sonnet 4.5 is available now in the Claude chatbot and API at $3 per million input tokens and $15 per million output tokens, with Anthropic indicating it will be the default experience for users.
Companion updates include Claude Code checkpoints, a refreshed terminal, a native VS Code extension, a Claude Agent SDK, context editing and a memory tool, plus a five‑day ‘Imagine with Claude’ preview for Max subscribers.
Anthropic touts improved alignment and ASL‑3 protections, though early third‑party tests reported a jailbreak, underscoring ongoing safety and verification work.