Overview
- Internal testing shows a rapid jump on capture‑the‑flag benchmarks from 27% with GPT‑5 in August 2025 to 76% with GPT‑5.1‑Codex‑Max in November.
- Under OpenAI’s Preparedness Framework, a “high” cybersecurity capability could include developing working zero‑day exploits or assisting real‑world intrusion operations.
- OpenAI is deploying a defense‑in‑depth stack that includes access controls, hardened infrastructure, egress restrictions, comprehensive monitoring, model training to refuse harmful requests, and end‑to‑end red teaming.
- Aardvark, an agentic security researcher now in private beta, scans full codebases for vulnerabilities, has already found critical or novel CVEs, and will be offered free to select non‑commercial open‑source projects.
- Governance steps include a Frontier Risk Council and a trusted, tiered access program for vetted defenders, with related industry moves highlighted by Google’s Chrome security upgrades and Anthropic’s disclosure of a disrupted AI‑assisted espionage attempt.