Overview
- OpenAI says advancing models could reach a high-risk threshold, with the potential to develop zero‑day exploits or assist complex enterprise and industrial intrusions.
- The company reports rapid capability gains, citing capture‑the‑flag performance rising from 27% on GPT‑5 in August to 76% on GPT‑5.1‑Codex‑Max in November.
- Planned and existing mitigations include strict access controls, hardened infrastructure, egress restrictions, continuous monitoring, targeted training refusals, detection‑and‑response layers, and end‑to‑end red teaming.
- OpenAI is investing in defensive tools and workflows, including testing Aardvark, an agentic security researcher that scans codebases, suggests patches, and has surfaced novel CVEs with planned free support for select open‑source projects.
- Governance steps in development include a trusted tiered‑access program for qualified cyberdefense users and the creation of a Frontier Risk Council, alongside work through the Frontier Model Forum to build a shared threat model.