Particle.news
Download on the App Store

OpenAI Warns Next Models Could Reach 'High' Cyber Risk, Rolls Out New Safeguards

The company outlines layered safeguards to steer advances toward defense through controlled access.

Overview

  • Internal testing shows a rapid jump on capture‑the‑flag benchmarks from 27% with GPT‑5 in August 2025 to 76% with GPT‑5.1‑Codex‑Max in November.
  • Under OpenAI’s Preparedness Framework, a “high” cybersecurity capability could include developing working zero‑day exploits or assisting real‑world intrusion operations.
  • OpenAI is deploying a defense‑in‑depth stack that includes access controls, hardened infrastructure, egress restrictions, comprehensive monitoring, model training to refuse harmful requests, and end‑to‑end red teaming.
  • Aardvark, an agentic security researcher now in private beta, scans full codebases for vulnerabilities, has already found critical or novel CVEs, and will be offered free to select non‑commercial open‑source projects.
  • Governance steps include a Frontier Risk Council and a trusted, tiered access program for vetted defenders, with related industry moves highlighted by Google’s Chrome security upgrades and Anthropic’s disclosure of a disrupted AI‑assisted espionage attempt.