OpenAI GPT-5.5-Cyber beats Anthropic Mythos on cyber benchmark
OpenAI released GPT-5.5-Cyber, an updated Codex Security plugin, and a Daybreak partner network with more than 25 security firms.
TL;DR
- 01OpenAI released GPT-5.5-Cyber, an updated Codex Security plugin, and a Daybreak partner network with more than 25 security firms.
- 02The company says the model leads across major cybersecurity benchmarks and the plugin now covers the workflow up to patch generation.
- 03The company also launched an open-source patching initiative called Patch the Planet with Trail of Bits, HackerOne, and Calif and says more than 30 open-source projects have signed on.
OpenAI released the full GPT-5.5-Cyber model, an updated Codex Security plugin and a Daybreak partner program on Jun 23, 2026, expanding its push from finding vulnerabilities to automatically resolving them. The company says the model leads across major cybersecurity benchmarks and the plugin now covers the workflow up to patch generation.
What did OpenAI announce?
OpenAI unveiled three linked moves: the full release of GPT-5.5-Cyber, an updated Codex Security plugin that closes the loop from discovery to patch generation, and a Daybreak Cyber Partner Program that includes more than 25 security firms and several governments. The company also launched an open-source patching initiative called Patch the Planet with Trail of Bits, HackerOne, and Calif and says more than 30 open-source projects have signed on.
The announcement follows a March research preview of the Codex Security plugin. OpenAI says the plugin scanned over 30 million commits across more than 30,000 codebases, automatically flagged over 500,000 findings as fixed, and had human reviewers confirm another 70,000 fixes.
How does the updated Codex Security plugin work?
The updated Codex Security plugin now analyzes entire codebases, builds targeted patches, verifies results, and can export findings to existing vulnerability management systems. It performs deep scans, attack path analysis, and exports via SARIF files or CodeQL queries, and can triage findings from other scanners or bug bounty reports.
OpenAI positions the plugin as a hands-on assistant to developers: it checks reachability of affected code, generates patches in batch mode, and verifies outcomes, while human reviewers still sign off on every change. The Patch the Planet effort pairs security researchers with maintainers to validate, deduplicate, and merge fixes; an initial five-day sprint found hundreds of issues and produced dozens of merged patches.
How does GPT-5.5-Cyber compare on benchmarks?
GPT-5.5-Cyber posts the highest scores on the three benchmarks OpenAI cites: CyberGym, ExploitGym, and SEC-bench Pro. OpenAI published a comparison showing GPT-5.5-Cyber at 85.6 percent on CyberGym, 39.5 percent on ExploitGym, and 69.8 percent on SEC-bench Pro.
The same table lists Mythos 5 at 83.8 percent on CyberGym, while several other models score lower: GPT-5.5 at 81.8 percent on CyberGym and 25.95 percent on ExploitGym with 63.1 percent on SEC-bench Pro, GPT-5.4 at 79.0 percent on CyberGym, and Claude Opus 4 at 73.1 percent on CyberGym. OpenAI says the Cyber version is deliberately more permissive and refuses fewer requests, but access is limited to verified defenders under verification, monitoring, and guardrails. OpenAI recommends most users stick with GPT-5.5 paired with Trusted Access for Cyber and Codex Security.
Why it matters
Shifting the focus from discovery to remediation addresses what OpenAI and others identify as the main bottleneck in cybersecurity: patches that never ship. Automating triage, patch generation, and verification at scale could shorten the time between vulnerability discovery and remediation, and integrating those outputs into existing vulnerability management systems reduces operational friction. Partnering with established security vendors and governments accelerates enterprise adoption while limiting misuse through vetted access.
What to watch
Monitor uptake in the Daybreak Cyber Partner Program, where partners listed include Cisco, CrowdStrike, Cloudflare, Palo Alto Networks, IBM, Fortinet, Wiz, SentinelOne, Darktrace, Palantir, Accenture, PwC, and KPMG, and the expansion of Trusted Access agreements with governments including Australia, Canada, France, Germany, Japan, South Korea, ENISA, and the UK. Also watch results from Patch the Planet as its signed projects, including cURL, Go, Python, Sigstore, and pyca/cryptography, move through validation and merging.
OpenAI calls GPT-5.5-Cyber "the most capable single model for finding and patching software flaws," and the next signals to watch are broader partner integrations, real-world remediation rates, and whether automated patches drive measurable reductions in exploitability.
Written by The Brieftide · Source: The Decoder
The Brieftide Daily · 06:00
Briefs like this one, in your inbox every morning.
Continue reading
More in Open Source AIZhipu AI GLM-5.2: 1M-token context, closes gap with Opus 4.8
GLM-5.2 ships under the MIT license with a stable one-million-token context and scores 74.4% on FrontierSWE, one point behind Opus 4.8.
OpenAI: PRC-linked influence operations target US AI debates
OpenAI says PRC-linked campaigns are using AI to push narratives on U.S. tech debates, data centers, tariffs and false ChatGPT claims.
OpenAI: LSEG scales trusted AI, empowers 4,000 staff
LSEG uses OpenAI to scale trusted AI across its global business, accelerating insights, shrinking release cycles and empowering 4.
Industrial policy OpenAI proposes for the Intelligence Age
OpenAI published a people-first industrial policy on June 9, 2026, and opened a pilot grants program with fellowships.