
Codex 5.3
OpenAI's coding-focused model; used as benchmark comparison in the AISI evaluation of Claude Mythos in April 2026.
Last refreshed: 15 May 2026 · Appears in 1 active topic
How close is OpenAI's Codex to Anthropic's restricted Mythos model on security tasks?
Timeline for Codex 5.3
Mentioned in: GPT-5.5 clears 32-step attack chain; two models in five days
AI: Jobs, Power & MoneyMentioned in: AISI confirms Mythos 20-hour attack chain
AI: Jobs, Power & Money- How does Codex 5.3 compare to Claude Mythos on cybersecurity benchmarks?
- AISI's April 2026 evaluation found Codex 5.3 within 5 to 10 percentage points of Mythos on isolated CTF tasks. Mythos scored above 85%; Codex is estimated at 75–80%. The gap opens on multi-step autonomous chains, where Mythos was evaluated alone.Source: UK AI Security Institute
- What is Codex 5.3 and who made it?
- Codex 5.3 is a coding-focused AI model made by OpenAI. It is designed for software development tasks and was used by AISI as one of three comparison models in its April 2026 evaluation of Claude Mythos Preview on cybersecurity benchmarks.Source: AISI evaluation, 15 April 2026
- How did Codex 5.3 perform against Claude Mythos in AISI's evaluation?
- AISI found Codex 5.3 within 5-10 percentage points of Mythos on isolated CTF cybersecurity tasks, both scoring above 75-80%. The evaluation did not test Codex 5.3 on the 32-step autonomous attack chain where Mythos demonstrated its most significant capability advantage.Source: AISI evaluation, 15 April 2026
- Is Codex 5.3 more capable than GPT-5.5?
- No. GPT-5.5 is OpenAI's more capable successor model. On 6 May 2026, GPT-5.5 became the second model (after Claude Mythos) to complete AISI's 32-step autonomous attack chain, achieving 71.4% on the expert cyber suite. Codex 5.3 was not evaluated on that benchmark.Source: AISI Frontier AI Trends Report
- What AI coding models are being used for cybersecurity tasks?
- AISI's April 2026 evaluation used Claude Opus 4.6, GPT-5.4, and Codex 5.3 as comparison models against Claude Mythos. All scored within 5-10 percentage points on discrete CTF tasks. By May 2026, GPT-5.5 had cleared the more challenging 32-step autonomous attack chain that Codex 5.3 was not assessed on.Source: AISI evaluation
- How often does AI cyber capability improve?
- AISI's Frontier AI Trends Report found frontier cyber capability doubling roughly every four months as of May 2026. GPT-5.5 cleared the TLO autonomous attack chain just five days after Mythos, suggesting the benchmark gap between models is closing rapidly.Source: AISI Frontier AI Trends Report
Background
Codex 5.3 is a coding-specialised AI model developed by OpenAI, positioned primarily as a software development and coding assistant. In the UK AI Security Institute's independent evaluation of Claude Mythos Preview on 15 April 2026, Codex 5.3 was used as one of three comparison models — alongside Claude Opus 4.6 and GPT-5.4 — to benchmark Mythos on isolated capture-the-flag (CTF) cybersecurity tasks. Mythos scored above 85%; Codex 5.3 fell within 5 to 10 percentage points, establishing it as competitive on single-task discrete security benchmarks.
Codex models have historically been the standard benchmark for coding-task comparisons in AI safety and capability evaluation, making Codex 5.3's inclusion in the AISI CTF battery a standard methodology choice. The evaluation did not assess Codex 5.3 on the 32-step 'The Last Ones' (TLO) autonomous attack chain where Mythos demonstrated a confirmed long-horizon capability. By 6 May 2026, OpenAI's more capable GPT-5.5 had become the second model to complete TLO in 2 of 10 attempts, suggesting the successor generation has moved well beyond Codex 5.3's positioning.
The significance for the AI beat is context: Codex 5.3 within 5 to 10 points of a restricted government-evaluated model confirms the public frontier was very close to the restricted frontier on discrete tasks as of April 2026. With AISI reporting frontier cyber capability doubling every four months, the benchmark landscape Codex 5.3 occupies is changing fast.