Product

Claude Opus 4.6

Anthropic's publicly released frontier model; scored within 5–10 percentage points of Claude Mythos on AISI's CTF benchmarks.

Last refreshed: 15 May 2026

Key Question

How close is Claude Opus 4.6 to the restricted Mythos model that spooked the US Treasury?

Timeline for Claude Opus 4.6

#96 May

Mentioned in: GPT-5.5 clears 32-step attack chain; two models in five days

AI: Jobs, Power & Money

#615 Apr

Mentioned in: AISI confirms Mythos 20-hour attack chain

AI: Jobs, Power & Money

View full timeline →

Common Questions

How does Claude Opus 4.6 compare to Mythos on cybersecurity benchmarks?

AISI's April 2026 evaluation found Claude Opus 4.6 within 5 to 10 percentage points of Mythos on isolated CTF tasks. The gap is larger on multi-step autonomous operations, where Mythos completed a 32-step chain estimated at 20 human hours — a test Opus 4.6 was not evaluated against in the same setting.Source: UK AI Security Institute

What is Claude Opus 4.6 used for?

Opus 4.6 is Anthropic's most capable publicly available model, used by API developers, enterprise customers, and Claude subscribers. It was the basis for comparison in AISI's April 2026 evaluation of the restricted Mythos Preview.Source: Anthropic

How does Claude Opus 4.6 compare to Claude Mythos?

On isolated CTF cybersecurity tasks, AISI found Opus 4.6 within 5-10 percentage points of Mythos (both above 75-80%). The meaningful gap is in multi-step autonomous operation: Mythos completed AISI's 32-step attack chain in the equivalent of a 20-hour human operation; Opus 4.6 was not evaluated on that benchmark.Source: AISI evaluation, 15 April 2026

Background

Claude Opus 4.6 is the most capable publicly available model in Anthropic's Claude line-up, released before the restricted Claude Mythos Preview. On 15 April 2026, the UK AI Security Institute (AISI) published a comparative evaluation of Mythos in which Claude Opus 4.6 was one of three comparison models — alongside GPT-5.4 and Codex 5.3 — used to benchmark Mythos on isolated capture-the-flag (CTF) cybersecurity tasks. Mythos scored above 85% on those tasks; Opus 4.6 fell within 5 to 10 percentage points, establishing no single-task superiority for Mythos over public frontier models.

Opus 4.6 is available to API and consumer subscribers as Anthropic's public capability ceiling. Unlike Mythos, which was withheld from general release, Opus 4.6 is the model enterprises and developers deploy at scale. By 6 May 2026, GPT-5.5 had joined Mythos as the second model to complete the 32-step 'The Last Ones' (TLO) autonomous attack chain — achieving it in 2 of 10 attempts at 71.4% on the expert cyber suite. Opus 4.6 has not been publicly evaluated on TLO. AISI's Frontier AI Trends Report found frontier cyber capability doubling every four months.

For this beat, the significance is that JPMorgan Chase and other Project Glasswing partners have privileged access to Mythos while the rest of the market uses Opus 4.6 and its equivalents. The AISI evaluation confirmed the meaningful advantage is in 20-hour autonomous operation chains rather than single tasks, and GPT-5.5's TLO clearance shows that capability frontier is no longer held by one model.