Skip to content
Briefings are running a touch slower this week while we rebuild the foundations.See roadmap
Claude Opus 4.6
Product

Claude Opus 4.6

Anthropic's publicly released frontier model; scored within 5–10 percentage points of Claude Mythos on AISI's CTF benchmarks.

Last refreshed: 15 May 2026

Key Question

How close is Claude Opus 4.6 to the restricted Mythos model that spooked the US Treasury?

Timeline for Claude Opus 4.6

View full timeline →
Common Questions
How does Claude Opus 4.6 compare to Mythos on cybersecurity benchmarks?
AISI's April 2026 evaluation found Claude Opus 4.6 within 5 to 10 percentage points of Mythos on isolated CTF tasks. The gap is larger on multi-step autonomous operations, where Mythos completed a 32-step chain estimated at 20 human hours — a test Opus 4.6 was not evaluated against in the same setting.Source: UK AI Security Institute
What is Claude Opus 4.6 used for?
Opus 4.6 is Anthropic's most capable publicly available model, used by API developers, enterprise customers, and Claude subscribers. It was the basis for comparison in AISI's April 2026 evaluation of the restricted Mythos Preview.Source: Anthropic
How does Claude Opus 4.6 compare to Claude Mythos?
On isolated CTF cybersecurity tasks, AISI found Opus 4.6 within 5-10 percentage points of Mythos (both above 75-80%). The meaningful gap is in multi-step autonomous operation: Mythos completed AISI's 32-step attack chain in the equivalent of a 20-hour human operation; Opus 4.6 was not evaluated on that benchmark.Source: AISI evaluation, 15 April 2026
Is Claude Opus 4.6 available to the public?
Yes. Opus 4.6 is available via Anthropic's API and to consumer subscribers. It is the highest publicly available Claude model. Claude Mythos Preview is restricted and available only to Project Glasswing partners such as JPMorgan Chase.Source: Lowdown
Can Claude Opus 4.6 autonomously complete cyberattacks?
AISI's evaluation did not test Opus 4.6 on the 32-step 'The Last Ones' autonomous attack chain where Mythos demonstrated a confirmed long-horizon capability. On discrete CTF tasks, Opus 4.6 scored within 5-10 points of Mythos. GPT-5.5 cleared TLO in 2 of 10 attempts on 6 May 2026.Source: AISI Frontier AI Trends Report
What is the difference between Claude Opus 4.6 and GPT-5.5?
Both score within 5-10 percentage points of Mythos on discrete CTF tasks. GPT-5.5 completed AISI's 32-step autonomous attack chain (TLO benchmark) on 6 May 2026 in 2 of 10 attempts at 71.4% on the expert cyber suite. Claude Opus 4.6 has not been publicly evaluated on TLO.Source: AISI Frontier AI Trends Report

Background

Claude Opus 4.6 is the most capable publicly available model in Anthropic's Claude line-up, released before the restricted Claude Mythos Preview. On 15 April 2026, the UK AI Security Institute (AISI) published a comparative evaluation of Mythos in which Claude Opus 4.6 was one of three comparison models — alongside GPT-5.4 and Codex 5.3 — used to benchmark Mythos on isolated capture-the-flag (CTF) cybersecurity tasks. Mythos scored above 85% on those tasks; Opus 4.6 fell within 5 to 10 percentage points, establishing no single-task superiority for Mythos over public frontier models.

Opus 4.6 is available to API and consumer subscribers as Anthropic's public capability ceiling. Unlike Mythos, which was withheld from general release, Opus 4.6 is the model enterprises and developers deploy at scale. By 6 May 2026, GPT-5.5 had joined Mythos as the second model to complete the 32-step 'The Last Ones' (TLO) autonomous attack chain — achieving it in 2 of 10 attempts at 71.4% on the expert cyber suite. Opus 4.6 has not been publicly evaluated on TLO. AISI's Frontier AI Trends Report found frontier cyber capability doubling every four months.

For this beat, the significance is that JPMorgan Chase and other Project Glasswing partners have privileged access to Mythos while the rest of the market uses Opus 4.6 and its equivalents. The AISI evaluation confirmed the meaningful advantage is in 20-hour autonomous operation chains rather than single tasks, and GPT-5.5's TLO clearance shows that capability frontier is no longer held by one model.