Skip to content
Briefings are running a touch slower this week while we rebuild the foundations.See roadmap
AI: Jobs, Power & Money
10APR

AISI confirms Mythos 20-hour attack chain

3 min read
16:54UTC

The UK AI Security Institute's independent evaluation of Claude Mythos Preview found no single-task superiority over rival models, but confirmed a genuine autonomous capability: a 32-step attack chain equivalent to 20 hours of trained-human work.

EconomicDeveloping
Key takeaway

AISI confirmed Mythos can run 20 hours of trained-human work autonomously, the capability that most directly substitutes for salaried labour.

The UK AI Security Institute (AISI) published an independent evaluation of Anthropic's Claude Mythos Preview on 15 April 2026. On isolated capture-the-flag (CTF) tasks, Mythos scored above 85%, but rival frontier models, GPT-5.4, Claude Opus 4.6 and Codex 5.3, fell within 5 to 10 percentage points. No single-task superiority. In AISI's 32-step "The Last Ones" benchmark, however, Mythos autonomously completed a sequence the Institute estimates would take a trained human roughly 20 hours, without human prompting between steps.

AISI is the UK government body established to evaluate the safety of frontier AI models; its evaluation is the first external assessment of Mythos since Anthropic distributed restricted access to twelve founding partners under Project Glasswing on 8 April . Anthropic's marketing had emphasised thousands of zero-day vulnerabilities discovered by the model; Tom's Hardware on 9 April reported those claims rested on only 198 manual reviews . AISI's CTF findings partly vindicate that critique: Mythos is not dramatically more capable than competitors at short, bounded tasks.

The attack-chaining result is the capability that matters. Sustained autonomous execution over 32 steps and roughly 20 hours is the operational profile a trained human analyst, paralegal or junior engineer currently provides inside a bank, law firm or software team. It is also the profile the Scott Bessent and Jerome Powell emergency convening of Wall Street CEOs at Treasury on 8 April was called to assess . Treasury and The Fed convened promptly on a capability that federal agencies could not themselves verify; AISI's 20-hour-human-equivalent figure is the first external confirmation the convening was warranted on substance.

For the workforce implication, the relevant dimension is not Mythos's cybersecurity reach but its ability to replace trained-human throughput at chain-of-task scale. That capability is what JPMorgan CEO Jamie Dimon described in February when he told the bank's investor meeting that AI has led to internal redeployment, covered elsewhere in this update. Every original Glasswing partner, and the additional five named in Anthropic's 7 April system card, will have to integrate the attack-chain profile into internal risk frameworks during live deployment.

The evaluation was accessed via a third-party summary from Results Sense rather than AISI's primary publication, so specific scores should be verified against the Institute's direct release when it becomes available. The methodology point, however, is solidly established: Mythos's material advantage is durability, not speed, and durability is the AI capability that most directly substitutes for salaried human labour.

Deep Analysis

In plain English

A UK government body called the AI Security Institute tested Anthropic's most advanced AI model, Mythos, and found that it can independently complete a complex cybersecurity attack across 32 separate steps; work that would take a trained human about 20 hours. This confirms a capability distinct from the headline claims: chaining together a full 32-step attack sequence autonomously, rather than finding a single flaw. This matters for jobs because the same autonomous multi-step capability that can conduct a security attack can also conduct many complex knowledge-work tasks without human oversight.

Deep Analysis
Root Causes

The attack-chaining capability that AISI confirmed is structurally distinct from any prior evaluation framework because it is an emergent property of model scale rather than a designed feature.

Existing regulatory frameworks (including the EU AI Act's high-risk classification system and the US Executive Order 14110 reporting requirements) were designed around discrete capabilities such as facial recognition accuracy and loan decision bias. They have no measurement category for 'sustained multi-step autonomous execution' as a risk dimension.

The ASL abandonment in Anthropic's own system card (event index 6) formalises this: capability thresholds cannot capture emergent attack-chaining because the capability arises from combining individually non-dangerous steps. This is the same structural challenge that makes nuclear non-proliferation frameworks inadequate for dual-use biotechnology: the dangerous capability is not in any single component.

First Reported In

Update #6 · Three federal surveys, one 34-to-1 gap

UK AI Security Institute (via Results Sense)· 16 Apr 2026
Read original
Different Perspectives
Directors Guild of America
Directors Guild of America
The DGA opened AMPTP talks on 12 May seeking AI training-use royalties that SAG-AFTRA and the WGA both settled without winning. France's SACD and European creative unions watch the DGA outcome as the US template for their own pending AI-training royalty negotiations with streaming platforms.
German IG Metall and European trade unions
German IG Metall and European trade unions
German unions led by IG Metall have pushed for binding co-determination rights on AI deployment since 2024; the Digital Omnibus literacy-duty weakening directly undercuts their model, which depends on a statutory information floor before works councils can challenge AI systems affecting members.
Chinese Ministry of Human Resources (MOHRSS)
Chinese Ministry of Human Resources (MOHRSS)
China's MOHRSS recognised 42 new AI occupations in April 2026 while Hangzhou courts upheld bans on AI-driven dismissal without retraining under the Labour Contract Law. Beijing's regulatory posture contrasts directly with Colorado's retreat: Chinese courts are adding employment liability for AI-driven redundancy while US courts remove state-level AI worker protection.
UK workers and Bank of England
UK workers and Bank of England
The ONS May 2026 bulletin showed payrolled employment down 210,000 year on year with no AI-specific breakdown, while the Bank of England's stress scenario used 500,000 additional unemployed as its AI-displacement worst case. UK workers are approaching that threshold through a dataset that cannot name its own cause.
India's IT sector workforce and NASSCOM
India's IT sector workforce and NASSCOM
NASSCOM's FY2026 data shows India's sector at 5.9 million while entry-level hiring fell 20 to 25%. GCC expansion by JPMorgan, Goldman Sachs and Apple benefits mid-career workers while closing the graduate entry pathway, replicating the under-25 displacement the NY Fed documented in US AI-exposed occupations.
European Parliament and Council (Digital Omnibus)
European Parliament and Council (Digital Omnibus)
The Digital Omnibus trilogue concession on AI-literacy duties reflects the Draghi report's argument that compliance overhead suppresses EU AI adoption. The Council traded the binding literacy mechanism for employer flexibility, leaving the December 2027 high-risk employment deadline without the worker-facing transparency layer Parliament had built around it.