Skip to content
Briefings are running a touch slower this week while we rebuild the foundations.See roadmap
AI: Jobs, Power & Money
4APR

GPT-5.5 clears 32-step attack chain; two models in five days

4 min read
20:44UTC

The UK AI Safety Institute confirmed on 6 May 2026 that GPT-5.5 cleared the 32-step autonomous cyber attack chain benchmark, becoming the second model to do so after Claude Mythos, with AISI's Frontier AI Trends Report recording frontier cyber capability doubling every four months.

EconomicDeveloping
Key takeaway

GPT-5.5 and Claude Mythos both cleared the 32-step attack chain within five days, making it a class-level capability.

The AISI (UK AI Safety Institute, the UK government body established in November 2023 to evaluate frontier AI capabilities) published its Frontier AI Trends Report on 6 May 2026, confirming that GPT-5.5, OpenAI's frontier language model, cleared the 32-step autonomous cyber attack chain benchmark known as "The Last Ones" (TLO). GPT-5.5 achieved 71.4% on the expert cyber suite and solved the TLO benchmark end-to-end in 2 of 10 attempts. GPT-5.5 is the second model to clear the benchmark; Anthropic's Claude Mythos cleared the same threshold on 1 May 2026 .

AISI's report assesses frontier cyber capability as doubling every four months. The 32-step attack chain benchmark triggered an emergency convening of five Wall Street bank CEOs by Treasury Secretary Bessent and Fed Chair Powell in April 2026, after Claude Mythos became the first model to clear it . Five days after Mythos cleared the threshold on 1 May, GPT-5.5 repeated the result on 6 May.

Two separate frontier models clearing the same benchmark within five days changes the nature of the risk assessment. A single-model capability that is deliberately kept under restricted access (as Mythos Preview was, through Project Glasswing with access limited to twelve partners) can be managed through deployment controls. A benchmark cleared by two models from different organisations within five days of each other is no longer a single-deployment governance question; it is a class of capability that has arrived across the frontier.

AISI's four-month doubling rate, if it holds, means the next iteration of the same benchmark class will be cleared by a broader set of models within months. AISI's evaluation function, confirming which models have crossed which thresholds, is doing the work that no other institution with formal standing has yet stepped in to perform. Whether that evaluation function informs regulatory action in any jurisdiction within a timeframe that is operationally relevant to the doubling rate is the question AISI's report leaves open.

Deep Analysis

In plain English

On 6 May 2026, the UK AI Safety Institute published a report confirming that GPT-5.5, an AI model from OpenAI, had passed a difficult cyber security test. The test involves completing 32 consecutive steps in a simulated computer attack without human help. Only one other model, Anthropic's Claude Mythos, had passed this test before, and it did so just five days earlier. The report also said that frontier AI models are getting twice as capable at cyber attacks every four months. This matters because cybersecurity has been one of the sectors least affected by AI job cuts, because the technology was not yet reliable enough to replace human security analysts. A confirmed four-month capability doubling, if it continues, changes that assumption within 12-18 months.

Deep Analysis
Root Causes

The relevance to the AI-jobs topic is the labour market implication of the four-month capability doubling that the body does not name explicitly. Cybersecurity is one of the few technology employment sectors that has remained at full employment or near it throughout the 2025-2026 restructuring cycle: demand for human cyber analysts has been resilient because autonomous cyber capability has been below the operational reliability threshold.

The AISI confirmation that two models have now cleared the 32-step benchmark changes the supply side of the cybersecurity labour market: if frontier AI models can autonomously execute 32-step attack chains at a 20% success rate today, security operations centres will face pressure to reduce analyst headcount as that rate improves toward operational reliability.

Lloyd's of London cyber underwriters base pricing partly on the probability of a successful autonomous attack on insured infrastructure, making the AISI benchmark confirmation an underwriting input. A confirmed four-month doubling of frontier cyber capability is an underwriting input that re-prices coverage upward, affecting every enterprise that carries cyber insurance.

What could happen next?
  • Risk

    Security operations centre headcount justifications at major banks and critical infrastructure operators begin to face the same AI displacement pressure as software engineering if the four-month doubling sustains through 2026.

    Medium term · 0.6
  • Consequence

    Cyber insurance pricing must incorporate the AISI-confirmed capability doubling as a loss probability input; Lloyd's and specialist cyber underwriters will reprice 2027 coverage upward.

    Short term · 0.7
  • Risk

    The absence of a second interagency convening following GPT-5.5's clearance, five days after the first model cleared the same benchmark, suggests the institutional alarm response is not scaling with the capability curve.

    Immediate · 0.65
First Reported In

Update #9 · GitLab signs the manifesto, Brussels backs out

Bloomberg / CFO Dive· 15 May 2026
Read original
Causes and effects
This Event
GPT-5.5 clears 32-step attack chain; two models in five days
Two separate frontier models clearing the same 32-step attack chain within five days of each other confirms the benchmark is no longer a single-model capability threshold, removing the assumption that frontier cyber risk remains isolated to a single controlled deployment.
Different Perspectives
TSMC and Taiwan chip supply chain
TSMC and Taiwan chip supply chain
Nvidia's 17% headcount growth to 42,000 on $81.6 billion in quarterly revenue depends on TSMC's CoWoS advanced packaging capacity constraining H100 and B200 supply, sustaining margins above 70%. The AI build-out's sole headcount-growth story runs through a Taiwan supply chain that has no parallel in downstream software.
Displaced tech workers globally
Displaced tech workers globally
CrowdStrike's SEC disclosure puts AI attribution on a material regulatory record for the first time, but Oracle's Massachusetts WARN clock expired unfiled after up to 14 workers were logged as remote despite office proximity. The legal apparatus cannot enforce what it cannot see: hybrid reclassification, GCC transfers, and hires never made.
UK workforce and policymakers
UK workforce and policymakers
ONS recorded UK vacancies at 705,000, below the pre-pandemic baseline for the first time, as payrolled employment fell 210,000 year on year with real wage growth at 0.1%. The Bank of England's AI worst case assumed 500,000 additional unemployed from a baseline above 730,000; the UK is already below that floor, and ONS still publishes no AI-exposure breakdown.
India IT workforce and graduates
India IT workforce and graduates
NASSCOM's FY2026 data shows net sector growth of 140,000, but entry-level hiring fell 20-25% as the growth concentrated in in-house GCC offices requiring mid-career specialists. Indian graduates who previously entered through TCS, Infosys and Wipro fresher programmes find that channel closing at both ends: outsourcers cutting and GCCs not hiring at the junior level.
IG Metall and European trade unions
IG Metall and European trade unions
European labour bodies see the market reward pattern, cuts on record revenue, as investor preference for short-term margin extraction over validated AI productivity. They note the EU Digital Omnibus provisional deal has dropped binding employer AI-literacy obligations at the precise moment the ILO-NASK index has quantified that 3.3% of global workers are in the highest AI exposure category.
Federal Reserve Board
Federal Reserve Board
Governor Cook told Stanford's SIEPR on 27 May that speculative-grade software bond spreads have widened on AI-disruption concern, moving AI displacement from a labour observation into the Fed's financial-stability mandate. The Fed cannot resolve structural labour transformation through rate policy, so Cook routed the concern through the one channel the Fed does control.