Skip to content
Briefings are running a touch slower this week while we rebuild the foundations.See roadmap
Cybersecurity: Threats and Defences
20MAY

GTIG names the first LLM-written working zero-day

3 min read
09:58UTC

Google's Threat Intelligence Group documented the first criminal use of a Large Language Model to write a working zero-day, a Python 2FA bypass intercepted before mass deployment, alongside four AI-augmented threat clusters spanning DPRK-, PRC- and Russia-nexus operators.

TechnologyDeveloping
Key takeaway

A regulator can now name a working LLM-written exploit by file, by actor, and by interception date.

Google's Threat Intelligence Group (GTIG) published a 11 May 2026 report documenting the first confirmed criminal-actor case of a working zero-day exploit written by a Large Language Model: a Python two-factor authentication bypass targeting a widely deployed web administration tool, intercepted before mass deployment 1 2. Mandiant, the incident-response firm Google acquired for $5.4 billion in 2022 and now publishes attribution work under GTIG, co-authored the analysis.

The same report names four state-actor clusters by tradecraft. PROMPTSPY, an Android backdoor first surfaced by ESET in February 2026, is confirmed to use Google's Gemini API for autonomous device navigation, biometric capture, and on-device user-interface automation. UNC2814, a People's Republic of China-nexus cluster, runs Gemini as a 'senior security auditor' persona for embedded-device code review. APT45, also PRC-nexus, sends thousands of recursive prompts per session to validate proof-of-concept exploits against known CVEs. Russia-nexus malware families CANFAIL and LONGSTREAM wrap their payloads in 32 or more LLM-generated benign queries to obscure malicious logic from static analysis.

The defensive track ran on the same date. Google's autonomous vulnerability-discovery agent Big Sleep found its first real-world unknown bug, and CodeMender began auto-patching critical code paths. The AI-augmented threat picture now sits alongside the multi-vector supply-chain pressure documented across the SAP, OpenVSX and PyPI compromises and the UNC1069 Axios npm intrusion . For regulators drafting AI-misuse provisions, the analytic shape changes. GTIG's intercept gives them a named Python artefact, a named target tool, and a named LLM-generation event to anchor policy text on.

Deep Analysis

In plain English

For the first time, security researchers at Google confirmed that a criminal group used an AI chatbot to write a working piece of malware from scratch, a computer program designed to bypass two-step login verification. Previous cases of AI being used in hacking had been assistive; this is the first confirmed case of the AI producing the working attack itself.

Deep Analysis
Root Causes

The convergence of three structural conditions enabled this threshold crossing: freely available frontier LLM access at zero marginal cost per query; open-source model fine-tuning that removes safety mitigations without requiring significant compute budget; and the absence of any vendor liability framework that would penalise an LLM provider for outputs used in downstream criminal activity.

The same GTIG report documenting offensive AI use also documents Google's defensive AI tools finding their first real-world vulnerability. Both tracks share the same underlying model capability. The structural asymmetry is that defenders operate within institutional constraints, responsible disclosure, patch timelines, and legal review, that attackers do not.

First Reported In

Update #4 · AI joins the breach column on both sides

Google Threat Intelligence Group· 20 May 2026
Read original
Different Perspectives
Tsinghua University Institute for International Strategic Studies
Tsinghua University Institute for International Strategic Studies
Beijing-aligned commentary rejects US attribution of PRC-nexus clusters (UNC2814, APT45, UAT-8616) as politically motivated framing, characterising the April sixteen-agency joint advisory as coordinated Western pressure rather than independent technical assessment.
Google Threat Intelligence Group
Google Threat Intelligence Group
GTIG's 11 May report establishes AI-assisted offence and AI-infrastructure targeting as concurrent named-incident categories, not theoretical ones: UNC6780 attacked LiteLLM and Cisco AI Defense in parallel; state actors used Gemini operationally; CANFAIL and LONGSTREAM used LLM-generated queries to evade static analysis.
Cisco
Cisco
Cisco has not confirmed the UNC6780 breach scope beyond the named AI Defense and AI Assistant projects; GitHub confirmed an investigation. CVE-2026-20182 is the sixth Cisco SD-WAN KEV entry in 2026, reaching that milestone the same week UNC6780's source-code visibility into the portfolio became public.
NCSC
NCSC
The ICO's South Staffs Water fine applies NCSC PAM and monitoring guidance as the GDPR Article 32 enforcement baseline against a water-sector CNI operator, extending the Capita precedent before the CS&R Bill has reached Royal Assent. NCSC guidance now carries enforceable weight inside the existing statutory framework for CNI sectors processing personal data.
Microsoft Security Response Center
Microsoft Security Response Center
The Exchange Emergency Mitigation Service URL rewrite is the sole available mitigation for CVE-2026-42897; MSRC has not signalled an out-of-band patch timeline. The workaround breaks OWA calendar print, inline images, and Light mode, forcing CISOs to choose between user-experience breakage and active-exploitation exposure.
CISA
CISA
CISA's Exchange CVE-2026-42897 deadline of 29 May, set before Microsoft published a patch, repeats the PAN-OS posture from 6 May: exploitation velocity now overrides vendor release timelines. BOD 22-01 compliance against an unpatched flaw leaves federal CISOs with only mitigation documentation and mailbox-rule monitoring.