Google's Threat Intelligence Group (GTIG) published a 11 May 2026 report documenting the first confirmed criminal-actor case of a working zero-day exploit written by a Large Language Model: a Python two-factor authentication bypass targeting a widely deployed web administration tool, intercepted before mass deployment 1 2. Mandiant, the incident-response firm Google acquired for $5.4 billion in 2022 and now publishes attribution work under GTIG, co-authored the analysis.
The same report names four state-actor clusters by tradecraft. PROMPTSPY, an Android backdoor first surfaced by ESET in February 2026, is confirmed to use Google's Gemini API for autonomous device navigation, biometric capture, and on-device user-interface automation. UNC2814, a People's Republic of China-nexus cluster, runs Gemini as a 'senior security auditor' persona for embedded-device code review. APT45, also PRC-nexus, sends thousands of recursive prompts per session to validate proof-of-concept exploits against known CVEs. Russia-nexus malware families CANFAIL and LONGSTREAM wrap their payloads in 32 or more LLM-generated benign queries to obscure malicious logic from static analysis.
The defensive track ran on the same date. Google's autonomous vulnerability-discovery agent Big Sleep found its first real-world unknown bug, and CodeMender began auto-patching critical code paths. The AI-augmented threat picture now sits alongside the multi-vector supply-chain pressure documented across the SAP, OpenVSX and PyPI compromises and the UNC1069 Axios npm intrusion . For regulators drafting AI-misuse provisions, the analytic shape changes. GTIG's intercept gives them a named Python artefact, a named target tool, and a named LLM-generation event to anchor policy text on.
