Skip to content
Briefings are running a touch slower this week while we rebuild the foundations.See roadmap
BS
TechnologyUS

Big Sleep

Google's AI-driven vulnerability-discovery framework that identifies flaws in software code.

Last refreshed: 20 May 2026 · Appears in 1 active topic

Key Question

Big Sleep found its first real-world bug the same week criminals used AI to write one; who is winning?

Timeline for Big Sleep

#411 May
View full timeline →
Common Questions
What is Google Big Sleep and what has it found so far?
Big Sleep is Google's AI-driven autonomous vulnerability-discovery framework. As of 11 May 2026, GTIG confirmed it has found its first unknown real-world bug, marking the transition from research tool to operational defensive asset.Source: Google Threat Intelligence Group
How does Google Big Sleep find vulnerabilities differently from traditional fuzzing?
Big Sleep uses an LLM agent to reason about code semantics and generate targeted hypotheses about failure modes, then synthesises test cases based on that reasoning. Traditional fuzzers generate large volumes of random inputs; Big Sleep directs its search using model-generated understanding of the code's intended behaviour.Source: Google Project Zero
How does Big Sleep relate to the AI-generated zero-day that GTIG documented in May 2026?
GTIG's 11 May 2026 report documented both: the first criminal AI-generated working zero-day on the offensive side, and Big Sleep's first real-world bug find on the defensive side. Google framed them as two sides of the same AI-enabled capability threshold, with Big Sleep as the defender's tool.Source: Google Threat Intelligence Group
What is the difference between Google Big Sleep and CodeMender?
Big Sleep is the discovery layer: it finds unknown vulnerabilities. CodeMender is the remediation layer: it automatically generates patches for discovered flaws. Together they form Google's two-part AI-driven defensive stack.Source: Google Threat Intelligence Group

Background

Big Sleep is Google's autonomous vulnerability-discovery framework, an AI agent built on top of Project Zero's tooling that uses large language models to identify previously unknown security flaws in real-world software. In May 2026, GTIG's landmark report published on 11 May confirmed that Big Sleep has found its first unknown real-world bug, crossing the threshold from research proof-of-concept to operational defensive asset. The report framed Big Sleep explicitly as the defender's countermove against the AI-enabled offensive landscape it documented in the same publication: criminal clusters using LLMs to write working zero-days, and state actors using Gemini for exploit validation.

Google's Project Zero team introduced Big Sleep in late 2024 as a research vehicle for AI-directed fuzzing and vulnerability analysis. The framework runs iterative sessions in which an LLM agent examines code paths, generates hypotheses about failure modes, synthesises test cases, and evaluates crash output to determine exploitability. Unlike traditional fuzzing tools that emit high volumes of test inputs blindly, Big Sleep directs its search using model-generated reasoning about the code's intended semantics versus its actual behaviour.

Big Sleep operates as the discovery layer of a two-part defensive stack alongside CodeMender, Google's automated patching companion. Together they represent Google's attempt to close the window between vulnerability discovery and patch deployment, the same window that exploitation actors measure in hours (LiteLLM CVE-2026-42208 was exploited within 36 hours of KEV addition). The GTIG 11 May report marks the first time Google has publicly placed Big Sleep in direct operational context against a named AI-assisted threat landscape rather than publishing it as a standalone research tool.

Source Material