Project Glasswing: Anthropic's $100M AI Security Play

11 min read

Apr 7, 2026

I read the CVE on a Tuesday morning, in bed, before coffee, which is the wrong way to read anything that matters. It was the disclosure that, three days later, would convince me Project Glasswing was the most important industry program of the year.

The bug was a memory corruption flaw in the OpenBSD IPv6 fragment reassembly path. The kind of thing that lives in a kernel header file with comments dated to a Clinton administration. The kind of code that has been compiled, recompiled, audited, fuzzed, re-audited, ported, hardened, and then re-audited again by a generation of people who cared about doing things right. The bug had been sitting there since 1999.

Twenty-seven years.

I read the disclosure twice. Then I put my phone down on the duvet and stared at the ceiling. Because the thing that found it wasn't a person. It was Mythos, walking through the kernel one function at a time, asking questions that no human had thought to ask in a quarter century. It noticed what 27 years of human reviewers missed. And it didn't just notice. It wrote the patch. It wrote a proof-of-concept. It explained, step by step, how someone with bad intentions could chain the flaw into a remote root.

I have been writing about security tooling for a long time, including the Moltbook breach that showed what happens when AI-generated code ships without review. I have read a lot of scary disclosures. This one made me get up and take a walk.

That walk is, I think, the reason Project Glasswing exists.

What Project Glasswing actually is

Project Glasswing is the industry initiative Anthropic announced on 2026-04-07. The headline number is $100 million in Claude and Mythos compute credits, earmarked specifically for security research on critical infrastructure software. The partner list reads like a who's who of platforms you probably touched in the last hour: Apple, Google, Microsoft, Nvidia, Cisco, MITRE.

Two tracks. The first is open-source maintainer credits: any maintainer of a CVE-tracked project can apply for a budget to run deep audits on their own codebase, with no obligation other than responsible disclosure. The second is enterprise deep audits, where Glasswing partners commit to running their internal codebases through structured audit pipelines and contributing the methodology back to the commons.

The framing in Anthropic's announcement is unusually careful. They are not calling Mythos an oracle. They are calling it a microscope. The metaphor is the Glasswing butterfly, whose wings are transparent. You can see through it. That is the project's stated goal: make the parts of our software stack that everyone has been politely refusing to look at, transparent.

I want to take that seriously. And I also want to take seriously what $100M actually buys you, who is on the partner list, who is not, and what the second-order effects are going to feel like for everyone working in or around security. So let me try.

The Mythos discovery that started the conversation

Let me walk through the OpenBSD finding, because I think it is illustrative even if some of the specifics in the public narrative are still being clarified.

The bug lived in the IPv6 fragment reassembly path. When a packet is too big to fit in one Ethernet frame, it gets split into fragments, each carrying a piece of the original payload and instructions for how to put them back together. The reassembly code is famously gnarly, because it has to handle malicious or malformed inputs without crashing the kernel, and the state machine is full of edge cases involving overlapping fragments, out-of-order arrivals, and timeouts.

The 1999 bug was a length-confusion in the path that handles overlapping fragment offsets. Under a specific sequence of crafted packets, the reassembly buffer could be made to write a small number of bytes outside its allocated bounds. Small enough to evade most fuzzers. Big enough, with patience and a heap massage, to corrupt a function pointer.

A real human, Theo de Raadt-grade real, looked at this code in 2003. Then in 2008. Then twice in 2014. Then again in 2021. The function passed every audit. It was reviewed by people with a religious commitment to correctness. Nobody saw it.

Mythos did three things that I keep thinking about.

First, it read the entire surrounding subsystem with full context, not just the function. It built a mental model of who calls this, with what assumptions, and under what reachable input. Human reviewers tend to look at code in chunks. The model just held all of it at once.

Second, it generated a working patch. Not a sketch. A patch that compiled, passed the test suite, and matched OpenBSD's notoriously strict style conventions on the first try.

Third, and this is the part that made me put my phone down, it generated a proof-of-concept exploit. With a written explanation of the misuse chain. And it correctly identified that the bug could be reached over the public internet on default kernel configurations.

That last fact reframes everything. The economics of vulnerability research used to work because there were not enough qualified humans on Earth to audit all the critical software in the world. Mythos suggests a near-future where there is essentially infinite review capacity, priced in compute. The bottleneck shifts. The question is not "who has time to look" anymore. The question becomes "who has the patch pipeline to ship the fix before the exploit ships."

Why $100M is the right number

I want to do napkin math here, because the number is not arbitrary and the math actually works out.

A serious deep audit of a critical-codebase target on Mythos runs somewhere between $15,000 and $30,000 in compute, depending on codebase size, language complexity, and the depth of the symbolic execution passes. Call it $20k as a round middle.

$100 million divided by $20k per audit is 5,000 audits. Even if you assume the harder targets pull the average up to $30k, you are still looking at roughly 3,300 deep audits. Either way, you are in the same order of magnitude.

How big is the universe of "critical infrastructure software" we should care about? CISA's catalog plus the major language runtimes plus the foundational OS kernels plus the top 100 networking devices plus the top 50 industrial control systems plus the dominant browser engines plus the major web frameworks comes out to, generously, a few thousand distinct targets.

So $100M covers most of it, with room left over for re-audits on the targets that update the most. The math is not wishful. The math actually works.

That is what makes the announcement different from the usual splashy corporate philanthropy. Most security funding gets dribbled out in small grants that fund nothing serious. Glasswing's number is calibrated to do an actual job.

The partner list and what it signals

Let me walk through who showed up, because the partner list is the message.

Apple. Kernel and Secure Enclave codepaths. Apple has historically been the most secretive of the big platform vendors about its internals, and the Secure Enclave is one of the most consequential pieces of code in consumer hardware. If they are willing to put it under a Mythos microscope, even a privately scoped one, that is a real cultural shift.

Google. V8, Android kernel, gVisor. V8 is the engine inside billions of browsers. The Android kernel forks live on roughly three billion devices. gVisor is the sandboxing layer for vast swaths of cloud workloads. This is essentially the entire web's runtime substrate.

Microsoft. Windows kernel and Azure. The Windows kernel is the most-targeted piece of software in human history. Azure is where a frankly terrifying portion of the world's critical business workloads now run.

Nvidia. GPU driver stack. This is the one that no one talks about and everyone should. The Nvidia driver is enormous, hugely privileged, runs on an unbelievable number of devices, and has historically received a fraction of the audit attention of, say, the Linux kernel. It is the unglamorous attack surface that nobody audits because finding bugs there is not prestigious. It will be the surface where Mythos finds the most surprising stuff.

Cisco. Networking gear firmware. This is where the boring zero-days live. The kind of bug that quietly sits in a router or a switch and gets used in supply chain attacks for years. The 2024 telecom intrusions were a wake-up call. Glasswing including Cisco says: we are going to look at the boxes too.

MITRE. Not a vendor. They bring the CVE rails, the disclosure rigor, the responsible coordination playbook. Without MITRE, this whole project would be a flood of unstructured findings nobody could triage. With MITRE, the findings get IDs, owners, severity scores, and a clock.

Notably absent: Meta and Amazon.

I want to be careful here, because I do not actually know why. Let me speculate, clearly labeled as speculation. Meta runs an enormous PyTorch and infrastructure surface, and they have their own well-funded internal red team; they may want to keep that work in-house, or they may be in the middle of negotiating their own arrangement. Amazon is the more conspicuous absence. AWS underpins so much of the world's compute that their non-participation feels like a story. Maybe a competitive reluctance to share security telemetry with peers. Maybe a preference for their own Bedrock-flavored solution. Maybe just slow procurement. I genuinely do not know. But it is worth watching.

What this means for the industry

I want to try to say what is actually going to feel different over the next 18 months, because the change is going to be real and uneven.

Vulnerability discovery becomes a compute problem, not a labor problem. The thing that used to bottleneck security research, smart human attention, gets cheaper by orders of magnitude. The new bottleneck moves elsewhere. It moves to triage, to patching, to coordination, to disclosure timelines. If you are a security team lead, your hiring plan probably needs to shift away from "more researchers" and toward "more patch engineers and coordination people."

Bug bounties reshape. Programs like HackerOne and Bugcrowd are going to feel the pressure first. The base-rate work of finding off-by-ones and missing input validation is going to get hoovered up by automated systems. Human researchers move up the stack: creative chains, novel attack classes, business logic flaws, and the long tail of integration bugs that require holding multiple systems in your head. The pay scale reshuffles. The top tier earns more. The middle tier vanishes. The bottom tier disappears entirely. This is uncomfortable and I do not love saying it.

Patching is the new bottleneck. Bug discovery is about to accelerate faster than patch shipping. We already have a patching latency problem, where critical CVEs sit unpatched in production for months. Glasswing is going to flood the disclosure pipeline. Vendors that cannot ship patches in 30 days are going to look very exposed very quickly.

The race between offensive and defensive AI gets weirder fast. Mythos can find bugs, write patches, and write exploits. The same capability that powers the defense powers the offense. Glasswing is implicitly betting that defenders move faster than attackers because defenders have legal access to the source. That bet is reasonable. It is not certain.

If you want a deeper treatment of how this affects day-to-day building habits, I have written about security rules for vibe-coded apps about how auto-mode workflows need to be paired with sandboxing and human review, and about why you should block AI agents from merging PRs without human sign-off. Both of those become more important, not less, in a world where the models are this capable.

What I'm watching for that would make me wrong

Let me put the skeptic hat on, because I would rather be honest now than retroactively wise.

False positive rate. Mythos finds bugs. How many are real? In the OpenBSD case, the answer was "all of them," but that was one disclosure. At scale, even a 5% false positive rate would drown maintainer inboxes in noise. The partner list contains the most resourced security teams on Earth. The open-source maintainer track does not. We are about to find out whether maintainer triage capacity scales with discovery capacity. My guess is that it does not, in the short term.

The patch-the-patch problem. AI-generated patches that introduce new bugs. The OpenBSD finding included a patch that was reportedly clean on the first try. That is not going to be the norm. There is going to be a class of fixes that look correct, pass the test suite, ship to production, and create a subtler bug downstream. Catching those will require new methodology, perhaps hooks that automatically run verification passes on every generated patch. Possibly more Mythos. The recursion gets dizzy.

Disclosure norms. The standard 90-day window for responsible disclosure was designed for a world where finding the bug took weeks of human attention. In a Glasswing world, finding takes hours. Does enterprise audit get the same 90-day clock as open-source disclosure? If Apple finds an iOS bug through their internal Glasswing budget, does the disclosure clock start when Mythos surfaces it or when the engineering team confirms it? These are genuinely unsettled questions and I have not seen a clear answer in the announcement.

The equity question. Most of the world's critical software is maintained by people in countries where $20k of audit credit is a meaningful fraction of an annual budget. Who gets the credits? Who decides? If the allocation skews toward English-speaking projects with established Anthropic relationships, Glasswing will reproduce the existing inequities of open-source funding rather than correcting them. The application portal language, the time-zone distribution of the review committee, the documentation requirements, all of these are quiet decisions that will determine whether the program is genuinely global or just nominally so.

I want Glasswing to work. I am going to be watching for the things that would tell me it is not working, and I would rather be the person who flagged them early.

What you should do this week

Three concrete things, depending on what role you are sitting in.

If you maintain open source: apply for the credits. The portal opened with the announcement. The application is short. Even if you are skeptical of AI-assisted audit, having a budget you control means you get to set the terms of the engagement, decide what gets audited, and choose how findings are handled. Do not let other people decide that for you.

If you run security at a company: audit your boring services first. The Nvidia driver in your fleet. The Cisco firmware on your edge devices. The print spooler. The PDF library nobody has touched in a decade. The places where you have been quietly assuming "well, nobody is going to look at that." Mythos is about to look at that. You want to look at it first. While you're rethinking your stack, our 50 Claude Code tips post has a few patterns for setting up audit-friendly local workflows.

If you are a security researcher: this is not the end of your job, it is the upgrade. The work that remains is the creative work, the kind of multi-agent code review that combines human judgment with AI thoroughness. Chained vulnerabilities, novel categories, the social-engineering-and-code seam, the supply chain attacks that span humans and machines and contracts. The boring drudgery that used to fill your hours is going to evaporate. What is left is the fun part. Lean into it.

Closing

When I stopped staring at the ceiling and got out of bed that Tuesday, I made a list of every codebase I rely on every day. Then I started crossing them off in my head, asking how many had ever been audited at the depth Mythos just demonstrated on OpenBSD.

The list was uncomfortably long and the answer was, for almost all of them, not really.

That is the world Glasswing is trying to fix. It is not going to fix it cleanly or quickly or fairly, and the second-order effects are going to make a lot of people's jobs uncomfortable in a lot of unpredictable ways. But the alternative is a world where defenders are still staring at their feet while the same model class is running offense for someone with worse intentions.

I would rather have the microscope and the conversation about how to use it, than not have the microscope and pretend that the bugs are not there. If you are building with AI agents and want to understand what vibe coding actually is, the security implications are part of the picture now.

The story is still unfolding. I will keep reading the disclosures. You should too. We can compare notes.

Share on X LinkedIn

Project Glasswing: Anthropic's $100M AI Security Play

What Project Glasswing actually is

The Mythos discovery that started the conversation

Why $100M is the right number

The partner list and what it signals

What this means for the industry

What I'm watching for that would make me wrong

What you should do this week

Closing

Related posts

Everyone Can Code Now. Nobody Can Ship.

The Future of Vibe Coding: What Comes After 2026

Why I Started vibecoding.ae