Analysis: How dangerous is Claude Mythos really?

Anthropic’s newest and most capable AI model, Claude Mythos, can autonomously find and exploit security vulnerabilities in virtually all major software systems. As I wrote previously, Mythos is currently only available to a select group of technology companies through Project Glasswing, Anthropic’s initiative to patch critical software before the capabilities become more widely known. Zvi Mowshowitz reports in his newsletter “Don’t Worry About The Vase” on what makes these capabilities genuinely unprecedented and why criticism dismissing the threat as marketing hype misses the point.

The core claim is stark. Using only a basic automated setup and minimal human direction, Mythos can find previously unknown security flaws, called zero-days, in every major operating system and every major web browser. It can then turn those flaws into working attack tools. Engineers at Anthropic with no security background reportedly asked the model to find vulnerabilities overnight and woke up to complete, functional exploits.

One documented case involves a 17-year-old flaw in the FreeBSD operating system. Mythos identified and exploited it entirely without human involvement, after receiving only a general instruction to find bugs. The exploit granted full administrative control over affected servers to anyone on the internet. Another example involved chaining together four separate browser vulnerabilities to allow an attacker to read data across websites and ultimately write directly to an operating system’s core.

The figures in Anthropic’s internal testing are striking. When searching for and then exploiting discovered vulnerabilities, the older Claude Opus model succeeded less than one percent of the time. Mythos succeeded 72 percent of the time. That is not an incremental improvement. It is a functional change in what the technology can do.

Critics have pushed back, arguing that smaller and cheaper AI models can find the same vulnerabilities. Some tests appeared to show models with only a few billion parameters detecting the same flaws Mythos had found. Mowshowitz and others argue this framing is misleading. Those smaller models were handed the relevant code, often narrowed down to around 20 lines, and asked whether a bug existed. That is a fundamentally different task from scanning an entire software project from scratch, identifying where to look, and then building a working exploit with no guidance. The smaller models also produced large numbers of false positives, making them impractical for any broad search.

Anthropic itself notes that older models like Opus can find many serious vulnerabilities when pointed in the right direction. The difference with Mythos lies in what happens next. It can chain vulnerabilities into complex, functional attacks. That capability did not meaningfully exist before.

The question of whether the danger is real has effectively been answered by the companies involved. Microsoft, Google, and other major technology firms have joined Project Glasswing and publicly confirmed they are working with Anthropic to patch flaws Mythos has found. For those asking whether Anthropic might be exaggerating, Mowshowitz puts it plainly: that would require its direct competitors to participate in the deception.

Stay up to date

Related posts: