Anthropic's Most Powerful AI Is Also Its Most Dangerous — So It's Keeping It Under Lock and Key

Anthropic has unveiled what it describes as its most capable AI model ever built, but rather than putting it on the market, the company has quietly handed it to a closed coalition of technology giants and security researchers with strict instructions: use it to find and fix vulnerabilities before someone else uses it to cause serious harm.

The model is called Claude Mythos Preview, and it represents something genuinely new in the AI industry — not a product launch, but a controlled risk event.

Announced on April 7th, Mythos has benchmarks that dwarf anything Anthropic has previously released. It scores 93.9% on SWE-bench Verified, a demanding software engineering evaluation, 97.6% on the USAMO mathematics olympiad, and 83.1% on CyberGym, a dedicated cybersecurity capability assessment. Internally, before the announcement was accidentally leaked in a data security incident last month, Anthropic employees were calling it "by far the most powerful AI model we've ever developed" — a tier above its existing Opus models, which were until now the company's most sophisticated.

The reason Anthropic isn't selling it is straightforward: Mythos is extraordinarily good at breaking into computer systems.

During internal testing, the model demonstrated the ability to identify and exploit zero-day vulnerabilities across every major operating system and every major web browser. It found bugs that human security researchers had missed for years — in some cases, for decades. The oldest vulnerability Mythos uncovered was a 27-year-old flaw in OpenBSD, an operating system built from the ground up with security as its defining principle. The model did not just flag these issues. It wrote working exploits for them, including a browser attack that chained together four separate vulnerabilities and executed a complex heap spray capable of escaping both the renderer and operating system sandboxes.

"We've seen Mythos Preview accomplish things that a senior security researcher would be able to accomplish," said Logan Graham, Anthropic's frontier red team lead. "This has very big implications for how capabilities like this should be released. Done not carefully, this could be a meaningful accelerant for attackers."

In response, Anthropic built what it is calling Project Glasswing — a coordinated industry consortium modelled loosely on coordinated vulnerability disclosure, the practice where security researchers privately notify software developers about flaws and give them time to patch before going public. The logic here is the same, scaled to an entire industry. Partner organisations given access to Mythos Preview include Amazon, Apple, Broadcom, Cisco, CrowdStrike, the Linux Foundation, Microsoft and Palo Alto Networks. An additional forty organisations will receive controlled access. None of this will be publicly available.

The participation of Microsoft and Google in the consortium is itself notable. Both companies are Anthropic's rivals in the frontier AI race, yet both signed on as partners and offered public statements endorsing the approach. Google's vice president of security engineering, Heather Adkins, praised the "cross-industry cybersecurity initiative," while Microsoft's global CISO, Igor Tsyganskiy, called the moment "unprecedented" in its opportunity to reduce risk at scale.

The framing from Anthropic is deliberately forward-looking. The company is not claiming Mythos is uniquely dangerous in ways no future model will be. It is claiming the opposite: that models with these capabilities will eventually be everywhere, and that the industry has a narrow window now to build the defensive infrastructure before that happens.

"The real message is that this is not about the model or Anthropic," Graham told WIRED. "We need to prepare now for a world where these capabilities are broadly available in 6, 12, 24 months. Many things would be different about security. Many of the assumptions that we've built the modern security paradigms on might break."

Anthropic says Mythos has already identified thousands of zero-day vulnerabilities across critical open-source software during the weeks of controlled testing, the overwhelming majority of which remain unpatched. The company is disclosing none of the specifics publicly — doing so before patches are available would be precisely the kind of irresponsible disclosure Project Glasswing was designed to prevent.

The story of Mythos began, somewhat ironically, with a security failure. In late March, a draft blog post about the model — then internally codenamed "Capybara" — was left in an unsecured cache of documents on a publicly accessible data lake. Security researchers found it before Anthropic did. The company attributed the leak to human error, but the incident cast an uncomfortable shadow over a company now positioning itself as a leader in responsible AI security practices. It also came just days after Anthropic accidentally exposed nearly two thousand source code files and more than half a million lines of code through a mistake made during the rollout of a Claude Code software update.

Despite those stumbles, the ambition behind Project Glasswing is real and the stakes it implies are significant. For years the debate around advanced AI and cybersecurity has been largely theoretical — could a powerful model lower the barrier to exploitation? Mythos appears to answer that question empirically. It can. The question now is whether the industry can move fast enough to reinforce its defences before models like this become as ordinary as a search engine.

Anthropic says it expects the Glasswing partners will eventually share what they have learned with the broader technology industry. A wider public release of Mythos itself, however, is not being planned.

// LATEST INTELLIGENCE