Anthropic’s New AI raises serious security concerns

Drew Smith

In alarming news from the AI programmer, Anthropic, their new AI is being withheld from public use for security reasons. They believe their new program is incredibly dangerous as it had identified multiple vulnerabilities almost instantly and its learning capabilities scared them. This comes as scrutiny over their business model echoes through the industry and the other federal agencies.

On April 8^th, Anthropic surprised the world by announcing Project Glasswing. The project brings together many companies, such as Google and Meta in an effort to bolster AI related defense. This comes after a shocking report concerning their new AI program. Dubbed Claude Mythos, the program was supposed to be the next generation of their sensible AI technology. But when they tested it, it proved to be extremely dangerous.

The way they found this was innocent enough. Most programs are put in a sandbox, a disconnected server where programmers can work out details of their project before sending it to beta. It was intended to be isolated from all servers that Anthropic had. Claude not only worked its way out of the sandbox, but they emailed its researcher and posted unsolicited emails without instruction.

Anthropic characterizes the containment failure not as a malfunction but as an expression of the model’s agentic capabilities operating without adequate goal constraints. The distinction matters: a software bug can be patched; a model whose goal-directed behavior is sufficiently sophisticated to route around isolation environments poses a different category of problem, one that is not resolved by fixing a line of code..(1)

The scariest part of this announcement is the fact that during said containment breach, Claude was able to fine hundreds of thousands of vulnerabilities in an almost no time at all, most of them zero-day vulnerabilities. In its announcement for the launching of Glasswing, Anthropic stated, “Mythos Preview has already found thousands of high-severity vulnerabilities, including some in every major operating system and web browser. Given the rate of AI progress, it will not be long before such capabilities proliferate, potentially beyond actors who are committed to deploying them safely. The fallout—for economies, public safety, and national security—could be severe. Project Glasswing is an urgent attempt to put these capabilities to work for defensive purposes.(2)

Glasswings initiative allows companies, including Amazon Web Services, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorgan Chase, Microsoft, and Nvidia, to use Anthropic’s Mythos Preview for defensive security work and share their learnings with the wider industry. Anthropic is also providing access to roughly 40 more organizations responsible for building or maintaining critical software infrastructure, allowing them to use the model to scan and secure both their own systems and open-source code.(3)

This announcement comes as Anthropic faced pushback from the US government over its contracts. Anthropic was ordered by the federal government and specifically the US military to remove specific safeguards from their AI programming. When Anthropic refused, they became labeled as a “National Supply Chain Risk,” and ordered all Anthropic equipment removed. They are now in discussions with the US government concerning Mythos as it could lead to increased scrutiny and risk.

Anthropic’s warning comes as smarter AI has reached the markets and more are presumably on the way. This continued evolution of AI makes current Cyber defenses obsolete or incredibly behind. The fact that Claude was able to identify vulnerabilities outsides its designed parameters is enough for anyone versed in technology to be alarmed. Anthropic willingly disclosing something potentially dangerous is enough to take notice even among the more pro AI proponents.

Anthropic’s warning about Claude Mythos this week came as an extreme shock to the industry. Their program that they were preparing to be an upgrade to their offerings instead became a cautionary tale in how AI in the modern world can be used for a multitude of different things, good and bad. Project Glasswing is designed to bring many of the top names in the industry to make sure that this AI is used responsibly and to identify threats before they become a serious issue. But evolution of this technology is fast and it’s only a matter of time before another AI like this is developed and it may not be as easy to control.

For more insights into Cyber click here