The most “moral” of all moral companies, Anthropic, may not be able to stop building capabilities that look very much like weapons. This time it is cyber weapons.
In the latest AI Realist article, I go through the Mythos system card. 245 pages inevitably contain a bit too much information. When you read between the lines as an AI expert, you can start to infer what might actually be happening.
In this article, I discuss what environment Mythos was possibly trained in, and why I believe its “too dangerous to release” capabilities are not unexpected emergent properties, but appear to be shaped through the training environment.
I argue that Mythos could be seen as a cyber weapon, one that the vast majority of cybersecurity departments are not prepared for. Many are still busy cataloguing AI agents and writing agentic governance guidelines instead of building real defenses.
If this interpretation is even partially correct, then restricting access to the model may not improve safety. If I can form a view on how such a training environment might be set up from reading their own document, others are likely already building Mythos-class systems.
Disclaimer: This analysis is based on publicly available materials and reflects my interpretation, not confirmed internal details.
In the latest AI Realist article, I go through the Mythos system card. 245 pages inevitably contain a bit too much information. When you read between the lines as an AI expert, you can start to infer what might actually be happening.
In this article, I discuss what environment Mythos was possibly trained in, and why I believe its “too dangerous to release” capabilities are not unexpected emergent properties, but appear to be shaped through the training environment.
I argue that Mythos could be seen as a cyber weapon, one that the vast majority of cybersecurity departments are not prepared for. Many are still busy cataloguing AI agents and writing agentic governance guidelines instead of building real defenses.
If this interpretation is even partially correct, then restricting access to the model may not improve safety. If I can form a view on how such a training environment might be set up from reading their own document, others are likely already building Mythos-class systems.
Disclaimer: This analysis is based on publicly available materials and reflects my interpretation, not confirmed internal details.