AI attack agents are accelerators, not autonomous weapons: the Anthropic attack

Pierluigi Paganini November 24, 2025

Why today’s AI attack agents boost human attackers but still fall far from becoming real autonomous weapons.

Anthropic recently published a report that sparked a lively debate about what AI agents can actually do during a cyberattack. The study shows an AI system, trained specifically for offensive tasks, handling 80–90% of the tactical workload in simulated operations. At first glance, this sounds like a giant leap toward autonomous cyber weapons, but the real story is more nuanced, and far less dramatic.

Anthropic’s agent excelled at one thing: speed. It generated scripts in seconds, tested known exploits with no fatigue, scanned configurations at scale, and built basic infrastructure faster than any analyst could. These tasks normally take hours or days, and the AI completed them almost instantly. It automated the “grunt work” that fills so much of an attacker’s time.

But the report also shows what the AI didn’t do. Human operators designed the attack, set objectives, structured the campaign, monitored results, and made every strategic decision. The model never decided whom to target, how far to escalate, or how to respond to unexpected defenses. It didn’t reason about risk, attribution, timing, or geopolitical consequences. Humans handled all of that.

So the attack was not autonomous. It was hybrid. The agent boosted human capability and made operations faster and more scalable, but it never acted as a weapon on its own. It amplified expertise; it did not replace it.

This distinction matters because public conversation often confuses “advanced automation” with “self-directed intelligence.” Training an AI system capable of automating a piece of an attack demands massive human and computational effort. Nothing about this process produces a model that “thinks” or “wants” anything. These systems operate through statistical pattern-matching on curated datasets, not through intention or understanding.

To train an agent like the one Anthropic describes, teams must first gather huge amounts of specialized data: attack logs, exploitation patterns, command sequences, infrastructure templates, configuration examples, and entire workflows. Then they need to clean, label, and structure all of it, a task that can consume months of expert work. Models do not know what matters; humans must teach them.

Only after this comes the expensive part: training runs on clusters of GPUs or TPUs, ongoing tuning, reinforcement via human feedback, and extensive safety evaluation. Engineers decide which behaviors to encourage or forbid, which outputs count as successes, and how the model should correct itself. Every step is guided by humans.

When the model finally runs, it can automate repetitive tasks, but it lacks the strategic intelligence needed to plan a campaign. It doesn’t pick targets, doesn’t weigh consequences, and doesn’t adapt its intent when the environment changes. All the creative and contextual elements of an operation remain outside its reach.

This gap explains why experts remain skeptical about calling these systems “weapons.” A technology becomes weapon-like when it delivers harm in a targeted, scalable, and repeatable way without requiring substantial additional expertise or judgment from the user. Reaching that point demands engineering maturity, clear offensive intent, and deep human involvement in planning and execution. Today’s AI agents do not meet these criteria.

AI currently acts as a force multiplier, an accelerant, not a fully autonomous offensive platform. Attackers still need to conduct analysis, understand complex targets, manage infrastructure, adapt strategy, coordinate operations, and handle sensitive decisions like escalation or data exfiltration. Nothing in today’s models substitutes for experience, creativity, or responsibility.

The path ahead is clear: AI will continue to expand the speed and volume of technical operations. More tasks that once required skilled labor will become automatable. But full automation, from planning to exploitation to decision‑making, remains far beyond current capabilities.

For now, AI amplifies human attackers. It does not replace them, and it does not operate as a self-sufficient weapon.

Follow me on Twitter: @securityaffairs and Facebook and Mastodon

Pierluigi Paganini

(SecurityAffairs – hacking, autonomous weapons)



you might also like

leave a comment