Experts divided over claim that Chinese hackers launched world-first AI-powered cyber attack — but that's not what they're really worried about
Anthropic said a Chinese espionage group used its Claude AI to automate most of a cyberattack campaign, but experts question how autonomous the operation really was, and what it means for the future of AI-powered hacking.
Anthropic researchers have claimed that a Chinese state-backed espionage group used its Claude artificial intelligence (AI) to automate most of a cyberattack campaign — but the news has sparked equal parts alarm and scepticism. In light of the research, the cybersecurity community is attempting to untangle what really happened and how autonomous the model actually was.
Company representatives said Nov. 13 in a statement that engineers disrupted what they describe as a "largely autonomous" operation that used the large language model (LLM) to plan and execute roughly 80-90% of a broad reconnaissance-and-exploitation effort against 30 organizations worldwide.
Engineers say they detected a cluster of misuse attempts across its products that ultimately traced back to operators linked to a Chinese state-sponsored espionage group. The attackers allegedly pointed Anthropic’s Claude Code model at targets spanning tech, finance, and government, tasking it with reconnaissance, vulnerability analysis, exploit generation, credential harvesting, and data exfiltration. According to the statement, humans intervened for only "high-level decision-making," such as choosing targets and deciding when to pull stolen data.
Engineers then thwarted the campaign internally through monitoring and abuse-detection systems that flagged unusual patterns indicative of automated task-chaining. Company representatives also reported that the attackers attempted to circumvent the model’s guardrails by breaking malicious goals into smaller steps and framing them as benign penetration-testing tasks — an approach researchers call "task decomposition." In several examples published by Anthropic, the model attempted to carry out instructions but produced errors, including hallucinated findings and obviously invalid credentials.
An AI-driven or human-driven attack?
The company’s narrative is stark: a “first-of-its-kind” example of AI-orchestrated espionage, in which the model was effectively piloting the attack. But not everyone is convinced the autonomy was as dramatic as Anthropic suggests.
Mike Wilkes, adjunct professor at Columbia University and NYU, told Live Science that the attacks themselves look basic, but the novelty lies in the orchestration.
"The attacks themselves are trivial and not scary. What is scary is the orchestration element being largely self-driven by the AI," Wilkes said. "Human-augmented AI versus AI-augmented human attacks: the narrative is flipped. So think of this as just a "hello world" demonstration of the concept. Folks dismissing the content of the attacks are missing the point of the "leveling up" that this represents."
Get the world’s most fascinating discoveries delivered straight to your inbox.
Other experts question whether the operation really reached the 90% automation mark that Anthropic representatives highlighted.
Seun Ajao, senior lecturer in data science and AI at Manchester Metropolitan University, said that many parts of the story are plausible but are likely still overstated.
He told Live Science that state-backed groups have used automation in their workflows for years, and that LLMs can already generate scripts, scan infrastructure, or summarise vulnerabilities. Anthropic’s description contains "details which ring true," he added, such as the use of "task decomposition" to bypass model safeguards, the need to correct the AI's hallucinated findings, and the fact that only a minority of targets were compromised.
"Even if the autonomy of the said attack was overstated, there should be cause for concern,” he argued, citing lower barriers to cyber espionage through off-the-shelf AI tools, scalability, and the governance challenges of monitoring and auditing model use.
Katerina Mitrokotsa, a cybersecurity professor at the University of St. Gallen, is similarly sceptical of the high-autonomy framing. She says the incident looks like "a hybrid model" in which an AI is acting as an orchestration engine under human direction. While Anthropic frames the attack as AI-orchestrated end-to-end, Mitrokotsa notes that attackers appear to have bypassed safety restrictions mainly by structuring malicious tasks as legitimate penetration tests and slicing them into smaller components.
"The AI then executed network mapping, vulnerability scanning, exploit generation, and credential collection, while humans supervised critical decisions," she said.
In her view, the 90% figure is hard to swallow. "Although AI can accelerate repetitive tasks, chaining complex attack phases without human validation remains difficult. Reports suggest Claude produced errors, such as hallucinated credentials, requiring manual correction. This aligns more with advanced automation than true autonomy; similar efficiencies could be achieved with existing frameworks and scripting."
Lowering the barrier to entry for cybercrime
What most experts agree on is that the significance of the incident doesn’t hinge on whether Claude was doing 50% or 90% of the work. The worrying part is that even partial AI-driven orchestration lowers the barrier to entry for espionage groups, makes campaigns more scalable, and blurs responsibility when an LLM becomes the engine gluing an intrusion together.
If Anthropic's account of events is accurate, the implications are profound, in that adversaries can use consumer-facing AI tools to accelerate reconnaissance, compress the time from scanning to exploitation and repeat attacks faster than defenders can respond.
If the autonomy narrative is exaggerated, however, that fact doesn’t offer much comfort. As Ajao said: "There now exists much lower barriers to cyber espionage through openly available off-the-shelf AI tools." Mitrokotsa also warned that "AI-driven automation [could] reshape the threat landscape faster than our current defenses can adapt."
The most likely scenario, based on the experts, is that this was not a fully autonomous AI attack but a human-led operation supercharged by an AI model acting as a tireless assistant — stitching together reconnaissance tasks, drafting exploits, and generating code at scale. The attack showed that adversaries are learning to treat AI as an orchestration layer, and defenders should expect more hybrid operations where LLMs multiply human capability rather than replace it.
Whether the actual number was 80%, 50%, or far less, the underlying message from experts is the same: Anthropic engineers may have caught this one early but the next such campaign might not be so easy to block.
Carly Page is a technology journalist and copywriter with more than a decade of experience covering cybersecurity, emerging tech, and digital policy. She previously served as the senior cybersecurity reporter at TechCrunch.
Now a freelancer, she writes news, analysis, interviews, and long-form features for publications including Forbes, IT Pro, LeadDev, Resilience Media, The Register, TechCrunch, TechFinitive, TechRadar, TES, The Telegraph, TIME, Uswitch, WIRED, and others. Carly also produces copywriting and editorial work for technology companies and events.
You must confirm your public display name before commenting
Please logout and then login again, you will then be prompted to enter your display name.

