Anthropic is blowing the whistle on stolen AI.
On Monday, the AI firm said that it identified three major Chinese AI labs — DeepSeek, Moonshot, and MiniMax — carrying out “industrial-scale” model distillation campaigns, attempting to exfiltrate Claude’s capabilities to enhance their own models.
Anthropic detected around 24,000 fraudulent accounts between the three companies, generating over 16 million illicit exchanges in an attempt to piece together Claude’s outputs to better train their own models.
- While distillation, or training a less capable model with the outputs of a more capable one, is a common machine learning technique, doing so under the table extracts the model’s abilities without the “necessary safeguards,” Anthropic said.
- These safeguards prevent these models from being used maliciously, Anthropic said, such as to develop bioweapons or to carry out cyberattacks.
- “If distilled models are open-sourced, this risk multiplies as these capabilities spread freely beyond any single government's control,” the company said.
Anthropic laid out several tactics it’s using to stop these attacks, including detection techniques, intelligence sharing between AI labs, access controls and model-level countermeasures. However, the company said that there is a “narrow” window to act on this problem. “Addressing it will require rapid, coordinated action among industry players, policymakers, and the global AI community.”
“No company can solve this alone,” Anthropic added.
The firm is the third major AI company to call out this kind of attack this month: In mid-February, OpenAI and Google both highlighted the growing prevalence of model distillation attacks. These companies, too, addressed the risks of unauthorized model distillation lacking proper safeguards.
OpenAI specifically called out DeepSeek using these techniques to “free-ride” on its models while circumventing its safety restrictions, while Google said it observed “private sector entities all over the world and researchers seeking to clone proprietary logic.”
These warnings also come as open-source Chinese AI skyrockets in popularity as an affordable alternative to proprietary model providers. Amid the growing demand, Chinese firms are taking a leading role, with DeepSeek preparing its next model release and Alibaba, Moonshot and Minimax unveiling their own new models in recent weeks.
Our Deeper View
Unauthorized model distillation presents a double-barreled attack for Anthropic. Of all the major AI firms, Anthropic is the most closed-off, not offering any open-source versions of its flagship models. It’s also the most focused on safety, having made itself the poster child for doing AI responsibly and ethically. Along with diluting Claude’s secret sauce by spreading its capabilities far and wide, if used for malicious, unethical or dangerous practices, these techniques threaten the foundation that Anthropic was built on. The big question will be how well fierce rivals Anthropic, OpenAI, and Google can effectively collaborate to stop model distillation attacks.

