Anthropic vs Alibaba: what the Claude access allegations mean for AI access, training data, and US-China tech rivalry
semafor.com

Anthropic vs Alibaba: what the Claude access allegations mean for AI access, training data, and US-China tech rivalry

Tech News
4 min read

Published by AINave Editorial • Reviewed by Ramit

TL;DRAnthropic accused Alibaba of using 25,000 fake accounts to run 29 million interactions with Claude AI in a distillation attack, escalating US-China tech tensions and raising practical questions about model access controls and training data provenance for AI builders.

Anthropic has accused Alibaba of orchestrating a large-scale operation to illicitly access its Claude AI model, using thousands of fake accounts to bypass restrictions and extract capabilities for training rival models. The allegations, detailed in a June 10 letter to the Senate Banking Committee, mark the latest flashpoint in the US-China tech rivalry and carry direct implications for how AI builders think about API security, data provenance, and cross-border model access.

What happened

Anthropic claims that over a six-week period beginning April 22, operators linked to Alibaba's Qwen AI lab used approximately 25,000 fake accounts to generate nearly 29 million interactions with Claude models. The goal, according to Anthropic, was to perform a "distillation attack" -- systematically extracting Claude's capabilities, including its ability to process complex prompts and make decisions, to train a smaller rival chatbot. Anthropic called it the largest known distillation attack on the company to date.

The allegations come as Alibaba sued the Pentagon this week to be removed from a blacklist of firms allegedly linked to the People's Liberation Army. Separately, the US is advancing the Pax Silica program to diversify supply chains for raw materials and chips, with several European governments and the EU joining those efforts.

Anthropic has previously leveled similar accusations against other Chinese AI companies, including DeepSeek, Moonshot, and MiniMax in February. OpenAI has also raised concerns about DeepSeek's unauthorized use of its models.

Why AI builders should care

For teams building on frontier models, this episode highlights a concrete risk: your API can be weaponized to train a competitor's model at scale. Distillation attacks turn hundreds of billions of dollars in US investment into a subsidy for geopolitical competitors, as Anthropic put it. Even if your own models aren't targeted, the broader policy response could reshape how frontier models are accessed across borders.

If you rely on API access to models like Claude, GPT, or Gemini for your product, tighter access controls and monitoring may become standard. The allegations also underscore the importance of data provenance in training data -- if a model you use was trained on distilled outputs from another model, that could affect performance, safety, and legal liability.

Practical implications

While the immediate impact is uncertain, the episode is likely to accelerate several trends:

  • Stricter API access controls: Expect more aggressive rate limiting, IP geofencing, identity verification, and anomaly detection from model providers. Anthropic's own April memo from the Trump administration signaled a willingness to cooperate with US AI companies to curb illicit access.
  • Policy and legislative scrutiny: The letter to the Senate Banking Committee suggests Anthropic is pushing for government action. This could lead to new export controls or compliance requirements for companies that serve models internationally.
  • Model sourcing strategies: Startups and enterprises may need to audit where their training data comes from and whether any distilled outputs from restricted models are present. This is especially relevant for teams using open-weight models that may have been trained on proprietary API outputs.

Caveats

These are allegations from Anthropic, not proven facts. The numbers -- 25,000 accounts, 29 million interactions -- come from Anthropic's own disclosures and have not been independently verified. Alibaba has not publicly responded to the specific claims as of this writing. The broader context of US-China tech tensions means both sides have incentives to frame the narrative. Investigations and legal proceedings are ongoing, and details may evolve.

For AI builders, the practical takeaway is to treat this as a signal that model access governance is becoming a first-class concern, not just a compliance checkbox.

FAQs

Did Alibaba illicitly access Anthropic's Claude AI?

Anthropic has publicly claimed that Alibaba used thousands of fake accounts to perform distillation attacks on Claude AI, extracting capabilities to train rival models. The allegations were detailed in a June 10 letter to the Senate Banking Committee and reported by multiple outlets. No independent verification is available yet, and investigations are ongoing.

What is a distillation attack in AI?

A distillation attack involves systematically querying a model (like Claude) to extract its capabilities -- such as reasoning patterns or decision-making processes -- and using those outputs to train a separate model. Anthropic describes it as illicitly harvesting US AI capabilities at industrial scale to avoid the R&D costs of building frontier models from scratch.

How many fake accounts did Alibaba allegedly use to access Claude AI?

Anthropic alleges that Alibaba-affiliated operators used approximately 25,000 fake accounts to generate nearly 29 million interactions with Claude models over six weeks. These figures come from Anthropic's disclosures and have not been independently confirmed.

What are the implications for US-China AI rivalry?

The allegations underscore escalating tensions over access to frontier AI capabilities and training data. They come alongside Alibaba's lawsuit against the Pentagon blacklist and US efforts like the Pax Silica program to diversify tech supply chains. For AI builders, this may lead to tighter cross-border access controls and increased scrutiny of model sourcing and data provenance.

Sources

Latest Tech News