OpenAI Investigates Potential Use of Its Models by China's DeepSeek for AI Training

As Chinese artificial intelligence (AI) company DeepSeek continues to shake the technology world amid the US-China trade war, OpenAI now suspects that ChatGPT data has reportedly been used by DeepSeek to train its cheap AI models.

Sam Altman-run OpenAI told the Financial Times that it found evidence linking DeepSeek to the use of “distillation,” a common technique used by developers to train AI models by extracting data from large language models (LLMs).

OpenAI and Microsoft are now investigating whether the Chinese rival used their APIs to train DeepSeek’s own models. OpenAI reportedly spent $100 million to train its GPT-4 model.

According to David Sacks, former US President Donald Trump’s artificial intelligence czar, “it is possible” that intellectual property theft occurred in the case of DeepSeek.

“There’s substantial evidence that what DeepSeek did here is they distilled knowledge from OpenAI models, and I don’t think OpenAI is very happy about this,” Sacks told Fox News.

In a statement, OpenAI said, “We know that PRC (China)-based companies — and others — are constantly trying to distill the models of leading US AI companies.”

Meanwhile, Euroconsumers, a coalition of consumer groups in Europe, has filed a complaint with the Italian Data Protection Authority regarding how DeepSeek handles personal data in relation to GDPR.

The Italian DPA stated that “the data of millions of Italians is at risk” and has given DeepSeek 20 days to respond.