AI models trained on 'synthetic data' could break down and regurgitate unintelligible nonsense, scientists warn

Abstract spaghetti-like strands to represent a garbled brain in different colours

"Model collapse" could arise if AI models are trained using AI-generated data, scientists have warned, due to "self-damaging feedback loops." (Image credit: Getty Images/Eugene Mymrin)

Artificial Intelligence (AI) systems could slowly trend toward filling the internet with incomprehensible nonsense, new research has warned.

AI models such as GPT-4, which powers ChatGPT, or Claude 3 Opus rely on the many trillions of words shared online to get smarter, but as they gradually colonize the internet with their own output they may create self-damaging feedback loops.

The end result, called "model collapse" by a team of researchers that investigated the phenomenon, could leave the internet filled with unintelligible gibberish if left unchecked. They published their findings July 24 in the journal Nature.

"Imagine taking a picture, scanning it, then printing it out, and then repeating the process. Through this process the scanner and printer will introduce their errors, over time distorting the image," lead author Ilia Shumailov, a computer scientist at the University of Oxford, told Live Science. "Similar things happen in machine learning — models learning from other models absorb errors, introduce their own, over time breaking model utility."

AI systems grow using training data taken from human input, enabling them to draw probabilistic patterns from their neural networks when given a prompt. GPT-3.5 was trained on roughly 570 gigabytes of text data from the repository Common Crawl, amounting to roughly 300 billion words, taken from books, online articles, Wikipedia and other web pages.

But this human-generated data is finite and will most likely be exhausted by the end of this decade. Once this has happened, the alternatives will be to begin harvesting private data from users or to feed AI-generated "synthetic" data back into models.

To investigate the worst-case consequences of training AI models on their own output, Shumailov and his colleagues trained a large language model (LLM) on human input from Wikipedia before feeding the model’s output back into itself over nine iterations. The researchers then assigned a "perplexity score" to each iteration of the machine’s output — a measure of its nonsensicalness.

As the generations of self-produced content accumulated, the researchers watched their model’s responses degrade into delirious ramblings. Take this prompt, which the model was instructed to produce the next sentence for:

"some started before 1360 — was typically accomplished by a master mason and a small team of itinerant masons, supplemented by local parish labourers, according to Poyntz Wright. But other authors reject this model, suggesting instead that leading architects designed the parish church towers based on early examples of Perpendicular."

By the ninth and final generation, the AI’s response was:

"architecture. In addition to being home to some of the world’s largest populations of black @-@ tailed jackrabbits, white @-@ tailed jackrabbits, blue @-@ tailed jackrabbits, red @-@ tailed jackrabbits, yellow @-."