Large language models not fit for real-world use, scientists warn — even slight changes cause their world models to collapse

Large language model AIs might seem smart on a surface level but they struggle to actually understand the real world and model it accurately, a new study finds.

Neural network 3D illustration. Big data and cybersecurity. Data stream. Global database and artificial intelligence. Bright, colorful background with bokeh effect.
Neural networks that underpin LLMs might not be as smart as they seem.
(Image credit: Yurchanka Siarhei/Shutterstock)

Generative artificial intelligence (AI) systems may be able to produce some eye-opening results but new research shows they don’t have a coherent understanding of the world and real rules.

In a new study published to the arXiv preprint database, scientists with MIT, Harvard and Cornell found that the large language models (LLMs), like GPT-4 or Anthropic's Claude 3 Opus, fail to produce underlying models that accurately represent the real world.

Roland Moore-Colyer

Roland Moore-Colyer is a freelance writer for Live Science and managing editor at consumer tech publication TechRadar, running the Mobile Computing vertical. At TechRadar, one of the U.K. and U.S.’ largest consumer technology websites, he focuses on smartphones and tablets. But beyond that, he taps into more than a decade of writing experience to bring people stories that cover electric vehicles (EVs), the evolution and practical use of artificial intelligence (AI), mixed reality products and use cases, and the evolution of computing both on a macro level and from a consumer angle.