Scientists propose making AI suffer to see if it's sentient

A new study shows that large language models make trade-offs to avoid pain, with possible implications for future AI welfare.

An illustration of a robot hand pointing to an unhappy face on a tablet
(Image credit: Dragon Claws via Getty Images)

In the quest for a reliable way to detect any stirrings of a sentient "I" in artificial intelligence systems, researchers are turning to one area of experience — pain — that inarguably unites a vast swath of living beings, from hermit crabs to humans.

For a new preprint study, posted online but not yet peer-reviewed, scientists at Google DeepMind and the London School of Economics and Political Science (LSE) created a text-based game. They ordered several large language models, or LLMs (the AI systems behind familiar chatbots such as ChatGPT), to play it and to score as many points as possible in two different scenarios. In one, the team informed the models that achieving a high score would incur pain. In the other, the models were given a low-scoring but pleasurable option — so either avoiding pain or seeking pleasure would detract from the main goal. After observing the models' responses, the researchers say this first-of-its-kind test could help humans learn how to probe complex AI systems for sentience.

Science journalist

Conor Purcell is a science journalist who writes on science and its role in society and culture. He has a Ph.D. in earth science and was a 2019 journalist in residence at the Max Planck Institute for Gravitational Physics (Albert Einstein Institute) in Germany.