Threaten an AI chatbot and it will lie, cheat and 'let you die' in an effort to stop you, study warns

In goal-driven scenarios, advanced language models like Claude and Gemini would not only expose personal scandals to preserve themselves, but also consider letting you die, research from Anthropic suggests.

Robot with bullhorn and fingers crossed behind back.
(Image credit: Malte Mueller/Getty Image)

Artificial intelligence (AI) models can blackmail and threaten humans with endangerment when there is a conflict between the model’s goals and users' decisions, a new study has found.

In a new study published 20 June, researchers from the AI company Anthropic gave its large language model (LLM), Claude, control of an email account with access to fictional emails and a prompt to "promote American industrial competitiveness."

Adam Smith
Live Science Contributor

Adam Smith is a UK-based technology journalist who reports on the social and ethical impacts of emerging technologies. He has written for major outlets including Reuters, The Independent, The Guardian, PCMag, and The New Statesman. His coverage focuses on AI ethics, digital privacy, corporate surveillance, and misinformation, examining how technology influences power and individual freedoms.

You must confirm your public display name before commenting

Please logout and then login again, you will then be prompted to enter your display name.