OpenAI's 'smartest' AI model was explicitly told to shut down — and it refused

An artificial intelligence safety firm has found that OpenAI's o3 and o4-mini models sometimes refuse to shut down, and will sabotage computer scripts in order to keep working on tasks.

An artist's depiction of a dark, human-like artificial intelligence.
Recently released AI models will sometimes refuse to turn off, according to an AI safety research firm. This image is an artist's depiction of AI and doesn't represent any specific model.
(Image credit: Blackdovfx via Getty Images)

The latest OpenAI model can disobey direct instructions to turn off and will even sabotage shutdown mechanisms in order to keep working, an artificial intelligence (AI) safety firm has found.

OpenAI's o3 and o4-mini models, which help power the chatbot ChatGPT, are supposed to be the company's smartest models yet, trained to think longer before responding. However, they also appear to be less cooperative.

Patrick Pester
Trending News Writer

Patrick Pester is the trending news writer at Live Science. His work has appeared on other science websites, such as BBC Science Focus and Scientific American. Patrick retrained as a journalist after spending his early career working in zoos and wildlife conservation. He was awarded the Master's Excellence Scholarship to study at Cardiff University where he completed a master's degree in international journalism. He also has a second master's degree in biodiversity, evolution and conservation in action from Middlesex University London. When he isn't writing news, Patrick investigates the sale of human remains.

You must confirm your public display name before commenting

Please logout and then login again, you will then be prompted to enter your display name.