Mathematicians devised novel problems to challenge advanced AIs' reasoning skills — and they failed almost every test

Current AI models struggle to solve research-level math problems, with the most advanced AI systems we have today solving just 2% of the hundreds of challenges faced.

Equations shown in a digital format.
The researchers tested six state-of-the-art AI models against the new benchmark and the best score registered by a single system was 2%.
(Image credit: hh5800/Getty Images)

Mathematicians have stumped the most advanced generative artificial intelligence (AI) models with a series of mind-bending new math problems.

These problems typically require doctorate-level mathematicians hours to days to solve, according to the research institute Epoch AI. But in the new tests, the most advanced AI models on the market got correct answers on less than 2% of these problems.

Stephanie Pappas
Live Science Contributor

Stephanie Pappas is a contributing writer for Live Science, covering topics ranging from geoscience to archaeology to the human brain and behavior. She was previously a senior writer for Live Science but is now a freelancer based in Denver, Colorado, and regularly contributes to Scientific American and The Monitor, the monthly magazine of the American Psychological Association. Stephanie received a bachelor's degree in psychology from the University of South Carolina and a graduate certificate in science communication from the University of California, Santa Cruz.