Scientists just developed a new AI modeled on the human brain — it's outperforming LLMs like ChatGPT at reasoning tasks

Digitally generated image of glass brain connected with neon cable.

(Image credit: Eugene Mymrin/Getty Images)

Scientists have developed a new type of artificial intelligence (AI) model that can reason differently from most large language models (LLMs) like ChatGPT, resulting in much better performance in key benchmarks.

The new reasoning AI, called a hierarchical reasoning model (HRM), is inspired by the hierarchical and multi-timescale processing in the human brain — the way different brain regions integrate information over varying durations (from milliseconds to minutes).

The HRM model has 27 million parameters while using 1,000 training samples, the scientists said in a study uploaded June 26 to the preprint arXiv database (which has yet to be peer-reviewed). In comparison, most advanced LLMs have billions or even trillions of parameters. Although an exact figure has not been made public, some estimates suggest that the newly released GPT-5 has between 3 trillion and 5 trillion parameters.

A new way of thinking for AI

When the researchers tested HRM in the ARC-AGI benchmark — a notoriously tough examination that aims to test how close models are to achieving artificial general intelligence (AGI) — the system achieved impressive results, according to the study.

HRM scored 40.3% in ARC-AGI-1, compared with 34.5% for OpenAI's o3-mini-high, 21.2% for Anthropic's Claude 3.7 and 15.8% for Deepseek R1. In the tougher ARC-AGI-2 test, HRM scored 5% versus o3-mini-high's 3%, Deepseek R1's 1.3% and Claude 3.7's 0.9%.

Most advanced LLMs use chain-of-thought (CoT) reasoning, in which a complex problem is broken down into multiple, much simpler intermediate steps that are expressed in natural language. It emulates the human thought process by breaking down elaborate problems into digestible chunks.

But the Sapient scientists argue in the study that CoT has key shortcomings — namely "brittle task decomposition, extensive data requirements, and high latency."

Instead, HRM executes sequential reasoning tasks in a single forward pass, without any explicit supervision of the intermediate steps, through two modules. One high-level module is responsible for slow, abstract planning, while a low-level module handles rapid and detailed computations. This is similar to the way in which the human brain processes information in different regions.

It operates by applying iterative refinement — a computing technique that improves the accuracy of a solution by repeatedly refining an initial approximation — over several short bursts of "thinking." Each burst considers whether the process of thinking should continue or be submitted as a "final" answer to the initial prompt.

—Threaten an AI chatbot and it will lie, cheat and 'let you die' in an effort to stop you, study warns

—'Foolhardy at best, and deceptive and dangerous at worst': Don't believe the hype — here's why artificial general intelligence isn't what the billionaires tell you it is

HRM achieved near-perfect performance on challenging tasks like complex Sudoku puzzles — which conventional LLMs could not accomplish — as well as excelling at optimal path-finding in mazes.

The paper has not been peer-reviewed, but the organizers of the ARC-AGI benchmark attempted to recreate the results for themselves after the study scientists open-sourced their model on GitHub.

Although they reproduced the numbers, representatives said in a blog post, they made some surprising findings, including that the hierarchical architecture had minimal performance impact — instead, there was an under-documented refinement process during training that drove substantial performance gains.

Keumars is the technology editor at Live Science. He has written for a variety of publications including ITPro, The Week Digital, ComputerActive, The Independent, The Observer, Metro and TechRadar Pro. He has worked as a technology journalist for more than five years, having previously held the role of features editor with ITPro. He is an NCTJ-qualified journalist and has a degree in biomedical sciences from Queen Mary, University of London. He's also registered as a foundational chartered manager with the Chartered Management Institute (CMI), having qualified as a Level 3 Team leader with distinction in 2023.

You must confirm your public display name before commenting

Please logout and then login again, you will then be prompted to enter your display name.