Scientists asked ChatGPT to solve a math problem from more than 2,000 years ago — how it answered it surprised them

A front view close up of the marble statue of the ancient Greek philosopher Plato, located outside the Academy of Athens. The statue was completed in 1885 by Leonidas Drosis.

(Image credit: georgeclerk/Getty Images)

The Greek philosopher Plato wrote about Socrates challenging a student with the "doubling the square" problem in about 385 B.C.E. When asked to double the area of a square, the student doubled the length of each side, unaware that each side of the new square should be the length of the original's diagonal.

Scientists at Cambridge University and Jerusalem's Hebrew University selected the problem to pose to ChatGPT because of its non-obvious solution. Since Plato's writing 2,400 years ago, scholars have used the doubling the square problem to argue whether the mathematical knowledge needed to solve it is already within us, released through reason, or only accessible through experience.

The answer came when the team went further. As described in a study published Sept. 17 in the journal International Journal of Mathematical Education in Science and Technology, they asked the chatbot to double the area of a rectangle using similar reasoning. It responded that because the diagonal of a rectangle can't be used to double its size, there was no solution in geometry.

However, visiting University of Cambridge scholar Nadav Marco from the Hebrew University of Jerusalem, and professor of mathematics education Andreas Stylianides, knew that a geometric solution existed.

Marco said the chances of the false claim existing in ChatGPT's training data was "vanishingly small," which means it was improvising responses based on previous discussion about the doubling the square problem — a clear indication of generated rather than innate learning.

"When we face a new problem, our instinct is often to try things out based on our past experience," Marco said Sept. 18 in a statement. "In our experiment, ChatGPT seemed to do something similar. Like a learner or scholar, it appeared to come up with its own hypotheses and solutions."

Machines that think?

The study shines new light on questions about the artificial intelligence (AI) version of "reasoning" and "thinking," the scientists said.

Because it seemed to improvise responses and even make mistakes like Socrates' student, Marco and Stylianides suggested ChatGPT might be using a concept we already know from education called a zone of proximal development (ZPD), which describes the gap between what we know and what we might eventually know with the right educational guidance.

ChatGPT, they said, might be using a similar framework spontaneously, solving novel problems that aren't represented in training data simply thanks to the right prompts.

It's a stark example of the longstanding black box issue in AI, where the programming or "reasoning" a system goes through to reach a conclusion is invisible and untraceable, but the researchers said that their work ultimately highlights the opportunity to make AI work better for us.

"Unlike proofs found in reputable textbooks, students cannot assume that ChatGPT's proofs are valid," Stylianides said in the statement. "Understanding and evaluating AI-generated proofs are emerging as key skills that need to be embedded in the mathematics curriculum."