Scientists asked ChatGPT to solve a math problem from more than 2,000 years ago — how it answered it surprised them
We've wondered for centuries whether knowledge is latent and innate or learned and grasped through experience, and a new research project is asking the same question about AI.

The Greek philosopher Plato wrote about Socrates challenging a student with the "doubling the square" problem in about 385 B.C.E. When asked to double the area of a square, the student doubled the length of each side, unaware that each side of the new square should be the length of the original's diagonal.
Scientists at Cambridge University and Jerusalem's Hebrew University selected the problem to pose to ChatGPT because of its non-obvious solution. Since Plato's writing 2,400 years ago, scholars have used the doubling the square problem to argue whether the mathematical knowledge needed to solve it is already within us, released through reason, or only accessible through experience.
Because ChatGPT, like other large language models (LLMs), is trained mostly on text rather than images, they reasoned there was a low chance that the answer to the doubling the square problem would exist in training data. This means that if it arrived at the correct solution unaided, one could argue that mathematical ability is learned and not innate.
The answer came when the team went further. As described in a study published Sept. 17 in the journal International Journal of Mathematical Education in Science and Technology, they asked the chatbot to double the area of a rectangle using similar reasoning. It responded that because the diagonal of a rectangle can't be used to double its size, there was no solution in geometry.
However, visiting University of Cambridge scholar Nadav Marco from the Hebrew University of Jerusalem, and professor of mathematics education Andreas Stylianides, knew that a geometric solution existed.
Marco said the chances of the false claim existing in ChatGPT's training data was "vanishingly small," which means it was improvising responses based on previous discussion about the doubling the square problem — a clear indication of generated rather than innate learning.
"When we face a new problem, our instinct is often to try things out based on our past experience," Marco said Sept. 18 in a statement. "In our experiment, ChatGPT seemed to do something similar. Like a learner or scholar, it appeared to come up with its own hypotheses and solutions."
Get the world’s most fascinating discoveries delivered straight to your inbox.
Machines that think?
The study shines new light on questions about the artificial intelligence (AI) version of "reasoning" and "thinking," the scientists said.
Because it seemed to improvise responses and even make mistakes like Socrates' student, Marco and Stylianides suggested ChatGPT might be using a concept we already know from education called a zone of proximal development (ZPD), which describes the gap between what we know and what we might eventually know with the right educational guidance.
ChatGPT, they said, might be using a similar framework spontaneously, solving novel problems that aren't represented in training data simply thanks to the right prompts.
It's a stark example of the longstanding black box issue in AI, where the programming or "reasoning" a system goes through to reach a conclusion is invisible and untraceable, but the researchers said that their work ultimately highlights the opportunity to make AI work better for us.
"Unlike proofs found in reputable textbooks, students cannot assume that ChatGPT's proofs are valid," Stylianides said in the statement. "Understanding and evaluating AI-generated proofs are emerging as key skills that need to be embedded in the mathematics curriculum."
It's a core skill they want students to master in educational contexts, something they said calls for better prompt engineering – for example, telling AI “I want us to explore this problem together” rather than '”tell me the answer.”
The team are cautious about the results, warning us not to over-interpret them and conclude that LLMs "work things out" like we do. But, Marco did label ChatGPT's behavior as "learner-like."
The researchers see scope for future research in several areas. Newer models can be tested on a wider set of mathematical problems, and there’s also potential to combine ChatGPT with dynamic geometry systems or theorem provers, creating richer digital environments that support intuitive exploration, for instance, in the way teachers and students use AI to work together in classrooms.
Drew is a freelance science and technology journalist with 20 years of experience. After growing up knowing he wanted to change the world, he realized it was easier to write about other people changing it instead. As an expert in science and technology for decades, he’s written everything from reviews of the latest smartphones to deep dives into data centers, cloud computing, security, AI, mixed reality and everything in between.
You must confirm your public display name before commenting
Please logout and then login again, you will then be prompted to enter your display name.