
Most of us have likely experienced artificial intelligence (AI) voices through personal assistants like Siri or Alexa, with their flat intonation and mechanical delivery giving us the impression that we could easily distinguish between an AI-generated voice and a real person. But scientists now say the average listener can no longer tell the difference between real people and "deepfake" voices.
In a new study published Sept. 24 in the journal PLoS One, researchers showed that when people listen to human voices — alongside AI-generated versions of the same voices — they cannot accurately identify which are real and which are fake.
"AI-generated voices are all around us now. We’ve all spoken to Alexa or Siri, or had our calls taken by automated customer service systems," said lead author of the study Nadine Lavan, senior lecturer in psychology at Queen Mary University of London, in a statement. "Those things don’t quite sound like real human voices, but it was only a matter of time until AI technology began to produce naturalistic, human-sounding speech."
The study suggested that, while generic voices created from scratch were not deemed to be realistic, voice clones trained on the voices of real people — deepfake audio — were found to be just as believable as their real-life counterparts.
The scientists gave study participants samples of 80 different voices (40 AI-generated voices and 40 real human voices) and asked them to label which they thought was real and AI-generated. On average, only 41% of the from-scratch AI voices were misclassified as being human, which suggested it is still possible, in most cases, to tell them apart from real people.
However, for AI voices cloned from humans, the majority (58%) of were misclassified as being human. Only slightly more (62%) of the human voices were classified correctly as being human, leading the researchers to conclude that there was no statistical difference in our capacity to tell the voices of real people apart from their deepfake clones.
The results have potentially profound implications for ethics, copyright and security, Lavan said. Should criminals use AI to clone your voice, it becomes that much easier to bypass voice authentication protocols at the bank or to trick your loved ones into transferring money.
Get the world’s most fascinating discoveries delivered straight to your inbox.
We've already seen several incidents play out. On July 9, for example, Sharon Brightwell was tricked out of $15,000. Brightwell listened to what she thought was her daughter crying down the phone, telling her that she had been in an accident and that she needed money for legal representation to keep her out of jail. "There is nobody that could convince me that it wasn’t her," Brightwell said of the realistic AI fabrication at the time.
Lifelike AI voices can also be used to fabricate statements by, and interviews with, politicians or celebrities. Fake audio might be used to discredit individuals or to incite unrest, sowing social division and conflict. Con artists recently built an AI clone of the voice of Queensland Premier Steven Miles, using his profile to try to get people to invest in a Bitcoin scam, for instance.
The researchers emphasised that the voice clones they used in the study were not even particularly sophisticated. They made them with commercially available software and trained them with as little as four minutes of human speech recordings.
"The process required minimal expertise, only a few minutes of voice recordings, and almost no money," Navan said in the statement. "It just shows how accessible and sophisticated AI voice technology has become."
While deepfakes present a multitude of opportunities for malign actors, it isn’t all bad news; there may be more positive opportunities that come with the power to generate AI voices at scale. "There might be applications for improved accessibility, education, and communication, where bespoke high-quality synthetic voices can enhance user experience," Navan said.

Kit Yates is an Association of British Science Writers media fellow at Live Science. At his day job, he is a professor of mathematical biology and public engagement at the University of Bath in the U.K. He reports on mathematics and health stories. His work has appeared in The Guardian, The Independent, New Statesman, BBC Futures and Scientific American among others. His science journalism has won awards from the Royal Statistical Society and The Conversation. Kit holds a BA in mathematics, an MSc in mathematical modeling and a PhD in Systems Biology all from the University of Oxford. He has written two popular science books, The Math(s) of Life and Death and How to Expect the Unexpected.
You must confirm your public display name before commenting
Please logout and then login again, you will then be prompted to enter your display name.