Get the world’s most fascinating discoveries delivered straight to your inbox.
You are now subscribed
Your newsletter sign-up was successful
Want to add more newsletters?
Delivered Daily
Daily Newsletter
Sign up for the latest discoveries, groundbreaking research and fascinating breakthroughs that impact you and the wider world direct to your inbox.
Once a week
Life's Little Mysteries
Feed your curiosity with an exclusive mystery every week, solved with science and delivered direct to your inbox before it's seen anywhere else.
Once a week
How It Works
Sign up to our free science & technology newsletter for your weekly fix of fascinating articles, quick quizzes, amazing images, and more
Delivered daily
Space.com Newsletter
Breaking space news, the latest updates on rocket launches, skywatching events and more!
Once a month
Watch This Space
Sign up to our monthly entertainment newsletter to keep up with all our coverage of the latest sci-fi and space movies, tv shows, games and books.
Once a week
Night Sky This Week
Discover this week's must-see night sky events, moon phases, and stunning astrophotos. Sign up for our skywatching newsletter and explore the universe with us!
Join the club
Get full access to premium articles, exclusive features and a growing list of member rewards.
Speech recognition technology has come a long way in recent years, and one of the fastest areas of growth is the cellphone market.
Now, the availability of 3G-enabled mobile devices with fast, always-on Internet connections and the ability to train voice modeling software with millions of phone users – a process called crowd sourcing – is helping fuel a new breed of mobile speech-recognition apps that work quickly and are amazingly accurate.
Speech recognition software has been around for years, but they were often frustrating to use because they typically required users to "train" them for optimal word recognition or to speak slowly.
"In the early days, the capabilities of the technology combined with the computing power of the various devices required that you have training so that [the software] would have data about the specific user ... and not use up too much computer power," explained Mike Thompson, senior vice president and general manager of Nuance Mobile, which makes the Dragon Dictation and Dragon Search apps for the iPhone and iPad. (Read more iPad news.)
But the computing power of today's smartphones is such that voice training is no longer required. The digital voice models that form the basis of today's speech recognition software are sophisticated enough that they can learn — on their own — their users' verbal quirks.
They're also fast: Dragon Dictation, for example, can transcribe words spoken at normal speed.
The power of the masses
Get the world’s most fascinating discoveries delivered straight to your inbox.
Mobile voice-recognition apps also have other advantages over their older desktop counterparts.
One is the ability to communicate with powerful central computers, or servers, that can combine information from millions of users and then make broad generalizations that help improve the apps' overall ability to recognize words.
"The first time you speak to the phone, we put a cookie" — a kind of digital tag — "on your device and when you say something we call up your personal language model from our servers and use it to get better accuracy," said Dave Grannen, president and CEO of speech recognition software maker Vlingo, which also has an app for the iPhone.
An individual's voice model contains information about his accent and unique way of pronouncing certain words, among other things.
The servers can combine the voice models of several speakers who have similar accents to improve the accuracy for that population.
"If you're from India and speaking English as a second language on Vlingo, we work pretty darned well. If you're from Germany speaking English, it doesn't work so well," Grannan told TechNewsDaily.
The reason? Vlingo has many more Indian-speaking users that German-speaking ones, so the voice model for Indians is generally better than that for Germans.
Smart apps
Today's speech-recognition apps for smartphones can also learn from their mistakes. If an app misspells a word, users can use the keyboards on their devices to correct the mistake, and the correction is noted on the server so it is less likely to recur.
Dragon Dictation and Dragon Search also pay attention to where a speaker is talking and can take steps to reduce background noise so a person's words are more understandable.
"If you're driving down the road in your car, you might have the window partway down, or the radio is on, or there's another person in the car with you. All of those kinds of sounds are predictable and can be eliminated through something called acoustic echo cancellation," said Dragon Dictation's Thompson.
Acoustic echo cancellation is a server-side process and also benefits from crowd sourcing. The more people who use the apps in similarly noisy environments, the better the software gets at ignoring background noise.
"Just like many forms of software, as you collect more data and expertise, you're continually pouring that back into the products," Thompson said in a telephone interview.
'Getting mainstream'
Vlingo's Grannan notes that it's only been in recent years, as fast 3G-enabled cellphones have become ubiquitous, that crowd sourcing and server-side voice analyses has really taken off.
"Before we had 3G, it was hard to do this," Grannan said.
In the future, speech recognition software will be more deeply integrated into a variety of devices, Thompson predicts.
"You're going to see a large number of devices roll out with speech recognition baked into the device," he said. "It will be built into messaging systems and the search functionality and all the apps on a phone."
This trend is already happening. Apple's iPhone 3GS, for example, includes native speech recognition capabilities that allow users to voice-dial people in their address books.
Speech recognition "is getting mainstream attention, and that's driving our business in a very positive way," Thompson said.
