Speech Recognition Software Finally Works

Dragon NaturallySpeaking speech recognition software is significantly improved 10 years after it was introduced. (Image credit: Nuance)

Surprisingly, the summer of 2007 will be remembered for something other than Paris Hilton’s incarceration: It’s also the 10th anniversary of continuous speech recognition (SR) technology for the PC. Dragon NaturallySpeaking 1.0 came out in the summer of 1997, and those who wanted to dictate to their computers no longer had to pause … between … words.

Originally, the user had to “train” the software for about 45 minutes by reading it a canned test, and the resulting accuracy of about 75 percent meant you couldn’t finish a short sentence without several glaring errors. Today, having changed hands twice before arriving at version 9.5, training takes only minutes and out-of-the-box accuracy is about 95 percent, meaning you can expect one error per run-on sentence. Dragon’s current vendor, Nuance Communications Inc. of Burlington, MA, reports that sales are booming.

Chris Strammiello, a spokesman for Dragon’s current vendor, Nuance Communications Inc. of Burlington, MA, told LiveScience that Dragon did not catch on with the mass market until Version 8.0 came out in June 2004, offering enough accuracy (thanks to improved algorithms and faster computers) to be truly useful. Sales have been increasing by 30 percent per year since then, he said. (Strammiello would not break out Dragon’s contribution to Nuance’s bottom line, but the firm’s gross sales rose from $130.9 million in 2004, to $232.4 million in 2005, to $388.5 in 2006.)

Up from 95 percent

Actually, my extensive personal use shows that 95 percent is about as accurate as typing, with the software’s chief advantage being that it can keep up with a conversational speed of 140 words per minute, which is easily three times faster than most people can type.

Proofreading is a strange experience, since you are seeing the text for the first time, and you can be confused between what you meant to say, what you really said, and what the computer heard. Long words are almost invariably correct, while short words sometimes seem interchangeable.

Getting to 99 percent accuracy is possible in several weeks using the software’s correction facilities, by which it gradually adjusts itself to your voice. But speaking clearly and consistently is all-important. The personal version of Dragon retails for about $200, while the professional version costs about $765.

Painful decade

Over the past decade and earlier, the history of SR has not been a continuous series of triumphs, as the technology was nearly sunk twice by rampant hucksterism. One of the pioneers in the SR field was Kurzweill Applied Intelligence, two of whose executives were sentenced to prison in 1993 for inventing sales. The remains of that firm were bought in 1997 by a Belgium-based SR firm, Lernout and Hauspie (L&H), which was then reporting steady sales growth.

Dragon’s original vendor, Dragon Systems, was not reporting much growth after releasing NaturallySpeaking in 1997, and in 2000 L&H stepped forward and bought the struggling firm in a stock deal. A few months later, L&H’s sales growth was exposed as fakery, and it collapsed.

ScanSoft Inc. bought the Dragon SR technology at a bankruptcy auction in late 2001 and has continued development through three upgrades since then, meanwhile changing its name to Nuance Communications.

SR elsewhere

SR facilities are also included in Microsoft Office XP, although the fact is apparently not known to most users. Industry observers considered it a test version, as it required a mouse for navigation and correction, unlike Dragon.

Microsoft Vista has an enhanced version of SR that, like Dragon, does not need a mouse.

IBM ViaVoice was also once a competitor of Dragon, but IBM has licensed the software to Nuance, which uses it as an entry-level product. No other large-vocabulary desktop SR products are being marketed in the United States.