Nuance Communications, the developer of the popular speech recognition software Dragon, has bet on deep learning and neural networks to improve its speech recognition engine. Senior director of corporate research at Nuance Communications Nils Lenke explained yesterday in a blog post that the company’s Dragon Individual and Dragon Legal offerings are the first to implement this technology.
Over the years, Nuance has developed what Lenke calls “speaker-independent” speech recognition technology which can work reliably for nearly every user. However, he added that as the company’s Dragon Individual and Dragon Legal products are usually used by only one speaker at a time, Nuance has chosen to “go beyond speaker-independent speech recognition by adapting to each user in a speaker-dependent way.” Concretely, the software can learn from typical phrases and words that the speaker uses, adapt to how a voice sounds depending on the environment and you can also opt in to let the software work in the background and learn more about your voice when you don’t use it. Lenke added:
Dragon uses Deep Neural Networks end-to-end both at the level of the language model — capturing the frequency of words and in which combinations they typically occur — and of the acoustic model, deciphering the smallest spoken units, or phonemes of a language.
These models are quite large and before they leave our labs, they have already been trained on lots and lots of data. Adapting those Deep Neural Networks that make up the acoustic model to the speech coming from the user is similar to training them, and we want to make that happen on the user’s PC, Mac or laptop – and we want it to be fast. Packaging this process in a way that allows the individual to run it on their desktop or laptop is the culmination of many years of innovation in speech recognition and machine learning R&D.
The latest versions of Dragon Professional Individual and Dragon Legal Individual will be available on digital download on September 1, with physical version coming two weeks after that. Dragon Professional will cost you $300, or just $150 if you upgrade from the previous version, and you can learn more about the products on the official website. If you already used speech recognition software, let us know in the comments if you think deep learning can improve accuracy, speed and efficiency.