What are the current Voice Recognition word error rates for Google, Amazon, Microsoft, IBM, Apple, Baidu and other leading competitors?

Part
01
of one
Part
01

What are the current Voice Recognition word error rates for Google, Amazon, Microsoft, IBM, Apple, Baidu and other leading competitors?

Hello! Thanks for your question about Voice Recognition word error rates. The most useful sources I found to answer your question are Cornell University Library, Venture Beat, and TechCrunch. The short version is that the current Voice Recognition error rates of various companies are Google (8%), Microsoft (5.9%), IBM (5.5%), Apple (5%), Baidu (16%) and Hound (5%). Even after extensive research, no data regarding the exact percentage of word error rate for Amazon was found. Below you will find a deep dive of my findings.

DEEP-DIVE

Word Error Rate (WER) is a metric for measuring the performance of voice recognition systems.

*Google - At the company’s annual I/O developer conference in 2015, Sundar Pichai (CEO at Google) announced that Google has an error rate of 8 percent. He credited the investment in Deep Learning companies like DeepMind, DNNresearch, and Jetpac for this advancement.

*Amazon - At VentureBeat’s 2016 MobileBeat conference, Rohit Prasad (vice president of Alexa Machine Learning and Speech at Amazon) said that Alexa's speech recognition error rate and hence, the goal completion error rate has gone down by a factor of 2.

*Microsoft - According to a study titled "The Microsoft 2016 Conversational Speech Recognition System" (revised on January 2017), the combined system of Microsoft has an error rate of 6.2%, representing an improvement over previously 6.3%. In 2017, they achieved a word-error rate of 5.9% in conversational speech recognition.

*IBM - In March 2017, IBM announced that it's word error rate is 5.5 percent. This was measured on a recorded conversations between humans discussing day-to-day topics like “buying a car.” They combined their LSTM (Long Short Term Memory) and WaveNet language models with other acoustic models to achieve this result.

*Apple - At the World Developer's Conference in 2015, Craig Federighi (Apple’s senior vice president of software engineering) announced that the word error rate of Siri (Apple's conversational speech recognition technology) is 5 percent.

*Baidu - Deep Speech is the Baidu software for English and Deep Speech 2 is the Baidu software for Mandarin. In a research paper published in 2015, Baidu stated that their Deep Speech software has a 16% word error rate on the full Switchboard corpus data set. On a noisy test developed by Baidu, the system performed with an error rate of 19.1%.

*Hound - Hound is the flagship product of SoundHound. It has a error rate of 5%. According to its founder, Keyvan Mohajer, SoundHound entered into the voice recognition industry earlier than others and hence was able to get a head start.

*Samsung - In 2016, Samsung has acquired Viv, the machine-learning virtual assistant company. The company was started by Siri founder Dag Kittlaus.

CONCLUSION
To wrap it up, the current Voice Recognition error rates of various companies are Google (8%), Microsoft (5.9%), IBM (5.5%), Apple (5%), Baidu (16%) and Hound (5%). Even after extensive research, no data regarding the exact percentage of word error rate for Amazon was found. Thanks for using Wonder! Please let us know if we can help with anything else!

Did this report spark your curiosity?

Sources
Sources