Professional Documents
Culture Documents
Presentation 60
Presentation 60
Presentation 60
Technical Seminar on
Speech Recognition
Guided by: Presented by
Dr. A. A. Khodaskar Mrunal Pradeep Tambakhe
Department of Computer science and Engineering 19BE0463
CONTENT
Introduction
History
Working
Advantages
Disadvantages
Applications
Future scope
Reference
INTRODUCTION
The first speech recognition systems were focused on numbers, not words. In 1952, Bell Laboratories
designed the “Audrey” system which could recognize a single voice speaking digits aloud.
Ten years later, IBM introduced “Shoebox” which understood and responded to 16 words in English.
By the year 2001, speech recognition technology had achieved close to 80% accuracy. For most of the
decade there weren’t a lot of advancements until Google arrived with the launch of Google Voice
Search. Because it was an app, this put speech recognition into the hands of millions of people
In 2011 Apple launched Siri which was similar to Google’s Voice Search. The early part of this decade
saw an explosion of other voice recognition apps. And with Amazon’s Alexa, Google Home we’ve
seen consumers becoming more and more comfortable talking to machines.
Today, some of the largest tech companies are competing to herald the speech accuracy title. In
2016, IBM achieved a word error rate of 6.9 percent. In 2017 Microsoft usurped IBM with a 5.9
percent claim. Shortly after that IBM improved their rate to 5.5 percent. However, it is Google that
is claiming the lowest rate at 4.9 percent.
WORKING
The first component of speech recognition is, of course, speech. Speech must be
converted from physical sound to an electrical signal with a microphone, and then to
digital data with an analog-to-digital converter. Once digitized, several models can be
used to transcribe the audio to text.
Installing speech recogition : $ pip install SpeechRecognition
Working with Microphone: To install PyAudio Package
image
ADVANTAGES
Microsoft Cortana
FUTURE SCOPE