Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 9

Project Title: Speech Recognition

Project Members: Supervisor Name:

Aparna Sharma - 2000910100029 Dr. Pradeep Kumar

Ashish Katiyar - 2000910100036
Daksh Garg - 2000910100055

• Speech plays a pivotal role in conveying thoughts, ideas, and intentions.

• Speech recognition is a field of artificial intelligence and computer science that focuses on
the development of systems and algorithms capable of converting spoken language into
written text

• It has garnered increasing significance in recent years due to its diverse applications across
various domains

• It is transforming the way we interact with technology and each other, from enabling hands-
free interactions with devices and facilitating accessibility for differently-abled individuals to
revolutionizing customer service and transcription services

The objectives of speech recognition technology vary depending on the specific application and
context, some of them are as follows:

• Transcription and Documentation: One of the primary objectives of speech recognition is to

transcribe spoken language into written text efficiently and accurately.

• Automation: Speech recognition aims to automate tasks that traditionally required manual input.

• Voice Search: Enabling users to search for information, services, or products by speaking their

• Human-Machine Interaction: The objective is to make human-machine interactions more natural

and intuitive.

• Multilingual and Accents: The objective is to make speech recognition technology capable of
understanding multiple languages and accents

A literature survey of speech recognition would typically involve a comprehensive review of

research and studies in the field, summarizing the key findings, developments, and trends.

• The history of speech recognition dates back to the 1950s. Early efforts primarily involved
pattern matching techniques and simple acoustic models.

• The transition to deep learning techniques, such as Deep Neural Networks (DNNs) and
Convolutional Neural Networks (CNNs)

• The advent of Recurrent Neural Networks (RNNs) and Long Short-Term Memory (LSTM)
networks improved language modeling, leading to better contextual understanding.

• Automatic Speech Recognition(ASR) for developing an effective ASR for different languages
and to show technological perspective of ASR in different countries They have used artificial
neural networks (ANNs), mathematical models of the low-level circuits in the human brain, to
improve speech-recognition performance, through a model known as the ANN-Hidden Markov
Model (ANN-HMM) which have shown improvements in large-vocabulary speech recognition


• Processor Memory (RAM)

• Storage Graphics Processing Unit (GPU)


• Operating System
• Programming Language (Python)
• Machine Learning Frameworks (TensorFlow or PyTorch, Keras)
• Speech Recognition Libraries(SpeechRecognition , Librosa, NLP)
• Development Environment (Jupyter Notebook or JupyterLab)
• Data Preprocessing Tools (Pandas and NumPy, Scikit-learn)
• Documentation and Collaboration (LaTeX or Overleaf, Git and GitHub)

• Problem identification (Weeks 0-1)

• Literature Review (Weeks 1-2)
• Data Collection and Preprocessing (Weeks 2-3)
• Model Analysis (Weeks 3-5)
• Experiments and Evaluation (Weeks 5-7)
• Results, Analysis, and Conclusion (Weeks 7-8)
• Paper Writing and Revision (Weeks 8-11)
• Submission and Review (Weeks 11-12)
• Finalization and Publication (Weeks 12-14)

• In conclusion, this research endeavors to advance the field of speech recognition technology,
offering a deeper understanding of its models, challenges, and potential enhancements.

• The identified challenges, including diverse accents, ambient noise, mispronunciation and
response time, shed light on the limitations of current speech recognition systems.

• The proposed enhancements, encompassing integration of deep learning techniques,

reinforcement learning, and optimized preprocessing methods, are poised to elevate model
performance, making strides towards real-time capabilities and reduced response times.

• This research aspires to contribute to a future where speech recognition technology seamlessly
integrates into our lives, making interactions with technology intuitive, efficient, and




You might also like