Text To Speechh Technology

S peech t o Te x t Re c o g n i t i o n a n d L a ng u a g e
Tra ns l a t i o n Te c h n o l o g y
MINI P R O J E C T REPORT
Submitted by
A r u n a va S i k d e r ( 2 1 BS D 7 0 0 4 )
Mandrajula Arun Prabhu Teja(21Bsd7027)
Thaddi Mojesh(21BSD7050)
M. Vinay(21BSD7030)
G. Uday Kiran(21BSD7018)
in partial fulfilment for the award of the degree

of
Bachelor of
Science in
Data Science
V I T- A P University
Andhra Pradesh
1|Page
Speech to Text Recognition and Language Translation
Technology
ABSTRACT
Speech Recognition and Language Translation: Bridging Communication Gaps In our increasingly
interconnected world, effective communication is paramount. Speech recognition and language
translation technologies have emerged as pivotal tools in overcoming linguistic barriers and
fostering seamless interactions. This abstract delves into the core concepts of these fields,
exploring their synergies and impact on global communication.Speech Recognition:Advancements
in speech recognition technology have revolutionized the way we interact with devices and
systems. From virtual assistants to transcription services, the ability of machines to accurately
convert spoken language into text has far-reaching implications. The abstract highlights the
underlying principles of speech recognition, such as acoustic modeling and language modeling,
and examines the challenges and breakthroughs that have shaped its evolution.
Language Translation:Language translation goes beyond mere word substitution; it encompasses
the nuanced understanding of cultural context and idiomatic expressions. Modern machine
translation systems leverage artificial intelligence and neural networks to achieve more accurate
and context-aware translations. This abstract discusses the methodologies behind machine
translation, including statistical and neural approaches, shedding light on their strengths and
limitations
.Interplay Between Speech Recognition and Language Translation:The synergy between speech
recognition and language translation is pivotal in developing systems that seamlessly convert
spoken words in one language into written or spoken words in another. Integrating these
technologies enhances accessibility and fosters cross-cultural communication. The abstract
explores how these two domains complement each other and the challenges posed by real-time,
multilingual communication scenarios.
Challenges and Future Directions:While significant progress has been made, challenges persist in
achieving perfect accuracy and addressing language nuances.
2|Page
The abstract concludes by discussing potential avenues for future research, such as the
incorporation of context-aware models and the development of more inclusive and diverse datasets
to further improve the robustness and effectiveness of speech recognition and language translation
technologies.
In summary, this abstract provides a comprehensive overview of the advancements, challenges,
and future directions in speech recognition and language translation, emphasizing their pivotal role
in shaping the way we communicate in our interconnected global society.
Keywords:
Speech Recognition, Language Translation, Communication Gaps, Interconnected World, Virtual
Assistants, Acoustic Modeling, Language Modeling, Machine Translation, Artificial Intelligence,
Neural Networks, Context-aware Translations, Synergy, Cross-cultural Communication, Challenges,
Breakthroughs, Future Directions, Real-time Communication, Multilingual Communication, Ethical
Considerations, Global Impact
3|Page
ACKNOWLEDGEMENT
This project has been a collaborative endeavor, and its successful completion is the result of the
dedication, support, and expertise of several individuals and groups. We would like to express our
sincere gratitude to those who have contributed to the realization of this speech-to-text and
language translation mini project.First and foremost, we extend our deepest appreciation to our
professor Dr. Manimaran Aridoss, whose guidance and insights were invaluable throughout the
development process. Your mentorship provided a steady direction, fostering an environment of
learning and growth.Our gratitude extends to our team members who worked tirelessly to
overcome challenges, share ideas, and contribute their unique skills to make this project a
success. Each member played a crucial role, and the collaborative spirit within the team has been
a driving force behind the project's achievements.
Special thanks to the participants who volunteered their time for testing and providing valuable
feedback. Your input significantly enhanced the functionality and usability of the speech-to-text
and language translation system.
Lastly, we acknowledge the broader community and the wealth of knowledge available through
open-source contributions. The collaborative nature of the tech community has been a constant
source of inspiration and assistance throughout our project.
This project marks not only a technological achievement but also a journey of personal and
collective growth. The support and collaboration of each individual and entity mentioned above
have been crucial, and we sincerely appreciate the role each one has played in the success of this
endeavor.
4|Page
CONTRIBUTIONS BY THE TEAM:
THADDI MOJESH CODING
ARUNAVA SIKDER CODING AND

DOCUMENTATION
M. ARUN PRABHU TEJA INFORMATION
COLLECTIOM
MYNÈNI VINAY INFORMATION
COLLECTION
UDAY KIRAN DOCUMENTATION
5|Page
TABLE OF CONTENTS
1.Introduction 7
2.Overview-------------------------------------------------------------------------------------------
-9
1. Interesting realms --------------- 9
2. Challenges ---------------------------------------------------------------------------------
3.Proposed Approach : Speech t o t e xt Te chnol ogy---- ----- ---- - ------------------
4. L a n g u a g e T r a n s l a t i o n 13
10
5. APPENDIX 15
A.1 Speech Recognition: 15
A.1.1 Install the require
libraries 15
A.1.2 Define the

languages 16
A.1.3 Import the

libraries 17
A.1.4 Create the interface 20

A.1.5 Key steps: 25
6 Conclusion: 27
7 References: 28
6|Page
1.
INTRODUCTION
In an era characterized by rapid globalization and technological advancement, the

seamless exchange of information across linguistic boundaries is integral to
fostering effective communication and understanding. Two groundbreaking
technologies, speech recognition and language translation, have emerged as
transformative forces, reshaping the landscape of human-computer interaction and
cross-cultural communication
In many areas of research for processing the Natural Language, we were using
Google, Siri, and many other inbuilt tools for conversion of Natural Language to
text. So, in our research, we have performed the conversions based on the
references of all the existing speech recognition tools. To start our exploration to
understand the working mechanism of speech recognition we have continued the
entire research in python. This makes it easy for the conversion of Natural
Language to multilingual text. As we discussed earlier, language acts as a bridge
between people considering it in mind, we have built a model that takes the input
from the user who is willing to speak. This recognized speech data is recorded in
the database and it is translated into the language that they select it to be
displayed. This model is developed by adding Multi linguistic features to the
existing Google Speech Recognition model based on some of the Natural
Language processing principles.
7|Page
When you try to create a model that supports the speech recognition feature then you need
to import some packages such as SpeechRecognition, PyAudio, Tkinter. This makes our
model easy to recognize the user's voice. SpeechRecognition makes to recognize the
speech, it depends on the utterances, the echo of the source, and one must speak more
clearly to get recognized by the computer system. Once the speech is recognized then it
breaks the sound and translates it into the desired text. So here we used the Tkinter
package, to use this Tkinter package one must go to the prompt and try to install the
package using the command pip install tkintertable. This package allows us to obtain the
pop-up window-based output.
So, when we take the input from the user the model translates all the speech data into the
desired text. The Natural Language processing makes the computer system to recognize
the speech and the imported packages help to develop the pop-up message box which
contains text output.
Speech Recognition: Speech recognition, also known as Automatic Speech Recognition
(ASR), is a pivotal technology that enables machines to interpret and comprehend spoken
language. The evolution of ASR has witnessed remarkable progress, from early systems
based on pattern matching to sophisticated models driven by machine learning algorithms.
Today, speech recognition is not only a cornerstone of virtual assistants and voice-activated
devices but also a key player in transcription services, voice biometrics, and hands-free
operation of various applications
.*Language Translation:Language translation, the art of rendering spoken or written content
from one language into another, has experienced a paradigm shift with the advent of
machine translation. Traditional rule-based systems have given way to statistical and neural
machine translation models, leveraging the power of artificial intelligence to grasp linguistic
nuances and context. Modern translation technologies not only break down language
barriers but also contribute to cross-cultural understanding by preserving the cultural and
contextual richness of the original content.
8|Page
2.Overview
Intersecting Realms:
While speech recognition and language translation address distinct
aspects of communication, their synergy is increasingly evident in
applications that aim to facilitate seamless multilingual interactions. The
integration of these technologies allows for real-time translation of spoken
words, opening new frontiers in international collaboration, travel, and
accessibility. As these fields continue to advance, the promise of achieving
fluid, natural communication across diverse languages becomes more
tangible.
Challenges and Opportunities:

However, the journey towards flawless speech recognition and translation
is not without challenges. Accents, dialects, and varying contextual
interpretations pose hurdles, necessitating ongoing research and
development. Moreover, ethical considerations surrounding privacy, bias,
and the impact of technology on linguistic diversity warrant careful
attention.This introduction sets the stage for a deeper exploration of the
intricacies, advancements, challenges, and the profound impact of speech
recognition and language translation technologies on the way we
communicate and connect in our globally interwoven society. As we delve
into the realms of these transformative technologies, we uncover the
innovations that hold the potential to redefine the boundaries of human
expression and understanding.
9|Page
3. Proposed Approach: Speech to Text Recognition
Speech-to-text (STT) technology, also known as automatic speech recognition (ASR), is

a technology that converts spoken language into written text. This technology has
evolved significantly over the years and is used in various applications and industries.
Here's an overview of how speech-to-text technology works and its applications:
1. Setting the Stage for Speech Recognition:

• Installing the Essential Components: To embark on our speech-to-text adventure,
we must first gather the necessary tools. Using pip, the package manager for
Python, we install the SpeechRecognition, google_trans_new, pyttsx3, libespeak1,
and PyAudio libraries. These libraries provide the foundation for our speech
recognition, translation, and text-to-speech capabilities.
.
11 | P a g e
Bringing the Tools into Play: Once the essential libraries are installed, we
must import them into our programming environment. We'll import
speech_recognition as sr, google_trans_new, and pyttsx3 to harness the
power of speech recognition, translation, and text-to-speech
2. Initializing the Speech Recognition Engine:

• Awakening the Listener: To capture the spoken words from our
environment, we must first initialize the speech recognition engine.
Using sr.Recognizer(), we bring the engine to life, ready to listen and
interpret the sounds around us.
• Adapting to the Acoustic Landscape: The real world is full of

background noises, so we must prepare our speech recognition engine
to handle these distractions. We'll use
recognizer.adjust_for_ambient_noise(source, duration=1) to calibrate
the engine for the specific acoustic environment .
3. Capturing and Recognizing the Spoken Symphony:

• Recording the Spoken Words:
With our speech recognition engine ready, we can now record the spoken
words using sr.Microphone(). This will capture the audio stream from our
microphone, ready for analysis.
• Decoding the Acoustic Signals:
Using recognizer.recognize_google(audio, language='en'), we'll unleash
the power of Google's speech recognition API. This will transform the
recorded audio into text, converting the spoken symphony into a
meaningful sequence of words.
4. Spanning Languages with Machine Translation:

• Enlisting the Translator's Expertise:
• To expand our reach beyond a single language, we'll enlist the help of
the google_translator() function. This will create a translator object,
ready to bridge the language divide.
12 | P a g e
.Translating the Recognized Words:
With our translator in hand, we can now translate the recognized text into
Spanish using translator.translate(result, dest='es'). This will transform the
English words into their Spanish counterparts, enabling seamless
communication across languages.
5. From Text to Sound: Giving Voice to the Translated Words:

• Setting the Pace of Delivery:
Before we give the translated text a voice, we'll adjust the speech rate
using engine.setProperty('rate', 150). This ensures that the spoken output is
clear and easy to understand.
• Breathing Life into the Translated Text:
With engine.say(translated_text), we'll convert the translated text into
audible speech. This process transforms the text into sound, allowing us to
hear the translated words as they were originally intended to be spoken.
• Executing the Speech Synthesis Process:
Finally, we'll execute the speech synthesis process using
engine.runAndWait(). This will transform the translated text into a spoken
reality, bringing the translated words to life through the power of speech
synthesis.
13 | P a g e
Language Translation
Language translation is the conversion of spoken or written
content from one language to another, preserving its original
meaning and intent, facilitating communication and
understanding across diverse linguistic communities.
There are two main types of language translation:

• Professional translators and interpreters are skilled individuals
who provide accurate translations of written or spoken content,
considering cultural nuances and context, while interpreters
specialize in real-time translations, often in settings like
conferences, meetings, or interviews.
• Machine translation methods include Rule-Based Machine
Translation (RBMT), Statistical Machine Translation (SMT), and
Neural Machine Translation (NMT). RBMT uses predefined
linguistic rules and dictionaries, while SMT analyzes large
bilingual corpora to identify patterns and relationships. NMT uses
artificial neural networks, particularly deep learning models, to
enhance translation accuracy and capture complex language
structures.
14 | P a g e
Setting the Stage for Language translation :
• Installing the Essential Components: To embark on our Language
translation adventure, we must first gather the necessary tools.
Using pip, the package manager for Python, we install the
mtranslate ,google_trans_new, and googletrans libraries. This
librarie provide the foundation for our translation, and change the
recoginizedtext into another language (from English to any other
language which are available in gooletrans new
• Importing libraries
1. import translator from mtranslate
2. Import language from googletrans
15 | P a g e
APPENDIX
A.1
SpeechRecognition
A.1.1
#install the required libraries
Buy using pip commond install the
libraries
pip install SpeechRecognition

pip install google_trans_new
pip install sudo apt-get
pip install libespeak1
pip install pyttsx3
pip install PyAudio
these are required
libraries to implementation the
our programm
16 | P a g e
Define the languages:
#code
from googletrans import LANGUAGES
LANGUAGES
{'af': 'afrikaans’,
'sq': 'albanian’,
'am': 'amharic’,
'ar': 'arabic’,
'hy': 'armenian’,
'az': 'azerbaijani’,
'eu': 'basque’,
'be': 'belarusian’,
'bn': 'bengali’,
'bs': 'bosnian’,
'bg': 'bulgarian’,
'ca': 'catalan’,
'ceb': 'cebuano’,
'ny': 'chichewa’,
'zh-cn': 'chinese (simplified)’,
'zh-tw': 'chinese (traditional)’,
'co': 'corsican’,
'hr': 'croatian’,
'cs': 'czech’,
'da': 'danish’,
'nl': 'dutch’,
'en': 'english’,
'eo': 'esperanto’,
'et': 'estonian’,
'tl': 'filipino',............
'cy': 'welsh’,
'xh': 'xhosa’,
'yi': 'yiddish’,
'yo': 'yoruba’,
'zu': 'zulu’} #there are so many languages are
available in googletrans
17| P a g e
#import the libraries
import speech_recognition as sr
from google_trans_new import google_translator
import pyttsx3
This line initializes a speech recognition Recognizer

object from the SpeechRecognition library. This object
is responsible for recognizing speech from audio
sources.
This line initializes a text-to-speech engine using the pyttsx3

library. This engine is responsible for converting text into
audible speech.
This line creates a context manager for working with the

microphone. The source variable represents the
microphone as an audio source.
18 | P a g e
This block adjusts the recognizer for ambient noise. It
helps in minimizing the impact of background noise on
the accuracy of speech recognition.
The adjust_for_ambient_noise method is used to
measure the ambient noise level for one second and
adjust the recognizer accordingly.
This block records audio from the microphone using the

listen method of the recognizer. The timeout parameter
sets the maximum time the recognizer will wait for a
phrase to be spoken. If no speech is detected within the
timeout, the recording stops. In this case, the timeout is
set to 1 second.
19 | P a g e
This block attempts to recognize the speech using
Google's Web Speech API. The recognize_google
method is called with the recorded audio data (audio).
The language parameter is set to 'en' for English. If
speech is successfully recognized, the result is printed.
If an exception occurs (for example, due to no speech
being detected or a network error), the exception is
caught and printed.
Finally, the pyttsx3 library is used to convert the recognized
text into audible speech, but the code you provided doesn't
include this part. If you want to use pyttsx3 to read out the
recognized text, you would typically add something like:
This would make the text-to-speech engine read out the

recognized text
20 | P a g e
Creating the Interface
from tkinter import *: Imports all classes, functions, and

constants from the Tkinter library
root = Tk(): Creates the main window object
root.geometry('1100x320'): Sets the initial size of the window

to 1100 pixels in width and 320 pixels in height.
root.resizable(0, 0): Disallows resizing of the window.
Window Styling:
root['bg'] = 'blue': Sets the background color of the window to

blue.
root.title('nlp project'): Sets the title of the window to 'nlp
project'.
21 | P a g e
1. This line creates a Label widget and adds it to the root
window (root).
2. The label displays the text 'Language Translation'.
3. The specified font is 'Arial 20 bold'.
4. The background color (bg) is set to 'green'.
5. The pack() method is used to automatically adjust the size
and placement of the label within the window.
1. This line creates another Label widget and adds it to the

root window.
2. The text displayed on the label is a variable result. Ensure
that result is defined and holds the desired text before this
line is executed.
3. The specified font is 'Arial 13 bold'.
4. The background color (bg) is set to 'pink'.
5. The place() method is used to specify the exact
coordinates (x=180, y=90) where the label will be placed
within the window.
1. This line creates yet another Label widget and adds it to

the root window.
2. The label displays the text 'Recognized text |<|'
22 | P a g e
3. The specified font is 'Arial 13 bold’.
4.The background color (bg) is set to 'yellow’.
5.The place() method is used to specify the exact coordinates
(x=165, y=40) where the label will be placed within the window
1. This line creates another Label widget and adds it to the

root window.
2. The text displayed on the label is a variable result. Ensure
that result is defined and holds the desired text before this
line is executed.
3. The label displays the text ‘Output'
4. The specified font is ‘arial 13 bold'.
5. The background color (bg) is set to ‘yellow'.
6. The place() method is used to specify the exact
coordinates (x=780, y=90) where the label will be placed
within the window.
Creating an Entry Widget for Text Input:
Input_text = Entry(root, width=60): This line creates an Entry widget

and assigns it to the variable Input_text.
root: Specifies the parent widget (in this case, the main window).
width=60: Sets the width of the entry widget to accommodate up to
60 characters.
23| P a g e
Placing the Entry Widget in the Window:
Input_text.place(x=30, y=130): This line uses the place()

method to determine the exact coordinates where the entry
widget will be positioned within the window. In this case, the
entry widget is placed at x-coordinate 30 and y-coordinate
130
Creating a Text Widget for Multiline Output:

output_text = Text(root, font='arial 10', height=5,
wrap=WORD, padx=5, pady=5, width=50): This line creates a
Text widget and assigns it to the variable output_text.
• root: Specifies the parent widget (in this case, the main
window).
• font='arial 10': Sets the font of the text inside the Text
widget to Arial with a size of 10.
• height=5: Specifies the number of lines (rows) visible in the
Text widget.
• wrap=WORD: Enables word wrapping, meaning that long
lines of text will wrap at word boundaries.
• padx=5, pady=5: Adds horizontal and vertical padding
inside the Text widget.
• width=50: Sets the width of the Text widget to
accommodate up to 50 characters per line.
Placing the Text Widget in the Window:
output_text.place(x=700, y=130): This line uses the place()
method to determine the exact coordinates where the Text
widget will be positioned within the window. In this case, the
Text widget is placed at x-coordinate 700 and y-coordinate
130.
24 | P a g e
Creating a List of Languages:
language = list(LANGUAGES.values()): This line creates a list of

languages by extracting the values from the LANGUAGES dictionary.
Each language in this list corresponds to an available language option
for translation.
Creating a Combobox for Language Selection:
dest_lang = ttk.Combobox(root, values=language, width=22): This line

creates a Combobox (dropdown menu) and assigns it to the variable
dest_lang.
root: Specifies the parent widget (in this case, the main window).
values=language: Sets the available options in the dropdown menu to
the list of languages created earlier.
width=22: Sets the width of the Combobox to 22 characters.
Placing the Combobox in the Window:
dest_lang.place(x=130, y=160): This line uses the place() method to

determine the exact coordinates where the Combobox will be
positioned within the window. In this case, the Combobox is placed at
x-coordinate 130 and y-coordinate 160.
Setting Default Selection:
dest_lang.set('choose language'): Sets the default selection in the

Combobox to 'choose language'. This is the initial text displayed
before the user makes a selection.
25 | P a g e
key steps:
Initialization:
Create a Translator object from the Googletrans library.
Input Retrieval:
Retrieve the input text from the Input_text Entry widget.
Destination Language Retrieval:
Retrieve the selected destination language from the dest_lang
Combobox.
Translation:
Check if both the input text and destination language are
provided.
If provided, use the translator to translate the input text to the
selected language.
Clear the existing content in the output_text Text widget.
Insert the translated text into the output_text Text widget.
26| P a g e
Error Handling:
If there is an exception during the translation process, print an
error message.
Missing Input or Language:
If either the input text or the destination language is missing,
print an error message
This line creates a Button widget and assigns it to the variable

trans_btn
Output:
27| P a g e
CONCLUSION
In conclusion, the journey through the development of our mini project

on speech-to-text and language translation has been both challenging
and rewarding. Our primary objective was to bridge the communication
gap by harnessing the power of technology, and we can confidently
assert that we have made significant strides in achieving this
goal.Through the implementation of robust speech-to-text algorithms,
we have successfully transformed spoken words into written text,
unlocking a world of possibilities for individuals with hearing
impairments and facilitating efficient transcription services. The
accuracy and speed of our speech-to-text system showcase the
potential for real-world applications, from transcription services to
voice-activated commands in various domains.Moreover, the
integration of language translation capabilities has broadened the
scope of our project, enabling users to seamlessly communicate across
linguistic barriers. Our system's ability to accurately translate spoken or
written words from one language to another holds great promise in
fostering global communication, breaking down language barriers, and
promoting cultural exchange.
28 | P a g e
REFERENCES:
1 Mrinalini Ket al: Hindi-English Speech-to-Speech Translation System for Travel

Expressions, 2015 International Conference On Computation Of Power, Energy,
Information And Communication.
2 Development and Application of Multilingual Speech Translation Satoshi Nakamura',
Spoken Language Communication Research Group Project, National Institute of
Information and Communications Technology, Japan.
3 Speech-to-Speech Translation: A Review, Mahak Dureja Department of CSE The
NorthCap University, Gurgaon Sumanlata Gautam Department of CSE The NorthCap
University, Gurgaon. International Journal of Computer Applications (0975 – 8887)
Volume 129 – No.13, November2015.
4 Sequence-to-Sequence Models for Emphasis Speech Translation. Quoc Truong
Do,Skriani Sakti; Sakriani Sakti; Satoshi Nakamura, 2018 IEEE/ACM
5 Olabe, J. C.; Santos, A.; Martinez, R.; Munoz, E.; Martinez, M.; Quilis, A.; Bernstein, J.,
“Real time text-to speech conversion system for spanish," Acoustics, Speech, and Signal
Processing, IEEE International Conference on ICASSP '84. , vol.9, no., pp.85,87, Mar
1984.
6 Kavaler, R. et al., “A Dynamic Time Warp Integrated Circuit for a 1000-Word
Recognition System”, IEEE Journal of Solid-State Circuits, vol SC-22, NO 1, February
1987, pp 3-14.
7 Aggarwal, R. K. and Dave, M., “Acoustic modeling problem for automatic speech
recognition system: advances and refinements (Part II)”, International Journal of Speech
Technology (2011) 14:309–320.
8 Ostendorf, M., Digalakis, V., & Kimball, O. A. (1996). “From HMM’s to segment
models: a unified view of stochastic modeling for speech recognition”. IEEE
Transactions on Speech and Audio Processing, 4(5), 360– 378.
9 Yasuhisa Fujii, Y., Yamamoto, K., Nakagawa, S., “AUTOMATIC SPEECH
RECOGNITION USING HIDDEN CONDITIONAL NEURAL FIELDS”, ICASSP
2011: P-5036-5039.
10 Mohamed, A. R., Dahl, G. E., and Hinton, G., “Acoustic Modelling using Deep Belief
Networks”, submitted to IEEE TRANS. On audio, speech, and language processing,
2010.
11 Sorensen, J., and Allauzen, C., “Unary data structures for Language Models”,
INTERSPEECH 2011.
12 Kain, A., Hosom, J. P., Ferguson, S. H., Bush, B., “Creating a speech corpus with semi-
spontaneous, parallel conversational and clear speech”, Tech Report: CSLU-11- 003,
August 2011
13 Speech recognition for English to Indonesian translator using Hidden Markov Model.
Hariz Zakka Muhammad, Muhammad Nasrun, Casi Setianingh, Muhammad Ary Murti,
2018 International Conference on Signals and Systems (ICSigSys).
14 Automatic Speech-Speech translation form of English language and translate into Tamil
language. J Poornakala, A Maheshwari. International Journal of InnovativeResearch in
Science.Vol. 5, Issu e3, March 2016 ... DOI:10.15680/IJIRSET.2016.0503110. 3926.
15 Multilingual speech-to-speech translation system for mobile consumer devices.Seung
Yun, Young-Jik Lee, Sang-Hun Kim.2014 IEEE Transaction on Consumer Electronics.
16 Speech Recognition using Python, Speech To Text Translation in
29| P a g e

Text To Speechh Technology

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Text To Speechh Technology

Uploaded by

Copyright:

Available Formats

S peech t o Te x t Re c o g n i t i o n a n d L a ng u a g e

Mandrajula Arun Prabhu Teja(21Bsd7027)

in partial fulfilment for the award of the degree

THADDI MOJESH CODING

ARUNAVA SIKDER CODING AND

A.1.2 Define the

A.1.3 Import the

A.1.4 Create the interface 20

In an era characterized by rapid globalization and technological advancement, the

Challenges and Opportunities:

Speech-to-text (STT) technology, also known as automatic speech recognition (ASR), is

1. Setting the Stage for Speech Recognition:

2. Initializing the Speech Recognition Engine:

• Adapting to the Acoustic Landscape: The real world is full of

3. Capturing and Recognizing the Spoken Symphony:

4. Spanning Languages with Machine Translation:

5. From Text to Sound: Giving Voice to the Translated Words:

There are two main types of language translation:

pip install SpeechRecognition

This line initializes a speech recognition Recognizer

This line initializes a text-to-speech engine using the pyttsx3

This line creates a context manager for working with the

This block records audio from the microphone using the

This would make the text-to-speech engine read out the

from tkinter import *: Imports all classes, functions, and

root = Tk(): Creates the main window object

root.geometry('1100x320'): Sets the initial size of the window

root['bg'] = 'blue': Sets the background color of the window to

1. This line creates another Label widget and adds it to the

1. This line creates yet another Label widget and adds it to

1. This line creates another Label widget and adds it to the

Creating an Entry Widget for Text Input:

Input_text = Entry(root, width=60): This line creates an Entry widget

Input_text.place(x=30, y=130): This line uses the place()

Creating a Text Widget for Multiline Output:

language = list(LANGUAGES.values()): This line creates a list of

dest_lang = ttk.Combobox(root, values=language, width=22): This line

dest_lang.place(x=130, y=160): This line uses the place() method to

dest_lang.set('choose language'): Sets the default selection in the

This line creates a Button widget and assigns it to the variable

In conclusion, the journey through the development of our mini project

1 Mrinalini Ket al: Hindi-English Speech-to-Speech Translation System for Travel

You might also like