Download as pdf or txt
Download as pdf or txt
You are on page 1of 60

VIRTUAL INTELLIGENCE VOICE ASSISTANT SYSTEM

A MINI PROJECT REPORT

Submitted by

BALASUBRAMANIAM G [REGISTER NO: 2114120205026]

In partial fulfillment for the award of the degree

of

BACHELOR OF TECHNOLOGY

IN

INFORMATION TECHNOLOGY

PANIMALAR ENGINEERING COLLEGE, CHENNAI 600 123

(An Autonomous Institution)

ANNAUNIVERSITY: CHENNAI 600 025

MAY 2023
VIRTUAL INTELLIGENCE VOICE ASSISTANT SYSTEM

A MINI PROJECT REPORT

Submitted by

BALASUBRAMANIAM G [REGISTER NO: 2114120205026]

In partial fulfillment for the award of the degree

of

BACHELOR OF TECHNOLOGY

IN

INFORMATION TECHNOLOGY

PANIMALAR ENGINEERING COLLEGE, CHENNAI 600 123

(An Autonomous Institution)

ANNAUNIVERSITY: CHENNAI 600 025

MAY 2023

i
ANNA UNIVERSITY: CHENNAI 600 025

BONAFIDE CERTIFICATE

Certified that this project report “VIRTUAL INTELLIGENCE


VOICE ASSISTANT SYSTEM” is the bonafide work of
BALASUBRAMANIAM G (211421205026) who carried out the project under
my supervision.

SIGNATURE SIGNATURE

Dr. M. HELDA MERCY M.E., Ph.D., Dr. B.KARTHIKEYAN M.TECH..,Ph.D.,


HEAD OF THE DEPARTMENT SUPERVISOR
Department of Information Technology Department of Information Technology

Panimalar Engineering College Panimalar Engineering College

Poonamallee, Chennai - 600 123 Poonamallee, Chennai - 600 123

Submitted for the project and viva-voce examination held on

SIGNATURE SIGNATURE

INTERNAL EXAMINER EXTERNAL EXAMINER

ii
DECLARATION

I hereby declare that the project report entitled “VIRTUAL INTELLIGENCE


VOICE ASSISTANT SYSTEM” which is being submitted in partial fulfilment of the
requirement of the course leading to the award of the ‘Bachelor Technology in
Information Technology ’in Panimalar Engineering College, Autonomous
institution Affiliated to Anna university- Chennai is the result of the project carried
out by me under the guidance of Dr. B. KARTHIKEYAN.,M.TECH., Ph.D.,
Professor in the Department of Information Technology. I further declared that I or
any other person has not previously submitted this project report to any other
institution/university for any other degree/ diploma or any other person.

BALASUBRAMANIAM G

Date:
Place: Chennai

It is certified that this project has been prepared and submitted under my guidance.

Date: (Dr.B.KARTHIKEYAN M.TECH.,Ph.d.,)


Place: Chennai (PROFESSOR/ IT)

iii
ACKNOWLEDGEMENT

A project of this magnitude and nature requires kind co-operation and


support from many, for successful completion . We wish to express our sincere
thanks to all those who were involved in the completion of this project.

Our sincere thanks to Our Honorable Secretary and Correspondent,


Dr.P. CHINNADURAI, M.A., Ph.D., for his sincere endeavor in educating us in
his premier institution. We would like to express our deep gratitude to Our
Dynamic Directors , Mrs. C. VIJAYA RAJESHWARI and Dr. C. SAKTHI
KUMAR, M.E., M.B.A., Ph.D., and DR.SARANYA SREE SAKTHIKUMAR,
B.E., M.B.A.,Ph.D., for providing us with the necessary facilities for completion
of this project.

We also express our appreciation and gratefulness to Our Principal Dr.K.


MANI, M.E., Ph.D., who helped us in the completion of the project. We wish
to convey our thanks and gratitude to our head of the department, Dr. M. HELDA
MERCY, M.E., Ph.D., Department of Information Technology, for her support and
by providing us ample time to complete our project.

We also express sincere thanks to our supervisor Dr. B. KARTHIKEYAN,


M.TECH., Ph.D., Professor, Department of Information Technology for providing
the support to carry out the project successfully. Last, we thank our parents and
friends for providingtheir extensive moral support and encouragement during the
course of the project.

iv
ABSTRACT

A voice assistant is an artificial intelligence (AI) application that can understand


and respond to voice commands. It uses speech recognition, natural language
processing, and machine learning techniques to interact with users and provide
personalized and efficient services. The objective of a voice assistant is to provide a
natural and intuitive way for users to interact with technology, improve the user
experience, and automate tasks. Developing a voice assistant project can be a complex
and challenging task that requires expertise in several areas, including speech
recognition, natural language processing, and machine learning. The scope of a voice
assistant project can vary depending on the specific application and purpose of the
project, but it typically includes designing a user-friendly interface, developing a
speech recognition system, implementing natural language processing techniques,
managing dialog flow, and integrating with external systems. Overall, a voice assistant
can be a valuable tool for enhancing the user experience and improving efficiency in
various applications, including home automation, personal assistance, and business
processesToday there is enormous Advancement in the Technical area which is
growing day by day. From the get-go, a couple of occupations could be finished by
PCs, yet with the improvement of new innovations like AI, man-made consciousness,
profound learning, and a couple of others, PCs have advanced to where they can now
finish any work Artificial intelligence has made significant strides recently, and its
capabilities are growing every day. Natural Language Processing is a branch of AI
that has many applications. NLP, or natural language processing, enables humans to
speak their own language to computers. Take Voice Assistant, for instance. Several
speech aides have been created, and their capabilities are still being expanded.to
improve efficiency and get around the difficulty that people have interacting with their
machines. In order to enable users to complete any job without using a keyboard, we
are working to create a speech assistant in Python. This essay's goal is to investigate
the clever behavior of speech assistants and how they can be applied to both
instructional and practical tasks

v
TABLE OF CONTENTS

CHAPTER TITLE PAGE NO.


NO
ABSTRACT V

LIST OF TABLES IX

LIST OF FIGURES X

LIST OF ABBREVATIONS XI

1. INTRODUCTION 1

1.1 OVERVIEW OF THE PROJECT 1

1.2 NEED FOR THE PROJECT 2

1.3 OBJECTIVE OF THE PROJECT 3

1.4 SCOPE OF THE PROJECT 5

2. LITERATURE SURVEY 7

2.1 USERS OF VOICE ASSISTANTS CAN 7


TAKE ADVANTAGE OF A NUMBER OF
FASCINATING SERVICES, INCLUDING

2.2 FEASIBILITY STUDY 9

3. SYSTEM DESIGN 11

3.1PROPOSED SYSTEM ARCHITECTURE 11


DESIGN

3.1.1CONCEPTUAL ARCHITECTURE 11

3.1.2 FEATURES 11

vi
3.1.3BLOCK DIAGRAM FOR PROPOSED 12
SYSTEM

3.2 SYSTEM ARCHITECTURE 13

3.3 MODULE DESIGN 18

3.4 IMPLEMENTATION METHODOLOGY 20

4. REQUIREMENT SPECIFICATION 22

4.1 HARDWARE REQUIREMENT 22

4.2 SOFTWARE REQUIREMENT 23

5. SYSTEM IMPLEMENTATION 24

5.1 CODING 25

5.2 SAMPLE SCREEN SHOTS 35

6. SYSTEM TESTING 39

6.1 TESTCASES AND REPORTS 39

6.2 SOFTWARE TESTING 40

6.2.1 UNIT TESTING 40

6.2.1.1 SPEECH RECOGNITION 40

6.1.1.2 VOICE QUALITY 40

6.2.2 INTEGRATION TESTING 41

6.2.2.1 TEXT-TO-SPEECH INTEGRATION 41

6.2.3 SYSTEM TESTING 41


vii
6.2.3.1 USABILITY TESTING 41

6.2.3.2 PERFORMANCE TESTING 41

6.3 MAINTENANCE 42

7. CONCLUSION 43

7.1 CONCLUSION 43

7.2 FUTURE ENHANCEMENT 44

8. REFERENCES 46

viii
LIST OF TABLES

TABLE NO TITLE DESCRIPTION PAGE NO


3.3.1 IMPORTED MODULES 18
3.1.3 BLOCK DIAGRAM FOR PROPOSED 12
SYSTEM

5.2.1 SPEECH RECOGNITION MODULE 35


6.1 TESTCASES AND REPORT TABLE FOR 39
VIRTUAL ASSISTANT

ix
LIST OF FIGURES

FIG NO FIGURE DESCRIPTION PAGE NO


3.1.3 BLOCK DIAGRAM FOR PROPOSED SYSTEM 18
3.2.1 LEVEL 0 DATAFLOW DIAGRAM FOR 13
VIRTUAL ASSISTANT

3.2.2 LEVEL 1 DATAFLOW DIAGRAM FOR 15


VIRTUAL ASSISTANT

3.2.3 LEVEL 2 DATAFLOW DIAGRAM FOR 16


VIRTUAL ASSISTANT

5.2.1 SPEECH RECOGNITION MODULE 35


5.2.2 SEARCHING YOUTUBE 36
5.2.3 SEARCHING GOOGLE 36
5.2.4 SEARCHING AMAZON 37
5.2.5 PLAY VIDEOS 37
5.2.6 PERFORMS ARITHMETIC CALCULATION 38
5.2.7 PLAY MUSIC 38

x
LIST OF ABBREVATIONS

NLP Natural Language Processing


ASR Automatic Speech Recognition
TTS Text-to-Speech
AI Artificial Intelligence
IoT Internet of Things
STT Speech-to-Text

xi
CHAPTER 1

INTRODUCTION

1.1 OVERVIEW OF THE PROJECT

Virtual reality, augmented reality, voice interface, IoT, and other emerging
technologies are altering people's interactions with the world and redefining digital
experiences. Voice control is an essential leap in human-machine interface made
feasible by advances in artificial intelligence. In this day and age, we can help our
machines to take care of their responsibilities. We can converse with our robots
through remote helpers or train them to think like people by utilizing innovations, for
example, Man-made consciousness, AI, Brain Organizations, etc. Voice has just made
a big comeback. Apple's Siri, Google Colleague, Microsoft's Cortana, and Amazon's
Alexa are instances of individual associates. Because to the widespread use of
cellphones this has been recognised. Voice assistants make use of technology such as
voice recognition and speech synthesis.

Voice assistant capabilities and upgrades are always evolving to deliver improved
performance to users. We created our desktop-based voice assistant using Python
modules and libraries so that our personal voice assistant could function easily and
smoothly on the desktop. The essential thought behind our venture is that the client
makes a solicitation to the voice colleague through the gadget's receiver to finish their
undertaking, and afterward their order is changed into message.The text request is then
routed to processing, which provides a text response as well as work. Together with
fundamental day-to-day functionality, we are attempting to incorporate the concept of
Face detection for security purposes in our voice assistant in order to make it more
flexible and personal. Our application employs the fewest system resources, lowering
the cost of system requirements while also reducing the threat to your system because
it does not directly connect with servers.

In the current scenario, advancement in technologies are such that they can perform
any task with same effectiveness or can say more effectively than us. By making this
1
project, I realized that the concept of AI in every field is decreasing human effort and
saving time. As the voice assistant is using Artificial Intelligence hence the result that
it is providing are highly accurate and efficient. The assistant can help to reduce human
effort and consumes time while performing any task, they removed the concept of
typing completely and behave as another individual to whom we are talking and asking
to perform task. The assistant is no less than a human assistant but we can say that this
is more effective and efficient to perform any task. The libraries and packages used to
make this assistant focuses on the time complexities and reduces time.

1.2 NEED FOR THE PROJECT


To develop a voice assistant, you will need several components and resources. Here
are some of the essential things you will need Programming language You will need
to choose a programming language to develop your voice assistant. Popular choices
include Python, Java, and JavaScript. Speech recognition engine A speech recognition
engine is necessary to convert spoken words into text that the computer can
understand. Some popular speech recognition engines include Google Speech API,
Amazon Transcribe, and Microsoft Azure Speech Services. Natural language
processing (NLP) library An NLP library is used to understand the meaning of the
user's spoken words. Popular NLP libraries include NLTK, SpaCy, and Stanford
CoreNLP. Dialog management system A dialog management system is used to
manage the conversation flow between the user and the voice assistant. It can help the
voice assistant determine what the user wants and how to respond appropriately. Text-
to-speech engine A text-to-speech engine is used to convert the voice assistant's
responses into audible speech. Some popular text-to-speech engines include Google
Text-to-Speech, Amazon Polly, and Microsoft Speech Services. Development tools
You will need a set of development tools to help you create and test your voice
assistant. This may include an integrated development environment (IDE) such as
PyCharm, a version control system such as Git, and a testing framework such as pytest.
Hardware Depending on the type of voice assistant you are developing, you may need
specific hardware components such as a microphone and a speaker. If you are
developing a voice assistant for a specific device, you will need to ensure that it is
compatible with the hardware. Developing a voice assistant can be a complex and
2
challenging task, but with the right resources and tools, it can be a rewarding
experience.

We are familiar with many existing voice assistants like Alexa, Siri, Google
Assistant, Cortana which uses concept of language processing, and voice recognition.
They listens the command given by the user as per their requirements and performs
that specific function in a very efficient and effective manner. As these voice assistants
are using Artificial Intelligence hence the result that they are providing are highly
accurate and efficient. These assistants can help to reduce human effort and consumes
time while performing any task, they removed the concept of typing completely and
behave as another individual to whom we are talking and asking to perform task. These
assistants are no less than a human assistant but we can say that they are more effective
and efficient to perform any task. The algorithm used to make these assistant focuses
on the time complexities and reduces time. But for using these assistants one should
have an account (like Google account for Google assistant, Microsoft account for
Cortana) and can use it with internet connection only because these assistants are
going to work with internet connectivity. They are integrated with many devices like,
phones, laptops, and speakers etc.

1.3 OBJECTIVE OF THE PROJECT

The objectives of developing a voice assistant can vary depending on the specific
application and purpose of the project. However, some common objectives of
developing a voice assistant may include Improving user experience A primary
objective of developing a voice assistant is to improve the user experience by
providing a natural and intuitive way for users to interact with technology.
Personalization Voice assistants can use machine learning techniques to learn from a
user's behavior, preferences, and previous interactions to provide personalized and
efficient services. Automation Voice assistants can automate tasks, such as scheduling
appointments, playing music, and controlling smart home devices, which can save
users time and effort. Accessibility Voice assistants can be helpful for people with

3
mobility or vision impairments who may have difficulty using traditional interfaces.
Innovation Voice assistants are becoming increasingly popular and are being
integrated into more devices and applications. Developing a voice assistant project can
be a way to explore new use cases and create innovative solutions. Learning
opportunity Developing a voice assistant project can be a great way to learn about
natural language processing, machine learning, and artificial intelligence. It can also
be an opportunity to work with different libraries and APIs. Overall, the objectives of
developing a voice assistant can vary depending on the specific application and
purpose of the project. However, the common goal is to provide a useful and
innovative tool that enhances the user's experience and improves efficiency.

Currently, the project aims to provide the Windows Users with a Virtual Assistant
that would not only aid in their daily routine tasks like searching the web, extracting
weather data, vocabulary help and many others but also help in automation of various
activities. In the long run, we aim to develop a complete server assistant, by
automating the entire server management process - deployment, backups, auto-
scaling, logging, monitoring and make it smart enough to act as a replacement for a 6
general server administrator.As a personal assistant, aide assists the end-user with day-
to-day activities like general human conversation, searching queries in various search
engines like Google, Bing or Yahoo, searching for videos, retrieving images, live
weather conditions, word meanings, searching for medicine details, health
recommendations based on symptoms and reminding the user about the scheduled
events and tasks. The user statements/commands are analysed with the help of
machine learning to give an optimal solution.

4
1.4 SCOPE OF THE PROJECT
The scope of a voice assistant project can vary depending on the specific application
and purpose of the project. However, some common areas of scope for a voice
assistant project may include

1. User interface design: This includes designing a user-friendly interface that allows
users to interact with the voice assistant using natural language commands.

2. Speech recognition: Developing a speech recognition system that can accurately


transcribe spoken words into text is a critical part of a voice assistant project.

3. Natural language processing: The voice assistant must be able to understand the
meaning of the user's spoken words, which requires the use of natural language
processing techniques.

4. Dialog management: The voice assistant must be able to manage the conversation
flow between the user and the system to provide a seamless and efficient experience.

5. Personalization: A voice assistant can use machine learning techniques to learn from
a user's behavior, preferences, and previous interactions to provide personalized and
efficient services.

6. Integration with external systems: A voice assistant project may require integration
with external systems, such as smart home devices, calendars, and email systems.

7. Text-to-speech: The voice assistant must be able to convert its responses into
audible speech using a text-to-speech engine.

Overall, the scope of a voice assistant project can be quite broad, requiring expertise
in several areas, including speech recognition, natural language processing, and

5
machine learning. The scope may also be influenced by the specific application and
purpose of the project, such as home automation or personal assistance.Presently,
Assistant is being developed as an automation tool and virtual assistant. Among the
Various roles played by voice aide are

1.Do a web search


2.Play some music or a video.
3. Setting a reminder and an alarm
4.Execute any application or utilisation
5.Obtaining weather forecasts
6.Sending email messages, for example.

There shall be proper Documentation available for making further development easy
and we aim to release our virtual assistant as an Open Source Software where
modifications and contributions by the community are warmly welcomed.

6
CHAPTER 2

LITERATURE SURVEY

2.1 USERS OF VOICE ASSISTANTS CAN TAKE ADVANTAGE OF A


NUMBER OF FASCINATING SERVICES, INCLUDING
The field of voice-based assistants has observed major advancements and
innovations. The main reason behind such rapid growth in this field is its demand in
devices like smartwatches or fitness bands, speakers, Bluetooth earphones, mobile
phones, laptop or desktop, television, etc. Most of the smart devices that are being
brought in the market today have built in voice assistants. The amount of data that is
generated nowadays is huge and in order to make our assistant good enough to tackle
these enormous amounts of data and give better results we should incorporate our
assistants with machine learning and train our devices according to their uses. Along
with machine learning other technologies which are equally important are IoT, NLP,
Big data access management. The use of voice assistants can ease out a lot of tasks for
us. Just give voice command input to the system and all tasks will be completed by
the assistant starting from converting your speech command to text command then
taking out the keywords from the command and execute queries based on those
keywords. In the paper “Speech recognition using flat models” by Patrick Nguyen and
all, a novel direct modelling approach for speech recognition is being brought forward
which eases out the measure of consistency in the sentences spoken. They have termed
this approach as Flat Direct Model (FDM). They did not follow the conventional
Markov model and their model is not sequential. Using their approach, a key problem
of defining features has been solved. Moreover, the template-based features improved
the sentence error rate by 3% absolute over the baseline

Nowadays, we teach our machines to think like humans and do tasks


independently, replacing human labour with machinery. Based on this circumstance,
the idea of a voice assistant emerges, capable of performing a variety of tasks for
people based just on their speech. Virtual assistants are capable of filtering out certain

7
user commands and returning information that is pertinent to the command [1].

1. Play YouTube recordings and music


2. Set cautions or clocks.
3.Send email correspondence.
4. Give weather-related information.
5. Oversee more savvy gadgets
Voice aides' abilities are continually extending because of client request [2].
The AIVA (Microsoft, Google Partner) all case from Google and the as of late
delivered clever colleague known as "AIVA" (2018), which planned to make a voice-
controlled individual collaborator fit for doing various errands, including leading web
look. It contains several brand-new features, such the ability to submit comments on
social media platforms like Facebook, Twitter, etc. using a few straightforward
commands. Also, you can obtain information about the local climate and the weather
[3].
Tulshan clarified that the user's fingers may suffer injuries as a result of constant
typing. To prevent these issues, we must create a system that enables us to complete
tasks using voice commands. The technology will hear the voice and recognise the
words.will be synthesised, and if they make sense or are appropriate, they will be
printed on the screen. The programme will then be assembled and run after this by
identifying the precise keywords [4].

The review led by Dr. Kshama V. Kulhalli analyzed the most well known voice aides,
including Google Right hand, Apple's Siri, and Cortana from Microsoft. This study
reached the resolution that Google Right hand reactions are the most dependable. They
were profoundly great at understanding voice vacillations [5].

8
2.2 FEASIBILITY STUDY
A feasibility study is an important initial step in any project to determine whether
the project is viable and feasible to undertake. Here are some key factors to consider
for a feasibility study for a voice assistant project:

1. Technical feasibility: The first aspect to consider is whether the technology exists
or can be developed to implement the voice assistant project. The project may require
expertise in speech recognition, natural language processing, and machine learning
techniques.
2. Market feasibility: It is important to assess the market demand for a voice assistant
and whether the project will meet the needs and expectations of the target market.
Researching the competition and potential user base can help determine the potential
success of the project.

3. Financial feasibility: A voice assistant project can require significant investment in


terms of development costs, software licenses, and hardware requirements. A
feasibility study should evaluate the costs and potential revenue streams to determine
whether the project can be financially viable.

4. Legal feasibility: It is important to consider legal requirements such as privacy laws,


data security, and intellectual property rights to ensure compliance with legal and
regulatory frameworks.

5. Operational feasibility: It is important to consider the project's operational


requirements, such as technical support, maintenance, and ongoing development, to
ensure the project's sustainability over time.

Overall, a feasibility study is an essential step in determining the viability and potential
success of a voice assistant project. Conducting a thorough feasibility study can help
identify potential challenges, risks, and opportunities associated with the project and
help make informed decisions about whether to proceed with the project.

9
This study showed how the voice recognition system worked in an integrated voice
based delivery system for the purpose of delivering instruction. An added importance
of the study was that the voice system was an independent speech recognition system.
At the time this study was conducted, there did not exist a reasonably priced speech
recognition system that interfaced with both graphics and authoring software which
allowed any student to speak to the system without training the system to recognize
the individual student's voice. This feature increased the usefulness and flexibility of
the system.

10
CHAPTER 3

SYSTEM DESIGN

3.1 PROPOSED SYSTEM ARCHITECTURE DESIGN

The Discourse Acknowledgment library has many underlying components that will
enable the partner to understand the client's order, and the response will be
communicated back to the client in voice using message-to-discourse capabilities, in
the suggested idea powerful approach to implementing an individual voice
collaborator. The fundamental calculations will decipher the client's vocal guidance
into text when the collaborator hears it.
3.1.1Conceptual Architecture

1. Utilizing the mouthpiece to catch discourse designs.


2. Text transformation from sound information acknowledgment.
3. Evaluating the contribution against prelaid out orders.
4. Delivering the expected outcomes.

The data is at first accumulated by means of the mouthpiece as discourse designs


The gathered information is handled in the subsequent stage utilizing NLP, over and
changed over into text based information. The required result process is done in the
subsequent stage by controlling the information in the resultant string utilizing Python
script. The last step is introducing the outcome, which may either be composed or
changed to discourse through TTS.
3.1.2. Features
1.When reached with a particular determined usefulness, it awakens right into it after
ceaselessly listening latently.

2. Riding the electronic on the client's expressed boundaries, giving the ideal result
through sound, and printing the result at the same time on the screen.

11
3.1.3 BLOCK DIAGRAM FOR PROPOSED SYSTEM
A voice assistant system can be developed using various programming languages
and tools depending on the specific platform and features required. Here's a general
overview of the components and architecture of a voice assistant system:
1.Wake word detection: The system listens for a wake word, such as "Hey Siri" or
"Alexa", to activate the voice assistant.
2. Speech recognition: The system converts the user's spoken command into text using
automatic speech recognition (ASR) technology.
3. Natural language processing: The system processes the text to identify the user's
intent and extract relevant information from the command.
4. Action execution: The system performs the requested action, which could involve
sending a message, playing music, setting a reminder, or controlling a smart home
device.
5. Response generation: The system generates a response, either by speaking the
response aloud or displaying it on a screen.
6. Integration with third-party services: The system can integrate with third-party
services to perform more complex actions, such as ordering food or making a
purchase.
The development of a voice assistant system typically involves the use of machine
learning techniques for the speech recognition and natural language processing
components. Additionally, the system needs to be optimized for the specific platform
it will run on, such as a smart speaker or a smartphone, and the user experience should
be carefully designed to ensure that the system is easy to use and understand.

Overall, developing a voice assistant system requires expertise in multiple areas,


including machine learning, natural language processing, and software development.
However, with the right tools and expertise, a voice assistant system can be an
extremely useful and powerful tool for a wide range of applications.

12
3.2 SYSTEM ARCHITECTURE

3.2.1 LEVEL-0 DFD

A Level 0 Data Flow Diagram (DFD) for a voice assistant system shows the overall
flow of data between the system and the user. Here is an example of a Level 0 DFD
for a voice assistant system
13
In this DFD, the user inputs a voice command or request, which is processed by the
voice assistant system. The system then generates an appropriate output, which is
presented to the user. This Level 0 DFD does not show the internal components of the
voice assistant system, such as speech recognition, natural language processing, or
external system integration. It provides a high-level view of the overall flow of data
between the user and the voice assistant system.

User Input: This is the starting point of the process, where the user speaks a command
or request to the voice assistant system. The user input may come from a variety of
sources, such as a smartphone, smart speaker, or other voice-enabled device.

Voice Assistant: The voice assistant component processes the user input using various
technologies, such as speech recognition, natural language processing, and machine
learning. The voice assistant system interprets the user's intent and generates an
appropriate response or action.

Output: The output component generates a response or action based on the user's input
and the system's interpretation of the request. The output may take various forms, such
as a spoken response, a text message, or a command to an external system.

14
3.2.2 LEVEL-1 DFD

In this Level 1 DFD, the system is broken down into more detailed components and
processes that work together to provide the voice assistant functionality. Here is a brief
explanation of each component

User Interface: This is the interface between the user and the voice assistant system,
where the user interacts with the system by speaking voice commands.

Speech Input: The speech input component captures the user's voice commands and
converts them into digital audio signals that can be processed by the system.

Speech Recognition: The speech recognition component analyzes the audio signals to
recognize the user's words and convert them into text format.

Natural Language Processing: The natural language processing component analyzes


the transcribed text to understand the user's intent and extract relevant information,
such as the action the user wants to perform or the context of the user's request
Weather Service, News Service, Music Service: These are examples of external
services that the voice assistant system may integrate with to provide the user with the
requested information or action.

15
Text-to-Speech: The text-to-speech component generates an audible response to the
user's request, which is played back through an audio output device.

Audio Output: This is the final output stage, where the system plays back the generated
response to the user through an audio output device, such as a speaker or headphones.
Overall, the Level 1 DFD provides a more detailed view of the voice assistant system
and its components than the Level 0 DFD. It shows how the different components of
the system work together to process user requests and generate appropriate responses.

3.2.3 LEVEL-2 DFD

In this Level 2 DFD, the system is further broken down into more detailed processes
and components. Here's a brief explanation of each component

User Interface: This is the same as in the Level 1 DFD, where the user interacts with
the voice assistant system by speaking voice commands.

16
Speech Input: This is the same as in the Level 1 DFD, where the system captures the
user's voice commands and converts them into digital audio signals.

Speech Recognition: This is the same as in the Level 1 DFD, where the system
analyzes the audio signals to recognize the user's words and convert them into text
format.

Natural Language Processing: This is the same as in the Level 1 DFD, where the
system analyzes the transcribed text to understand the user's intent and extract relevant
information.
Weather Service, News Service, Music Service, Reminders Service: These are
examples of external services that the voice assistant system may integrate with to
provide the user with the requested information or action.
Response Generation: This component generates an appropriate response to the user's
request based on the user's intent and the information retrieved from external services,
if any.
Text-to-Speech Generation: This component generates an audible response to the
user's request, which is played back through an audio output device.

Audio Output: This is the same as in the Level 1 DFD, where the system plays back
the generated response to the user through an audio output device.
External Services: This component represents any external services that the voice
assistant system may integrate with, such as APIs for weather, news, or music services.

Overall, the Level 2 DFD provides an even more detailed view of the voice assistant
system and its components than the Level 1 DFD. It shows how the different
components of the system work together to process user requests and generate
appropriate responses, and how the system may integrate with external services to
provide additional functionality.

17
3.3 MODULE DESIGN

3.3.1 IMPORTED MODULES

Speech Recognition Module


Realizing your voice is one of the main parts of any voice associate programming,
which is the reason this component is one of the main ones to incorporate. Installing
this module requires running the following instructions in the shell.
Datetime module
Is used to show the Date and Time. Python is as of now incorporated into this datetime
bundle.
Wikipedia
Everybody knows that Wikipedia is a fabulous and broad wellspring of information,
very much like Nerds for Nerds or some other sources. We have involved the
Wikipedia module in our venture to get extra data from Wikipedia or to direct a
Wikipedia search.
Web browser
For conducting an online inquiry. Python already includes this package by default.

18
OS
The Python OS package offers methods for relating to the OS in some way. OS is
included in the basic utility tools for Python. This program offers a strategy for
utilizing working framework subordinate highlights.

PyAudio
is an assortment of Python interfaces for Port Sound, a cross-stage C++ library that
speaks with sound drivers.
Text-to-Speech feature
The expression "text-to-discourse" (TTS) depicts an element that permits machines to
perceptibly understand text. A TTS Motor changes composed message into a
phonemic portrayal, which is then changed into waves that can be sent as sound. TTS
frameworks with different dialects, accents and expert vocabularies are available
through outsider distributers.
Text to speech conversion
Discourse acknowledgment programming is utilized to make an interpretation of
spoken information into composed yield. It unravels the discourse and converts it to
message in a justifiable way.
Setting Extraction
Setting extraction is the course of independently separating coordinated data from
unstructured and additionally semi-organized machine-intelligible texts. This activity
ordinarily includes utilizing regular language handling to dissect texts written in
human dialects.Ongoing advancements in multimodal record handling, like robotized
comment and content extraction from pictures, sound, and video, could be seen as the
results of setting extraction tests.

Written Output
It deciphers the voice order, executes the activity, and afterward shows the voice
request as composed yield in the terminal.

19
3.4 IMPLEMENTATION METHODOLOGY

Using sapi5 and pyttsx3, we enable our application to use system speech. pyttsx3 is
a text-to-discourse interpretation device in Python. It is viable with Python 2 and 3
and capabilities disconnected, dissimilar to contending devices. The Discourse
Application Programming Point of interaction, otherwise called SAPI, is a
Programming interface made by Microsoft to empower the utilization of voice
amalgamation and acknowledgment inside operating system applications. Then, the
program's all's powers are determined in the primary capability. The associate
solicitations input from the client and keeps on looking for guidelines. How much time
for hearing can be changed in light of the necessities of the person. If the helper is
unable to understand the instruction properly, it will repeatedly request the same
instruction from the customer. The client can pick either a manly or ladylike voice for
this partner, contingent upon their inclinations. The right hand's latest release
incorporates capabilities for really looking at climate projections, sending and getting
messages, looking through Wikipedia, opening applications, actually looking at the
time, taking notes, and showing notes. Google, YouTube, and other applications can
be opened and shut
Voice assistant implementation typically involves a combination of software
development methodologies, such as agile or iterative methodologies, and machine
learning techniques. Here is a general methodology that can be used for implementing
a voice assistant system

1. Requirements gathering: Identify the requirements and objectives of the voice


assistant system, including the types of interactions it should support and the data
sources it should use.

2. Data collection and preprocessing: Collect and preprocess data for machine
learning, such as speech data and text data, to train and test the natural language
processing and speech recognition models.

20
3. Model selection and training: Select appropriate machine learning models, such as
deep neural networks, and train them using the preprocessed data.
4. Integration with external APIs: Integrate the voice assistant system with external
APIs and services to enhance its functionality and provide access to additional data
sources.
5. Speech recognition and natural language processing: Develop algorithms and
models for speech recognition and natural language processing, which convert
spoken or written input into structured data that can be used to generate responses.

6. Response generation and text-to-speech conversion: Develop algorithms and


models for generating appropriate responses to user requests, and convert those
responses into audible speech.
7. User interface design and development: Design and develop a user interface that
enables users to interact with the voice assistant system.
8. Testing and validation: Test the voice assistant system to ensure it meets the
requirements and objectives, and validate the accuracy of the models and
algorithms.
9. Deployment: Deploy the voice assistant system to the desired platform, such as a
mobile device or desktop computer
10.Maintenance and updates: Monitor the voice assistant system for errors and update
it regularly to ensure it remains up to date and functional.

The above methodology can be adapted and modified based on the specific
requirements and constraints of the voice assistant system being developed. It is
important to involve users in the development process and gather feedback throughout
the development lifecycle to ensure the system meets their needs and expectations.

21
CHAPTER 4

REQUIREMENT SPECIFICATION

A requirements specification is a key component of a project report that outlines


the detailed description of the project's objectives, scope, and specific requirements.
It serves as a reference point for all stakeholders involved in the project, including
project managers, developers, designers, and clients, to ensure that the project is
developed according to the stated goals and objectives.

The requirements specification document typically includes several sections,


including a description of the project's background and purpose, a detailed overview
of the project's scope and objectives, functional requirements, nonfunctional
requirements, and any constraints or limitations that must be considered during the
project development.

4.1 HARDWARE REQUIREMENT


Hardware requirements specification is a crucial component of a project report,
particularly in software development projects. It outlines the necessary hardware
resources and specifications required to support the software application or system
being developed. A well-documented hardware requirements specification ensures
that the system is designed to meet the necessary hardware capabilities and can
perform optimally.

Processor: Intel 3 10th gen Processor 2.4 GHz or AMD

Memory (RAM): 8GB]

Internet service

In Build Microphone or External Microphone

Speaker (output device)

22
4.2 SOFTWARE REQUIREMENT
Software requirements refer to the specific functional and non-functional needs
of a software system or application. In a project report, the section on software
requirements should provide a clear and comprehensive description of the
requirements that must be met by the software being developed.

Windows 10 or higher

Visual Studio Code

Pycharm

Python

Pyautogui

23
CHAPTER 5

SYSTEM IMPLEMENTATION

In a project report, an implementation description would typically provide a


detailed account of how the project was executed and what steps were taken to achieve
its goals. This section would typically include information on the specific tools,
techniques, and methodologies used to complete the project. The implementation
description should begin by providing an overview of the project's objectives and how
they were translated into concrete actions. It should then provide a step-by-step
account of the various stages of the project, from planning and design to development,
testing, and deployment.

The following will describe the implementation of the proposed voice cloning
system, including the data collection, model training, and evaluation processes.

Data Collection: The first step in building a voice cloning model is to collect data
from the target speaker. The data collection process involves recording a large amount
of high-quality audio samples of the person's voice. The samples must be diverse and
cover various speaking styles, emotions, and accents. The data must also be recorded
in a quiet environment with minimal background noise. The collected data is then
cleaned and preprocessed to prepare it for the next step.

Model Training: The next step is to use the collected data to train a deep learning
model. In this project, we used a neural network-based approach, which is capable of
learning the complex patterns and nuances of the target speaker's voice. The model
consists of an encoder and a decoder, which work together to learn the speaker's voice
characteristics and generate synthetic speech. During the training process, the model
is optimized to minimize the difference between the generated speech and the original
recordings.

24
Deployment: Once the model has been trained and evaluated, it can be deployed to
produce synthetic speech. The model can be integrated into various applications, such
as virtual assistants, text-to-speech engines, and personalized voice assistants. The
deployment process involves optimizing the model for realtime performance and
ensuring its compatibility with the target platform.

5.1 CODING

import pyttsx3
import datetime
import speech_recognition as sr
import wikipedia
import webbrowser
import os
import psutil
import pywhatkit
import pyautogui
import smtplib
import pyjokes
import wolframalpha
import requests
from bs4 import BeautifulSoup

engine=pyttsx3.init('sapi5')
voices=engine.getProperty('voices')
engine.setProperty('voices',voices[0].id)
def speak(audio):
engine.say(audio)
engine.runAndWait()

def wishme():
hour=int(datetime.datetime.now().hour)

25
if hour>=0 and hour<12:
speak("good morning sir")
elif hour>=12 and hour<18:
speak("good afternoon sir")
else:
speak("good evening sir")

speak("I Am Bruce ,, how may I help you")


def takecommand():
r=sr.Recognizer()
with sr.Microphone() as source:
print("listening...")
r.pause_threshold=1
audio=r.listen(source)
try:
print("Recognizing...")
query=r.recognize_google(audio,language='en-in')
print("user said", query)
except Exception as e:
print(e)

speak("say that again please")


query="nothing"
return query

def sendemail(to, subject, content):


server = smtplib.SMTP('smpt.gmail.com', 587)
server.ehlo
server.starttls()
server.login("bkking3328@gamil.com","123test")
server.sendmail("balajisarask@gmail.com", to, content)

26
server.close()

if __name__=="__main__":

wishme()
while True:
query = takecommand().lower()

if "turn on" in query:


speak("I am on,, please tell me what can i do for you sir")
while True:
query = takecommand().lower()

if 'wikipedia' in query:

speak("searching in wikipedia")

query=query.replace("wikipedia","")
results=wikipedia.summary(query,sentences=2)
speak("according to wikipedia")
speak(results)
print(results)

elif 'open youtube' in query:


speak("opening youtube sir")
webbrowser.open("youtube.com")

elif 'open google' in query:


speak("on the way sir")
webbrowser.open("google.com")

27
elif 'open amazon' in query:
speak("opening sir")
webbrowser.open("amazon.com")

elif 'open flipkart' in query:


speak("opening sir")
webbrowser.open("flipkart.com")

elif 'open panimalar' in query:


speak("opening website sir")
webbrowser.open("panimalar.ac.in")

elif 'open instagram' in query:


speak("As per your wish sir")
webbrowser.open("instagram.com")

elif 'open mail' in query:


speak("As per your wish sir")

webbrowser.open("gmail.com")
elif 'play music' in query:
speak("playing music sir")
musicdir="A:\\MUSIC"
songs=os.listdir(musicdir)
print(songs)
os.startfile(os.path.join(musicdir,songs[0]))

elif 'play video' in query:


speak("playing video sir")
musicdir="A:\\MOVIES\\VIDEOS"

28
songs=os.listdir(musicdir)
print(songs)
os.startfile(os.path.join(musicdir,songs[19]))

elif 'play batman movie' in query:


speak("playing movie sir")
musicdir="A:\\MOVIES\\THE BATMAN 2022"
songs=os.listdir(musicdir)
print(songs)
os.startfile(os.path.join(musicdir,songs[1]))

elif 'play series' in query:


speak("playing video sir")
musicdir="A:\\MOVIES\\SERIES"
songs=os.listdir(musicdir)
print(songs)
os.startfile(os.path.join(musicdir,songs[5]))

elif 'open game' in query:


speak("welcome back sir,, all system for gaming will be prepared in
few minutes")
codepath="B:\\GAMES\\Batman - Arkham
Knight\\Binaries\Win64\\BatmanAK.exe"
os.startfile(codepath)

elif 'open devour' in query:


speak("welcome back sir,, all system for gaming will be prepared in few
minutes")
codepath1="B:\GAMES\DEVOUR.v3.2.2\DEVOUR.v3.2.2\DEVOUR.exe"
os.startfile(codepath1)

29
elif 'open fps' in query:
codepath2="C:\ProgramData\Microsoft\Windows\Start
Menu\Programs\Fraps\Fraps.lnk"
os.startfile(codepath2)

elif 'the time' in query:


time=datetime.datetime.now().strftime("%H:%M")
speak(time)
elif 'battery percentage' in query:
battery =psutil.sensors_battery()
percent=battery.percent
speak("the Battery percentage is" + str(percent))

elif 'how are you' in query:


speak("i am fine sir,, thank you ")
speak(" how are you sir")

elif 'who are you' in query:


speak(" let me introduce my self,, im Bruce,, a virtual Artificial intelligence,, im
here to assist you with the varity of tasks as best as i can 24 hours a
day,, seven days week ,, systems are now fully operational")

elif 'good' in query or 'fine' in query:


speak("its good to know that you are fine")

elif 'search youtube' in query:

speak("searching youtube sir")


result = query.replace("search youtube for","")
webbrowser.open("https://www.youtube.com/results?search_query="+result)
speak("found,, what you have searched for")

30
pywhatkit.playonyt(result)
speak("playing video sir")

elif 'search google' in query:


import wikipedia as googleScrap
speak("searching google sir")
result=query.replace("search google for","")
webbrowser.open("https://www.bing.com/search?q="+result)
speak("I got the information sir")
result = googleScrap.summary(result,2)
speak(result)
print(result)

elif 'search amazon' in query:


speak("searching amazon sir")
result=query.replace("search googlefor","")
webbrowser.open("https://www.amazon.in/s?k="+result)
speak(" sir, I found what you have searched for ")

elif 'stop' in query:


pyautogui.press("k")
speak("video paused")

elif 'play' in query:


pyautogui.press("k")
speak("video played")

elif 'mute video'in query:


pyautogui.press("m")
speak("video muted")
elif 'volume up' in query:

31
speak("increasing volume sir")
pyautogui.press("volumeup")

elif 'volume down' in query:


speak("decreasing volume sir")
pyautogui.press("volumedown")

elif 'jokes' in query:


results = (pyjokes.get_joke())
speak(results)
print(results)

elif "temperature" in query:


app = wolframalpha.Client("LPPQ5A-U9WPHJA6RY")
result = app.query(query)
speak(next(result.results).text)
print(result)

elif 'calculate' in query:


app = wolframalpha.Client("LPPQ5A-U9WPHJA6RY")
speak("what should I calculate?")
gh = takecommand().lower()
result = app.query(gh)
speak(next(result.results).text)
print(next(result.results).text)

elif "take screenshot" in query:


speak("please sir hold the screen for few seconds, I am taking screenshot")
img = pyautogui.screenshot()
img.save("screenshot.png")

32
speak(" I am done sir, the screenshot is saved in our main folder")

elif "open" in query:


speak("opening sir")

query = query.replace("open","")
query = query.replace("app","")
pyautogui.press("super")
pyautogui.typewrite(query)
pyautogui.sleep(1)
pyautogui.press("enter")

elif "close notepad" in query:


speak("okay sir, closing notepad")
os.system("taskkill /f /im notepad.exe")

elif "close discord" in query:


speak("okay sir, closing")
os.system("taskkill /f /im discord.exe")

elif "close fps" in query:


speak("okay sir, closing")
os.system("taskkill /f /im fraps.exe")

elif "close game" in query:


speak("okay sir, closing")
os.system("taskkill /f /im BatmanAK.exe")

elif "close telegram" in query:


speak("okay sir, closing")
os.system("taskkill /f /im telegram.exe")
33
elif "click my picture" in query:
speak(" wait, I am opening camera")
pyautogui.press("super")
pyautogui.typewrite("camera")
pyautogui.press("enter")
pyautogui.sleep(2)
speak("smile sir, i am taking picture")
pyautogui.press("enter")

elif "shutdown" in query or "shutdown system" in query:


speak("shutdowning system sir")
os.system("shutdown /s /t 1")

elif "logout" in query or "logout system" in query:


speak("logging out system sir")
os.system("shutdown /1")

elif "restart" in query or "restart system" in query:


speak("restarting system sir")
os.system("shutdown /r")

elif 'quit' in query:


speak("quitting sir")
break

elif 'exit' in query:


speak("exitting sir,, have a nice day")
exit()

elif "send email" in query:


try:

34
speak("what should I say?")
content = takecommand()
to = "balajisarask@gmail.com"
sendemail(to, content)
speak(" sir, Email has been sent sucessfully")
except Exception as e:
speak("sorry email cannot be sent")
print(e)

5.2 SAMPLE SCREEN SHOTS

5.2.1 SPEECH RECOGNITION MODULE

35
5.2.2 SEARCHING YOUTUBE

5.2.3 SEARCHING GOOGLE

36
5.2.4 SEARCHING AMAZON

5.2.5 PLAY VIDEOS

37
5.2.6 PERFORMS ARITHMETIC CALCULATION

5.2.7 PLAY MUSIC

38
CHAPTER 6

SYSTEM TESTING

6.1 TESTCASES AND REPORTS

TEST TESTCASE/ EXPECTED ACTUAL PASS/


CASE ID ACTION TO BE RESULT RESULT FAIL
PERFORMED

1 Speech It should recognized PASS


recognition recognize
speech
2 Text to speech It should converted PASS
convert text to
speech
3 Noisy It should detect Not Detected FAIL
Background voice

4 Turn on It should detect Detected PASS


command command

5 Search google It should Action PASS


search performed
google

6 No internet It should have Action not FAIL


network performed

7 Open any It should open Detected PASS


application apps

39
6.2 SOFTWARE TESTING

Software testing is an essential part of the software development life cycle (SDLC).
It involves evaluating the functionality and performance of a software application to
identify any defects or bugs that may hinder its smooth operation. The testing process
helps in ensuring that the software meets the requirements and specifications outlined
in the project plan.

Software testing is a critical component of any software development project. It


helps in identifying defects or bugs that could lead to the failure of the software
application, which could be costly and time-consuming to fix. Proper planning and
execution of the testing process can help ensure that the software application meets
the required specifications and delivers the desired outcomes .

6.2.1 Unit Testing


Unit testing involves the testing of each unit or an individual component of the
software application. It is the first level of functional testing. The aim behind unit
testing is to validate unit components with their performance. A unit is a single testable
part of a software system and is tested during the development phase of the application
software.

6.2.1.1 Speech recognition


Voice assistant speech recognition refers to the process of identifying and
interpreting spoken words and phrases by a voice assistant system. Speech recognition
is a critical component of a voice assistant system since it allows the system to
understand and interpret spoken commands and respond accordingly.

6.1.1.2 Voice quality


This test involves inputting text and verifying that the synthesized voice output is
of high quality and accurately reflects the input text. This can be done by playing the
synthesized voice output and comparing it to the expected result.

40
6.2.2 Integration Testing
Integration testing is a type of software testing that verifies that different
components of a system work together correctly. This type of testing is typically
conducted after unit testing and before system testing. The goal of integration testing
is to identify any issues or defects that arise from the interaction between different
components of the system.

6.2.2.1 Text-to-speech integration:


This test involves inputting text and verifying that the text-to-speech model
produces accurate and high-quality speech output. This test would ensure that the
model is properly integrated with the rest of the system.

6.2.3 System Testing


System testing is a type of software testing that evaluates the entire system or
application as a whole, rather than individual components or functions. It is performed
to ensure that the system meets its specified requirements and that it functions
correctly and reliably under various conditions.

6.2.3.1 Usability testing


Usability testing is an important step in the development of any voice assistant
system. It involves testing the system's usability by evaluating its effectiveness,
efficiency, and user satisfaction. Here are some key considerations for conducting
usability testing of a voice assistant system

6.2.3.2 Performance Testing


Performance testing is an important aspect of voice assistant system development
to ensure that the system can handle the load and provide a satisfactory user
experience. Here are some key considerations for conducting performance testing of
a voice assistant system.

41
6.3 MAINTENANCE
Maintaining a voice assistant system is critical for ensuring its ongoing performance
and reliability. Here are some key aspects of voice assistant maintenance

1. Monitoring: Regular monitoring of the system is important to ensure that it is


running smoothly and to identify any issues or errors that may arise. This includes
monitoring the system's hardware and software components, network connectivity,
and user feedback.

2. Updates and upgrades: Keeping the system up-to-date with the latest software
updates and upgrades is critical for maintaining its performance and security. This
includes updating the voice recognition software, operating system, and any third-
party components.

3. Data management: Managing the data used by the voice assistant system is
important for maintaining its accuracy and efficiency. This includes regularly
cleaning up old or outdated data, and ensuring that the system is properly trained
on new data.

4. Testing: Regular testing of the voice assistant system is important to ensure that it
is functioning properly and to identify any issues or errors that may arise. This
includes testing the system's response time, accuracy, and reliability.

5. User feedback: Gathering feedback from users is important for identifying areas
where the system can be improved. This can include feedback on the system's
accuracy, speed, and overall user experience.

42
CHAPTER 7

CONCLUSION

7.1 CONCLUSION
The personal desktop-based voice assistant made with open-source software visual
studio code as an implementation tool has been addressed in terms of its purposes,
methodology, and implementation specifics. All generations, as well as those with
certain infirmities or specific instances, will benefit from this project. The desktop
voice assistant will be simple to use and will lessen the need for physical labour to do
a variety of activities.

Moreover, this colleague can play out a ton of things with simply a voice order,
including sending messages to the client's cell phone, robotizing YouTube, and getting
information from Google and Wikipedia. The ongoing voice partner framework's
capacities are restricted to working on the web and on work areas (needed)The voice
collaborator framework is secluded in plan, making it conceivable to add new abilities
without influencing existing framework usefulness.

In conclusion, a voice assistant is a powerful technology that has revolutionized the


way people interact with their devices and access information. From simple tasks like
setting reminders and playing music to more complex tasks like managing finances
and controlling smart home devices, voice assistants have become an integral part of
our daily lives. The development of a voice assistant system requires careful planning,
implementation, and maintenance. The system must be designed to accurately
recognize and interpret natural language commands, respond in a timely and efficient
manner, and continually improve over time through feedback and data analysis.

Usability and performance testing are critical steps in the development process to
ensure that the system is functioning as intended and providing a satisfactory user
experience. Ongoing maintenance is also essential for ensuring the system's continued
performance and reliability. As voice assistant technology continues to evolve, there
43
are endless possibilities for its use in various industries and applications. It will be
exciting to see how this technology continues to develop and enhance our lives in the
years to come.

7.2 FUTURE ENHANCEMENTS


As voice assistant technology continues to evolve, there are many potential areas
for future enhancement and improvement. Here are a few possibilities.

• Make voice aide to learn more on its own and develop a new skill in it.

• Voice aide android app can also be developed.

• Make more voice terminals.

• Voice commands can be encrypted to maintain security.

1. Natural language processing: While current voice assistant systems have made
significant strides in natural language processing, there is still room for
improvement. Future enhancements could include more advanced natural language
processing capabilities, such as better understanding of context and tone, and the
ability to process more complex commands.

2. Multimodal interfaces: Current voice assistant systems primarily rely on voice


input, but future enhancements could include the ability to recognize and respond
to other forms of input, such as gestures, facial expressions, or even brain signals.

3. Personalization: As voice assistant technology continues to develop, there is


potential for systems to become even more personalized to individual users. This
could include better understanding of user preferences and habits, and tailoring
responses and recommendations accordingly.

44
4. Integration with other technologies: Voice assistant systems could potentially be
integrated with other emerging technologies, such as augmented reality or virtual
reality, to create more immersive and interactive experiences.

5. Improved security and privacy: With the increasing use of voice assistants for
sensitive tasks such as online banking and personal communication, there will be
a need for even stronger security and privacy protections to prevent unauthorized
access to user data.

Overall, the future of voice assistant technology is exciting and full of possibilities.
As the technology continues to advance, we can expect to see even more innovative
and intuitive voice assistant systems that enhance our daily lives in meaningful ways.

45
CHAPTER 8

REFERENCES

1. Harshit Agrawal, Nivedita Singh, Gaurav Kumar, Dr. Diwakar Yagyasen, Mr.
Surya Vikram Singh. "Voice Assistant Using Python" An International Open
Access-revied, Refereed Journal.Unique Paper ID: 152099, Publication Volume &
Issue: Volume 8, Issue 2, Page(s): 419-423.

2. George Terzopoulos, Maya Satratzemi “Voice Assistants and Smart Speakers in


Everyday Life and In Education”, Department of Applied Informatics,
University of Macedonia, Thessaloniki, Greece.

3. [Deepak Shende. Ria Umabiya, Monika Raghorte, Aishwarya Bhisikar. Anup


Bhange. "Al Based Voice Assistant Using Python", International Journal of
Emerging Technologies and Innovative Research (www.jetir.org), ISSN2349-5162,
Vol.6, Issue 2, page no.506-509, February-2019.

4. Tulshan, Amrita & Dhage, Sudhir. (2019). “Survey on Virtual Assistant: Google
Assistant, Siri, Cortana, Alexa”, 4th International Symposium SIRS 2018,
Bangalore, India, September 19–22, 2018, Revised Selected Papers. 10.1007/978-
981-13- 5758-9_17.

5. Dr. Kshama V. Kulhalli, Dr.Kotrappa Sirbi, Mr. Abhijit J. Patankar, "Personal


Assistant with Voice Recognition Intelligence", International Journal of
Engineering Research and Technology. ISSN 0974-3154 Volume 10, Number 1
(2017).
6. RAJA, K. D. P. R. A. (2020). “Jarvis ai using python.

7. Sangpal, R., Gawand, T., Vaykar, S., and Madhavi, N. (2019). “Jarvis: An inter-
pretation of AIML with integration of gtts and python.” 2019 2nd International Con-
ference on Intelligent Computing, Instrumentation and Control Technologies (ICI-
CICT), Vol. 1. 486– 489.

8. Steen, J. and Wilroth, M. (2021). “Adaptive voice control system using ai.

9. Terzopoulos, G. and Satratzemi, M. (2019). “Voice assistants and artificial


intelligence in education.” Proceedings of the 9th Balkan Conference on
Informatics. 1–6.

10. Tibola, L. R. and Tarouco, L. M. R. (2013). “Interoperability in virtual world.”


XVIII Congreso Argentino de Ciencias de la Computación.

11. Vora, J., Yadav, D., Jain, R., and Gupta, J. (2021). “Jarvis: A pc voice assistant.
12. asirian et al. (2017) Malodia et al. (2021) Vora et al. (2021) Tibola and
46
Tarouco(2013) Sangpal et al. (2019) RAJA (2020) Beirl et al. (2019) Terzopoulos
and Satratzemi

13. (2019) Alotto et al. (2020) Steen and Wilroth (2021) Canbek and Mutlu (2016)

14. Keerthana S, Meghana H, Priyanka K, Sahana V. Rao, Ashwini B “Smart Home


Using Internet of Things ”, proceedings of Perspectives in Communication ,
Embedded -systems and signal processing, 2017

15. Sutar Shekhar, P. Sameer, Kamad Neha, Prof. Devkate Laxman, “An Intelligent
Voice Assistant Using Android Platform”, IJARCSMS, ISSN: 232-7782, 2017.

16. Rishabh Shah, Siddhant Lahoti, Prof. Lavanya. K, “An Intelligent Chatbot using
Natural Language Processing”, International Journal of Engineering Research ,
Vol.6 , pp.281-286, 2017.

17. Luis Javier Rodríguez-Fuentes, Mikel Peñagarikano, AparoVarona, Germán


Bordel, “GTTS-EHU Systems for the Albayzin 2018 Search on Speech
Evaluation”, proceedings of IberSPEECH, Barcelona, Spain, 2018.

47

You might also like