Preliminary Synopsis Report 2021-22

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
PROJECT WORK PHASE I

Preliminary Synopsis Presentation Report
Batch No: Date: 17/12/2021

Title of the Project: Artificial Intelligence Based Voice Assistant
Team Members:
USN NAME
1MV18CS080 Rohit Jumnani
1MV18CS091 Saurav Singh
1MV18CS097 Shiva Singh
Guide
Table of Contents
I. Abstract
II. Introduction
III. Literature Survey
IV. System Requirements

● Hardware Requirements
● Software Requirements
V. Proposed Methodology
VI. Plan of Action
VII. Conclusion
VIII. References
Signature of Student
1. Rohit Jumnani(1MV18CS080)
2. Saurav Singh(1MV18CS092)
3. Shiva Singh(1MV18CS097)
I. ABSTRACT
In this modern era, day to day life became smarter and interlinked with technology. We already
know some voice assistance like google, Siri. etc. Now in our voice assistance system, it can act as
a basic medical prescriber, daily schedule reminder, note writer, calculator and a search tool. This
project works on voice input and give output through voice and displays the text on the screen. The
main agenda of our voice assistance makes people smart and give instant and computed
results. The voice assistance takes the voice input through our microphone (Bluetooth and wired
microphone) and it converts our voice into computer understandable language gives the required
solutions and answers which are asked by the user. This assistance connects with the world wide
web to provide results that the user has questioned. Natural Language Processing algorithm helps
computer machines to engage in communication using natural human language in many forms.
II. INTRODUCTION
Today the development of artificial intelligence (AI) systems that can organize a natural human-
machine interaction (through voice, communication, gestures, facial expressions, etc.) are gaining
in popularity. One of the most studied and popular was the direction of interaction, based on the
understanding of the machine by the machine of the natural human language. It is no longer a
human who learns to communicate with a machine, but a machine learns to communicate with a
human, exploring his actions, habits, behaviour and trying to become his personalized assistant.
Virtual assistants are software programs that help you ease your day to day tasks, such as showing
weather reports, creating remainders, making shopping lists etc. They can take commands via text
(online chat-bots) or by voice. Voice-based intelligent assistants need an invoking word or wake
word to activate the listener, followed by the command. We have so many virtual assistants, such
as Apple’s Siri, Amazon’s Alexa and Microsoft’s Cortana.
This system is designed to be used efficiently on desktops. Personal assistants software improves
user productivity by managing routine tasks of the user and by providing information from an
online source to the user.
This project was started on the premise that there is a sufficient amount of openly available data
and information on the web that can be utilized to build a virtual assistant that has access to
making intelligent decisions for routine user activities.
III. LITERATURE SURVEY
 In order to select a proper speech recognition system for Voice Assistant there are various APIs
such as Microsoft Speech API and Google Speech API, with open-source speech recognition
systems such as Sphinx-4.The author in paper [1] compared automatic speech recognition
systems in different environments, by using some audio recordings that were selected from
different sources and calculating the word error rate (WER). Although the WER of the three
aforementioned systems were acceptable, it was observed that the Google API is
superior.Sphinx-4 achieved 37% WER, Microsoft API achieved 18% WER and Google API
achieved 9% WER. Therefore, it can be stated that the acoustic modeling and language model
of Google is superior.
 In recent years, Artificial Intelligence (AI) has shown significant progress and its potential is
growing. An application area of AI is Natural Language Processing (NLP). Voice assistants
incorporate AI by using cloud computing and can communicate with the users in natural
language. Voice assistants are easy to use and thus there are millions of devices that
incorporates them in households nowadays.
 The theory of paper [3] suggests statistical approach for continuous speech recognition with
different approaches for speech recognition system’s using the perceptual linear prediction
(PLP) of speech , for example,
-dimensionality reduction of a pathological voice quality assessment system,
-speech recognition by using the speech signals of spoken words,
 In [4], a scalable, user friendly and highly interactive system has been developed that gives a
reply to each and every query of an end user. It uses both local database and web database for
this purpose. Techniques such as machine learning, pattern matching, NLP, data processing
algorithms have been implemented enhance the performance of the system.
 In [5], the author has made a comparative study between human conversations and human to
chatbot conversations and listed the notable differences in duration and profanity of such
conversations. The background Noise also need to be handled while taking speech input for
efficient recognition of the voice sample.
 In [6], several measurement metrices have been proposed to evaluate overall quality,
performance and usability of conversational agents. Commercial applications of chatbots have
been studied. The authors have also analyzed existing English-speaking commercial chatbots
used in the B2C sector, which have the widest range of users.
IV. SYSTEM REQUIREMENTS
Hardware Requirements-
• Processor - Intel Pentium or Higher Version
• RAM - Minimum 2 GB
• HARD DISK - Minimum 50 GB
Software Requirements -
•VS Code, Anaconda Navigator
• Supported devices -Windows
• Important libraries – Speech Recognition, Pyttsx3, OS,pyAudio
V. PROPOSED METHODOLOGY
Speech Recognition module

The system uses Google’s online speech recognition system for converting speech input to text.
The speech input Users can obtain texts from the special corpora organized on the computer
network server at the information centre from the microphone is temporarily stored in the system
which is then sent to Google cloud for speech recognition. The equivalent text is then received and
fed to the central processor.
Python Backend:
The python backend gets the output from the speech recognition module and then identifies
whether the command or the speech output is an API Call and Context Extraction. The output is
then sent back to the python backend to give the required output to the user.
API calls
API stands for Application Programming Interface. An API is a software intermediary that allows
two applications to talk to each other. In other words, an API is a messenger that delivers your
request to the provider that you’re requesting it from and then delivers the response back to you.
Text-to-speech module
Text-to-Speech (TTS) refers to the ability of computers to read text aloud. A TTS Engine converts
written text to a phonemic representation, then converts the phonemic representation to wave
forms that can be output as sound. TTS engines with different languages, dialects and specialized
vocabularies are available through third-party publishers.
Content Extraction
Context extraction (CE) is the task of automatically extracting structured information from
unstructured and/or semi-structured machine-readable documents. In most cases, this activity
concerns processing human language texts using natural language processing (NLP). Recent
activities in multimedia document processing like automatic annotation and content extraction out
of images/audio/video could be seen as context extraction TEST RESULTS.
VI. PLAN OF ACTION
Detailed Action of Planning:
Task Due date Outcome Challenges faced/ Comments
Idea formation and 02nd November Finished Focused on finding the scope and objective
discussion with team 2021 successfully of the idea, chalked out the plan after a
and guide teacher discussion with the entire team
Finding resources and 15thNovember Finished Started looking for relevant reference
defining the problem 2021 successfully materials and published research work
statement clearly regarding the idea
Literature Review-I, 01st December Finished Read about 6-8 papers as a team and
division of respective 2020 successfully discussed and brainstormed about
work and discussion implementation procedures which could be
around implementing followed
procedure with guide
teacher
Literature Review 2, 24th December In Progress Read more papers making the review work
Synopsis preparation 2021 to be around 12- 15 papers, prepared a
and submission with synopsis containing a brief description of
guide teacher our idea including the hardware and
software requirements and applications of
the solution proposed
Presentation of idea - Late December To be done
Phase 2 2021
-----------------------------
Submission of draft Early to Mid To be done

review/survey paper January 2022
-----------------------------
with guide teacher
Final paper submission Late January To be done

for publishing 2022 to
-----------------------------
February 2022
IX. CONCLUSION
In this paper “Virtual Assistant Using Python” we discussed the design and implementation of
Digital Assistance. The project is built using open source software modules with Python
community backing which can accommodate any updates shortly. The modular nature of this
project makes it more flexible and easy to add additional features without disturbing current
system functionalities.
It not only works on human commands but also give responses to the user based on the query
being asked or the words spoken by the user such as opening tasks and operations. It is greeting the
user the way the user feels more comfortable and feels free to interact with the voice assistant. The
application should also eliminate any kind of unnecessary manual work required in the user life of
performing every task. The entire system works on the verbal input rather than the next one.
XI. REFERENCES
[1] Veton Këpuska1 , Gamal Bohouta2 1,2(Electrical & Computer Engineering Department,
Florida Institute of Technology, Melbourne, FL, USA,March 2017
[3] Department of Cybernetics and Biomedical Engineering, Faculty of Electrical Engineering

and Computer Science, VSB-Technical University of Ostrava,23 October 2020
[4]A. Hajare, P. Bhosale, R. Nanaware and G. Hiremath, "Chatbot for Education System", vol. 3,
no. 2, April 2018
[5] Hill, W. Randolph Ford, and I. G. Farreras, "Real conversations with artificial intelligence: a
comparison between human-human online conversations and humanchatbot conversations,"
Computers in Human Behaviour, 2018
[6] Karolina Kuligowska, “Commercial Chatbot: Performance Evaluation, Usability Metrics, And
Quality Standards Of ECA.”,2021

Preliminary Synopsis Report 2021-22

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Preliminary Synopsis Report 2021-22

Uploaded by

Copyright:

Available Formats

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

PROJECT WORK PHASE I

Batch No: Date: 17/12/2021

III. Literature Survey

IV. System Requirements

VI. Plan of Action

Speech Recognition module

Detailed Action of Planning:

Task Due date Outcome Challenges faced/ Comments

Submission of draft Early to Mid To be done

Final paper submission Late January To be done

[3] Department of Cybernetics and Biomedical Engineering, Faculty of Electrical Engineering

You might also like