Professional Documents
Culture Documents
End To End Voice Assisted Inner Website Navigating Model For Handless and Blind PDF
End To End Voice Assisted Inner Website Navigating Model For Handless and Blind PDF
I. INTRODUCTION
Evolution of natural language processing has made
speech driven actions possible. Such methodologies are in
both hardware such as robotics and software such as voice
assistant software. Speech synthesizer can enable smooth
web crawling for both handicap and blind people. Such an
implementation can be designed using these natural language
processor and website technologies helping handless people.
This solution leverages browsing the web without hands or
keyboards and mouse.
II. RELATED WORK
Voice assistant software [1] provides capabilities for
interacting with the application voice based. Through which
we can hear the conversation of AI [2] and respond to it. It
browses the web and gives you the answer in the form of
human speech. It opens the web pages spoken by you. You
can schedule task and work by it. You can get numerous
helps from it and it delivers services in the form of speech.
But you can’t able to click any link or fill any form or play
media by voice. These website based operation is the focused
work of this research.
III. PROPOSED WORK
This solution makes use of HTML5 Web Speech API and
Javascript. Since browser understands only frontend
languages, it leverages javascript for operation. This
implementation offers voice based browser surfing Fig. 1. System Architecture
functionality.
Fig. 1 describes the architecture of this solution. User who is TABLE 1. COMMANDS
handless or anyone who is in distant position speeks out the Commands Description
commands to perform. These commands are listed in the Link Opens links and clicks the desired one
commands table below. The application provides an Scroll Automatically scrolls the webpage from top
to bottom and vice versa
interface for the user to understand the process of voice
Search Searches the given content in the web
based operation which shows the progress of execution. Play Plays the video
Submit Submit the form
V. MODULES Zoom Perform zoom in current webpage
The solution is divided into multiple modules. Each Copy Copies text of the selected element
module with separate functionality. Fig. 2 shows the module
description. VI. USER INTERFACE
The application is characterized as an extension
1. LINK MODULE : logic, going to be developed as a browser extension and
When the user says link, the application published in Google chrome extension store which ease
searches for all the links in the current tab web for everyone able to easily capable of downloading and
page and assigns a unique identity to it. Thus user installing it, and making use of the application.
can further choose which link to click. Now user
says the required link ID, after which application User interface for this application is designed in a
performs a click operation. The result would be to way for users easy to use and manipulate the features.
navigate to that clicked link by voice. Visually this adds an another layer upon existing current
tabs creating interfaces dynamically based on user speech
2. AUTO SCROLL MODULE : by use of HTML, CSS and JS.
This module features the auto scrolling option
Session for each tab is maintained by the session
enabling the user to scroll automatically just by
saying the command scroll. Also gives the option handler where when you move to another tab and again
come back to the previous tab you see the same thing
to customize the speed of scrolling.
where you left before.
3. MEDIA MODULE : Fig. 3 illustrates an example interface which
Capable of opening media sites such as
depicts the success of matching a command spoken by
youtube.com and others, with the options to play
user.
and pause the video. Features the options to
forward the video and enabling full screen display.
4. TAB MODULE :
This module gives the feature of opening a
new tab, closing an existing tab, going back and
forth between pages. Enables performing voice
based web searching operation just similar to
Google search by voice system.
FIG. 2. MODULES
Fig. 4. User Interface for unmatched command
[2] Conversation of AI: https://ai.googleblog.com/2018/05/duplex-ai-
VII. TEXT TO SPEECH ANALYSIS FOR BLIND system-for-natural-conversation.html
[3] “Web Speech API”, Draft Community Group Report, 21 January
Since blind people cannot able to see the webpage, 2020.
a new methodology can be created to read out the [4] Google Password Manager: https://passwords.google.com.
webpage text thus enabling blind to hear the webpage
content through speech. A screen reader software or text
to speech software can do this job but has one problem.
Since these software reads all the webpage content, it’s
irrelevant for one to hear all those or time consuming to
hear all those content. Say if a webpage has 1000 words,
until the software completes reading the blind person has
to listen to all those text. So it’s unnecessary content and
time consuming. What the blind person needs is to know
what are available in the webpage and what commands
should be said to access those available things. So the
content has to be mapped to some commands and this
mapping should be read out to the user by text to speech
software. So now the blind person can know what he/she
needs from that webpage and how to access that which
followed by that person saying the respective command.
This can be accomplished by two methods :
1. Using Name Attribute In Html Tags :
By giving a name attribute to each tag in
the html elements, we can make the text to
speech software to read those values in the name
attribute in this way we get to know the purpose
of that tags. By this way the blind person can
know about the content. But this is not feasible
since we can’t say that each websites has name
attributes in their tags.
2. XML Based Approach with AI:
An xml based webpage structure for each
webpage can be developed using AI, so that
screen reader can read those xml data by which
the person can know the purpose of content.
CONCLUSION
Knowing Information via web is indispensable for
everyone which is being difficult for handicap and blind
people. Considering this problem we have built this
model using existing technology. Technology is growing
faster than before, which can bring new methods in
browsing and crawling the web which could help blind
and handless people more comfortable.
REFERENCES