Major Project Report On Text and Facial Recognition For Visually Impaired Persons

Major Project Report
on
Text and Facial Recognition for Visually Impaired Persons
Submitted to
Vishveshwarya Group of Institutions, GB Nagar
In partial fulfilment of the requirement for the award degree of

Bachelor of Technology
In
Computer Science and Engineering
By
SOURAV SINGH (1609610092)
SUDHANSHU SINGH (1609610093)
PANKAJ KR DWIVEDI (1609610058)
PINKI KUMARI (1609610059)
Under the guidance

of
Lakshman Singhhave
Asst. Prof, Computer Science & Engineering
Vishveshwarya Group Of Institutions,
Dr. A.P.J. Abdul Kalam Technical University, UP.
0
DECLARATION
I , PANKAJ KUMAR DWIVEDI , student of B.Tech (CSE), hereby declare that the
project titled “TEXT AND FACIAL RECOGNITION FOR VISUALLY IMPAIRED
PERSONS” which is submitted by me to Department of Computer Science and Engineering,
Vishveshwarya Group of Institutions, AKTU University Uttar Pradesh in partial fulfilment
of requirement for the award of the degree of Bachelor of Technology in Computer
Science and Engineering , has not been previously formed the basis for the award of any
degree, diploma or other similar title or recognition.
Date: Pankaj Kumar Dwivedi

(1609610058)
1
CERTIFICATE
On the basis of declaration submitted by PANKAJ KUMAR DWIVEDI, student of B. Tech

CSE , I hereby certify that the project titled “TEXT AND FACIAL RECOGNITION FOR
VISUALLY IMPAIRED PERSONS” submitted to Department of Computer Science and
Engineering, Vishveshwarya Group of Institutions, AKTU University Uttar Pradesh, in
partial fulfilment of the requirement for the award of the degree of Bachelor of Technology
in Computer Science and Engineering, is an original contribution with existing knowledge
and faithful record of work carried out by him under my guidance and supervision.
Date: 11-08-2020
Lakshman Singhhave
Department of Computer Science and Engineering
Vishveshwarya group of institution ,Dadri
Uttar Pradesh, Greater Noida
2
ACKNOWLEDGEMENT
I have taken efforts in this project. However, it would not have been possible without the kind
support and help of many individuals and organizations. I would like to extend my sincere
thanks to all of them.
I am highly indebted to Lakshman Singhhave, Assistant Professor, Vishveshwarya Group of
Institutions, Greater Noida, for his guidance and constant supervision as well as for providing
necessary information regarding the project.She will always prevail upon my remembrance.
I am also greatful to our respected HOD, Prof. Dharmesh Niranjan department of Computer
Science and Engineering for his regular encouragement and endeavors.
I am very fortunate to have unconditional support from my family. I think my parents, who
gave me the courage to get my education, supported me in all achievements throughout my
life.Without their encouragement, this work would indeed have been very difficult for me to
tackle.Above all, I pay my reverence to the almighty god.
Pankaj Kumar Dwivedi

(1609610058)
3
TABLE OF CONTENTS
CHAPTERS PAGE NO.
I. DECLARATION i
II. CERTIFICATE ii
III. ACKNOWLEDGEMENT iii
IV. TABLE OF CONTENTS iv
V. TABLE OF FIGURES v
1. ABSTRACT 1
2. INTRODUCTION 2
3. LITERATURE SURVEY 3-6
4. PROPOSED WORK 7-9
5. IMPLEMENTATION 10-22
5.1 REQUIRED HARDWARE AND SOFTWARE 10-14
5.1.1 Raspberry Pi 3 Model B 10
5.1.2 Pi CAMERA 11
5.1.3 EARPHONE 11
5.1.4 MICRO USB CABLE 12
5.1.5 MICRO SD CARD 12
5.1.6 RASPBIAN 13
5.1.7 OpenCV 13
5.1.8 VNC VIEWER 14
5.1.9 Thonny Python IDE 14
5.2 SYSTEM SETUP 15-16
5.3 CONNECTING TO RASPBERRY PI USING VNC VIEWER
AND OPENING PYTHON IDE 17-18
5.3.1 ENTER RASPBERRY PI’S PRIVATE IP ADDRESS 17
5.3.2 PROVIDE USERNAME (pi) FOR AUTHENTICATION
TO VNC SERVER 17
5.3.3 YOU NEED TO ENTER YOUR CREATED PASSWORD
FOR SECURE LOGIN TO THE SERVER 18
5.3.4 MENUE>>Programming>>Thonny Python IDE 18
5.4 OCR TEXT READER FOR VISUALLY IMPAIRED PERSONS 19-20
5.4.1 RUN THE CODE FOR TEXT READER 19
5.4.2 TEXTS TO BE READ 19
5.4.3 TEXTS RECOGNIZED FROM THE IMAGE 20
5.5 FACE RECOGNITION FOR VISUALLY IMPAIRED PERSONS 20-22
5.5.1 TEXTS RECOGNIZED FROM THE IMAGE 20-21
5.5.2 RESULT OF RECOGNIZED FACE 22
6. ANALYSIS 23
7. CONCLUSION 24
8. FUTURE WORK 24
9. REFERENCES 25-26
4
TABLE OF FIGURES
TITLE PAGE NO.

Table 1. Literature Review Comparison 5-6
Figure 1. Flowchart for Face recognition 7
Figure 2. Flowchart for Text Recognition 8
Figure 3. Block Diagram of the Proposed Method 8
Figure 4. Raspberry pi 3 model B 10
Figure 5. Pi Camera 11
Figure 6. Earphone 11
Figure 7. Micro USB Cable 12
Figure 8. Micro Sd Card 12
Figure 9. Raspbian 13
Figure10. OpenCV 13
Figure11. VNC Viewer 14
Figure12. Thonny Python IDE 14
Figure13. Raspberry pi connections 15
Figure14. Pi Camera Connection 16
Figure15. VNC Viewer Window 17
Figure16. Username for Authentication 17

Figure17. Password for Authentication 18
Figure18. Open Thonny Python IDE 18
Figure19. OCR Reader Code 19
Figure20. Text to be read 19
Figure21. Text recognized output 20
Figure22. Run Face Recognition code 20
Figure23. Recognized face 22
Figure24. Results of text and face recognition 23
5
1. ABSTRACT
Blindness is a visual impairment that affects a 0.7% of the world's population. According to
the latest estimates, almost one million people in Spain suffering from visual disabilities and
due to retinal diseases mentioned, about 70,000 people have total blindness. According to
estimates of the World Health Organization (who), around 285 million people suffer from
some sort of visual impairment, of which 39 million are blind, which means a 0.7% of the
world's population.
Speech and text is the main medium for human communication. A person needs vision to
access the information in a text. However those who have poor vision can gather information
from voice. This project proposes a camera based assistive text reading and facial recognition
to help visually impaired persons in reading the text present on the captured image and
recognising their family members or friends based on the saved data. The faces can be
detected when a person enter into the frame .
The proposed idea also involves text extraction from scanned image using Tesseract Optical
Character Recognition (OCR) and converting the text to speech by e-Speak tool, a process
which makes visually impaired persons to read the text. This is a prototype for blind people
to recognize the products in real world by extracting the text on image and converting it into
speech. Proposed method is carried out by using Raspberry pi and portability is achieved by
using a battery backup. Thus the user can carry the device anywhere and able to use at any
time. Upon entering the camera view previously stored faces are identified and informed
which can be implemented as a future technology.
Also an individual is identified by his/her face. Face is the most important part of a human
body, which is used to distinguish people from one another. Each face has different features
and have different characteristics. So, In human behavior face recognition plays a vital role.
Through this research project, blind people themselves can recognize individuals using a
face-recognition tool to have an audio alert that states the name of the individual so the blind
people then can speak to them without getting to wait for the individual in front of them to
come and talk to them, only having to recognise the individual (provided the information of
the person recorded in the program repository). When the individual's image is not found in
the repository, a voice alert stating "Unknown Person" get sent so that they will be informed
of it. You may also add a fresh face to the repository. Basically, the aim is to get the standard
of sight of the impaired a little closer to the ordinary. They will also be able to read and
understand the texts. They can interpret the name of the drug, the street sign signs, the title of
the film, etc. from the video taken. This technology helps millions of people in the world who
experience a significant loss of vision.
Keywords: Tesseract Optical character recognition (OCR), Open CV, Text detection and
recognition, Face recognition, Text to speech conversion, Visually challenged person.
6
2. INTRODUCTION
Due to eye diseases, age connected causes, uncontrolled polygenic disorder, accidents and
alternative reasons the quantity of visually impaired persons inflated once a year.
one among the foremost vital issue for a visually impaired person is to browse.
per estimates of the globe Health Organization (who), around 285 million individuals suffer
from some variety of impairment, of that thirty-nine million area unit blind, which implies a
zero.7%
Recent developments in mobile phones, computers, and handiness of digital cameras build it
possible to help the blind man by developing camera based mostly applications that mix
laptop vision tools
with alternative existing useful merchandise like Optical Character Recognition (OCR)
system or biometric identification.
This analysis is being undertaken for the "Free supply Machine Vision code," the OpenCV.
it's based mostly on Raspberry Pi and Python.
OpenCV was developed for process productivity, with a significant stress on time period
applications.
Face is that the most significant a part of a person's body, that is employed to tell apart
individuals from each other.
every face has completely different|completely different} options and have different
characteristics.
So, In human behaviour face recognition plays a significant role.
In this project, the blind individuals themselves may be ready to determine individuals
mistreatment face recognition technique and can get associate audio message oral
communication the person’s name and also
the blind individuals themselves may be ready to seek advice from them while not having to
attend for person before of them to return to him and seek advice from
him, simply he needs to determine the person (provided the person details saved in system
database).
If the face of the person isn't gift within the information then associate audio message oral
communication “Unknown Person” are received, so they'll bear in mind of them.
The new face also can be other to the information.
The projected project conjointly involves text extraction from scanned image mistreatment
Tesseract Optical Character Recognition (OCR) and changing the text to speech by e-Speak
tool, a method that makes visually
this can be a image for blind individuals to acknowledge the merchandise in planet by
extracting the text on image and changing the extracted text into speech so they'll
browse sign boards, medication names, name plate, etc .
projected methodology is meted out by mistreatment Raspberry pi and movability is achieved
by employing a power bank.
7
so the user will carry the device anyplace and ready to use it at any time.
Upon coming into the camera read antecedently hold on faces area unit known and au fait,
which might be enforced as a future technology.
This technology helps countless individuals within the world World Health Organization
expertise a major loss of vision.
Basically, the thought is to bring the vision level of blind a small amount additional nearer to
the conventional ones.
3. LITERATURE SURVEY
After searching for the papers on digital libraries that approach the problem of the persons
with less visibility, an overview of the works found on digital libraries are presented in this
section.
1. Krishna et al. (2005) [2] developed a pair of sunglasses with a pinhole camera, which
uses the Principal Component Analysis (PCA) algorithm (Kistler and Wightman,
1992)[3] for face recognition. The idea is to be able to later evolve the system from
face to emotion, gesture and facial expressions recognition. The sunglasses system
was validated with a highly controlled dataset, which uses a precisely calibrated
mechanism to provide robust face recognition.
2. Pun et al. (2007) present a survey on assistive devices for visually impaired persons.
The survey covers works that use video processing for converting visual data into an
alternative rendering modality, such as auditory. Most of these studies focuses on
daily tasks such as navigation and object detection, but not on people recognition.
3. Astler et al. (2011)[4] used a camera atop a standard white cane to perform face
recognition using the Luxand Face SDK (Luxand, 2013), and to identify six kinds of
facials expressions using the Seeing Machines Face API 1.
4. Tanveer et al. (2012) developed a system called FEPS, which uses Constrained Local
Model algorithm for facial expressions recognition providing audible feedback, and
Fusco et al. (2012) proposed a method which combines face matching and identity
verification modules in feedback.
5. Porzi et al. (2013) developed a gesture recognition system for a smartwatch that
increases its usability and accessibility to assist people with visual disabilities. The
user presses the display of smartwatch to start the gesture input. Then, the user
performs a gesture and the signals generated by the integrated accelerometers of
smartwatch are sent via Bluetooth to a smartphone. These signals are processed and
then the system recognizes the gesture and activates the corresponding function.
When the task is completed, the user receives vibration feedback. Moreover, the
system has two modules: one for identifying wet floor signs and one for automatic
8
recognition of predefined logos. A downside of it is that the smartwatch cannot be
directly programmed.
6. Kramer et al. [5] have implemented a client application for a smartphone that acquires
images using the phone’s built-in camera, wirelessly transmits them to a remote
server for identification, receives the recognition results and then transmits them to
the user via the phone’s speech interface. The server application utilizes VeriLook [6],
a commercially available face recognition package, with the ability to detect and
recognize multiple faces per frame. To determine the robustness of the VeriLook
technology to changes in viewpoint, images of 10 subjects were taken from 15
different positions with different head orientations. To make the test more realistic,
images of 78 additional people were also downloaded from the CalTech and
GeorgiaTech face databases and added to the database of known faces. Experiments
showed that VeriLook could tolerate up to 40o and 20o changes in viewpoint and
head tilt angles, respectively. The system has been reported to have high recognition
accuracy based on initial tests conducted with 10 known users.
7. Blind Assistant (2012)[7] is a software platform that integrates many different

functionalities for the visually impaired, namely, place recognition, e-mail (reading
and dictating), colour recognition and bar code reading. We will focus our discussion
on the face recognition module of this system. This solution utilizes the Nano
desktop, a freely available, open-source software aimed at developing computer
vision applications on embedded systems (F. Battaglia et al., 2009) [8]. The system
consists of a handheld console equipped with a pair of RISC microprocessors, a video
accelerator, a wireless connection, a USB port and a slot for flash memory cards. A
web cam connected to the console is used to acquire images of the scene in front of
the user. The images are normalized with respect to luminosity, the faces within them
are detected using the Viola-Jones algorithm (P. Viola and M. Jones, 2001)[9] and
recognition is performed based on the PCA algorithm.
8. Balduzzi et al., IEEE Workshop on, 2010, pp. 45-52.[10] have developed a prototype
for a compact PC that acquires a video stream from a small form video camera and
analyzes it to detect human faces in the scene (by detecting skin-colored regions and
finding faces among them using a cascade of Support Vector Machine (SVM)
classifiers (V. N. Vapnik, 1998); eye and nose detection is then applied to the face
regions to select the faces in which these features are unoccluded). The face
recognition module, which is based on Local Binary Patterns (LBP) (T. Ahonen et al.,
pp. 2037-2041, 2006)[12], attempts to recognize the detected faces. To avoid audio
spamming, this module aggregates the results over N consecutive frames and provides
feedback only if the last N frames have provided some concrete results. If the person
is identified or an unknown person is detected, in either case, an audio feedback is
provided to the user via a speaker set. LBP descriptions were selected based on some
initial tests that demonstrated their superiority over Local Ternary Patterns (X. Tan
and B. Triggs, 2007, pp. 168-182) and Histogram of Gradient (N. Dalal and B. Triggs,
2005, pp. 886-893.) .The system was found to be robust to viewpoint changes of up to
30 degrees. Interviews conducted with prospective users of this prototype revealed
9
that though most people were satisfied with the face detection and feedback speeds,
however, the I/O interface and the face recognition capabilities need to be
substantially improved to meet the users’ expectations.
The concept behind this smart cap is entirely different from any work conducted so far.
Unlike others, this model is focused on the development of arrays of known face encodings.
The dataset of the faces made available to the model, after which the array of recognized
faces is created, which is used to match the features for the identification of the faces. Texts
for visually impaired persons can also be understood through earphones. There are other
innovations such as smart button, smart goggles, and so on, but the concept of this smart cap
is unique, and would be really useful to people lacking vision.
Literature Review Comparison:
Paper Description Advantage Disadvantage
1.Krishna et al. Developed a pair of sunglasses The sunglasses Output given by

(2005) with a pinhole camera, which system was this system is
uses the Principal Component validated with a not so accurate.
Analysis (PCA) algorithm for highly
face recognition. controlled
dataset, which
uses a precisely
calibrated
mechanism to
provide robust
face recognition.
2.Pun et al. (2007) Present a survey on assistive The survey Most of these
devices for visually impaired covers works studies focuses
persons. The survey covers that use video on daily tasks
works that use video processing for such as
processing for converting converting navigation and
visual data into an alternative visual data into object detection,
rendering modality, such as an alternative but not on
auditory. rendering people
modality, such recognition.
as auditory.
3.Astler et al. (2011) Used a camera atop a standard User friendly Identify only six
white cane to perform face and easy to use. kinds of facials
recognition using the Luxand expressions
Face SDK
10
4.Tanveer et al. Developed a system called Face matching Unable to
(2012) FEPS, which uses Constrained and identity recognize every
Local Model algorithm for verification expressions of
facial expressions recognition modules in faces.
providing audible feedback, feedback.
and proposed a method which
combines face matching and
identity verification modules in
feedback.
5.Porzi et al. (2013) Developed a gesture Gesture A downside of it

recognition system for a controlled is that the
smartwatch that increases its smartwatch for smartwatch
usability and accessibility to identifying wet cannot be
assist people with visual floor signs and directly
disabilities. one for programmed.
automatic
recognition of
predefined
logos.
6.Kramer et al. Have implemented a client The server Blind people are
application for a smartphone application unable to use
that acquires images using the utilizes smartphones.
phone’s built-in camera, VeriLook, a
wirelessly transmits them to a commercially
remote server for identification, available face
receives the recognition results recognition
and then transmits them to the package, with
user via the phone’s speech the ability to
interface. detect and
recognize
multiple faces
per frame.
7.Blind Assistant Is a software that integrates Place The system
(2012) many different functionalities recognition, e- consists of a
for the visually impaired,such mail (reading handheld
as, place recognition, e-mail and dictating), console
(reading and dictating), color color
recognition and barcode recognition and
reading. barcode reading
facilities was
present in the
device.
8.Balduzzi et al. Have developed a prototype for The system was The face
a compact PC that acquires a found to be recognition
video stream from a small robust to capabilities need
11
video camera and analyses it to viewpoint to be
detect human faces in the scene changes of up to substantially
(by detecting skin-coloured 30 degrees improved to
regions and finding faces meet the user’s
among them using a cascade of expectations.
SVM classifiers.
Table 1 Literature Review Comparison
4. PROPOSED WORK
The planned system is especially designed to beat the issues that a purblind person can face
after they move publically places.
this technique came up with an inspiration of detection and recognising the faces.
12
Methodology for face recognition supported scientific theory approach of committal to
writing and decryption the face image is mentioned in [Sarala A.
planned methodology is affiliation of 2 stages –
First Stage - Face detection victimisation the created array of legendary face encodings and
their names.
Second Stage - The face within the camera video is matched with the array of legendary
faces.
If the face matches, the espeak tool can speak the name of the matched legendary face.
numerous face detection and recognition ways are evaluated [Faizan Ahmad et al., 2013]
associated conjointly answer for image detection and recognition is planned as an initial step
for
Implementation of face recognition victimisation principal element analysis victimisation
four distance classifiers is planned in [Hussein Rady, 2011]
Figure 1. Flowchart for Face recognition

From figure 4.a, we can see that, the person in the picture frame will be identified first. Then
comes the feature extraction process ,where the features of the detected face are extracted. In
the next step, extracted features are matched with the saved data. If the data matches then we
get the message saying person’s name. Else, we get message saying Unknown Person.
13
Figure 2. Flowchart for Text Recognition
For this text recognition software, the raspberry pi camera module takes an picture from a
live frame as shown in figure 4.b. Once the picture is processed, the text that is translated to
speech using the espeak tool is understood by the interpreted picture tesseract ocr system. It
makes it easier for a person without vision to interpret it using this smart cap.
Figure 3. Block Diagram of the Proposed Method
The diagram of the planned methodology (figure four.c) shows the operating model of the
software package.
Raspberry Pi model detects the face or text and converts the recognized faces or texts into
audio type victimisation applicable techniques.
Open CV is employed for face recognition and Tesseract OCR for text recognition
14
This paper provides the important time application of face recognition which is able to be
terribly helpful for the blind folks.
many face recognition algorithmic program and numerous techniques has been used in
various processes.
The face recognition is taken into account to be a really powerful method.
The existing face recognition system runs on MATlab platform[18] that isn't associate open
supply software package and is a smaller amount transportable.
The PCA technique used with chemist face algorithmic program is wide used.
The disadvantages that occur with the usage of PCA technique has been overcome by Haar
cascade classifier.
Open cv software package is associate open supply software package that is being employed
to run this project with efficiency.
Raspberry pi three B may be a image that is employed to execute the algorithmic program.
The planned model implements a picture process system victimisation OpenCV in Raspberry
Pi three that has UNIX system based mostly package named Raspbian OS.
The handy device, developed for the visually impaired folks, is connected with web via
Raspberry Pi 3’s inbuilt Wi-Fi system that links the device winning information from path
provided
we tend to selected this platform as a result of it's price effective and also the size of the
image is additionally terribly tiny compared to alternative prototypes.
The openCV software package has pre-defined algorithms in it.
Those algorithms will work on the traditional feature detection and also the color detection.
however after we use the haar cascade algorithmic program within the openCV software
package we will get a lot of correct ends up in a quicker period of time.
For reading texts tesseract Optical character recognition (OCR) system is employed.
Optical character recognition (OCR) systems offer folks that ar blind or visually impaired
with the power to scan written text so have it regenerate into loud speech.
the weather of OCR technology ar scanning, recognition, and reading text.
Current OCR technology systems offer excellent accuracy and information capabilities.
The audio output is usually fed to associate ear through phone.
The raspberry pi three B module is employed to run the openCV software package.
It may be charged employing a traditional portable charger.
This module is most well-liked as a result of it's transportable.
 Enterprise Security - Computer Access Control

 Government Events - Terrorist Screening
 Immigration - Illegal immigration detection
 Casino - Filtering VIP’s
15
 Toy - Intelligent robotics
 Vehicle - Safety alert system
5. IMPLEMENTATION
5.1 REQUIRED HARDWARE AND SOFTWARE:
5.1.1 Raspberry Pi 3 Model B:
 Raspberry Pi is a series of small flat-board computers produced in the United
Kingdom by the Raspberry Pi Organization to facilitate simple computer science
instruction in schools and developing countries.
 The initial model was even more common than planned, selling for purposes such as
robotics beyond its target demographic.
 The Raspberry Pi 3 Model B is the first version of the Raspberry Pi third edition.
 In February 2016, the Raspberry Pi 2 Model B was discontinued.
 Raspberry Pi 3 Model B+ is the newest version in the Raspberry Pi 3 series, but more
costly than Model B.
Specifications:
o Processor : Broadcom BCM2387 chipset. 1.2GHz Quad-Core ARM Cortex-A53 802.11 b/g/n
Wireless LAN
o GPU : Dual Core Video Core IV® Multimedia Co-Processor. Provides Open GL. It is ES 2.0,
hardware accelerated OpenVG, and 1080p30 H.264 high-profile decode. Capable of 1Gpixel/s,
1.5Gtexel/s or 24 GFLOPs with texture filtering and DMA infrastructure
o Memory :1GB LPDDR2
o OS :Boots from Micro SD card, running a version of the Linux operating system or
Windows 10 IoT
o Dimensions :85 x 56 x 17mm
o Power : Micro USB socket 5V1, 2.5A
o Connectors : Ethernet10/100 BaseT Ethernet socket
o Video Output: HDMI (rev 1.3 & 1.4Composite RCA (PAL and NTSC)
o Audio Output: Audio Output 3.5mm jack, HDMI USB 4 x USB 2.0 Connector
o GPIO Connector: 40-pin 2.54 mm (100 mil) expansion header: 2x20 strip Providing 27 GPIO pins as
well as +3.3 V, +5 V and GND supply lines
o Camera Connector: 15-pin MIPI Camera Serial Interface (CSI-2)
o Display Connector: Display Serial Interface (DSI) 15 way flat flex cable connector with two data lanes
and a clock lane
o Memory Card Slot: Push/pull Micro SDIO
16
Figure 4. Raspberry pi 3 model B
5.1.2 Pi CAMERA:
 The Raspberry Pi Camera Module is a 5MP CMOS camera with a focus lens good
enough to take still photographs as well as high quality stills.
 The stills are taken at a resolutions of 2592 x 1944.
 Video is powered by 1080p at 30 FPS, 720p at 60 FPS and 640x480 at 60 or 90 FPS.
 The camera is enabled in the current edition of Raspbian, the popular operating
system for Raspberry Pi.
Figure 5. Pi Camera
5.1.3 EARPHONE:
• Translates into an audio output to support those who are impaired.
• The audio source is 3.5mmjack.
• It gives the audio output of the recognized faces or texts.
17
Figure 6. Earphone
5.1.4 MICRO USB CABLE:

 The Micro USB Cable for Raspberry Pi is a standard norm for device charging and
data transfer.
 It could be used not only to control the pi, but also as a way to view it in a single
window.
 The micro-USB cables have 28 gage wires for both data and control. Really cheaper
ones might even go as low as 32 gauge.
 Thinner cords have no effect on data transmission (application cords should be small
so they don't need a ton of electricity), they significantly limit power distribution to
anything like a Raspberry Pi3, which requires a lot of power than the previous
versions.
 Best USB cables should have 24 gauge power wires, so that should be enough for
most people, but if you want more power, I will suggest Micro USB cord.
Figure 7. Micro USB Cable

5.1.5 MICRO SD CARD:
 A1 memory card set, modern shape, Device runs smoother.
18
 Film, save and share more than before.
 Catch life to the best with the groundbreaking capabilities of this A1 microSD chip.
 A1 scored for better device output to open and load applications quicker than it has
ever been.
 Standard Transfer Speeds up to 98MB / s. The content is going extremely quickly.
 You can now move up to 1,200 images in only one minute
Figure 8. Micro Sd Card

5.1.6 RASPBIAN
 Raspbian is the official operating system sponsored by the Base.
 You should mount it with NOOBS or download the file following the setup guide on
the website for free "raspberrypi.org."
 Raspbian comes fully-installed with a ton of training, programming and general use
apps.
 It's got Python, Scratch, Sonic Pi, Android and more.
 The Raspbian Disk file in the ZIP archive is over 4 GB in size, which implies that
such files utilize technologies that are not enabled by outdated unzip software on
certain technologies.
 If you notice that the archive seems to be incomplete or that the file is not unzipped
properly, please try using 7Zip (Windows). It is available for free and has been
checked to properly unzip the file.
Figure 9. Raspbian
5.1.7 OpenCV:
• OpenCV (Open source computer vision) is a library of programming functions mainly
aimed at real-time computer vision (Pulli et al. 2012)[19].
• Originally developed by Intel, it was later supported by Willow Garage then Itseez
19
(which was later acquired by Intel [20]).
• The library is cross-platform and free for use under the open-source BSD license.
• OpenCV supports some models from deep learning frameworks like TensorFlow,
Torch, PyTorch (after converting to an ONNX model) and Caffe according to a defined list of
supported layers [21].
• It promotes OpenVisionCapsules[22], which is a portable format, compatible with all
other formats.
Figure 10. OpenCV

5.1.8 VNC VIEWER:
 Often it's not easy to operate straight on the Raspberry Pi. You may want to operate
on it from any other remotely controlled unit.
 VNC is a virtual application networking program that helps you to remotely monitor
the user environment of a single machine (operating VNC server) from any other
computer or mobile device (operating VNC viewer).
 The VNC Client transmits keyboard and mouse or contact activities to the VNC
Server and provides on-screen notifications in exchange.
 RealVNC Connect VNC is included on the Raspbian.It consists of both VNC Server,
which allows you to control your Raspberry Pi remotely, and VNC Viewer, which
allows you to control desktop computers remotely from your Raspberry Pi.
 You must allow the VNC Server until it can be used.
 You can display the Raspberry Pi screen through a browser on your monitor or phone
or tablet.
Figure 11. VNC Viewer

5.1.9 Thonny Python IDE:
20
 Raspberry Pi (specifically, Raspbian with desktop) comes with a few built-in
programming environments (IDEs) for publishing, running and debugging Python
scripts.
 Thonny was explicitly designed with a particular focus: to be a Python IDE for
beginners.
 This becomes apparent as soon as you launch the software – only the editor and the
terminal can show in the browser.
 At the top, you can see big, easy-to-use buttons for file attachment, loading, saving,
working, stopping, and debugging (figure 5.1.i).
Figure 12. Thonny Python IDE
5.2 SYSTEM SETUP
Figure 13. Raspberry pi connections
 Insert a microSD card:
Your Raspberry Pi requires an SD card to hold all of its data and the Raspbian
operating system. You require a microSD card with a size of at least 8 gb.
21
 Link your raspberry pi to a Television screen or a computer screen:
To access the Raspbian desktop environment, you need to get a screen and a cable to
connect the computer to the Pi. You should have a TV or a computer display or your
phone. If the computer has built-in speakers, Pi may use them to play the music.
 Connect Headphones or speakers:
Big Raspberry Pi versions (but not Pi Zero / Zero W) have a normal audio port like
the one on your phone or MP3 player. You need to attach to your headset or speakers
for audio playback. When you have built-in speakers on your Raspberry Pi display,
Raspberry Pi can play sound from them, you can also attach speakers via wireless
bluetooth devies.
 Connect a Pi Camera:
The Pi camera is linked to the Raspberry Pi through a connector. It has to be activated

from the raspi-config.
 Connect to a power socket:
Every Raspberry Pi versions have a USB port (the same one used on several smart
phones): either a USB-C for Raspberry Pi 4, or a micro USB for Raspberry Pi 3, 2 and
1.
Setting up Pi Camera:
Figure 14. Pi Camera Connection
As shown in Figure 5.2.b, the Pi-Camera cable is placed into the CSI Bus Connector. The top
plate of the connector must be raised to allow the cable to be inserted. The cable is inserted
22
with the metallic connections pointing to the left i.e. towards the HDMI connector. Make
sure the cable is pushed well down. Now push the connector plate down.
The next step is to update the Raspian operating system and to make sure that the Pi-Camera
is configured correctly. The steps to be taken to configure the Pi-Camera software require the
use of the LX Terminal application on the Raspberry Pi. Start the LX Terminal and then
type:-

sudo apt-get update
sudo apt-get upgrade

These update the Raspberry Pi firmware and ensure the latest versions of the Raspbian
operating system is installed. The next step is to make sure that the Pi Camera is enabled.
Type:-

sudo raspi-config

Whatever method you use, find Interfacing Options > Camera option and select Yes. You’ll
be prompted to reboot your Raspberry Pi, so do this and wait for it to reboot. The enable
option will ensure that on reboot the correct GPU firmware will be running for the use of the
Pi Camera. Make sure you reboot the Raspberry Pi once the Pi Camera has been enabled.
Now, you are good to go. Your device setup is ready. Then you need to connect to VNC
Server to configure your drvice.
5.3 CONNECTING TO RASPBERRY PI UXXIIISING VNC VIEWER AND

OPENING PYTHON IDE
5.3.1 ENTER RASPBERRY PI’S PRIVATE IP ADDRESS
Figure 15. VNC Viewer Window
If you have Raspbian, VNC Server is included with your Raspberry Pi. It’s completely free
for non-commercial use. Just enable it.
23
You would have to enter the private IP address of your Raspberry Pi in the VNC Viewer to
access the raspbian desktop environment (figure 5.3.a).
5.3.2 PROVIDE USERNAME (pi) FOR AUTHENTICATION TO VNC SERVER
You need to validate to VNC Server to finish either a direct or a cloud link. If you are
connected from the RealVNC compliant VNC Viewer device, type the user name and
password that you usually use to sign in to your Raspberry Pi user account. When you haven't
changed your Pi username name and password, the default user key is:
 Username: pi
 Password: raspberry
Figure 16. Username for Authentication
5.3.3 YOU NEED TO ENTER YOUR CREATED PASSWORD FOR SECURE

LOGIN TO THE SERVER
Figure 17. Password for Authentication
For secure login and to prevent misuse of the system you need a secure login using username
and password.
24
Note: it is strongly encouraged that you update your password! Anyone that has connections
to your network can quickly access your Pi by using the default login details (figure 5.3.b).
5.3.4 MENUE>>Programming>>Thonny Python IDE
Figure 18. Open Thonny Python IDE
Raspberry Pi has preinstalled python development environment called Thonny Python IDE.
You can write and run your python codes in this IDE. You can access this IDE in raspberry
by following these steps MENUE>>Programming>>Thonny Python IDE. First go to start
menue and then click on programming. After clicking on programming you will see Python
IDE (as shown in figure 5.3.d).
5.4 OCR TEXT READER FOR VISUALLY IMPAIRED PERSONS
5.4.1 RUN THE CODE FOR TEXT READER
Figure 19. OCR Reader Code
After pressing the "Run" icon, our program will begin to run, within a second pi camera will
25
start working with a red light glowing on it.The red light shows that the camera is ready for
detecting the text. If you want to read texts just get that text into the frame.
5.4.2 TEXTS TO BE READ
Figure 20. Text to be read
When the text is in a frame you just need to press key “s” from your keyboard.After that the
program will capture the image and it will detect the texts from the picture.This system is
able to read noisy data as well. By noisy data we mean texts with noisy backgrounds. In the
picture above picture the system is able to read texts from smartphone.This shows that we can
read emails, contact details, caller’s name from our smartphone.
5.4.3 TEXTS RECOGNIZED FROM THE IMAGE
Figure 21. Text recognized output
26
Here we see the figure 5.4.c that we can recognized the texts with better accuracy. With the
help of the system a visually impaired person can read the text from the street signs,
documents, medical descriptions, poetry, the title of the novel, and much more like a person
having normal vision. We just have to press the "s" key on the keyboard, and we'll get the
text in the form of audio via earphones or speakers. Tesseract functions well where there is a
really simple segmentation of the source data.
5.5 FACE RECOGNITION FOR VISUALLY IMPAIRED PERSONS
5.5.1 RUN CODE FOR FACE RECOGNITION
Figure 22. Run Face Recognition code
After pressing the "Run" icon, our program will begin to run, but it will take a few seconds to
start the video. This is because every time we run the code the arrays of known face
encodings and their names are generated. After the development of the array, the red light
will continue to shine on the sensor, which means the sensor is ready to recognize the images.
The following libraries have been used for making this application:
1. numpy
2. opencv
3. face_recognition
4. os
5. pytesseract
1. Numpy – It is a general purpose array processing library which provides high level tools
for working on those arrays. All the elements in the array are of the same type indexed by a
tuple of positive intergers. Here we are using numpy’s resize function which returns an array
which is of a specified size given by the user.
2. cv2 – cv2 is a version of OpenCV which stands for computer vision. As the name
27
computer vision suggests it helps in various image operations which include reading, writing
and other image processing operations which are used in various digital image processing
applications. OpenCV helps us in performing many operations on image matrix but in this
program we use one of the very basic functions of cv2 that is imread(). As the name suggests
it is used to read an image in a matrix form. The image that we want to read should have a
path and should exist in a directory. It uses 2 parameters one being the path and other being
an integer which can be – {1, 0, -1} which represents the format in which we want our image
to be read in. These interpret as Color, Grey Scale, unchanged respectively.
3. Face Recognition- Recognize and manipulate faces from Python or from the command
line with the world’s simplest face recognition library. The face_recognition library, created
by Adam Geitgey, wraps around dlib’s facial recognition functionality, making it easier to
work with.
4. OS- It is possible to automatically perform many operating system tasks. This module
provides a portable way of using operating system dependent functionality.It provides
functions for creating and removing a directory (folder), fetching its contents, changing and
identifying the current directory, etc. The os Python module provides a big range of useful
methods to manipulate files and directories. Here, we have used os library for converting trxts
to speech.
5. pytesseract- Pytesseract is an optical character recognition (OCR) tool for python. That is,
it will recognize and “read” the text embedded in images. Python-tesseract is a wrapper
for Google’s Tesseract-OCR Engine. It is also useful as a stand-alone invocation script to
tesseract, as it can read all image types supported by the Pillow and Leptonica imaging
libraries, including jpeg, png, gif, bmp, tiff, and others. Additionally, if used as a script,
Python-tesseract will print the recognized text instead of writing it to a file.
5.5.2 RESULT OF RECOGNIZED FACE
28
Figure 23. Recognized face
Through this research project, blind people themselves can recognize individuals using a
face-recognition tool to have an audio alert that calls the name of the individual so the blind
people then can speak to them without getting to wait for the individual in front of them to
come and talk to them, only having to recognise the individual (provided the information of
the person recorded in the program repository). When the individual's image is not found in
the repository, a voice alert calling "Unknown Person"
In this way, a visually handicapped person (a person lacking vision) will be able to identify
individuals by their name because they are known to them. By this a visually impaired person
get alerted by the unknown people and they can ask about their identity before their actions.
It's going to give a sight to people lacking vision and also it is good for their safety purpose.
In figure 5.5.b you can see the system has recognized my face and returns my name on the
screen and also I am able to listen my name through earphones connected to the device. If
there will be any other person present in the frame with me whose face is not there in the
database then we will hear a voice calling my name and also unknown person. This shows
that the system is smart enough to recognize two or more people at a time.
This smart cap is very useful for the visually impaired persons. They will also be able to read
and understand the texts. They can interpret the name of the drug, the street sign signs, the
title of the film, etc. from the video taken. This technology helps millions of people in the
world who experience a significant loss of vision.
6. ANALYSIS
By the efficient programming in the module it recognizes the Faces,Texts and gave the audio
output through the earphone. The main purpose of this model is to help blind persons to
29
recognize people and read texts using this system design. It recognizes the face such as
known(utters name if known) and unknown persons will be identified using face recognition
features. It gives the scanned and recognized images in the form of audio output to help and
guide the blind person. It is specially designed to assist blind or visually impaired persons.
Proposed system is tested with several prepared sample texts and faces. An example reading
text and recognising the faces, the experiment results with the proposed smart cap is shown.
The text might be relatively simple, but it proves the basic concept of proposed system
design.
Figure 24. Results of text and face recognition

From the above figure we can see the results of the sample tests. Left one shows the result of
face recognition and the right one shows the result of text recognition. Both the applications
are working properly with better accuracy. Text that has been detected and the name of the
person whose face is recognized are conveyed through earphones.
7. CONCLUSION
30
1. Facial recognition system concludes that the system can facilitate the blind in many
functions. It uses Raspberry Pi kit to execute this method. The Raspberry Pi kit could
be a tiny board which might be used with any software like Raspbian, Linux,
FreeBSD, NetBSD, RISC OS. it'll facilitate the visually impaired persons to
acknowledge folks or scan texts. This planned system doesn't use any previous data
regarding the position of the folks. The experimental analysis is performed on an
oversized dataset. The system can take input from the camera and gets the perimeters
of the photographs. this method presents a very new construct of good cap designed
for visually impaired folks exploitation low price single board laptop raspberry pi
three model B and pi camera. For the demonstration purpose, the cap is intended to
perform text recognition. The system capability but will be simply extended to
multiple tasks by adding additional models to the core program. every model
represents a selected task or mode. The system style, operating mechanism and
principles were mentioned together with some experiment results. This new construct
is predicted to boost the visually impaired students’ lives despite their economic
things. 8.FUTURE WORK Immediate future work includes assessing the user-
friendliness and optimizing the ability management of the computing unit. Future
works includes Text recognition: (a) original image, (b) image when sweetening, (c)
Real time text reading with none use of key. Face Recognition: to differentiate general
human emotions, like happiness and disappointment and different emotions.
Recognition of human emotions would need correct detection and analysis of the
assorted components of somebody's face, just like the brow and also the mouth, to
work out Associate in Nursing individual’s current expression. The expression will
then be compared to what's thought-about to be the essential signs of Associate in
Nursing feeling all told kinsfolk. 9. REFERENCES 1.Fernández, J. L. carús, R.
usamentiaga and R. casado , "Face Recognition And Spoofing Detection System
custom-made To Visually-Impaired People"- IEEE geographic region Transactions,
vol. 14, no. 2, feb. 2016 2.S. Krishna, G. Little, J. Black, and S. Panchanathan, "A
wearable face recognition system for people with visual impairments," in Proceedings
of the seventh international ACM SIGACCESS conference on Computers and
accessibility, Baltimore, MD, USA, 2005, pp. 106-113. 3.M. Turk and A. Pentland,
"Face recognition exploitation Eigenfaces," in Proc. of IEEE Conference on laptop
Vision and Pattern Recognition, 1991, pp. 586–591. 4.D. Astler, H. Chau, K. Hsu, A.
Hua, A. Kannan, L. Lei, M. Nathanson, E. Paryavi, K. Ripple, M. Rosen, H. Unno, C.
31
Wang, K. Zaidi, and X. Zhang, "A microcomputer Vision Device for rising Social
Interactions of the Visually Impaired," University of Maryland2010. 5.K. M. Kramer,
D. S. Hedin, and D. J. Rolkosky, "Smartphone based mostly face recognition tool for
the blind," in Engineering in drugs and Biology Society (EMBC), 2010 Annual
International Conference of the IEEE, 2010, pp. 4538-4541. 6."VeriLook SDK.
Available: http://www.neurotechnology.com/verilook.html." 7.F. Battaglia and G.
Iannizzotto, "An open design to develop a hand-held device for serving to visually
impaired folks," client natural philosophy, IEEE Transactions on, vol. 58, pp. 1086-
1093, 2012. 8.F. Battaglia, G. Iannizzotto, and F. L. Rosa, "An open and transportable
package development kit for hand-held devices with proprietary operative systems,"
IEEE Trans. on Consum. Electron, vol. 55, pp. 2436-2444, 2009. 9.P. Viola and M.
Jones, "Rapid object detection employing a boosted cascade of easy options," in IEEE
laptop Society Conference on laptop Vision and Pattern Recognition, 2001, pp. 511-
518. 10.L. Balduzzi, G. Fusco, F. Odone, S. Dini, M. Mesiti, A. Destrero, and A.
Lovato, "Low-cost face statistics for visually impaired users," in Biometric
Measurements and Systems for Security and Medical Applications (BIOMS), 2010
IEEE Workshop on, 2010, pp. 45-52. 11.V. N. Vapnik, applied math Learning
Theory: Wiley, 1998. 12.T. Ahonen, A. Hadid, and M. Pietikainen, "Face Description
with native Binary Patterns: Application to Face Recognition," IEEE Trans. Pattern
Anal. Mach. Intell., vol. 28, pp. 2037-2041, 2006. 13.X. Tan and B. Triggs,
"Enhanced native texture feature sets for face recognition beneath tough lighting
conditions," in Proceedings of the third international conference on Analysis and
modeling of faces and gestures, Rio {de Janeiro|Rio|city|metropolis|urban center} de
Janeiro, Brazil, 2007, pp. 168-182. 14.N. Dalal and B. Triggs, "Histograms of
orienting Gradients for Human Detection," in Proceedings of the 2005 IEEE laptop
Society Conference on laptop Vision and Pattern Recognition (CVPR'05) - Volume
one - Volume 01, 2005, pp. 886-893. 15.Sarala A. Dabhade & Mrunal S. Bewoor
(2012), “Real Time Face Detection and Recognition exploitation Haar – based mostly
Cascade Classifier and Principal part Analysis”, International Journal of engineering
and Management analysis, Vol. 1, No. 1. 16.Faizan Ahmad, Aaima Najam & Zeeshan
Ahmed (2013), “Image-based Face Detection and Recognition: State of the Art”,
IJCSI International Journal of engineering problems, Vol. 9, Issue. 6, No. 1.
17.Hussein Rady (2011), “Face Recognition exploitation Principle part Analysis with
completely different Distance Classifiers”, IJCSNS International Journal of
32
engineering and Network Security, Vol. 11, No. 10, Pp. 134–144. 18.K Krishan
Kumar Subudhi and Ramshankar Mishra “Human Face Detection And Recognition”.
19. Pulli, Kari; Baksheev, Anatoly; Kornyakov, Kirill; Eruhimov, Victor (1 Apr
2012). "Realtime laptop Vision with OpenCV". Queue: 40:40–
40:56. Ddoi:10.1145/2181796.2206309(inactive 2019-12-01). 20. Intel acquires
Itseez: https://opencv.org/intel-acquires-itseez.html 21.Dmitry Kurtaev,
2018 Available: https://github.com/opencv/opencv/wiki/Deep- Learning-in-OpenCV
Andrey Golubev and Dmitry Kurtaev, Available:
https://github.com/opencv/open_vision_capsules
33

Major Project Report On Text and Facial Recognition For Visually Impaired Persons

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Major Project Report On Text and Facial Recognition For Visually Impaired Persons

Uploaded by

Copyright:

Available Formats

Major Project Report

In partial fulfilment of the requirement for the award degree of

Under the guidance

Date: Pankaj Kumar Dwivedi

On the basis of declaration submitted by PANKAJ KUMAR DWIVEDI, student of B. Tech

Pankaj Kumar Dwivedi

CHAPTERS PAGE NO.

TITLE PAGE NO.

Figure15. VNC Viewer Window 17

Figure16. Username for Authentication 17

Figure21. Text recognized output 20

Figure22. Run Face Recognition code 20

Figure23. Recognized face 22

Figure24. Results of text and face recognition 23

7. Blind Assistant (2012)[7] is a software platform that integrates many different

Paper Description Advantage Disadvantage

1.Krishna et al. Developed a pair of sunglasses The sunglasses Output given by

5.Porzi et al. (2013) Developed a gesture Gesture A downside of it

Table 1 Literature Review Comparison

Figure 1. Flowchart for Face recognition

Figure 3. Block Diagram of the Proposed Method

 Enterprise Security - Computer Access Control

5.1.4 MICRO USB CABLE:

Figure 7. Micro USB Cable

Figure 8. Micro Sd Card

Figure 10. OpenCV

 You must allow the VNC Server until it can be used.

Figure 11. VNC Viewer

Figure 12. Thonny Python IDE

5.2 SYSTEM SETUP

Figure 13. Raspberry pi connections

 Insert a microSD card:

 Connect Headphones or speakers:

The Pi camera is linked to the Raspberry Pi through a connector. It has to be activated

 Connect to a power socket:

Figure 14. Pi Camera Connection

5.3 CONNECTING TO RASPBERRY PI UXXIIISING VNC VIEWER AND

5.3.1 ENTER RASPBERRY PI’S PRIVATE IP ADDRESS

Figure 15. VNC Viewer Window

5.3.2 PROVIDE USERNAME (pi) FOR AUTHENTICATION TO VNC SERVER

Figure 16. Username for Authentication

5.3.3 YOU NEED TO ENTER YOUR CREATED PASSWORD FOR SECURE

Figure 17. Password for Authentication

5.3.4 MENUE>>Programming>>Thonny Python IDE

Figure 18. Open Thonny Python IDE

5.4 OCR TEXT READER FOR VISUALLY IMPAIRED PERSONS

5.4.1 RUN THE CODE FOR TEXT READER

Figure 19. OCR Reader Code

5.4.2 TEXTS TO BE READ

Figure 20. Text to be read

5.4.3 TEXTS RECOGNIZED FROM THE IMAGE

Figure 21. Text recognized output

5.5 FACE RECOGNITION FOR VISUALLY IMPAIRED PERSONS

5.5.1 RUN CODE FOR FACE RECOGNITION

Figure 22. Run Face Recognition code

5.5.2 RESULT OF RECOGNIZED FACE

Figure 24. Results of text and face recognition

You might also like