D8 - Major - Project Report Final

SCHOOL OF COMPUTER SCIENCE AND ENGINEERING
A PROJECT REPORT
ON
“OBJECT RECOGNITION AND FACE RECOGNITION
FOR VISUALLY IMPAIRED PEOPLE”
Submitted in fulfillment of the requirements for the award of the Degree of
BACHELOR OF TECHNOLOGY
IN
COMPUTER SCIENCE AND ENGINEERING
Submitted by
Manoj Sharma V (R18CS218)

Manthan N R (R18CS219)
Lalith Kumar Nitesh Kumar Mehtha (R18CS200)
Krishnapuram Kalyan Kumar Reddy (R18CS191)
Under the guidance of

Prof. Bindushree D C
Assistant Professor,
School of CSE
REVA University
May 2022
Rukmini Knowledge Park, Kattigenahalli, Yelahanka, Bengaluru-560064
www.reva.edu.in
SCHOOL OF COMPUTER SCIENCE AND ENGINEERING
CERTIFICATE
Certified that the project work entitled “Object Recognition and Face Recognition for Visually
Impaired People” carried out under our guidance by Prof. Bindushree D C, Assistant Professor,
School of CSE, REVA University, the bonafide students of REVA University during the academic
year 2021-22. Mr. Manoj Sharma V (R18CS218), Mr. Manthan N R (R18CS219), Mr. Lalith
Kumar Nitesh Kumar Mehtha (R18CS200), Mr. Krishnapuram Kalyan Kumar Reddy
(R18CS191) is submitting the project report in partial fulfillment for the award of Bachelor of
Technology in Computer Science and Engineering during the academic year 2021–22. The
project report has been tested for plagiarism and has passed the plagiarism test with the similarity
score less than 16%. The project report has been approved as it satisfies the academic requirements
in respect of project work prescribed for the said degree.
Signature with date Signature with date
Dr. Ashwinkumar
Prof. Bindushree D C.
Guide Deputy Director
Signature with date
External Examiner
Dr. M Dhanamjaya
Vice Chancellor
Name of the Examiner with affiliation Signature with Date
1.
2.
DECLARATION
We, Mr. Manoj Sharma V (R18CS218), Mr. Manthan N R (R18CS219), Mr. Lalith Kumar
Nitesh Kumar Mehtha (R18CS200), Mr. Krishnapuram Kalyan Kumar Reddy
(R18CS191), students of B. Tech, belong to School of Computer Science and Engineering,
REVA University, declare that this Project Report entitled “Object Recognition and Face
Recognition for Visually Impaired People” is the result the of project work done by us under
the supervision of Mrs. Bindushree D C., Assistant. Professor, School of CSE, REVA
University.
I am submitting this Project Report in partial fulfillment of the requirements for the award of the
degree of Bachelor of Technology in Computer Science and Engineering by the REVA
University, Bengaluru during the academic year 2021-22.
I declare that this project report has been tested for plagiarism and has passed the plagiarism test
with the similarity score less than 16% and it satisfies the academic requirements in respect of
Project work prescribed for the said Degree.
I further declare that this project or any part of it has not been submitted for any other Degree of
this University or any other University/ Institution.
1.
2.
3.
4.
(Signature of the Students)
Signed on 27th May 2022.
Certified that this project work submitted by Mr. Manoj Sharma V (R18CS218), Mr.
Manthan N R (R18CS219), Mr. Lalith Kumar Nitesh Kumar Mehtha (R18CS200), Mr.
Krishnapuram Kalyan Kumar Reddy (R18CS191), has been carried out under our guidance
and the declaration made by the candidates is true to the best of my knowledge.
Signature of Guide Signature of Director

Date: 27th May 2022 Date: 27th May 2022.
Official Seal of the School
ACKNOWLEDGEMENT
We feel it’s our duty to acknowledge the help rendered to us by various persons.
With immense pleasure, we express our sincere gratitude, regards and thanks to our
project guide Mrs. Bindushree D C, Asst. professor, School of Computer Science and
Engineering, REVA University, Bengaluru for providing us with enough technical
guidance, motivation and all the support needed throughout the course of our project
work. we can never forget her valuable guidance and timely suggestion given to us.
We thank Dr. Ashwinkumar, Deputy Director, School of Computer Science and
Engineering, REVA University, Bengaluru for extending his full support and co-
operation.
We wish to record our profound and sincere gratitude to our Chancellor Dr. P
Shyama Raju, and Vice Chancellor Dr. M Dhanamjaya, REVA University, Bengaluru
for extending his full support and co-operation by allowing me to do the project in the
establishment.
We would like to thank our entire Teaching and Non-Teaching Faculty for their
support and Friends for their friendship making the life at REVA enjoyable and
memorable.
We would like to thank one and all who directly or indirectly helped us in
completing the project successfully.
I
ABSTRACT
The technology shown here is designed for visually impaired people to recognize objects and
faces in their environment, allowing users to walk around securely without colliding with
them. Object identification is related to the video captured by the camera. OpenCV, YOLO,
and FaceNet are used to recognize faces and things in the video recorded. When a human
face is spotted, the algorithm matches the name to the individual. The user will next be given
with an audio version of their identity. Likewise, items spotted in the area will be delivered to
the user in audio format along with their names.
Keywords: Visually Impaired, YOLO, OpenCV, Object identification and Face recognition.
II
TABLE OF CONTENTS
Acknowledgement I
Abstract II
Table of contents III
List of figures IV
List of tables V
Chapter 1
Introduction 1
1.1 Introduction 1
1.2 Problem Statement 2
1.3 Objective 2
1.4 Scope of Work 2
Chapter 2
Literature Survey 3
Chapter 3
Proposed Work 5
3.1 Methodology 5
3.2 System Implementation 7
3.3 Hardware specifications 7

3.4 Software specifications 7
Chapter 4 8
Result Analysis
Chapter 5 12
Conclusion & Future
Scope
5.1 Conclusion 12
5.2 Future Scope 12
References 13
Conference/Journal 15
Published
III
LIST OF FIGURES
Figure No Figure Name Page No

1. Illustration of the developed framework 6
2. Face Recognition 8
3. Object Recognition of Bottle 9
4. Object Recognition of Cell Phone 9
5. Object Recognition of Car 9
6. Object Recognition of Dog 10
IV
LIST OF TABLES
Table No Table Name Page No

1. Face Recognition testing 10
2. Object Recognition testing 11
V
Object Recognition and Face Recognition for Visually Impaired People
Chapter 1
INTRODUCTION
1.1 Introduction
Many people are disabled, both temporarily and permanently. There are a lot of blind people all
throughout the world. According to the World Health Organization (WHO), about 390 lakh people
are fully blind, while another 2850 lakh are purblind, or visually impaired [5]. Many supporting or
guiding systems have been produced or are being developed to help people navigate from one place
to another in their daily lives.
Common tasks required medium sight, interaction, reading and writing of certain sight, analyzing the
surrounding space, and activities involving distant sight are all affected by poor eyesight.
Furthermore, any task needs constant eye concentration.
Emerging developments in computer vision technology have prompted researchers to focus their
efforts on providing solutions for persons with visual impairments. These devices are designed to
assist the user in moving about securely.
The prospect of deploying technologies to detect objects [7] and individuals in the immediate
surroundings was investigated in this study. The detection result is delivered to the user in digital
audio. The capabilities of vision and hearing are similar in many respects. A real-time object
detection and facial recognition system is detailed with the goal of making a user aware of the objects
and people in his immediate surroundings.
School of CSE, REVA University, Bengaluru-6 1

1.2 Problem statement
Visually Impaired are unable to move because they are unable to recognize the terrain and
surroundings. In your daily life, you will repeatedly require assistance and walking support systems.
Without vision, it can be difficult for the visually handicapped to navigate a room or hallway without
running into objects. Even with assistance, such as a walking stick, avoiding obstacles can be
difficult, uncomfortable, and possibly inaccurate.
1.3 Objective
The goal of this thesis is to create an object recognition system that can distinguish between 2D and
3D objects in a picture. The characteristics utilized and the classifier used for recognition determine
the performance of the object recognition system. This study aims to present a new feature extraction
approach for extracting global features and getting local features from the study area. In addition, the
study work aims to combine classical classifiers in order to recognize the item.
1.4 Scope of work
The system is both free and accessible. Real-time results are provided. Reliability, the visually
impaired user may rely on the system to provide accurate results. The difference between various
objects, such as a chair, a table, etc. may be clearly distinguished depending on the video quality.

Chapter 2
LITERATURE SURVEY
[1] The most challenging situations with respect to the blind or visually impaired population is to
fight with unemployment. Many schools adapt the existing Braille to educate them, but it becomes
unachievable due to the high demanding expenses.
Out of 12 million visually impaired people, only 10% of them makes an effort to learn Braille. Using
computer vision to read any text in any format and lighting condition a non-expensive wearable
device was designed using Raspberry Pi along with a camera to record content around and translate
the same to the blind in their choice of language and a sensor to alert the user about the distance with
an object. The system is designed including of image processing, machine learning and speech
synthesis techniques. The accuracy recorded with both optical character recognition and the object
recognition algorithms was found to be 84%.
[2] Smart spec produces a voice output for the visually impaired persons using text detection. Specs
comprises of an inbuilt camera to capture images and is further analyzed using Tesseract-Optical
Character recognition (OCR). Text is converted to speech with open-source software speech
synthesizer, eSpeak. Further the headphones produce the speech by TTS. Raspberry Pi acts as an
interface between camera, sensors, IP, and controls the peripheral units.

[3] In the field of electronic travel aids (ETA) which comes with sensor technology and signal
processing, it greatly improves the mobility of visually impaired persons in dynamic conditions.
Results are achieved in the field of like integrated environment for assisted movement, acoustical
virtual reality (AVR), bioinspired solutions.
[4] It is described how a CNN-based correlation algorithm can help visually impaired persons. Given
the wealth of information that can be derived from pictures captured, adding a visual processing unit
in the framework of systems that aid persons with visual impairments is urgently important,
regardless of the version presented. This research describes a correlation technique that uses cellular
neural networks (CNNs) to improve the characteristics of helping systems and provide more
information from the surroundings to visually impaired people. Parallel processing can handle the
majority of the operations (calculations) in the suggested approach. As a result, the computing time
may be reduced, and the computing time does not rise proportionately with the size of the template
pictures.
[5] In this paper, many computers vision technology has been developed to assist blind or visually
impaired people. Wayfinding, navigation, and finding daily necessities have all been made easier with
the help of camera-based systems. The observer's movement causes all scene objects, whether
stationary or non-stationary, to move. As a result, detecting moving objects with a moving observer is
critical.

Chapter 3
PROPOSED WORK
3.1 Methodology
Identifying several objects in a picture is called object detection, and it includes both object
localization and object categorization. A first basic method would be to slide a window with variable
dimensions and use a network trained on cropped photos to predict the content class each time.
Convolutions can be used to automate this procedure, which has a significant computational cost.
The main principle behind YOLO is to place a grid on an image (typically 19x19) in which just one
cell the one holding the center/midpoint of an object, is responsible for identifying that object.
The image recorded will be broken into little grids in this method. The midpoint will be determined
using these grids. The midpoint of the bounding box will be bx and by, as well as the width and
height will be bw and bh. The confidents will be determined from this, and if the probability of mid-
point is equal to or greater than confidents, the object or person name to which the confidents level
matched will be predicted.
Intersection Over Union method to assess item localization, which quantifies the overlap between two
bounding boxes. Many outputs may be generated when estimating the bounding box of a given object
in a particular cell of the grid; Non-Max Suppression helps you identify the object just once. It
chooses the box with the highest probability and ignores the other boxes with a lot of overlap (IOU).

Fig 1: Illustration of the developed framework
The camera will capture live video, and the frames will be drawn from the footage. The objects as
well as the person's face will be identified. The object's name and confidentiality will be determined.
The audio output is created after the text to speech conversion.

3.2 System Implementation
Python libraries include Scikit-learn for machine learning, OpenCV for computer vision, TensorFlow
for neural networks, and more. Real-time computer vision tasks are performed with OpenCV. YOLO
provides a framework for object detection in near-real time. Keras is a TensorFlow and other
frameworks-compatible deep neural network library.
It's user-friendly, and it makes neural network-based machine learning models extremely
straightforward to train. Keras is a useful toolbox for a number of applications since it contains a
variety of neural network add-on features including as layers, optimizers, and activation functions.
Some hardware made use of a camera for live video capture and a headphone for audio output.
3.3 Hardware and Software Requirements
3.3.1 Hardware Requirements

• Processor: Intel Core i7 and above.
• RAM: 8 GB +
• Hard Disk: 50 GB+
• HD Web Camera
3.3.2 Software Requirements
• Operating System: Windows 10 and above.
• Language: Python
• IDE: PyCharm and Jupiter Notebook
• Libraries: Keras, TensorFlow, NumPy, Pandas.

Chapter 4
RESULT ANALYSIS
The following image shows the final detection results.
Fig 2: Face Recognition
Figure 2 shows an example output of facial recognition on the system, complete with bounding box,
name of person detected, and confidence score.

Fig 3: Object Recognition of Bottle Fig 4: Object Recognition of Cell Phone
Fig 5: Object Recognition of Car

Fig 6: Object Recognition of Dog
Figures 3, 4, 5, and 6 show the sample output, which shows identified objects with bounding boxes, labels,
and confidence ratings. These photos were captured for the purpose of object detection.
Live Input Expected Live Status

Output Output
Kalyan Face Face should Kalyan-Face Correct
(clear recognize as
background) Kalyan
Kalyan Face Face should Kalyan-Face Correct
(Normal room) recognize as
Kalyan
Table 1: Face Recognition testing
The live input, expected output, live output, and status of the face recognition tests are detailed in
Table 1.

Input Expected Live Confidence Status

output Output score of
YOLO
Bottle Bottle Bottle 0.7298 Correct
detected
Cell Cell Cell 0.6122 Correct
phone phone phone
detected
Car Car Car 0.7272 Correct
detected
Dog Dog Dog 0.8428 Correct
detected
Table 2: Object Recognition testing
Table 2 shows the live input, expected output, live output, YOLO confidence score, and face
recognition testing status.

Chapter 5
CONCLUSION AND FUTURE SCOPE
5.1 Conclusion
Object categorization and localization within a scene are two of the most challenging aspects of
object detection. The application of deep neural networks has aided in the identification of objects.
However, implementing such strategies necessitates a significant amount of computing and memory
resources. Therefore, utilizing deep neural network designs for object detection, such as YOLO,
produces positive results, demonstrating that they may be utilized for real-time object identification
and face recognition, which can benefit the visually impaired.
5.2 Future Scope
For object detection at night, the camera's night vision mode should be accessible as an integrated
feature. For visual monitoring, the scale of the design remains constant. When the size of the
monitored object decreases over time, the background takes precedence over the tracked object. In
this case, the item may not be traceable. Splitting and merging with a single camera is not possible in
all cases, resulting in a loss of content from a 3D object projection in 2D images.

REFERENCES
[1] M.P. Arakeri, N.S. Keerthana, M. Madhura, A. Sankar, T. Munnavar, “Assistive Technology for
the Visually Impaired Using Computer Vision”, International Conference on Advances in Computing,
Communications and Informatics (ICACCI), Bangalore, India, pp. 1725-1730, sept. 2018.
[2] R. Ani, E. Maria, J.J. Joyce, V. Sakkaravarthy, M.A. Raja, “Smart Specs: Voice Assisted Text
Reading system for Visually Impaired Persons Using TTS Method”, IEEE International Conference
on Innovations in Green Energy and Healthcare Technologies (IGEHT), Coimbatore, India, Mar.
2017.
[3] V. Tiponuţ, D. Ianchis, Z. Haraszy, “Assisted Movement of Visually Impaired in Outdoor

Environments”, Proceedings of the WSEAS International Conference on Systems, Rodos, Greece,
pp.386-391, 2009.
[4] L. Ţepelea, A. Gacsádi, I. Gavriluţ, V. Tiponuţ, “A CNN Based Correlation Algorithm to Assist
Visually Impaired Persons”, IEEE Proceedings of the International Symposium on Signals Circuits
and Systems (ISSCS 2011), pp.169-172, Iasi, Romania,2011.
[5] P. Szolgay,L. Ţepelea, V. Tiponuţ, A. Gacsádi, “Multicore Portable System for Assisting Visually
Impaired People”, 14th International Workshop on Cellular Nanoscale Networks and their
Applications, pp. 1-2, University of Notre Dame, USA, July 29-31, 2014.

[6] E.A. Hassan, T.B. Tang, “Smart Glasses for the Visually Impaired People”, 15th International
Conference on Computers Helping People with Special Needs (ICCHP), pp. 579-582, Linz, Austria,
2016.
[7] M. Trent, A. Abdelgawad, K. Yelamarthi, “A Smart Wearable Navigation System for Visually
Impaired”, 2nd EAI international Conference on Smart Objects and Technologies for Social Good
(GOODTECHS), pp. 333-341, Venice,Italy,2016.
[8] Jae Sung Cha, Dong Kyun Lim and Yong-Nyuo Shin, “Design and Implementation of a Voice
Based Navigation for Visually Impaired Persons”, International Journal of Bio- Science and Bio-
Technology, Vol. 5, No. 3, pp.61-68, June 2013.
[9] S. Khade, Y.H. Dandawate, “Hardware Implementation of Obstacle Detection for Assisting
Visually Impaired People in an Unfamiliar Environment by Using Raspberry Pi”, Smart Trends in
Information Technology And Computer Communications, SMARTCOM 2016, vol. 628, pp. 889-
895, Jaipur, India, 2016.
[10] R. C. Gonzalez, R. E. Woods and S. L. Eddins, “Digital Image Processing using MATLAB”, Pearson
Education, 2004.

CONFERENCE / JOURNAL PAPER PUBLISHED
We have presented our paper in 4th International Conference on Advances in Computing and
Information Technology (IACIT-2022) 17th and 18th May 2022. We have applied for publication in
Scopus journal.



OBJECT RECOGNITION AND FACE

RECOGNITION FOR VISUALLY IMPAIRED
PEOPLE
Manoj Sharma V Manthan N R Lalith Kumar Nitesh Kumar Mehtha

UG Scholar UG Scholar UG
Computer Science & Engineering Computer Science & Engineering Scholar
REVA University REVA University Computer Science & Engineering
Bangalore,India Bangalore,India REVA University
R18CS218@cit.reva.edu.in R18CS219@cit.reva.edu.in Bangalore,India
R18CS200@cit.reva.edu.in
Krishnapuram Kalyan Kumar Reddy Bindushree D C

UG Scholar Assistant Professor
Computer Science & Engineering Computer Science & Engineering
REVA University REVA University
Bangalore,India Bangalore,India
R18CS191@cit.reva.edu.in Bindushree.DC@reva.edu.in
Abstract— The technology shown here is designed for

visually impaired people to recognize objects and faces in Emerging developments in computer vision technology have
their environment, allowing users to walk around securely prompted researchers to focus their efforts on providing
without colliding with them. Object identification is related solutions for persons with visual impairments. These devices
to the video captured by the camera. OpenCV, YOLO, and are designed to assist the user in moving about securely.
FaceNet are used to recognize faces and things in the video
recorded. When a human face is spotted, the algorithm The prospect of deploying technologies to detect objects [7]
matches the name to the individual. The user will next be and individuals in the immediate surroundings was
given with an audio version of their identity. Likewise, items investigated in this study. The detection result is delivered to
spotted in the area will be delivered to the user in audio the user in digital audio. The capabilities of vision and hearing
format along with their names. are similar in many respects. A real-time object detection and
facial recognition system is detailed with the goal of making
Keywords— Visually Impaired, YOLO, OpenCV, Object a user aware of the objects and people in his immediate
identification and Face recognition. surroundings.
I. INTRODUCTION II. LITERATURE SURVEY
The most challenging situations with respect to the blind or

Many people are disabled, both temporarily and permanently.
visually impaired population is to fight with unemployment
There are a lot of blind people all throughout the world.
[1]. Many schools adapt the existing Braille to educate them
According to the World Health Organization (WHO), about
but it becomes unachievable due to the high demanding
390 lakh people are fully blind, while another 2850 lakh are
expenses.
purblind, or visually impaired [5]. Many supporting or
guiding systems have been produced or are being developed Out of 12 million visually impaired people, only 10% of them
to help people navigate from one place to another in their makes an effort to learn Braille. Using computer vision to read
daily lives. any text in any format and lighting condition a non-expensive
wearable device was designed using Raspberry Pi along with
Common tasks required medium sight, interaction, reading a camera to record content around and translate the same to
and writing of certain sight, analyzing the surrounding space, the blind in their choice of language and a sensor to alert the
and activities involving distant sight are all affected by poor user about the distance with an object. The system is designed
eyesight. Furthermore, any task needs constant eye including of image processing, machine learning and speech
concentration. synthesis techniques. The accuracy recorded with both optical

character recognition and the object recognition algorithms Convolutions can be used to automate this procedure, which
was found to be 84% [1]. has a significant computational cost.
Smart spec [2] produces a voice output for the visually The main principle behind YOLO is to place a grid on an
impaired persons using text detection. Specs comprises of an image (typically 19x19) in which just one cell, the one
inbuilt camera to capture images and is further analyzed using holding the center/midpoint of an object, is responsible for
Tesseract-Optical Character recognition (OCR). Text is identifying that object.
converted to speech with open-source software speech
synthesizer, eSpeak. Further the headphones produce the The image recorded will be broken into little grids in this
speech by TTS. Raspberry Pi acts as an interface between
camera, sensors, IP, and controls the peripheral units. method. The midpoint will be determined using these grids.
The midpoint of the bounding box will be b x and by, as well
In the field of electronic travel aids (ETA) which comes with as the width and height will be bw and bh. The confidents will
sensor technology and signal processing, it greatly improves be determined from this, and if the probability of mid-point is
the mobility of visually impaired persons in dynamic equal to or greater than confidents, the object or person name
conditions. Results are achieved in the field of like integrated to which the confidents level matched will be predicted.
environment for assisted movement, acoustical virtual reality
(AVR), bioinspired solutions [3]. Intersection Over Union method to assess item localization,
which quantifies the overlap between two bounding boxes.
In this paper, many computers vision technology has been Many outputs may be generated when estimating the
developed to assist blind or visually impaired people. bounding box of a given object in a particular cell of the grid;
Wayfinding, navigation, and finding daily necessities have all
Non-Max Suppression helps you identify the object just once.
been made easier with the help of camera-based systems [9].
The observer's movement causes all scene objects, whether It chooses the box with the highest probability and ignores the
stationary or non-stationary, to move. As a result, detecting other boxes with a lot of overlap (IOU).
moving objects with a moving observer is critical [9].
Block Diagram
It is described how a CNN-based correlation algorithm can
help visually impaired persons. Given the wealth of
information that can be derived from pictures captured,
adding a visual processing unit in the framework of systems
that aid persons with visual impairments is urgently
important, regardless of the version presented. This research
describes a correlation technique that uses cellular neural
networks (CNNs) to improve the characteristics of helping
systems and provide more information from the surroundings
to visually impaired people [4]. Parallel processing can handle
the majority of the operations (calculations) in the suggested
approach. As a result, the computing time may be reduced,
and the computing time does not rise proportionately with the
size of the template pictures [4].
I. PROBLEM DEFINITION
Fig 1: Illustration of the developed framework
Visually Impaired are unable to move because they are unable
to recognize the terrain and surroundings [6]. In your daily The camera will capture live video, and the frames will be
life, you will repeatedly require assistance and walking drawn from the footage. The objects as well as the person's
support systems. Without vision, it can be difficult for the face will be identified. The object's name and confidentiality
visually handicapped to navigate a room or hallway without will be determined. The audio output is created after the text
running into objects. Even with assistance, such as a walking to speech conversion.
stick, avoiding obstacles can be difficult, uncomfortable, and
possibly inaccurate. III. SYSTEM IMPLEMENTATION
II. METHODOLOGY Python libraries include Scikit-learn for machine learning,
OpenCV for computer vision, TensorFlow for neural
Identifying several object in a picture is called object
networks, and more. Real-time computer vision tasks are
detection, and it includes both object localization and object
performed with OpenCV. YOLO provides a framework for
categorization. A first basic method would be to slide a
object detection in near-real time. Keras is a TensorFlow and
window with variable dimensions and use a network trained
other frameworks-compatible deep neural network library.
on cropped photos to predict the content class each time.

It's user-friendly, and it makes neural network-based machine

Live Input Expected Live Status
learning models extremely straightforward to train. Keras is a
Output Output
useful toolbox for a number of applications since it contains Kalyan Face should Kalyan- Correct
a variety of neural network add-on features including as Face(clear recognize as Face
layers, optimizers, and activation functions. background) Kalyan
Kalyan Face should Kalyan- Correct
Some hardware made use of a camera for live video capture Face(Normal recognize as Face
and a headphone for audio output. room) Kalyan
Table 1: Face Recognition testing
I. EXECUTION AND RESULTS

The live input, expected output, live output, and status of the
The following image shows the final detection results. face recognition tests are detailed in Table 1.
Input Expected Live Confidence Status

output Output score of
YOLO
Bottle Bottle Bottle 0.7298 Correct
detected
Cell Cell Cell 0.6122 Correct
phone phone phone
detected
Car Car Car 0.7272 Correct
detected
Dog Dog Dog 0.8428 Correct
Fig 2: Face Recognition detected
Table 2: Object Recognition testing
Figure 2 shows an example output of facial recognition on the
system, complete with bounding box, name of person Table 2 shows the live input, expected output, live output,
detected, and confidence score. YOLO confidence score, and face recognition testing status.
II. OBJECTIVE
The goal of this thesis is to create an object recognition system

that can distinguish between 2D and 3D objects in a picture.
The characteristics utilized and the classifier used for
recognition determine the performance of the object
recognition system. This study aims to present a new feature
extraction approach for extracting global features and getting
local features from the study area. In addition, the study work
Fig 3: Object Recognition of Fig 4: Object Recognition of aims to combine classical classifiers in order to recognize the
Bottle Cell Phone item.
III. APPLICATIONS
It is both free and accessible. Real-time results are provided.

Reliability, the visually impaired user may rely on the system
to provide accurate results. The difference between various
objects, such as a chair, a table, etc. may be clearly
distinguished depending on the video quality.
IV. CONCLUSION
Fig 5: Object Recognition of Fig 6: Object Recognition of
Car Dog Object categorization and localization within a scene are two
of the most challenging aspects of object detection. The
Figures 3, 4, 5, and 6 show the sample output, which shows application of deep neural networks has aided in the
identified objects with bounding boxes, labels, and identification of objects. However, implementing such
confidence ratings. These photos were captured for the strategies necessitates a significant amount of computing
purpose of object detection. and memory resources. As a consequence, utilizing deep
neural network designs for object detection, such as YOLO,

produces positive results, demonstrating that they may be Science and Bio-Technology, Vol. 5, No. 3, pp.61-68, June
utilized for real-time object identification and face 2013.
recognition, which can benefit the visually impaired.
[9] S. Khade, Y.H. Dandawate, “Hardware Implementation
I. FUTURE ENHANCEMENT of Obstacle Detection for Assisting Visually Impaired People
in an Unfamiliar Environment by Using Raspberry Pi”, Smart
For object detection at night, the camera's night vision mode Trends In Information Technology And Computer
should be accessible as an integrated feature. For visual Communications, SMARTCOM 2016, vol. 628, pp. 889-895,
monitoring, the scale of the design remains constant. When Jaipur, India, 2016.
the size of the monitored object decreases over time, the
background takes precedence over the tracked object. In this [10] R. C. Gonzalez, R. E. Woods and S. L. Eddins, “Digital
case, the item may not be traceable. Splitting and merging Image Processing using MATLAB”, Pearson Education,
with a single camera is not possible in all cases, resulting in a 2004.
loss of content from a 3D object projection in 2D images.
II. REFERENCES
[1] M.P. Arakeri, N.S. Keerthana, M. Madhura, A. Sankar, T.

Munnavar, “Assistive Technology for the Visually Impaired
Using Computer Vision”, International Conference on
Advances in Computing, Communications and Informatics
(ICACCI), Bangalore, India, pp. 1725-1730, sept. 2018.
[2] R. Ani, E. Maria, J.J. Joyce, V. Sakkaravarthy, M.A. Raja,

“Smart Specs: Voice Assisted Text Reading system for
Visually Impaired Persons Using TTS Method”, IEEE
International Conference on Innovations in Green Energy and
Healthcare Technologies (IGEHT), Coimbatore, India, Mar.
2017.
[3] V. Tiponuţ, D. Ianchis, Z. Haraszy, “Assisted Movement

of Visually Impaired in Outdoor Environments”, Proceedings
of the WSEAS International Conference on Systems, Rodos,
Greece, pp.386-391, 2009.
[4] L. Ţepelea, A. Gacsádi, I. Gavriluţ, V. Tiponuţ, “A CNN

Based Correlation Algorithm to Assist Visually Impaired
Persons”, IEEE Proceedings of the International Symposium
on Signals Circuits and Systems (ISSCS 2011), pp.169-172,
Iasi, Romania,2011.
[5] P. Szolgay,L. Ţepelea, V. Tiponuţ, A. Gacsádi, “Multicore

Portable System for Assisting Visually Impaired People”, 14th
International Workshop on Cellular Nanoscale Networks and
their Applications, pp. 1-2, University of Notre Dame, USA,
July 29-31, 2014.
[6] E.A. Hassan, T.B. Tang, “Smart Glasses for the Visually
Impaired People”, 15th International Conference on
Computers Helping People with Special Needs (ICCHP), pp.
579-582, Linz, Austria, 2016.
[7] M. Trent, A. Abdelgawad, K. Yelamarthi, “A Smart

Wearable Navigation System for Visually Impaired”, 2nd
EAI international Conference on Smart Objects and
Technologies for Social Good (GOODTECHS), pp. 333-341,
Venice, Italy, 2016.
[8] Jae Sung Cha, Dong Kyun Lim and Yong-Nyuo Shin,
“De,sign and Implementation of a Voice Based Navigation
for Visually Impaired Persons”, International Journal of Bio-

D8 - Major - Project Report Final

Uploaded by

Copyright:

Available Formats

You might also like

D8 - Major - Project Report Final

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

D8 - Major - Project Report Final

Uploaded by

Copyright:

Available Formats

SCHOOL OF COMPUTER SCIENCE AND ENGINEERING

Submitted in fulfillment of the requirements for the award of the Degree of

Manoj Sharma V (R18CS218)

Under the guidance of

in respect of project work prescribed for the said degree.

Signature with date Signature with date

Signature with date

Name of the Examiner with affiliation Signature with Date

Signature of Guide Signature of Director

Engineering, REVA University, Bengaluru for providing us with enough technical

We thank Dr. Ashwinkumar, Deputy Director, School of Computer Science and

completing the project successfully.

the user in audio format along with their names.

3.3 Hardware specifications 7

Figure No Figure Name Page No

Table No Table Name Page No

to another in their daily lives.

Furthermore, any task needs constant eye concentration.

assist the user in moving about securely.

and people in his immediate surroundings.

School of CSE, REVA University, Bengaluru-6 1

1.2 Problem statement

difficult, uncomfortable, and possibly inaccurate.

1.4 Scope of work

School of CSE, REVA University, Bengaluru-6 2

unachievable due to the high demanding expenses.

recognition algorithms was found to be 84%.

School of CSE, REVA University, Bengaluru-6 3

virtual reality (AVR), bioinspired solutions.

School of CSE, REVA University, Bengaluru-6 4

matched will be predicted.

School of CSE, REVA University, Bengaluru-6 5

Fig 1: Illustration of the developed framework

The audio output is created after the text to speech conversion.

School of CSE, REVA University, Bengaluru-6 6

3.2 System Implementation

frameworks-compatible deep neural network library.

3.3 Hardware and Software Requirements

3.3.1 Hardware Requirements

School of CSE, REVA University, Bengaluru-6 7

The following image shows the final detection results.

Fig 2: Face Recognition

School of CSE, REVA University, Bengaluru-6 8

Fig 3: Object Recognition of Bottle Fig 4: Object Recognition of Cell Phone

Fig 5: Object Recognition of Car

School of CSE, REVA University, Bengaluru-6 9

Fig 6: Object Recognition of Dog

Live Input Expected Live Status

School of CSE, REVA University, Bengaluru-6 10

Input Expected Live Confidence Status

Table 2: Object Recognition testing

School of CSE, REVA University, Bengaluru-6 11

CONCLUSION AND FUTURE SCOPE

and face recognition, which can benefit the visually impaired.

5.2 Future Scope

all cases, resulting in a loss of content from a 3D object projection in 2D images.

School of CSE, REVA University, Bengaluru-6 12