Set Conference 22mdt1034

You might also like

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 17

23rd National Conference on Science, Engineering and Technology (NCSET 2023)

Object detection and recognition


for people with vision complexities using YOLO
Di
J Y O S H I TA K . G - 2 2 M D T 1 0 3 4

A M I S H A G U P TA – 2 2 M D T 1 0 0 8

ROSHINI K – 22MDT1031

D I V I S I O N O F M AT H E M AT I C S

M . S C D ATA S C I E N C E – I Y E A R

S A S , V I T- C H E N N A I .

GUIDE: DR. A FELIX


Vision is a crucial ability that enables people to view the world
around them.
Many visually impaired people struggle to feel and identify their
surroundings, making it challenging for them to navigate and
socialize in a new environment.
This project will aid visually impaired persons by giving
Abstract navigational assistance and aural feedback so they may more
quickly discover specific products or pathways within a place,
reducing their reliance on others.
This study addresses the notion of interpreting visual objects
through hearing. The aural and optical senses have the capacity to
find items in space. Therefore, this work utilized YOLOTINYv3
(You only look once) to recognize things within the bounding box,
and the Pyttsx3 module will convert the object into speech.

JYOSHITA​- AMISHA – ROSHINI 2


Due to visual impairment, millions of people around the world are
unable to interpret their surroundings.
Despite the fact that they are able to adapt their daily routines, they
experience some navigational issues and social awkwardness.
For example, in an unfamiliar place, it is quite challenging for them to
locate a specific room or different road paths. Additionally, it might be
Introduction challenging for them to determine who is speaking to them during a
conversation.
In this paper, the idea of understanding visual objects through hearing
is explored.
Both visual objects and audio sounds may be localized in space,
which is a striking similarity between the senses of sight and hearing.
The project's goal is to use voice guidance to steer visually challenged
individuals using the output of a processor or controller.

JYOSHITA​- AMISHA – ROSHINI 3


Literature Review

JYOSHITA​- AMISHA – ROSHINI 4


YEAR ARTICLE NAME AUTHOR METHOD USED TOOLS USED

An RFID tool developed specifically for


finding medications in a household medicine
cabinet is offered as a tool to assist the
Wearable object detection Alessandro Dionisi
2012 visually impaired in locating various things. RFID, RSSI
system for the blind.​ et al.
The gadget can detect drugs, as well as give
the user an auditory signal to help them
discover the required goods.​

This technique is based on the idea of


extracting local characteristics. The notion of
Object detection and identification a visual substitution system based on feature
2015 Hanen Jabnoun et al. SFIT, Feature Extraction​
for blind people in video scene​ extraction and matching to identify and
locate items in photos is therefore
our contribution.​

Real Time Object Detection and


Algorithms are implemented for detection Single Shot Detector,
Tracking Using Deep Learning and Chandan et al.
2018 and tracking of object. Mobile Nets
opencv

Real-Time Object Detection Image processing and


Selman TOSUN et A mobile software application for visually
2018 Application for Visually Impaired machine learning
al. impaired people
People: Third Eye technologies

JYOSHITA​- AMISHA – ROSHINI 5


Deep learning; object
CNN to train the object recognition
Deep Learning Based Shopping Daniel Pintado recognition; computer vision
2019 system. A text-to-speech system is utilized
Assistant For The Visually Impaired et al. and convolutional neural
to inform the user about the object
networks

Obstacle Detection, Depth Estimation Object detection and Depth Estimation


K C Shahira et
2019 And Warning System For Visually with Warning System For Visually YOLO
al.
Impaired People Impaired People

Moving object detection on high


Moving Object Detection With Deep Haidi Zhu , Xin resolution scenarios. They have used Deep Deep convolution neural
2020
cnns Yan et al. CNN to detect the objects and mtiny networks, mtiny YOLO,
YOLOV3 for better accuracy in Frames
per Second (FPS)

Deep Learning based Object Detection


Swapnil Bhole Object detection and recognition of human Convolution neural network,
2020 and Recognition Framework for the
et al. faces and currency notes. SSD, Inception v3
Visually-Impaired

JYOSHITA​- AMISHA – ROSHINI 6


Detect the real-time objects through the
Image Processing,
Real-Time Object Detection for camera and inform visually impaired
2020 Sunit Vaidya et al. Machine learning,
Visually Challenged People people about the object through the audio
YOLO
output.

A walking stick has been developed for


Fusion of Object Recognition and
Saumya Yadav et visually impaired person so that they Machine learning
2020 Obstacle Detection approach for
al. could detect objects, upstairs, downstairs models
Assisting Visually Challenged Person
and edges

Indoor objects detection and


Getting prepared an image so that people
2020 recognition for an ICT mobility Mouna Afif et al. YOLO
with vision impairments can see it
assistance of visually impaired people

A Systematic Literature Review


of the Mobile Application for Zulfadhlina Amira ​ ystematic
S
2020 Recognizes object using smartphones​
Object Recognition for Visually et al. Literature Review
Impaired People​

JYOSHITA​- AMISHA – ROSHINI 7


Used raspberry pi camera to take pictures and
Raspberry pi scanned the
Vaibhav V. Raspberry pi camera,​​
based intelligent reader for image for processing using imagemagick softwa
2020 Mainkar et Tts​
visually impaired persons​​ re. The scanned image is given as input to
al. ​
​ the tesseract ocr (optical character recognition)
software to convert image into text.

OpenCV and TensorFlow


api on raspberry pi​​
Employing real- Created an object detector that will successfully Text-to-
Kashish naqvi
2021 time object detection for provide information about detected object in the speech synthesizer.​​
et al.
visually impaired people​​ form of an audio using espeak​​ Single shot
detector (SSD)
model with mobilenet v2​​

A comprehensive assistive The proposed system consists of a processing unit,


Azhar iqbal et
2022 solution for visually impaired camera (depth camera kinect V2) for object Yolo
al.
persons detection and recognition

JYOSHITA​- AMISHA – ROSHINI 8


Aim and Objectives
AIM:
To help visually impaired people to detect object in real world.

OBJECTIVES:
The objective of the project is to employ voice assistance to guide people who are visually impaired.
To implement the trained YOLOTINYv3 model in a real-time application (camera).
To integrate additional features such as text-to-speech feedback to aid in object recognition for
individuals with vision complexities.
To test and evaluate the performance of the implemented system in real-world scenarios.

JYOSHITA​- AMISHA – ROSHINI 9


Algorithms and Modules
Utilized

IMAGEAI
PYTHON FLASK

LIBRARIES Pyttsx3
NumPy

DEEP
LEARNING YOLOTINYv3

ALGORITHM

JYOSHITA​- AMISHA – ROSHINI 10


Methodology
STEP 1: Object detection.
◦ We use the YOLO analogy in our project.
◦ YOLO weights have already been trained.
◦ A person's proximity to an object can also be ascertained.

STEP 2: Object recognition


◦ A small red colored bounding box will be placed over the
prediction area along the distance between the area and the
person which will be recognized by our model.

STEP 3: Text to Speech conversion


◦ Now that the object has been discovered, it needs to be translated
into speech so the person can hear it.
◦ The program we employ to convert. Text to speech is referred to
as pyttsx3.

JYOSHITA​- AMISHA – ROSHINI 11


Acquired Result
These are a few of the objects that the YOLO algorithm can identify.
Thus, these objects will be transformed into speech, such as "I see
person" which can be used by people who are visually impaired,
enabling them to carry out daily activities independently.

This figure illustrates the identified objects by the pre-trained


YOLOTINYv3 algorithm so that these objects will now be transferred to
speech and alert the people with vision complexities that there is a "car"
in front of them so that they can move cautiously.

JYOSHITA​- AMISHA – ROSHINI 12


Objects in the actual world can now be recognized and identified
by people with visual impairments.
The recognized object will subsequently be converted into voice
for persons who are visually impaired. With the help of this model,
Conclusion and they can change the visual world into the auditory one.
Future Scope However, visually impaired persons may find it challenging to
operate. Further, this work can be extended by using headphones to
make it easy to recognize people and by using portable glasses to
make it possible for the user to quickly detect objects.

JYOSHITA​- AMISHA – ROSHINI 13


Object detection and recognition with YOLOTINYv3 will increase
visually impaired individual's independence.
Advantages and This model will help visually impaired individuals by improving
Limitations of their capacity to accomplish everyday chores and reducing accidents.

the Work There are a few limitations to the study. Sometimes the recognition
accuracy of object detection and recognition systems may occur, as
these systems often struggle to identify objects in complex and
cluttered environments. The object can be impacted by varying
lighting conditions, which can lead to incorrect or inconsistent
results.

JYOSHITA​- AMISHA – ROSHINI 14


[1] Chandan, G., Jain, A., & Jain, H. (2018, July). Real time object detection and tracking using Deep
Learning and opencv. In 2018 International Conference on inventive research in computing
applications (ICIRCA) (pp. 1305-1308). IEEE.

[2] Pintado, D., Sanchez, V., Adarve, E., Mata, M., Gogebakan, Z., Cabuk, B., ... & Oh, P. (2019,
January). Deep learning-based shopping assistant for the visually impaired. In 2019 IEEE international
conference on consumer electronics (ICCE) (pp. 1-6). IEEE.

[3] Zhu, H., Yan, X., Tang, H., Chang, Y., Li, B., & Yuan, X. (2020). Moving object detection with deep
cnns. Ieee Access, 8, 29729-29741.

[4] Bhole, S., & Dhok, A. (2020, March). Deep learning-based object detection and recognition
framework for the visually-impaired. In 2020 Fourth International Conference on Computing
Methodologies and Communication (ICCMC) (pp. 725-728). IEEE.

References [5] Vaidya, S., Shah, N., Shah, N., & Shankarmani, R. (2020, May). Real-time object detection for
visually challenged people. In 2020 4th International Conference on Intelligent Computing and Control
Systems (ICICCS) (pp. 311-316). IEEE.

[6] Tosun, S., & Karaarslan, E. (2018, September). Real-time object detection application for visually
impaired people: Third eye. In 2018 International Conference on Artificial Intelligence and Data
Processing (IDAP) (pp. 1-6). Ieee.

[7] Shahira, K. C., Tripathy, S., & Lijiya, A. (2019, October). Obstacle detection, depth estimation and
warning system for visually impaired people. In TENCON 2019-2019 IEEE Region 10 Conference
(TENCON) (pp. 863-868). IEEE.

[8] Yadav, S., Joshi, R. C., Dutta, M. K., Kiac, M., & Sikora, P. (2020, July). Fusion of object
recognition and obstacle detection approach for assisting visually challenged person. In 2020 43rd
International Conference on Telecommunications and Signal Processing (TSP) (pp. 537-540). IEEE.

JYOSHITA​- AMISHA – ROSHINI 15


[9] Iqbal, A., Akram, F., Haq, M. I. U., & Ahmad, I. (2022, May). A comprehensive assistive solution for visually impaired
persons. In 2022 2nd International Conference of Smart Systems and Emerging Technologies (SMARTTECH) (pp. 60-65). IEEE.
[10] Afif, M., Ayachi, R., Pissaloux, E., Said, Y., & Atri, M. (2020). Indoor objects detection and recognition for an ICT mobility
assistance of visually impaired people. Multimedia Tools and Applications, 79(41), 31645-31662.
[11] Hisham, Z. A. N., Faudzi, M. A., Ghapar, A. A., & Rahim, F. A. (2020, August). A Systematic Literature Review of the
Mobile Application for Object Recognition for Visually Impaired People. In 2020 8th International Conference on Information
Technology and Multimedia (ICIMU) (pp. 316-322). IEEE.
[12] Dionisi, A., Sardini, E., & Serpelloni, M. (2012, May). Wearable object detection system for the blind. In 2012 IEEE
International Instrumentation and Measurement Technology Conference Proceedings (pp. 1255-1258). IEEE.
[13] Jabnoun, H., Benzarti, F., & Amiri, H. (2015, December). Object detection and identification for blind people in video
scene. In 2015 15th International Conference on Intelligent Systems Design and Applications (ISDA) (pp. 363-367). IEEE.
[14] Naqvi, K., Hazela, B., Mishra, S., & Asthana, P. (2021). Employing real-time object detection for visually impaired people.
In Data Analytics and Management (pp. 285-299). Springer, Singapore.
[15] Mainkar, V. V., Bagayatkar, T. U., Shetye, S. K., Tamhankar, H. R., Jadhav, R. G., & Tendolkar, R. S. (2020, March).
Raspberry Pi based intelligent reader for visually impaired persons. In 2020 2nd International Conference on Innovative
Mechanisms for Industry Applications (ICIMIA) (pp. 323-326). IEEE.

JYOSHITA​- AMISHA – ROSHINI 16


THANK YOU...

JYOSHITA​- AMISHA – ROSHINI 17

You might also like