Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 24

UNDER VEHICLE ANIMAL PROTECTION SYSTEM

MAJOR PROJECT REPORT

B.TECH.

IN

ELECTRONICS & COMMUNICATION ENGINEERING

Submitted By:

Mohtaash Sharma 0101EC2010

Nancy Bachwani 0101EC2010

Pakhi Bhargava 0101EC201096

Siddhi Mahajan 0101EC2011

Simran Chaturvedi 0101EC2011

DEPARTMENT OF ELECTRONICS & COMMUNICATION ENGINEERING

UNIVERSITY INSTITUTE OF TECHNOLOGY

RAJIV GANDHI PROUDYOGIKI VISHWAVIDALAYA

BHOPAL-462033 SESSION 2020-2024


RAJIV GANDHI PROUDYOGIKI VISHWAVIDYALAYA, BHOPAL
DEPARTMENT OF ELECTRONICS & COMMUNICATION
ENGINEERING

CERTIFICATE

This is to certify that Pakhi Bhargava, Simran Chaturvedi, Nancy Bachwani,


Mohtaash Sharma and Siddhi Mahajan of B.Tech Fourth Year, Electronics &
Communication Engineering have completed their Manor Project entitled “Under
Vehicle Animal Protection System” during the year 2023-2024 under our
guidance and supervision.
We approve the project for the submission for the partial fulfilment of the
requirement for the award of degree of B.E. in Electronics & Communication
Engineering.

Date……………..

DECLARATION BY CANDIDATE

2
We, hereby declare that the work which is presented in the minor project, entitled
“Sign Language Recognition” submitted in partial fulfilment of the requirement
for the award of Bachelor degree in Computer Science and Engineering has been
carried out at University Institute of Technology RGPV , Bhopal and is an
authentic record of our work carried out under the guidance of Dr. Piyush Shukla
(Project Guide) and Prof. Praveen Yadav (Project Guide) , Department of
Computer Science and Engineering, UIT RGPV, Bhopal.
The matter in this project has not been submitted by us for the award of any other
degree.

Kartik Lodhi 0101CS201058


Ayush Sharma 0101EC201040
Dipanshu Dahate 0101CS201044
Devendra Sondhiya 0101CS201039

3
ACKNOWLEDGMENT

After the completion of minor project work, words are not enough to express our
feelings about all those who helped us to reach our goal, feeling above all this is
our indebtedness to the almighty for providing us this moment in life.

First and foremost, we take this opportunity to express our deep regards and
heartfelt gratitude to our project guide(s) Dr. Piyush Shukla and Prof. Praveen
Yadav Faculty of Computer Science and Engineering Department, RGPV
Bhopal for their inspiring guidance and timely suggestions in carrying out our
project successfully. They have also been a constant source of inspiration for us.

We are extremely thankful to Prof. Uday Chourasiya , Head, Computer Science


and Engineering Department, RGPV Bhopal for his cooperation and motivation
during the project. We would also like to thank all the teachers of our department
for providing invaluable support and motivation. We are also grateful to our friends
and colleagues for their help and cooperation throughout this work.

4
INDEX
TITLE PAGE NO.
Abstract 6
List of figures and tables 7
Chapter 1: Introduction 9
1.1: Sign Language
1.2: K-Nearest Neighbor
1.3: Process
1.3.1: Data Collection
1.3.2: Image Pre-Processing
1.3.3: Feature Extraction
1.3.4: Model Training
1.3.5: Recognition
Chapter 2: Literature Survey 14
2.1: Sign Language
2.2: Vision-Based Approach
2.3: Need
Chapter 3: Software Requirements 17
Chapter 4: Requirement Analysis 18
Chapter 5: Algorithms 19
5.1: Convolutional Neural Networks (CNN)
5.2: K-Nearest Neighbor (KNN)
Chapter 6: Results 21
Chapter 7: Conclusion & Future work 24
7.1: Conclusion
7.2: Future Scope
Chapter 8: References 25

5
ABSTRACT

Sign language is essential for people who have hearing and speaking deficiencies. It is the
only mode of communication for such people to convey their messages and it becomes very
important for people to understand what they want to say. Aiding the cause, Deep learning,
and Computer vision can be used to make an impact on this cause. Conversion of Sign
language into words by an algorithm or a model can help bridge the gap between people with
hearing or speaking impairment and the rest of the world. Vision-based hand gesture
recognition is an area of active current research in computer vision and machine learning. So,
the primary goal of gestures and using them to convey information.

The main focus of this work is to create a vision-based system to identify sign language
gestures from the video sequences. The reason for choosing a system based on vision relates
to the fact that it provides a simpler and more intuitive way of communication between a
human and a computer.
In this project, we have developed this idea using OpenCV and Keras modules of python.

6
LIST OF FIGURES AND TABLES

Figure No. Caption Page No.

1. Components for Sign Language 9

2. Gestures for Alphabets in Sign 10

Language

3. Project Flowchart 11

4. Flowchart of Vision-based approach 15

5. Process of Vision-based approach 15

6. Different hand signs 16

7. Accuracy of different algorithms 20

8. Training model for alphabet A 21

9. Training model for alphabet B 21

10. Recognition Sign 23

7
Chapter 1: INTRODUCTION

Motion of any body part like face, hand is a form of gesture. Here for gesture recognition, we are using
image processing and computer vision. Gesture recognition enables computer to understand human actions
and also acts as an interpreter between computer and human. This could provide potential to human to
interact naturally with the computers without any physical contact of the mechanical devices. Gestures are
performed by deaf and dumb community to perform sign language. This community used sign language for
their communication when broadcasting audio is impossible, or typing and writing is difficult, but there is
the vision possibility. At that time sign language is the only way for exchanging information between people.
Normally sign language is used by everyone when they do not want to speak, but this is the only way of
communication for deaf and dumb community. Sign language is also serving the same meaning as spoken
language does. This is used by deaf and dumb community all over the world but in their regional form like
ASL. Sign language can be performed by using Hand gesture either by one hand or two hands. Isolated sign
language consists of single gesture having single word while continuous ISL or Continuous Sign language is
a sequence of gestures that generate a meaningful sentence. In this report we performed isolated ASL gesture
recognition technique.

1.1 Sign Language

Deaf people around the world communicate using sign language as distinct from spoken language in their
everyday a visual language that uses a system of manual, facial and body movements as the means of
communication. Sign language is not a universal language, and different sign languages are used in different
countries, like the many spoken languages all over the world. Some countries such as Belgium, the UK, the
USA or India may have more than one sign language. Hundreds of sign languages are in used around the
world, for instance, Japanese Sign Language, British Sign Language (BSL), Spanish Sign Language, Turkish
Sign Language. Sign language is a visual language and consists of 3 major components:

Table No.1 [Components of Sign Language]

8
Figure No.1 [Gestures for Alphabets in Sign Language]

1.2 K-Nearest Neighbor

K-nearest neighbor (KNN) classifier classifies objects on the basis of feature shape. KNN uses supervised
learning algorithm. Nearest neighbor algorithm is most popular classification technique proposed by Fix
and Hodges. KNN classify method classifies each row of the data in sample into one of the groups in
training using the nearest-neighbor method. Sample and Training must have same number of columns.
Group is grouping variable for training and unique values define groups. Each element defines the group
that the corresponding row of training belongs. Group can be a numeric vector, a string array, or a cell array
of strings in the Matlab environment. Training and group have same number of rows. Class indicates group
of each row of sample that it has been assigned to, and is of the same type as group. The default behavior of
this method is to use majority rule i.e., a sample point is assigned to the class from which the b of the K
nearest neighbors are from. When classify more than two groups and ‘k’ is even number then there is a tie
so to break tie random is used to break tie in nearest neighbor. The default is majority rule for nearest tie
breaker.

1.3 Process

This project includes following steps:

• Data Collection
• Image Pre-Processing
9
• Feature Extraction
• Model Training
• Recognition

Figure No.2 [Project Flowchart]

1.3.1 DATA COLLECTION


Light background for acquiring the images is selected. The selection of this background is due to uniformity
in background and its pixel values in capturing features, also it is helpful in deleting background in order to
extracting important features. An inbuilt laptop webcam camera was used for image acquisition. The
common file format PNG was used to capture the images. There are 720 images for database. Each image is
120 DPI and a 124 KB size approximately.
10
1.3.2 IMAGE PRE-PROCESSING
After collecting the database from user, the images were pre-processed. Firstly, the RGB images were
converted to gray scale image by rgb2gray function available in code. It converts the true color image RGB
to the gray scale intensity image. The function converts RGB images to gray scale by eliminating the hue
and saturation information while retaining the luminance. We used first derivative Sobel edge detector
method because it computes gradient by using discrete difference between rows and columns of 3×3
neighbors. The Sobel method finds edges using the Sobel approximation to the derivative. Where the
gradient of image is maximum, Sobel returns edge points. Sobel is the best in amongst because it provides
good edges, and it performs reasonably well in the presence of noise.

1.3.3 FEATURE EXTRACTION


Feature extraction is a form of dimensionality reduction. Input images are too large for processing, so to
process these images in time we reduce the dimension of the input image by feature extraction.
Transforming input data into feature is called feature extraction. Feature extraction is chosen in such a way
that image information must be retained. Feature extraction is an essential pre-processing step to pattern
recognition and machine learning problems. It is often decomposed into feature construction and feature
selection. Feature extraction involves simplifying the number of resources required to describe a large set of
data accurately. When performing analysis of complex data one of the major problems stems from the
number of variables involved. Analysis with a large number of variables generally requires a large amount
of memory and computation power or a classification algorithm which over fits the training sample and
generalizes poorly to new samples. Feature extraction is a general term for methods of constructing
combinations of the variables to get around these problems while still describing the data with sufficient
accuracy.
Feature extraction techniques used in this project are direct pixel value and hierarchical centroid. In direct
pixel value feature extraction method, original image (200×300) was resized to 20×30 pixels and then the
image matrix was converted into one dimensional array containing exactly 600 elements. The end result of
the extraction task is set of features, commonly called a feature vector and feature vector constitutes a
representation of the image.

1.3.4 MODEL TRAINING


As we are using K-nearest neighbor algorithm in this project it basically detects your hand and find nearest
neighbor to make a trend in which way your fingers are bend. After this it trains the model according to that
learning from the dataset i.e., how index finger should be in the case of alphabet A and more.

11
1.3.5 RECOGNITION

After model training it’s the time to test the model accuracy, will a model be able to detect signs accurately
or not. For this a window will be open so that you can make gestures or sign and the output of that sign will
be displayed just above that window with the confidence that a model has in that prediction.

Chapter 2: LITERATURE SURVEY

2.1 Sign Language

12
Sign language (SL) is a visual-gestural language used by deaf and hard-hearing people. They use three
dimensional spaces and the hand movements (and other parts of the body) to convey meanings. It has its
own vocabulary and syntax which is entirely different from spoken and written languages. A gesture may
be defined as a movement, usually of hand or face that expresses an idea, sentiment or emotion e.g., rising
of eyebrows, shrugging of shoulders is some of the gestures we use in our day-to-day life. Sign language is
a more organized and defined way of communication in which every word or alphabet is assigned to a
particular gesture. With the rapid advancements in technology, the use of computers in our daily life has
increased manifolds.

Around 5% of world community in all parts of the world is using sign language as a medium of
communication. Regionally different languages have been evolved as ASL (American Sign Language) in
America, GSL (German Sign Language) in Germany, BSL (British Sign Language) in the UK or ISL
(Indian Sign Language) in India. There are mainly two different motivations for developing sign language
recognition model. The first aspect is the development of an assistive system for the deaf or hard hearing
people. Hence a system of translating sign language to spoken language would be of great help for deaf as
well as for hearing people. A second aspect is that sign language recognition serves as a good basis for the
development of gestural human-machine interfaces. Sign gesture can be divided into two types: static and
dynamic. Static gestures have fixed position of hand whereas dynamic gestures have movement of hands
and body parts. Sign Language Recognition is the machine recognition of gestures.
Gesture recognition can be done in either way, Device based approach or Vision based approach. The later
one is commonly used in pattern recognition. There is no common way of recognition of sign language
gestures, so a recognition system is to be formalized. Till now; there is no translator or machine available to
help the unblessed community in public places in India. We are in the process of developing a recognition
system to help the unblessed community.
In the recent years, there has been tremendous research on the hand sign language gesture recognition. The
technology for gesture recognition is given below.

2.2 Vision-based Approach

In vision-based methods computer camera is the input device for observing the information of hands or
fingers. The Vision Based methods require only a camera, thus realizing a natural interaction between
humans and computers without the use of any extra devices. These systems tend to complement biological
vision by describing artificial vision systems that are implemented in software and/or hardware. This poses a
challenging problem as these systems need to be background invariant, lighting insensitive, person and
camera independent to achieve real time performance. Moreover, such systems must be optimized to meet
the requirements, including accuracy and robustness. The vision-based hand gesture recognition system is
shown in fig.:

13
Figure No.3 [Flowchart of Vision-based approach]

Vision based analysis, is based on the way human beings perceive information about their surroundings, yet
it is probably the most difficult to implement in a satisfactory way. Several different approaches have been
tested so far.

1. One is to build a three-dimensional model of the human hand. The model is matched to images of
the hand by one or more cameras, and parameters corresponding to palm orientation and joint angles are
estimated. These parameters are then used to perform gesture classification.
2. Second one to capture the image using a camera then extract some feature and those features are
used as input in a classification algorithm for classification.

Figure No.4 [Process of Vision-based approach]

2.3 Need

While sign language is very important to deaf-mute people, to communicate both with normal people and
with themselves, is still getting little attention from the normal people. We as the normal people, tend to
ignore the importance of sign language, unless there are loved ones who are deaf-mute. One of the solutions
to communicate with the deaf-mute people is by using the services of sign language interpreter. But the
14
usage of sign language interpreter can be costly. Cheap solution is required so that the deaf-mute and
normal people can communicate normally. Therefore, researchers want to find a way for the deaf-mute
people so that they can communicate easily with normal person. The breakthrough for this is the Sign
Language Recognition System.
The system aims to recognize the sign language, and translate it to the local language via text or speech.
However, building this system cost very much and are difficult to be applied for daily use. Early researches
have known to be successful in Sign Language Recognition System by using data gloves. But, the high cost
of the gloves and wearable character make it difficult to be commercialized. Knowing that, researchers then
try to develop a pure vision Sign Language Recognition Systems. However, it is also coming with
difficulties, especially to precisely track hands movements. The problems of developing sign language
recognition ranges from the image acquisition to the classification process. Researchers are still finding the
best method for the image acquisition. Gathering images using camera gives the difficulties of image
preprocessing. Meanwhile, using active sensor device can be costly. Classification methods also give
researchers some drawbacks. Wide choice of recognition method makes researchers unable to focus on one
best method. Choosing one method to be focused on, tends to make other method that may be better suit for
Sign Language Recognition, not being tested. Trying out other methods makes researchers barely develops
one method to its fullest potentials.

Figure No.5 [Different hand signs]

15
Chapter 3: SOFTWARE REQUIREMENTS

➢ Python
• Python 3

➢ Libraries
• Numpy
• Mediapipe
• OpenCV
• Sklearn
• Keras

➢ Operating System
• Windows or Ubuntu

➢ Hardware Requirements Specification


• Laptop with basic hardware
• Webcam

16
Chapter 4: REQUIREMENT ANALYSIS

➢ Python:

• Python is the basis of the program that we wrote. It utilizes many of the python libraries.

➢ Libraries:

• Numpy: Pre-requisite for array.


• Mediapipe: Used for detection of hand and hand gestures.
• OpenCV: Used to get the video stream from the webcam.
• Sklearn: Used to implement machine learning models and statistical modelling.
• Keras: Used to make the implementation of neural networks.

➢ Laptop:

• Used to run our code.

➢ Webcam:

• Used to get the video feed.

17
Chapter 5: ALGORITHMS

5.1 Convolutional Neural Network (CNN)

This emphasizes the utilization of machine learning and convolution neural network (CNN) to recognize
hand gesture lively, in spite of variations in hand sizes and spatial position in the image by providing our
own personalized system inputs as a dataset representing the gestures according to the classes developed
and to implement our model that will identify and classify the gesture into one of the defined categories.
CNN utilizes three layers, where two are hidden layers and another one is convolution. The proposed model
has been designed with three classes containing personalized gestures. The classes considered here are
firstaid, food, and water. This model can be used for in-flight comfort facilities by travellers and also where
there is a need for the use of these gestures.
CNN is famous for recognition and has preferable outcomes over different strategies, primarily because it
can obtain the necessary element esteems from the information picture and become familiar with the
contrast between various examples by utilizing countless examples in its preparation. In any case,
previously, its advancement has been restricted because of the speed of equipment computing. Lately,
because of the progression of semiconductor fabricating, the computing pace of illustrations preparing units
is becoming quicker, and the bottleneck of equipment handling speed has permitted the CNN network to
grow quickly. The steps followed in applying this CNN are as follows: first, a picture is inputted
(interpreted as an array of pixels); second, processing and filtering have to be conducted; and third, the
results are obtained after the classification. Every model has to be trained and then tested so that it can be
used in a layered architecture in which many convolutional layers involve kernels (or filters) and a Pooling
operation.

5.2 K-Nearest Neighbour (KNN)

For gesture recognition, the K-Nearest Neighbour (KNN) algorithm is a supervised machine-learning
algorithm. KNN is used for classification, by which a data point’s classification is determined by how its
neighbour is classified. Euclidean distance is used to find the nearest neighbour in KNN. In this, the target is
to achieve minimum Euclidean distance, and the calculation is performed based on several small distances.
As soon as the k value increases, accuracy also increases.

In general, the Euclidian distance formula is used. With the use of KNN, classification is performed using the
threshold value, which is calculated by the average of the k data point that is nearest. The performance is
totally based on the distance of the nearest neighbour, similarity measurement, and a threshold value. To

18
obtain a measurement of accuracy, hidden layer size is calculated by the number of neurons in the hidden
layer. Weight optimization is conducted by the use of a solver, and the learning rate is calculated and
represented by leaning_rate_initdouble. This whole setup exists under scikit-learn.

Below graph represents different machine learning algorithms with their accuracy:

Figure No.6 [Accuracy of different algorithms]

19
Chapter 6: RESULTS

➢ Training Model:

Figure No.7 [Training model for alphabet A]

Figure No.8 [Training model for alphabet B]

20
➢ Predicting or recognising the sign:

S. No. Alphabet Result

1. A

2. B

3. D

4. L

21
5. O

6. W

7. Y

Table No.2 [Recognising Signs]

Chapter 7: CONCLUSION & FUTURE SCOPE

22
7.1 CONCLUSION
In conclusion, sign language recognition is a crucial area of research that can greatly improve the
accessibility and communication for deaf and hard of hearing individuals. While the development of accurate
sign language recognition systems is challenging, advances in machine learning and computer vision
technology have led to promising results.
To achieve accurate sign language recognition, a significant amount of high-quality training data and the use
of appropriate machine learning algorithms are necessary. It's also essential to take into account the
variability and complexity of sign language as a language, including variations in sign language dialects and
the presence of non-manual features.
Overall, the continued advancement of sign language recognition technology has the potential to greatly
improve the lives of individuals who use sign language as their primary form of communication, and it's an
area of research that should continue to be pursued.

7.2 FUTURE SCOPE


The future scope of sign language recognition is vast, and there are several exciting opportunities for further
research and development in this field. Here are some potential areas of focus for the future:
Improved accuracy: One of the primary areas of focus for future research is improving the accuracy of sign
language recognition systems.
Real-time recognition: Another potential area of focus is improving the real-time recognition of sign
language.
Adaptation to different dialects: Sign language is a diverse language, with many variations and dialects.
Non-manual feature recognition: Sign language also involves non-manual features, such as facial
expressions and body language, that are essential to understanding the meaning of a sign.
Integration with other technologies: There is an opportunity to integrate sign language recognition with
other technologies, such as virtual assistants or augmented reality systems, to provide more seamless and
accessible communication for sign language users.

Chapter 8: REFERENCES

23
[1] Google LLC. (2023, Feb. 23). Google – Isolated Sign Language Recognition [Online]. Available:
https://www.kaggle.com/competitions/asl-signs

[2] Towards Data Science. (2020, Mar. 05). Sign Language Recognition with Computer Vision [Online]
Available: https://towardsdatascience.com/sign-language-recognition-with-advanced-computer-
vision7b74f20f3442

[3] Geeks for Geeks. (2022, Nov.25). Sign Language Recognition system using TensorFlow in python
[Online]
Available: https://www.geeksforgeeks.org/sign-language-recognition-system-using-tensorflow-in-python

[4] Harsh Thuwal. (2020, Sept. 23). Sign Language Gesture Recognition [Online]
Available: https://github.com/topics/sign-language-recognition-system

[5] Data Flair. “Sign Language Recognition using Python and OpenCV”. [Online]
Available: https://data-flair.training/blogs/sign-language-recognition-python-ml-opencv

[6] Reddygari Sandhya Rani, R. Rumana and R. Prema, ” A review paper on sign language recognition for
the deaf and dumb”, International Journal of Engineering Research & Technology (IJERT),
2021Nov. [Online]
Available: https://www.ijert.org/a-review-paper-on-sign-language-recognition-for-the-deaf-and-dumb

[7] Ashok K Sahoo, Gouri Sankar Mishra and Kiran Kumar Ravulakollu. “Sign Language Recognition: State
of the Art”, ARPN Journal of Engineering and Applied Sciences, 2014-Feb. [Online]
Available:
https://www.researchgate.net/publication/262187093_Sign_language_recognition_State_of_the_art

[8] Matyáš Boháček and Marek Hrúz, “Sign Pose-Based Transform for Word-Level Sign Language
Recognition, Papers with Code, 2022 [Online]
Available: https://paperswithcode.com/paper/sign-pose-based-transformer-for-word-level

24

You might also like