Python Based Recognition of Sign

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 10

Python based recognition of sign language

Submitted in partial fulfillment of the requirements for the award of degree


of
BACHELOR OF
ENGINEERING IN
COMPUTER SCIENCE & ENGINEERING

Submitted to: Submitted By:


Er. Amanpreet Kaur Ma’am Shubham Kr. (19BCS2590)
Shubham Bharti (19BCS2848)
Haris Adnan (19BCS2856)
Mahamad Imsad
(19BCS2809)

DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING


Chandigarh University, Gharuan
Abstract

This paper focuses on experimenting with different segmentation approaches and


supervised learning algorithms to create an accurate sign language recognition model.
To more easily approach the problem and obtain reasonable results, we experimented
with just up to 10 different classes/letters in the our self-made dataset instead of all 26
possible letters. We collected 12000 RGB images and their corresponding depth data
using a Microsoft Kinect. Up to half of the data was fed into the auto encoder to
extract features while the other half was used for testing. We achieved a classification
accuracy of 98% on a randomly selected set of test data using our trained model. In
addition to the work we did on static images, we also created a live demo version of
the project which can be run at a little less than 2 seconds per frame to classify signed
hand gestures from any person. Sign languages are a set of predefined languages
which use visual-manual modality to convey information. We consider the problem of
real time Indian Sign Language (ISL) finger-spelling recognition. We collected a
dataset of depth based segmented RGB image using Microsoft XBOX360 Kinect
Camera for classifying 36 different gestures (alphabets and numerals). The system
takes in a hand gesture as input and returns the corresponding recognized character as
output in real time on the monitor screen. For classification we used Deep
Convolutional Neural Network and achieved an accuracy of 89.30%. Most research
implementations for this task have used depth maps generated by depth camera and
high resolution images. The objective of this project was to see if neural networks are
able to classify signed ASL letters using simple images of hands taken with a personal
device such as a laptop webcam. This is in alignment with the motivation as this would
make a future implementation of a real time ASL-to-oral/written language translator
practical in an everyday situation. This goal is further motivated by the isolation that is
felt within the deaf community. Loneliness and depression exists in higher rates
among the deaf population, especially when they are immersed in a hearing world.
Large barriers that profoundly affect life quality stem from the communication
disconnect between the deaf and the hearing.
1. Introduction

The goal of this project was to build a neural network able to classify which letter of
the American Sign Language (ASL) alphabet is being signed, given an image of a
signing hand. This project is a first step towards building a possible sign language
translator, which can take communications in sign language and translate them into
written and oral language. Such a translator would greatly lower the barrier for many
deaf and mute individuals to be able to better communicate with others in day to day
interactions. This goal is further motivated by the isolation that is felt within the deaf
community. Loneliness and depression exists in higher rates among the deaf
population, especially when they are immersed in a hearing world . Large barriers that
profoundly affect life quality stem from the communication disconnect between the
deaf and the hearing. Some examples are information deprivation, limitation of social
connections, and difficulty integrating in society. Most research implementations for
this task have used depth maps generated by depth camera and high resolution images.
The objective of this project was to see if neural networks are able to classify signed
ASL letters using simple images of hands taken with a personal device such as a
laptop webcam. This is in alignment with the motivation as this would make a future
implementation of a real time ASL-to-oral/written language translator practical in an
everyday situation. The problem we are investigating is sign language recognition
through unsupervised feature learning. Being able to recognize sign language is an
interesting computer vision problem while simultaneously being extremely useful for
deaf people to interact with people who don’t know how to understand American Sign
Language (ASL). Our approach was to first create a data set showing the different
hand gestures of the ASL alphabet that we wished to classify. The next step was to
segment out only the hand region from each image and then use this data for
unsupervised feature learning using an auto encoder, followed by training a soft max
classifier for making a decision about which letter is being displayed.
Many different segmentation approaches were tried until we discovered that
the skin color and depth segmentation techniques worked most consistently.
Motivation: The various advantages of building a Sign Language Recognition
system includes:
• Sign Language hand gestures to text/speech translation systems or dialog systems
which are used in specific public domains such as airports, post offices, or
hospitals.
• Sign Language Recognition (SLR) can help to translate the video to text or
speech enables inter-communication between normal and deaf people.

1. Technology:

• Data Processing: The load data.py script contains functions to load the Raw
Image Data and save the image data as numpy arrays into file storage. The process
data.py script will load the image data from data.npy and preprocess the image by
resizing/rescaling the image, and applying filters and ZCA whitening to enhance
features. During training the processed image data was split into training, validation,
and testing data and written to storage. Training also involves a load dataset.py script
that loads the relevant data split into a Dataset class. For use of the trained model in
classifying gestures, an individual image is loaded and processed from the file system.

• Training: The training loop for the model is contained in train model.py. The
model is trained with hyperparameters obtained from a config file that lists the learning
rate, batch size, image filtering, and number of epochs. The configuration used to train
the model is saved along with the model architecture for future evaluation and
tweaking for improved results. Within the training loop, the training and validation
datasets are loaded as Dataloaders and the model is trained using Adam Optimizer with
Cross Entropy Loss. The model is evaluated every epoch on the validation set and the
model with best validation accuracy is saved to storage for further evaluation and use.
Upon finishing training, the training and validation error and loss is saved to the disk,
along with a plot of error and loss over training.

● Classify Gesture: After a model has been trained, it can be used to classify a
new ASL gesture that is available as a file on the filesystem. The user inputs the
filepath of the gesture image and the test data.py script will pass the filepath to
process data.py to load and preprocess the file the same way as the model has been
trained.
2. Field of project:
The project “Sign Language recognition” is in python language. Python is an
interpreter, high-level, general-purpose programming language. Created by Mr.
Guido van Rossum and first released in 1991, Python's design philosophy
emphasizes code readability with its notable use of significant whitespace. Its
language constructs and object-oriented approach aim to help programmers write
clear, logical code for small and large-scale projects. Python is dynamically
typed and garbage-collected. It supports multiple programming paradigms,
including procedural, object-oriented, and functional programming.

Python is often described as a "batteries included" language due to its


comprehensive standard library.

Python was conceived in the late 1980s as a successor to the ABC language.
Python 2.0, released in 2000, introduced features like listcomprehensions and a
garbage collection system capable of collecting reference cycles. Python 3.0,
released in 2008, was major revision of the language that is not completely
backward-compatible, and much Python 2 code does not run unmodified on
Python 3.

Visual Studio Code is a free source-code editor made by Microsoft for


Windows, Linux and mac, OS. Features include support for debugging, syntax
highlighting, intelligent code completion, snippets, code refactoring, and
embedded Git. User can change the theme, keyboard shortcuts, and preferences
and install extensions that add additional functionality. Visual Studio Code's
source code comes from Microsoft's free and open-source software VS Code
project released under the permissive Expat License, but the compiled binaries
are freeware for any use. In the Stack Overflow 2019 Developer Survey, Visual
Studio Code was ranked the most popular developer environment tool, with
50.7% of 87,317 respondents claiming to use it.

3. Feasibility Study:
The project “Sign Language recognition” is economically, technically as well as
legally feasible. The cost of building up the project is very less and all the
technical
resources were available on the internet. The project is basically designed for educational
purpose and is not concerned with any legal issues. Therefore, the project is
economically, technically as well as legally feasible.

1.1. Need of an Air Canvas:

Our project Sign language recognition used for Deaf and Mute people use
hand gesture sign language to communicate, hence normal people face problems

in recognizing their language by signs made. Hence, there is a need for systems
that recognize the different signs and conveys the information to normal people.

4. Planning of work:

1. The first and the most important step in developing a project is to have
the brief information about the project you are keen to work on.
2. The second step is to develop the knowledge about the
platforms and technology you are working on.
3. Then install all the important software used in your project.
4. Distribute the work according to the interests of the members and try to work as
a team to achieve your goal.
5. Take proper references from internet and your supervisors if the platform is
new for your team to work on.
6. Have a proper chalk out plan in your mind to how to develop and how
to overcome the technical drawbacks encountered at the time of
testing.
7. Development should occur in step by step, code the programs in
modules that it would help you in debugging the errors later on.

8. Regularly update your supervisors about the development of the project.

5. Modules and team member wise distribution of work:

• The project has been divided into 3 team members Shubham


Kumar, Shubham Bharti and Haris Adnan and Mahamad Imsad.
• So work is bifurcated into 3 equal parts. As the development of the project
is occurred in modules so the number of modules is distributed properly.
• Few modules of the project are developed by both the members at their end.
• The documentation is also divided equally among all members.
• Both the members will be working on the coding and the
documentation part will be going on side by side.
6. Software requirements:

Visual basic code i.e. our main IDE.


Python 3.8.
Windows 8 or higher.

7. Problem Definition:
The purpose of Sign Language Recognition (SLR) systems is to provide an efficient
and accurate way to convert sign language into text or voice has aids for the hearing
impaired for example, or enabling very young children to interact with computers
(recognizing sign language), among others.
Sign language uses lots of gestures so that it looks like movement language which
consists of a series of hands and arms motions. For different countries, there are
different sign languages and hand gestures. Also, it is noted that some unknown
words are translated by simply showing gestures for each alphabet in the word. In
addition, sign language also includes specific gestures to each alphabet in the
English dictionary and for each number between 0 and 9.

8. System Methodology:
The Methodological framework for scoping reviews pro-posed in was
adopted in this study, composed of five stages: Identification of a
research question; Identification of relevant studies; Study selection;
Charting the data; and Collating, summarizing, and reporting the results.
The identification of relevant studies and study selection stages in the
scoping review methodology corresponds to the "Preferred Reporting
Items for Systematic Reviews and Meta-Analyses(PRISMA)" workflow
phases: study collection, scan-ning, and eligibility evaluation Therefore,
afterthe scoping review, our research identifies the relevant
worksaccording to the defined research questions and inclusioncriteria
and discusses them in more detail.
9. References:
[1] https://arxiv.org/ftp/arxiv/papers/2201/2201.01486.pdf
[2] https://data-flair.training/blogs/sign-language-recognition-python-ml-
opencv/

[3] https://edu.authorcafe.com/academies/6813/sign-langua
ge-recognition

[4] https://www.sciencedirect.com/science/article/pii/
S2667305321000454

[5] http://cse.anits.edu.in/projects/projects1920C6.pdf

You might also like