Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

SIGN LANGUAGE TRANSLATOR FOR MOBILE

PLATFORMS

Mahesh M, Arvind Jayaprakash, Geetha M


Dept. of Computer Science And Engineering
Amrita School Of Engineering, Amritapuri
Amrita Vishwa Vidyapeetham
Amrita University, India
Email: maheshmukund13@gmail.com, arvindjp1995@gmail.com, geetha.m.amrita@gmail.com

Abstract—The communication barrier of deaf and dumb Features) which uses binary strings as an efficient feature point
community with the society still remains a matter of concern descriptor.
due to lack of perfect sign language translators. Usage of mobile
phones for communication remains a dream for deaf and dumb Description using Oriented Fast and Rotated BRIEF is a
community. We propose an android application that converts big process. In this application, not all the comparison happens
sign language to natural language and enable deaf and dumb through ORB. The histogram matching happens initially to
community to talk over mobile phones. Developing Sign Recog- minimize the images passed to ORB. The rate of determination
nition methods for mobile applications has challenges like need becomes faster keeping the probability of error same when we
for light weight method with less CPU and memory utilization. use this method.
The application captures image using device camera process it
and determines the corresponding gesture. An initial phase of
comparison using histogram matching is done to identify those A. Static Gesture
gestures that are close to test sample and further only those sam- Gestures are of two types- static gesture and dynamic
ples are subjected to Oriented Fast and Rotated BRIEF(Binary
Robust Independent Element Features) based comparison hence
gesture. Those gestures that require the dynamic movement
reducing the CPU time. The user of the application can also of the hand is called dynamic gesture and those gestures that
add new gestures into the dataset. The application allows easy are static, does not require the hand movement are called static
communication of deaf and dumb with society. Though there are gestures.
many computer based applications for sign language recognition,
development in android platform is adequately less.
In Indian Sign Language almost all of the alphabets and
numbers can be represented using static gestures. Also some
Keywords—Indian sign language, background subtraction, words can be represented using static gestures.
thresholding, Histogram comparison, ORB

I. I NTRODUCTION
The sign language is a language that uses gestures of one
or both hands as the medium of communication. The disabled
people, who have hearing and speaking problems communicate
through this gesture language. The increasing urge to support
the deaf and dumb society made the sign language recognition,
an interested area for the researchers. A working recognition
system help the communication of such handicapped people
without the need of any other interpretation.
Though many research work has happened in this field, a
very few had attempted to develop a recognition system in
small hand held android phones. We propose an algorithm Fig. 1. Some static hand gestures
for android platform which captures and recognizes gestures.
The existing systems are computer based and hence common
man rarely get the chance of interaction with the system. This II. R ELATED W ORKS
application will provide a major break through to develop
recognition system in small android systems with less pro- Gesture Recognition systems are developed mostly as
cessing speed and memory. computer applications, Very few work has happened in mobile
based sign recognition. Few works are detailed below.
The algorithms used in the computer based recognition
system might not work in android platform. So our proposed The ”Indian sign language recognition system to aid deaf-
system uses Oriented Fast and Rotated Brief, an algorithm that dumb people” is a system to aid deaf-dumb people which
is the successor of BRIEF(Binary Robust Independent Element translate the sign language recognition into text with static

978-1-5090-6367-3/17/$31.00 ©2017 IEEE 1176


palm side of right hand images. It is a computer application A. Block Diagram
of an accuracy of 96.87 percent. 32 combination sets with
320 images, 10 images of each sign is present in the data set.
As a measure to this problem, the application is giving
This being a computer application can not be implemented in
the user right to add new gestures into the dataset. The user
android devices.
needs to enter name of the new gesture to be added and press
The ”Indian sign language recognition” is another com- ’ADD’ button. The new gesture will be saved with the name
puter based application that represents a framework for a ’¡gesturename¿.jpg’ inside the dataset folder.
human computer interface capable of recognizing gestures
from the Indian sign language.This system can be extended for
This OnClick process requires android camera permission.
words and sentences Recognition is done with PCA (Principal
On the click of ADD button after entering the gesture name,
Component analysis). This paper also proposes recognition
the device opens camera and asks user to capture image. Once
with neural networks. Mobile devices with such processing
the image is captured, the image is saved in dataset folder with
systems are developing and hence implementation of such
the name given by the user.
system is a difficult task
”A Vision Based Dynamic Gesture Recognition of Indian
Sign Language on Kinect based Depth Images” recognizes
3D dynamic signs corresponding to Indian Sign Language
Words. A external device- Microsoft Kinect camera 3D images
are getting processed. By using the new method of axis Of B. Pre-Processing
least inertia global feature is determined. The main goal of
our application to to determine gestures without any external
application, so the use of Kinect is not appretiated in our case.. The pre-processing is a major part of this application. The
images captured by the device camera can not be directly fed
In the IEEE paper 2012 ”Hand Gesture Recognition for comparison. It must go through a series of pre-processing
for Indian Sign Language” by AS Ghotkar introduces a steps in order to get a more accurate output.
hand gesture recognition system to recognize the alphabets
of Indian Sign Language. There are 4 modules: real time
hand tracking, hand segmentation, feature extraction and Skin detection is done using three methods. The result of
gesture recognition. Camshift method and Hue, Saturation, the three methods are combined together to finally determine
Intensity (HSV) color model are used for hand tracking and the skin pixel. Skin detection using RGB, YCbCr, HSI are
segmentation. The computer application here was devloped to the three methods used. The pre-determinant values for skin
determine alphabets. We have a broader view. Real time hand colour are considered in each methods for detection. Once a
tracking is difficult in mobile devices with low processing method confirms a pixel as skin colour pixel, a flag variable
speed. is set. Similarly for the remaining two methods, if it confirms
skin colour, this flag variable is incremented. When all the
three methods are satisfied, the that corresponding pixel is
III. P ROPOSED M ETHOD determined as skin colour pixel.
We propose a solution to the communication barrier of deaf
and dumb community with society. We develop an android 1) Skin Detection- RGB: Skin detection is done pixel
application that determines the gestures taking less memory wise. One pixel is taken and R,G and B value of
and less CPU processing time. The application can be divided the pixel is used for detecting skin pixels. The av-
to two parts - Adding new gesture into the dataset and erage RGB range for skin colour is pre-determined.
recognizing existing gestures from the dataset. Recognition re-
Algorithm 1: Skin detection using RGB
quires comparison, and hence the comparison using histogram
matching is done. The main method to identify gestures is Result: Boolean value depending on whether skin pixel
using descriptors while the histogram matching is done in order is in the given range
to reduce the dataset for comparing using descriptors. Gestures if r<95—g<40—b<20—r<g—r<b then
differ in different regions. In India itself the gesture for one return false;
word or alphabet will not be the same in different parts of the end
country. if -15<r-g<15 then
return false;
end
if max(r,g,b)-min(r,g,b)<15 then
return false;
end
return true;

2) Skin Detection- HSI: Here also one pixel is taken


Fig. 2. Alphabet E with 2 possible gestures and the RGB values of that pixel is used to get the
HSI of the pixel which is later used for skin detection.

1177
Fig. 3. Process flow

Algorithm 2: Skin detection using HSI Algorithm 3: Skin detection using YCbCr
Result: Boolean value depending on whether skin pixel Result: Boolean value depending on whether skin pixel
is in the given range is in the given range
h=0,s=0,i=0; y=0,cb=0,cr=0;
i=r+g+b/3; y=(0.257*r)+(0.504*g)+(0.098*b)+16;
r=r/i; cb=(0.439*r)+(0.368*g)+(0.071*b)+128;
g=g/i; cr=(0.148*r)+(0.291*g)+(0.439*b)+128;
b=b/i; if cr=((1.5862*cb)+20) then
w=0.5*(r-g)+(r-b)/(r-g)2̂+(r-b)2̂ ; return false;
h=cos(w); end
if b>g then if cr<=((0.3448*cb)+76.2069) then
h=2*PI*h; return false;
end end
if h<25 — h>230 then if cr<=((-4.5652*cb)+234.5652) then
return true; return false;
end end
return false; if cr=((-1.15*cb)+301.75) then
return false;
end
if cr=((2.2857*cb)+432.85) then
return false;
end
return true;

4) Thresholding: Thresholding is the process of chang-


ing all the pixels in the image to binary, black or white
3) Skin Detection - YCbCr: Each pixel is taken one by one, depending on some criteria. Here the criteria used is -
the RGB values are converted to YCbCr and then checked. whether all the skin detection algorithms are satisfied or not.

1178
Algorithm 4: Thresholding The first phase of comparison happens here. Not all
Result: Pixels changed to either black or white the images are fed to ORB(). As ORB() is a big op-
i=0; eration and requires more time, the data set of im-
a=0; ages fed to ORB() is reduced by doing histogram
white=0*00ffffff; matching first. After histogram matching, if the compare
black=0*00000000; value is between a possible range it is fed to ORB().
while i<pixel.length do Algorithm 5: Histogram Matching
i++; Result: Returns text and calls method depending upon
if isSkinRGB()==true then G2
a+=1; compare=histogram1.compare(histogram2);
end if compare==0 then
if isSkinHSI()==true then ”Exact duplicates”;
a+=1; end
end if 0<compare<1500 then
if isSkinYCbCr()==true then ”Possible Duplicate”;
a+=1; ORB();
end end
if a==3 then if compare>1500 then
pixel[i]=white; ”Different images”
end end
pixel[i]=black; ”No image”
end
2) ORB: ORB is an efficient algorithm developed as an
5) Resizing: The android devices nowadays are having high alternative for SURF/SIFT to produce the good match among
quality cameras. As the quality increases the size of the image images. It is FAST key point detector and BRIEF descriptor.
also increases. Large sized images are difficult for processing.
Hence the image is resized. The height and width of the Here in ORB, the FAST algorithm is used to detect the
captured image is taken and reduced by half. key points. As all the key points will be detected and we only
need selected N points, we use the Harris algorithm. Scaling
is done with pyramid transform.
C. Recognizing existing gestures
As the normal BRIEF does not do well with orientation,
The difficult part of the application is the recognition here steered BRIEF is used. The steering is done by some
phase.It is done with 2 different processes. The dataset will factors. The intensity of the centroid is determined. The
be large and hence to process every image under ORB() will direction from a corner to this centroid is determined which is
be a time consuming task. Hence before ORB(), the images later used for steering. Greedy algorithm is used to determine
are first compared using histogram matching. Only the images the low co-relation between pixel blocks and the descriptor is
that pass through this matching process goes to ORB(). determined.
The device should compare images and find the match. After determining the descriptor and key points of both
Comparison is done with all the images in the dataset. Prior images, the good match is determined. If the matching value is
to the first best match, the system should not stop iteration, the above the given match value, then these images are considered
comparison should continue till all the images in the dataset as possible matches.
is compared. There can be more than one output as there
can be two or more alphabets or words with same gesture After histogram matching, the captured image along with
representation. the image from reduced dataset are fed to ORB(). Here key
point of both images are determined and image description
Recognition phase starts with the camera for capturing using ORB descriptor is determined. A good match list is
image. Here also, it requests device permission to open camera. produced. If the image matching percentage is greater than
On click of recognize button camera opens and user is asked our set value, the name of the loaded image(from dataset) is
to take image. Once the image is taken, it is pre-processed. displayed as output.
The images present in the dataset are loaded one by one on
each iteration. The loaded images are later pre-processed. The
IV. R ESULT AND P ERFORMANCE E VALUATION
two pre-processed images are fed to the comparator which
uses Histogram matching and ORB descriptor comparison to For developing an android application, Android Studio
compare the images and give the image name as output if good latest version is downloaded and installed(Version 2.2.3). The
match is found. SDK components required for the mobile device is determined
and installed. The SDK components depends on the version of
1) Histogram Matching: Compute the histograms of the Android the device is having. OpenCV version 2.4.1 is used.
two images.With the histograms, cumulative distributive func- It is added as a module to the android project. Compile time
tion (F1 and F2) for two images are calculated. For each dependency is given to the added module.
gray level G1 between 0 and 255, G2 is found such that
F1(G1)=F2(G2). The value of G2 is taken as a compare The android application is tested with the obtained dataset
variable. and results are formulated. The performance of the application

1179
A. Testing
The static gestures corresponding to Indian Sign Language
is added as images to the dataset folder. From this, a few
gestures are chosen which are difficult for the system to
distinguish and determine. Each selected gesture is checked
number of times and outputs are noted. The outputs are
analyzed and tabular form of the results is made.
1) Results: Different alphabets and numbers are taken and
tested. These taken images are compared with the ones in
the dataset. Repeated tests had been taken and outputs are
recorded. In the table, each column corresponds to a n alphabet
or number obtained and each row corresponds to an alphabet
or number tested and the result it obtained after testing. If 1
is the cell value, then the testing character gives the column
heading as output. If 0 then vice versa.

Fig. 6. Matrix of result obtained(1 corresponds to the output obtained For


eg: In testing for gesture 1 result obtained is 1 and I)

Fig. 4. ORB
Representing the testing results in tabular form. It shows
the characters tested, the corresponding output obtained, and
is measured using accuracy of the result obtained. This analysis the accuracy of the output obtained with the correct output.
will help for future developments of the application.
1) Dataset: The dataset is containing gestures of alphabets
and numbers. All the alphabets that can be represented as static
gesture and first ten whole numbers are added in the dataset.
As the user can add to the dataset, the size of the dataset
remains not fixed.

Fig. 7. Matrix of result obtained(1 corresponds to the output obtained For


eg: In testing for gesture 1 result obtained is 1 and I)

2) Performance Evaluation: Accuracy of the output is


determined by taking number of test cases.
Accuracy(%) = Expected output/Actual output ∗ 100 (1)
Accuracy graph is prepared using this data.

V. C ONCLUSION
A hand held gesture recognition system for the common
Fig. 5. Dataset man makes this a unique contribution. Interaction with the deaf

1180
7) Rajam P S, Balakrishnan G,”Real Time Indian Sign
Language Recognition Systern to aid deaf-dumb peo-
ple”, Communication Technology (ICCn, 2011 IEEE
13th International Conference, pp. 737- 742.
8) Nadia R. Aibelwi, Yasser M. Aiginahi, ”Real-Time
Arabic Sign Language (ArSL) Recognition” Interna-
tional Conference on Communications and Informa-
tion Technology 2012.
9) IR Sensor-Based Gesture Control Wheelchair for
Stroke and SCI Patients Megalingam R.K, Rangan V,
Krishnan S, Edichery Alinkeezhil A.B Humanitarian
Technology Labs, Electronics and Communication
Department, Amrita Vishwa Vidyapeetham Univer-
sity, Kollam, India Infosys, Pune, India University of
Fig. 8. Pie chart of accuracy New Mexico, Albuquerque, NM, United States. IEEE
Sensors JournalVolume 16, Issue 17, 1 September
2016, Article number 7501503.
and dump society will be increased and discrimination towards 10) MK Bhuyan, D Ghosh, and PK Bora. Finite state
them will be reduced. representation of hand gesture using key video object
plane. In TENCON 2004. 2004 IEEE Region 10
Static hand gestures captured by the device camera is Conference, pages 579582. IEEE, 2004.
processed and the corresponding gesture is determined with an 11) M Geetha, Rohit Menon, Suranya Jayan, Raju James,
accuracy around 70 percent. The output text obtained as the and GVV Janardhan. Gesture recognition for amer-
result of comparison will help the deaf and dumb society to a ican sign language with polygon approximation. In
great extent. As the user can add new gestures to the dataset, Technology for Education (T4E), 2011 IEEE Inter-
the application might not get outdated. We are planning to national Conference on, pages 241 245. IEEE, 2011.
extend our work to dynamic gesture recognition.
Dynamic gestures that have dynamic movement of the hand
requires video processing to determine the gesture. Goal is to
develop an android application that recognizes both static and
dynamic gestures at a higher accuracy rate.

VI. R EFERENCES
1) T.Starner and A. Pentland, Real-time american
sign language recognition from video using hid-
den markov models, Technical Report, M.I.T Media
Laboratory Perceptual Computing Section, Technical
Report No.375, 1995.
2) Geetha M, Manjusha C,Unnikrishnan P and Harikr-
ishnan R, A Vision Based Dynamic Gesture Recogni-
tion of Indian Sign Language on Kinect based Depth
Images May 21 2013.
3) Kusurnika Krori Dutta , Satheesh Kumar Raju , Anil
Kumar G ,Sunny Arokia Swarny Double Handed
Indian Sign Language to Speech and Text, 2015
Third International Conference on Image Information
Processing.
4) Shangeetha.R.K, Valliammai.V, Padmavathi.S. Corre-
spondence Dept. of Information Technology, Amrita
Vishwa Vidyapeetham, Coimbatore, India. Computer
vision based approach for Indian sign language char-
acter recognition 2012 international conference on
machine vision and image processing M V I P 2012
Coimbatore Tamil Nafu
5) MJerin Jose V.Priyadharshni et.al, ”Indian Sign Lan-
guage (ISL) Translation Systern For Sign Language
Learning,”IJIRD, vol. 2, May,2013, pp. 358-365.
6) R. Rokade, D. Doye, and M. Kokare, ”Hand Ges-
ture Recognition by Thinning Method” International
Conference on Digital Image Processing (2009).

1181

You might also like