Professional Documents
Culture Documents
Research Paper1
Research Paper1
Research Paper1
ANURAG PRAJAPATI
INFORMATION TECHNOLOGY
GREATER NOIDA INSTITUTE OF TECHNOLOGY(GNIOT)
GREATER NOIDA ,UTTAR PRADESH
anuragprajapati52852@gmail.com
NANDINI SHARMA
INFORMATION TECHNOLOGY
GREATER NOIDA INSTITUTE OF TECHNOLOGY(GNIOT)
GREATER NOIDA ,UTTAR PRADESH
nandinisharma0126@gmail.com
SAURABH KUMAR
INFORMATION TECHNOLOGY
GREATER NOIDA INSTITUTE OF TECHNOLOGY(GNIOT)
GREATER NOIDA ,UTTAR PRADESH
saurabhhaldwan2100@gmail.com
Abstract— Hand Gesture Recognition
Hand gesture recognition system received great attention in the recent few years because of its manifoldness applications and the ability to
interact with machine efficiently through human computer interaction. In this paper a survey of recent hand gesture recognition systems is
presented. A major goal of gesture recognition research is to create systems that can recognize specific human gestures and use them to convey
information. In this paper, we make use of three algorithms and compare those algorithms and find out which algorithm suits best for Hand
Gesture Recognition System, we have taken the datasets of all of the three algorithms and compare those datasets and the algorithm which
have a high accuracy amongst them will be proven as the best, we found that SVM works very well amongst all of the algorithms
INTRODUCTION
Hand gesture recognition for human-computer interaction is an active research area in computer vision and machine learning. A
major goal of gesture recognition research is to create systems that can recognize specific human gestures and use them to convey
information. In this paper, we present a comparative study of three algorithms for static hand gesture recognition .SVM works
very well and has high accuracy rate of 95.7% [1] while KNN (K-Nearest Neighbours) has accuracy rate of 91.5% [1] and Naïve
Bayes has very low accuracy rate of 82.6% [1]
Naïve Bayes algorithm is a supervised learning algorithm, which is based on Bayes theorem and used for solving classification
problems. It is mainly used in text classification that includes a high-dimensional training dataset. Naïve Bayes Classifier is one
of the simple and most effective Classification algorithms which helps in building the fast machine learning models that can
make quick predictions. It is a probabilistic classifier, which means it predicts on the basis of the probability of an object. Some
popular examples of Naïve Bayes Algorithm are spam filtration, Sentimental analysis, and classifying articles. Bayes' theorem
is also known as Bayes' Rule or Bayes' law, which is used to determine the probability of a hypothesis with prior knowledge. It
depends on the conditional probability. The formula for Bayes' theorem is given as:[4]
Before the sensors signal processed in Naïve Bayes and NN algorithm, the information is filtered. Moving average filter is used
to reduce the noise. For the retrieval of data for the training phase, subjects performed thirty times for each pose. In the Naïve
Bayes method, the mean and the standard deviation of the RMS signal of the training phase are saving in the database of the
algorithm[4]
Gaussian: The Gaussian model assumes that features follow a normal distribution. This means if predictors take continuous
values instead of discrete, then the model assumes that these values are sampled from the Gaussian distribution.
Multinomial: The Multinomial Naïve Bayes classifier is used when the data is multinomial distributed. It is primarily used
for document classification problems, it means a particular document belongs to which category such as Sports, Politics,
education
Bernoulli: The Bernoulli classifier works similar to the Multinomial classifier, but the predictor variables are the independent
Booleans variables. Such as if a particular word is present or not in a document. This model is also famous for document
classification tasks.
MACHINE LEARNING
Machine Learning is the field of study that gives computers the capability to learn without being explicitly programmed. ML is
one of the most exciting technologies that one would have ever come across. As it is evident from the name, it gives the
computer that makes it more similar to humans: The ability to learn. Machine learning is actively being used today, perhaps in
many more places than one would expect. There are three types of Machine Learning:
Supervised learning is the types of machine learning in which machines are trained using well "labelled" training data, and on
basis of that data, machines predict the output. The labelled data means some input data is already tagged with the correct
output. In supervised learning, the training data provided to the machines work as the supervisor that teaches the machines to
predict the output correctly. It applies the same concept as a student learns in the supervision of the teacher. Supervised
learning is a process of providing input data as well as correct output data to the machine learning model. The aim of a
supervised learning algorithm is to find a mapping function to map the input variable(x) with the output variable(y). In the
real-world, supervised learning can be used for Risk Assessment, Image classification, Fraud Detection, spam filtering, etc.
As the name suggests, unsupervised learning is a machine learning technique in which models are not supervised using
training dataset. Instead, models itself find the hidden patterns and insights from the given data. It can be compared to
learning which takes place in the human brain while learning new things. Unsupervised learning is a type of machine learning
in which models are trained using unlabelled dataset and are allowed to act on that data without any supervision.
Unsupervised learning cannot be directly applied to a regression or classification problem because unlike supervised learning,
we have the input data but no corresponding output data. The goal of unsupervised learning is to find the underlying structure
of dataset, group that data according to similarities, and represent that dataset in a compressed format.
Reinforcement Learning:
Reinforcement learning is an area of Machine Learning. It is about taking suitable action to maximize reward in a
particular situation. It is employed by various software and machines to find the best possible behavior or path it should
take in a specific situation. Reinforcement learning differs from supervised learning in a way that in supervised learning
the training data has the answer key with it so the model is trained with the correct answer itself whereas in reinforcement
LITERATURE SURVEY
The authors presented a scheme using a data base driven hand gesture recognition based upon threshold approach. Authors
presented the static hand gesture recognition system using digital image processing. Author proposed a real time vision-based
system for hand gesture recognition for human computer interaction in many applications.
They presented application that helps the deaf and dumb person to communicate with the rest of the world using sign
language. Currently, it has been used in virtual environment control system, also used in sign language translation, robot
remote control or musical creation.With the development of today’s technology, and as humans tend to naturally use hand
gestures in their communication process to clarify their intentions, hand gesture recognition is considered to be an important
part of Human Computer Interaction (HCI), which gives computers the ability of capturing and interpreting hand gestures,
and executing commands afterwards. The aim of this study is to perform a systematic literature review for identifying the
most prominent techniques, applications and challenges in hand gesture recognition[5].
The research work in hand gesture recognition has been developing for more than 38 years (Prashan, 2014). In 1977, a system
that detects the number of fingers bending using a hand glove was proposed by Zimmerman (Praveen & Shreya,2015).
Furthermore, Gary Grimes in 1983 developed a system for determining whether the thumb is touching another part of the
hand or fingers (Laura, Angelo & Paolo, 2008). In 1990, despite the limited computer processing powers, the systems
developed then gave promising results (Prashan, 2014).
The field of hand gesture recognition is very wide, and a big amount of work was conducted in the last 2 to 3 years. In this
research, we survey the latest researches that were done on hand gesture recognition. We shall also compare the different
techniques, applications, and challenges presented by the surveyed work. The reason why the most recent research articles
from IEEE. database was chosen to be studied is that we wanted to construct a valid base of the current situation and
technologies of hand gesture recognition. Furthermore, the articles published by IEEE in the year of 2016 to 2018 were
considered to increase the intensity and focus of this study, and because the recent works were not sufficiently studied before,
where the older ones were studied and compared more such as in Rafiqul & Noor (2012), Arpita, Sanyal & Majumder
(2013) and Deepali &Chitode (2012).
1. Introducing the most recent researches from the year 2016 to the year 2018 in the field of hand gesture recognition for the
first time.
2.Comparing the different algorithms with their datasets discussed in the current available technology of hand gestures
recognition.
This Paper includes the results and analysis of the work studied. Then the next section discusses the future of hand gesture
recognition. The last section concludes the research
PROPOSED SYSTEM
The Gesture Recognition System takes the input hand gestures through the in-built web camera at a resolution of 320 x 240
pixels. The images are captured in a high intensity environment directed to illuminate the image source which is held at
black background so as to avoid shadow effects. The images are captured at a specified distance (typically 1.5 – 2 ft)
between camera and signer. The gestures are given by palm side of right hand. The captured video is then processed for
Hand motion detection and it is done using SAD. Then the segmentation of hand is carried out. The segmented hand image
is used for finding features. These features are used for gesture recognition.[7]
METHODOLOGY
The proposed system consists mainly of three phases:
The first phase -- pre-processing
The next phase -- feature extraction
The final phase -- classification.
The first phase includes hand segmentation that aims to isolate hand gesture from the background and removing the noises
using special filters. This phase includes also edge detection to find the final shape of the hand.
The next phase, which constitutes the main part of this research, is devoted to the feature extraction problem where two
feature extraction methods, namely, hand contour and complex moments are employed. These two extraction methods were
applied in this study because they used different approaches to extract the features, namely, a boundary-based for hand
contour and region-based for complex moments. The feature extraction algorithms deal with problems associated with hand
gesture recognition such as scaling, translation and rotation. In the classification phase where neural networks are used to
recognize the gesture image based on its extracted feature, we analyze some problems related to the recognition and
convergence of the neural network algorithm. As a classification method, SVM has been widely employed especially for real-
world applications because of its ability to work in parallel
Segmentation
In this step, hand is segmented from the background. The image in RGB colour is converted into grey scale image which is in
turn converted into black and white image. Filling is applied on the image which leads to the filling in of the holes inside the
hand region. Blob Analysis is performed to obtain the largest white area in a binary image as the hand. Now, vertical
orientation is performed, during which the main axis of the hand image (the longest axis) is identified and is then rotated by the
required angle to correct it to the upright position, so that the hand gesture image will be vertical. To smooth the edges,
morphological filters dilation and erosion are used. Now separate the arm from palm on the basic idea that arm is thinner than
palm.
Feature Extraction
In this step, the binary image is processed using thinning operation, which is a morphological operation that is used to remove
selected foreground pixels from binary image. This produces another binary image which is skeleton image. This image is used
for extracting the end points. According to the connectivity neighbours, the endpoint is a point that contains the only one 8-
connectivity neighbours and represents the terminal pixel of the thin segment. shows the end points
Feature Extraction figure1. (a) Feature Extraction figure1. (d)
A use case is a methodology used in system analysis to identify, clarify and organize system requirements. The use case is
made up of a set of possible sequences of interactions between systems and users in a particular environment and related to a
particular goal. The method creates a document that describes all the steps taken by a user to complete an activity of use case
SVM Dataset
NAÏVE BAYES
This naive approach for hand gesture recognition based on the well-known classifier.Naive Bayes. Naïve Bayes algorithm is a
supervised learning algorithm, which is based on Bayes theorem and used for solving classification problems. It is mainly used
in text classification that includes a high-dimensional training dataset. Naïve Bayes Classifier is one of the simple and most
effective Classification algorithms which helps in building the fast machine learning models that can make quick predictions. It
is a probabilistic classifier, which means it predicts on the basis of the probability of an object. Some popular examples of
Naïve Bayes Algorithm are spam filtration, Sentimental analysis, and classifying articles. Bayes' theorem is also known
as Bayes' Rule or Bayes' law, which is used to determine the probability of a hypothesis with prior knowledge. It depends on
the conditional probability.[9]
KNN Dataset
Accuracy is one metric for evaluating classification models. Informally, accuracy is the fraction of predictions our model got
right. Formally, accuracy has the following definition: Accuracy=Number of correct predictions/Total number of predictions
For binary classification, accuracy can also be calculated in terms of positives and negatives as follows:
Accuracy=TP+TNTP+TN+FP+FN
where TP =True Positives, TN = True Negatives, FP =False Positives, and FN = False Negatives.
Comparison of the Accuracy Rate of all of the three algorithms used in this Research Paper:
Classifier KNN Naïve SVM
Bayes
CONCLUSIONS
In this paper various methods are discussed for hand gesture recognition, these methods are SVM, Naïve Bayes and KNN (K-
Nearest Neighbours) The experimentation is tested on various dataset which justifies that the proposed solution
outperforms the existing methods by being robust to scale variance and does not require any predefined templates for
recognitions works very well in hand gesture recognition from other two algorithms KNN and Naïve Bayes[9][10][11]
FUTURE SCOPE
The scope of this project is to build a real time gesture classification system that can automatically detect gestures
in natural lighting condition. In order to accomplish this objective, a real time gesture-based system is developed to
identify gestures. Make disable person capable to communicate with able person. Make a system to establish a way
of sharing thought and ideas of disabled person. Use to detect, recognise and interpret the hand gesture through
computer vision so that they also can communicate with others.
ACKNOWLEGEMENT
We have taken efforts in this project. However, it would not have been possible without the kind support and help of many
individuals and organizations. We would like to extend my sincere thanks to all of them, We are highly indebted to Dr.
Shivani Dubey, Professor of Department of Information Technology. We are greatly thankful to Dr. Vikas Singhal, Head of
Information Technology for his support and cooperation. Finally, We thank my parents and project group partners who
directly and indirectly contributed to the successful completion of my project.
REFERENCES
[1] Hninn, T. and H. Maung, Real-Time Hand Tracking and Gesture Recognition System Using Neural Networks. 2009. 50(Frebuary): p. 466-470.
[2] Chaudhary, A., et al., Intelligent Approaches to interact with Machines using Hand Gesture Recognition in Natural way: A Survey. International
Journal of Computer Science & Engineering Survey, 2011. 2(1): p. 122- 133.
[3] Chen, Q., N.D. Georganas, and E.M. Petriu, Real-time Vision-based Hand Gesture Recognition Using Haar-like Features, in Instrumentation and
Measurement Technology Conference, IEEE, Editor 2007: Warsaw, Poland.
[4] , R. RapidMiner: Report the Future. December 2011]; Available from: http://rapid-i.com/content/view/181/196/.
[5] Mitra, S. and T. Acharya, Gesture recognition: A Survey, in IEEE Transactions on Systems, Man and Cybernetics2007, IEEE. p. 311-324.
[6] Murthy, G.R.S. and R.S. Jadon, A Review of Vision Based Hand Gestures Recognition. International Journal of Information Technology and
Knowledge Management, 2009. 2(2): p. 405-410.
[7] Faria, B.M., N. Lau, and L.P. Reis. Classification of Facial Expressions Using Data Mining and machine Learning Algorithms. in 4ª Conferência
Ibérica de Sistemas e Tecnologias de Informação. 2009. Póvoa de Varim, Portugal.
[8] Gillian, N.E., Gesture Recognition for Musician Computer Interaction, in Music Department2011, Faculty of Arts, Humanities and Social
Sciences: Belfast. p. 206.
[9] Faria, B.M., et al., Machine Learning Algorithms applied to the Classification of Robotic Soccer Formations ans Opponent Teams, in IEEE
Conference on Cybernetics and Intelligent Systems (CIS)2010:
Singapore. p. 344 - 349
[10] Mannini, A. and A.M. Sabatini, Machine learning methods for classifying human physical activity from on-body accelerometers. Sensors, 2010.
10(2): p. 1154-75.
[11] Maldonado-Báscon, S., et al., Road-Sign detection and Recognition Based on Support Vector Machines, in IEEE Transactions on Intelligent
Transportation Systems2007. p. 264-278.
[12] Vicen-Bueno, R., et al., Complexity Reduction in Neural Networks Appplied to Traffic Sign Recognition Tasks, 2004.
[13] Witten, I.H., E. Frank, and M.A. Hall, Data Mining - Pratical Machine Learning Tools and Techniques. Third Edition ed2011: Elsevier.
[14] Snyder, W.E. and H. Qi, Machine Vision2004: Cambridge University Press.
[15] Stephan, J.J. and S. Khudayer, Gesture Recognition for Human- Computer Interaction (HCI). International Journal of Advancements in
Computing Technology, 2010. 2(4): p. 30-35.
[16] Ben-Hur, A. and J. Weston, A User’s Guide to Support Vector Machines, in Data Mining Techniques for the Life Sciences2008.
[17] Ke, W., et al., Real-Time Hand Gesture Recognition for Service Robot. 2010: p. 976-979.
[18] Camastra, F. and A. Vinciarelli, Machine Learning for Audio, Image and Video Analysis2008: Springer.
[19] Future Genre Compute System(2019) Shady. et al.
[20] Into J Compute App.(2013)RautarayS.S. et al.Vision based hand gesture recognition for human computer interaction: A survey
[21] Artif Intell Rev(2015)WachsJ.P. et al. Hand gestures are obtained by evaluating the contour captured from the image segmentation using a glove
worn by the speaker in Rosalina et al. (2017). Also, in Danling, Yuanlong & Huaping (2016) they used a novel data glove called YoBu to collect
data for gesture recognition.
[22] The work in Gunawardane & Nimali (2017) compared using a data glove to track the motion of the human hand using flex sensors, gyroscopes
and vision data with Leap Motion Controller
[23] Furthermore, in Eko, Surya & Rafiidha (2017), Shaun et al. (2017) and Deepali & Milind (2016) the gestures are also captured using Leap
Motion Controller.
[24] Vision-based hand-gesture application Pattern Recognit Lett (2021) MitraS. et al.
[25] To control OS on the projected screen for a virtual mouse system without any hardware requirement one camera source was required (Rishabh et
al., 2016). Also, data acquisition was done using Camera interfacing in Ashish & Aarti (2016a) and Ashish & Aarti (2016b).
[26] The first method is using vision-based hand gesture recognition to extract images which was proposed by Weiguo et al. (2017), Ananyaa et al.
(2017), Alvi, Fatema & Mohammad (2016), Shome (2017). Where in Weiguo et al. (2017) it was built in a real-time system
[27] Hand Gesture Recognition via Lightweight VGG16 and Ensemble Classifier 2022, Applied Sciences (Switzerland)
[28] SarkarA.R. et al. Hand gesture recognition system Int J Comput Appl (2013)
[29] RautarayS.S. et al.Vision based hand gesture recognition for human computer interaction: A survey Artif Intell Rev(2015)
[30] VallathanG. et al. Suspicious activity detection using deep learning in secure assisted living IoT environments J Supercomput(2021) LiuS. et
al.Overview and methods of correlation filter algorithms in object tracking Complex Intell Syst(2021)