Hand Gesture Recognition: (I) Area of Interest Calculation

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 7

HAND GESTURE RECOGNITION

1.Introduction.
This work generally deals with the detection and recognition of hand
gestures. Images are obtained through the webcam. Hand gesture has become
one of the essential methods to build up user friendly environment. This could
help in developing robots that can understand sign language and also help
gamers to have the control of the game than the controllers. Modern researchers
are giving equal importance to both speech and hand gesture recognition, but
the later gas gained a lot of interest recently. This is mainly because the number
of datasets required for speech recognition is more when compared to gesture
recognition.
In case of hand gesture recognition, we can either use a predefined
database that contains the simple gestures and match it with the live images, or
we can identify the gesture on the run based on the position of each finger with
respect to the others. This work mainly focuses on the later part. We make use
of the concepts of digital image processing and computer vision.

2.Methodology
(i) Area of interest calculation
Initially the video is captured through the webcam. After capturing the
video, we generally define the area of interest. It is defined based on where we
need to position our hand. (figure of hand)
(ii) Changing the colour space and extract the skin image
We here use HSV colour space. This is most preferably used for object
tracking. It has Here we need to segregate the portion that contains hand. By
specifying the range of skin colour in hsv model we can extract the skin colour
image from the region of interest this becomes our mask
To this we perform a morphological operation called dilation. This
process adds the pixel to the boundary of the image. This is done in order to see
even the smallest portion in an image.in the dilation process the parameters are
kernel and iteration. Kernel is the structuring element which we need to specify
beforehand and the iteration specifies the number of times the dilation process
has to continue.
(iii)Identifying contours
Contours are defined as the line joining the points along the boundary of
an image that are having the same intensity. Here we have specified to segregate
only he hand image by providing the range of pixels that has the same colour as
the skin colour.
Here, by using chain approximate simple, all the redundant pixels in the
image gets removed and it also compresses the image, thereby saving some
memory.
Next the area occupied by the hand is found out. Next the exterior shape
of the hand is found using the concept of convex hull. Convex hull of a shape is
a tight-fitting convex boundary around the points or edges of the shape. area of
the complex hull is found following this. The area of hull and hand are defined
separately. The percentage of the area not occupied by the hand is found out in
order to accurately mark the defects. Any deviation of the object from this hull
can be marked as a defect. This is found out through convexity defect.
Convexity defect is a cavity in an object or a contour that is segmented from an
image. Which means it is an area that do not belong to the object but located
inside its outer boundary ie. its convex hull.
(iv)Identifying the number of defects
The defects in the convex hull has the three points, the start, stop and the
farthest points. The fourth parameter specifies the distance from the centre of
the hull. Here the defects are differentiated from the required gestures by using
the cosine angle.
The angles in between the fingers are generally less than 90, only with the
help of angles the gesture recognition is performed. So in this case all the angles
greater than 90 are excluded and all the points that are very close to the hull are
also excluded as they are considered as defects. after this a affine line is drawn
around the hand to differentiate it from all the other portions if the region of
interest.
(v)Finding out the gestures
For gesture recognition, defect becomes primary. Number of defects plus
one is the number of fingers in the region of interest. Gestures are identified
only based on the no of defects and the angle between each finger.
If the number of defects plus one is equal to one the output can be zero or
one or all the best. Now to further put a light on whether it 0 or 1 or all the
above, contour ratio is considered. The contour ratio is higher in case of 0. One
and all the best have their own contour ratios respectively. The ratios are
calculated beforehand and they are applied in the if condition to differentiate. If
there is nothing in the box, then it gets displayed as place your hand properly.
Similarly, the conditions for all other gestures are captured and the results are
displayed.

3.Results

Figure 3.1.1 contour for 0 Figure 3.1.2 mask for zero


Figure 3.2.1 Contour for All the best Figure 3.2.2 Mask for al the best

Figure 3.3.1 Contour for 1 Figure 3.3.2 Mask for 1


Figure 3.4.1 Contour for 2 Figure 3.4.2 Mask for 2

Figure 3.5.1 Contour for 3 Figure 3.5.2 Mask for 3


Figure 3.6.1 Contour for ok Figure 3.6.2 Mask for ok

Figure 3.7.1 Contour for 4 Figure 3.7.2 Mask for 4

Figure 3.8.1 Contour for 5 Figure 3.8.2 Mask for 5


Conclusion
Based on repetitive analysis it is found out that, the results mainly depends
upon:
(i) Noise
Video streaming is always affected by noise. Even slightest hand
movements would affect the gesture recognition. If suppose the hand is steady
for so long enough, the program outputs would be correct.
(ii)Position of the hand
Hand should generally placed in the region of interest to avoid error
representing that there is no figure inside
(iii)Illumination of the place
The place should be illuminated sufficiently to recognise the skin colour
pixels which are further used for recognizing the gestures. Uneven lighting
across the picture of the hand causes the algorithm to draw contours along the
wrong rea and identify the wrong pixels. So, place should illuminate properly
and the threshold has to be set accordingly.
(iv)Range of skin coloured pixels
Range of the skin coloured pixels must me mentioned correctly to avoid
any mistakes during the gesture recognition. Incorrect range of pixels may lead
to wrong prediction of the image
(v)Threshold of the ratio
Threshold for converting the original image to hsv image must be
satisfied. For gesture recognition the ratio of area od convex hull and the
original image is very important. Based on the ratio the gestures are being
identified.

You might also like