Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

Analysis of Zernike Moment-Based

Features for Sign Language Recognition

Garima Joshi, Renu Vig and Sukhwinder Singh

Abstract This paper discusses a Zernike Moment (ZM) based feature vector that
can characterize the alphabets of Indian Sign Language (ISL). Sign Language
Recognition (SLR) is a multiclass shape classification problem. Studies of human
visual system reveal that while observing any scene, the focus is more on the center
part and it decreases toward the edges. This became the basis of calculating the
ZMs on a unit circular disk. Continuous orthogonal moments such as magnitude of
ZM are well known for their shape representing capabilities. However, for a SLR
system the highest order of moments is to be estimated, because with increase in
order of moments, the feature vector size increases significantly. In order to find the
maximum order that is sufficient to classify the shapes of hand silhouettes, per-
formance of various classifiers is analyzed. Results show that increasing the order
of ZM, beyond a certain order does not contribute to the improvement in recog-
nition capability. The results improve when the ZM is combined with some basic
geometric features and commonly used shape descriptors such as Hu moments
(HMs).

Keywords Sign language recognition  Zernike moments  Shape features

1 Introduction

India has a large population of speech and hearing-impaired people. Indian Sign
Language (ISL) is used by the speech and hearing-impaired to communicate.
Gesture recognition system can act as an interpreter for Sign Language (SL) [1]. It
can eliminate the need of a translator to make conversations with the society at any
point of time and can bridge the communication gap. Shape-based features that can
be used for shape recognition include Hu Moment (HM), Zernike Moment (ZMs),

G. Joshi (&)  R. Vig  S. Singh


University Institute of Engineering and Technology, Panjab University,
Chandigarh 160014, India
e-mail: joshi_garima5@yahoo.com

© Springer Nature Singapore Pte Ltd. 2018 1335


R. Singh et al. (eds.), Intelligent Communication, Control and Devices,
Advances in Intelligent Systems and Computing 624,
https://doi.org/10.1007/978-981-10-5903-2_140
1336 G. Joshi et al.

edge information, and geometric features (GF) [2]. Hand shape is a very important
and basic characteristic used to recognize SL [3]. Gesture recognition is the
mathematical modeling of hand shapes using appearance-based features. These
features can be used to identify different shapes [4]. Moment-based features find
applications in shape representation domain. These include non-orthogonal
moments, orthogonal moments, continuous and discrete moments [5]. Potocnik
assessed the ability of moment-based descriptors to recognize shape [6]. It was
found that orthogonal moments such as ZM could also represent minor variations in
shape and could represent variation in medical images such as tumor [7]. ZMs are
scale invariant, rotation invariant, and translational invariant. Kim et al. compared
ZM with various competing descriptors and found ZM as an effective region-based
descriptor [8]. Sabhara et al. compared ZM and HM. They reported ZM to be more
accurate, flexible, and easier to reconstruct than HM. Increasing the order of the ZM
increased the accuracy, and as per the system requirement, an optimal order of ZM
could be chosen [9]. Priya and Bora studied the capability of orthogonal moments
to address the challenges associated with signer-independent and viewpoint-variant
hand gesture recognition system [10].
In pattern recognition domain, moments possess the capability to represent
shape. Depending on the order u of the polynomials, moments are divided into
lower and higher orders. Primarily, the pattern recognition applications are focused
on low-order polynomials. Also, the size of feature vector increases as the order
increases. This in turn affects the performance of a classifier. Therefore, there is a
need to find the optimal value of order of the moment being derived.
In this work, the performance of a Sign Language (SL) recognition system is
analyzed as the order of ZM increases. The order of moments to be included can be
estimated by observing the basic characteristic of polynomial order or by using
reconstruction process. The polynomial characteristics in 1D are analyzed in
Sect. 2. We have created an image dataset for 26 ISL alphabets for 90 subjects. It
has total number of 2300 images.
Geometric features are extracted for the binary hand images using region-based
parameters (area, moments, and axis) and boundary parameters. The geometrical
shape descriptors are listed in Table 1. The details can be found in [11] and in [12].

Table 1 Shape-based feature set


Feature type Feature set
Geometric features (7) Circularity ratio (CR),
Spreadness (SS), roundness (RO),
Solidity (S), average bending energy (BE),
Eccentricity (E), convexity (CV)
Hu moments (7) H1, H2, H3, H4, H5, H6, H7
Zernike moments (121) ZM uv of order u = 0 to u = 20 and iteration v from ZM0 0 to ZM20 20
Such that u + v is always even with maximum u = v
Analysis of Zernike Moment-Based Features … 1337

2 Analysis of Zernike Moments

While calculating ZMs, the first step is to convert an image into the unit square
image. In the literature, it is revealed that when a person observes an image, center
part of the image is more focused. Slowly the attention reduces outward from the
center. This study is used in feature extraction and recognition activities. In the
calculation of ZMs, this concept is used, so unit disk image is divided into many
rings spaced equally outward from the center of the image. Pixels in the different
rings carry different weights. This image of unit disk is then projected onto Zernike
polynomials, and weights are included in the calculation of Zernike moments [13].
Figure 1 shows the first iteration of odd Zernike polynomial, and it is a sine
function. As order increases, the number of zero crossings increases, thus enhancing
the ability of ZMs to represent details within an image [14]. Figure 2 shows the
behavior of Zernike polynomial for u = v. It is observed that for lower values of
(u, v), the plots have distinctive shapes. For higher values of (u, v), the plots acquire
similar shape. Therefore, for the case where u and v are equal, only lower-order
ZMs are suitable. ZMs are derived inside a unit disk; in the case of Cartesian
coordinates, the drawback of computing ZMs is that the information around the
boundary of the disk gets lost, and this leads to geometrical errors. To overcome
this problem, the image is resized and ROI is aligned at the center in order to ensure
all the pixels lie inside the unit disk.

Fig. 1 Zernike polynomial


plots for odd values of u and
first repetition
1338 G. Joshi et al.

Fig. 2 Zernike polynomial plots u = [0, 2, 3, 4, 6, 8, 10], when u = v

Fig. 3 a Improper resizing


and b center resizing

2.1 Computational Error in Zernike Moments

Figure 3 shows a unit circle placed on a resized ISL sign of the alphabet A, and the
white pixels that come within the circle are included in calculating ZM. In Fig. 3(a),
the large white area falls outside the unit circle, whereas in Fig. 3(b) entire hand
shape is covered in the circle. Figure 3(b) shows center-aligned resized image; here,
the excess number of rows and columns is added in such a way that the center of the
image is near to the center of the unit disk of ZM. This helps in the reduction of
computational error in ZM. Therefore, the proper center resizing can minimize the
computational error in the calculation of ZMs.
Analysis of Zernike Moment-Based Features … 1339

3 Results and Discussion

3.1 Preprocessed Image Analysis

In the preprocessing stage, the images without edges, with edges, misaligned, and
centrally aligned are classified. Table 2 shows the comparison of results for the ISL
database.
For classification, Sequential Vector Machine (SVM), Logistic Model Tree (LMT),
Multi-Layer Perceptron (MLP), Bayes Net (BN), Naive Bayes (NB), J48, and
k-Nearest Neighbor (k-NN) are compared. The conclusions derived from this exper-
iment are summarized below:
• When edges are not considered, the signs having unique and distinguishable
shapes such as alphabets A, C, D, E, G, J, L, U, V, Y, and Z are classified with
high accuracy [11]. Considerably, low accuracy is observed for alphabets with
similar shape. To distinguish these signs visually, internal edge details are
included. Therefore, the performance of all the classifiers improves when edges
are included.
• With center resizing, some improvement is observed. SVM gave the highest
accuracy. Initially, the highest overall accuracy is 86%. The accuracy improved
to 88.4% on the inclusion of edge details, and when images are centrally
aligned, 91.1% accuracy is achieved.

3.2 Feature Set Performance Analysis

In Fig. 4, the performance of SVM, NB, and LMT is shown for different varying
orders of ZMs. For all the classifiers under study and varying feature vectors, the
results are summarized in Table 3.
SVM gives good results. Multiclass problems are solved using pair-wise clas-
sification. SVM is supposed to be effective in high-dimensional spaces, in cases
where a number of feature dimensions are greater than the number of samples. In
the case of SVM, the accuracy for ZM13 (up to 13th order and all possible values of v)
is 89.8%, which saturates for higher orders of ZM. The best result of 93% is
obtained for combined feature vector of size 70, which includes ZM12, GF, and HM.
In the case of NB, for ZM-based feature vector highest accuracy is 81.8% for ZM12.

Table 2 Performance of the preprocessing stage for ISL database


Edges Center aligned SVM MLP BN NB k-NN LMT J48
No No 86 82.7 74.6 78.9 77.6 80.7 66.2
Yes No 88.4 88.4 81.3 83.4 82.4 88.4 73.6
Yes Yes 91.1 88.4 83.8 84.4 83.7 90.6 77.7
1340 G. Joshi et al.

Fig. 4 Accuracy of SVM,


NB, and LMT for various
orders of ZM

It decreases for higher orders. The reason for decrease can be due to the significant
increase in feature vector size, which results in a curse of dimensionality called a
Hughes effect in probability-based classifiers [15]. Adding GF improves the overall
accuracy of NB. Highest accuracy of 84.5% for feature vector of size 63 is
achieved. It includes ZM12 combined with GF. However, including HM does not
contribute to results. For LMT, by including HM to lower-order ZM enhances the
performance, whereas further adding GF to this group improves the performance at
higher orders of ZM. The performance of LMT is better than all the classifiers at
lower feature set. It is 92.1% for combined feature set of size 114. The results for
feature vector size variation are summarized in Table 3. Taking the best combi-
nation of ZM, GF, and HM into account, a feature vector of size 70 and maximum
overall accuracy of 93.7% are obtained with MLP. The next best performer is SVM
with 93% at feature vector of size 78 followed by LMT with the feature vector of
size 114 and accuracy 92.1%. However, the concern with MLP is the time required
to build the model when a feature vector is large. In terms of model building time, it
is observed that SVM performs better than MLP and LMT. For other classifiers,
accuracy remains less than 85%.

3.3 Analysis of Dataset Size Variation

For the proposed feature set and considering 10 classes only, effect of varying the
ISL sample size is analyzed. The percentage split of training/test data is 75/25
percent. The dataset is varied from 10 samples/class (equal to the number of
classes) to 90 samples/class (nine times the number of classes). As shown in Fig. 5,
even with less number of samples, LMT, SVM, and MLP outperform NB.
Performance increases linearly in MLP and NB. It increases considerably for SVM,
when size is increased from 30 to 50 samples/class. Therefore, increasing the
Table 3 Comparison of results in terms of accuracy (%)
Classifier ZM ZM best result ZM + GF ZM + HM ZM best + HM + GF All features Size
(121) (a) (b) (b) (c) (135) (a) (b) (c)
SVM 89.4 89.8 92.4 90 93 92.7 64 71 70
MLP 89.5 89 89.7 89.8 93.7 92.7 56 63 78
Analysis of Zernike Moment-Based Features …

NB 76.5 81.8 84.5 82.2 84.5 80.7 56 63 70


BN 74.7 78.7 82 79.4 82.8 82.8 56 63 70
k-NN 81.1 84.4 83.9 84.2 84 84 56 63 70
LMT 87.7 87.7 90.3 90.1 92.1 91.2 100 117 114
J48 79 73.9 79.1 75.6 79.3 79.3 64 71 78
1341
1342 G. Joshi et al.

Fig. 5 Effect of variation of


dataset size on classifier
performance

dataset size improves the classifier performance. The performance of LMT varies
nominally with number of samples per class, it shows better results even for a
smaller datasets.

4 Conclusion

From the results presented in the previous section, the following conclusions are
drawn:
• Internal edge details improve the results, as the alphabets with similar outside
boundary can only be distinguished by including internal details. Also, central
alignment of resized images helps in improving the accuracy.
• Combining geometic features and Hu Moments enhances the performance of
Zernike Moments.
• For higher orders of ZM, the feature vector size increases considerably, while a
significant improvement in accuracy is not achieved. Particularly in the case of
Naive Bayes, it decreases due to the considerable increase in feature vector size.
• For combined feature vector, SVM, LMT, and MLP show similar results and are
better than BN.
• The performance of LMT is better than all the classifiers even at lower feature
set, and it performs at par with MLP and SVM for combined feature set. LMT,
MLP, and SVM are capable of handling the larger feature vector.

References

1. Hasan, H. and Kareem, S. A., Static Hand Gesture Recognition using Neural Net-works,
Artificial Intelligence Review, 37, 1–35 (2012).
Analysis of Zernike Moment-Based Features … 1343

2. Zhang, Lu, G., Review of Shape Representation and Description Techniques, Journal of
Pattern Recognition Society, 37, 1–90 (2004).
3. Kar, P. and Raina, A. M, Semantic Structure of the Indian Sign Language, International
Conference on South Asian Languages, 1–23 (2008).
4. Ibraheem, P. A. and Khan, R. Z., Survey on Various Gesture Recognition Technologies and
Techniques, International Journal of Computer Applications 50(7), 38–44 (2012).
5. Shu, H., Luo, L. and Coatrieux, J. L., Moment-based Approaches in Image. Part 1: Basic
Features, IEEE Engineering in Medicine and Biology Magazine, 25, 70–74 (2007).
6. Potocnik, B., Assessment of Region-based Moment Invariants for Object Recognition., IEEE
International Symposium on Multimedia Signal Processing and Communications, 27–32
(2006).
7. Nallasivan, G., Janakiraman, S. and Ishwarya, Comparative Analysis of Zernike Moments
With Region Grows Algorithm on MRI Scan Images for Brain Tumor Detection, Australian
Journal of Basic and Applied Sciences, 9, 1–7 (2015).
8. Kim, H. S. and Lee, H. K. Invariant image watermarking using Zernike moments, IEEE
Trans. Image Process, 13, 766–775, (2003).
9. Sabhara, R. K., Lee, C .P. and Lim, K. M., Comparative Study of Hu Moments and Zernike
Moments in Object Recognition, Smart Computing Review, 3, 166–173 (2013).
10. Priya, S. P. and Bora, P. K., A Study on Static Hand Gesture Recognition using
Moments IEEE International Conference on Signal Processing and Communication), 1–5
(2013).
11. Khurana, G., Joshi, G. and Kaur, J., Static Hand Gestures Recognition System using Shape
Based Features, Recent Advances in Engineering and Computational Sciences, 1–4 (2014).
12. Mingqiang, Y., Kidiyo, K. and Joseph, R., A Survey of Shape Feature Extraction Techniques,
Pattern Recognition Techniques Technology and Applications, 25, 43–90 (2008).
13. Singh, C., Walia, E. and Upneja, R., Accurate Calculation of Zernike moments, Information
Sciences, 233, 255–275 (2013).
14. Khalid, M. and Hosny, A Systematic Method for Fast Computation of Accurate Full and
Subsets Zernike Moments, Information Sciences, 180, 2299–2313 (2010).
15. Kotsiantis, S, Zaharakis, I. and Pintelas, P., Supervised Machine Learning: A Review of
Classification and Combining Techniques, Artificial Intelligence Review, 26, 159–190
(2006).

You might also like