Professional Documents
Culture Documents
Expert System For Speaker Identification Using Lip Features With PCA
Expert System For Speaker Identification Using Lip Features With PCA
Expert System For Speaker Identification Using Lip Features With PCA
Abstract--Biometric authentication techniques such as Shape Model (ASM) features and model the lip
lips, face, and eyes are more reliable and efficient shape and the intensity profile vector along normal
than conventional authentication techniques such as to the model points by using ASM, Matthews et al.
password authentication, token, cards, personal [6] considering the intensity variation inside the
identification number, etc. In this research paper, the
outer lip contour. Previous work has been done on
emphasis has been laid on the speaker identification
based on lip features. In this study, we have presented the Active Shape Model (ASM) [7] features and
a detailed comparative analysis for speaker analysis has been made by using Hidden Markov
identification by using lip features, Principal Model (HMM) as a classifier. Algorithms have
Component Analysis (PCA), and neural network been developed that automatically extracts lip areas
classifiers. PCA has been used for feature extraction from speaker images [9]. PCA is an approach
from the six geometric lip features which are height of which is used to approximate the original data with
the outer corners of the mouth, width of the outer lower dimensional feature vectors. In this paper, we
corners of the mouth, height of the inner corners of have proposed a comparative analysis for the
the mouth, width of the inner corners of the mouth,
speaker authentication by using lip features with
height of the upper lip, and height of the lower lip.
These features are then used for training of the the incorporation of BP, LVQ, and RBF with PCA
network by using different neural network classifiers as compared to previous work which uses mainly
such as Back Propagation (BP), Radial Basis Hidden Markov Model (HMM) [8]. The present
Function (RBF) and Learning Vector Quantization work is a novel approach which uses different
(LVQ). These approaches are incorporated on Artificial Neural Network (ANN) algorithms for
“TULIPS1 database, (Movellan, 1995)” which is a speaker identification. The block diagram of
small audiovisual database of 12 subjects saying the speaker identification process in this work is shown
first 4 digits in English. After the detailed analysis in fig.1.
and evaluation a maximum of 91.07% accuracy in
speaker recognition is obtained using PCA and RBF. “TULIPS1
Speaker identification has a wide range of database” [1]
applications such as Audio Processing, Medical data,
Finance, Array processing, etc.
D. Experimental Results:
The results for the recognition test with BP, RBF &
LVQ are shown in Table IV:
Fig 4. Learning by PCA along with BP.
IV. STATISTICAL DATA of DIFFERENT NEURAL Pattern Analysis and Machine Intelligence, vol.24, issue 2, pp.
NETWORK TECHNIQUES 198-213, Feb. 2002.
Methods BP RBF LVQ 7.L. L. Mok, W. H. Lau, S. H. Leung, S. L. Wang and H. Yan,
"Person Authentication Using ASM Based Lip Shape and
Recognition 89.88% 91.07% 87.5% Intensity Information", Proc. 2004 IEEE International
Rate (151/168) (153/168) (147/168) Conference on Image Processing (ICIP 2004), pp 561-
564,Singapore, Oct. 2004.
Result shows that the recognition performance for 8. S.L.Wang and A. W. C. Liew, “ICA-Based Lip Feature
PCA with RBF is better than the other methods Representation For Speaker Authentication”, Proc. 2008 Third
used. International IEEE Conference on Signal-Image technologies
and Internet-Based System Page(s): 763-767, 16-18 Dec. 2007.
9. K.L. Sum, WH. Lau, S.H. Leung, Alan WC. Liew and K. W
IV. CONCLUSIONS Tse , ”A New Optimization Procedure for extracting the point –
based lip contour using active shape model”, Proc.2001 IEEE
It can be concluded from the result section that International Conference on Acoustics, Speech, and Signal
among all the classifiers mentioned above RBF Processing-Volume 03, Pages 1485-1488.
overrules BP and LVQ when used with PCA. It can
also be concluded that BP network achieved better 10. Guangming Dong, Jin Chen, Xuanyang Lei, Zuogui Ning,
Dongsheng Wang, and Xiongxiang Wang, “Global-Based
accuracy than LVQ by the use of its back Structure Damage Detection Using LVQ Neural Network and
propagation mechanism but it was RBF who Bispectrum Analysis”, Proc.2005 Springer-Verlag pp. 531 537,
achieved highest accuracy. The second conclusion Berlin Heidelberg 2005.
can be made with respect to computational time;
11. Jaakko Ho l lmen , Volker Tresp and Olli Simula, “A
RBF took much lesser time than BP and LVQ due Learning Vector Quantization Algorithm For Probailistic
to the fact that it is fast and require fewer training Models”, Proc.2000 European Signal Processing Conference,
samples as it employs local approximators. Hence, Volume II,pp.721-724.
using RBF in lip recognition increases the accuracy
12. Hui Kong, Xuchun Li, Lei Wang, Earn Khwang Teoh, Jian-
and decreases the computational time. In future, the
Gang Wang, Venkateswarlu, R “Generalized 2D principal
work can be done on an identification system that component analysis”, Proc. 2005 IEEE International Joint
incorporates the features of both lips and speech. Conference on Volume 1, Aug. 2005.
Accuracy can be improved by the use of other
feature extraction techniques. Accuracy can also be 13. Haykin, S. (1994) , Neural Networks: A Comprehensive
Foundation, Upper Saddle River, NJ : Prentice Hall.
improvised by incorporating several feature
extraction techniques to form a unique one. 14. Harry Wechsler, Vishal Kakkad, Jeffrey Huang, Srinivas
Gutta, V. Chen, “ Automatic Video-based Person Authentication
Using the RBF Network” First International Conference on
Audio- and Video-Based Biometric Person Authentication, 1997
V. REFERENCES pages 85-92.
1. “TULIPS1 database, (Movellan J. R. (1995)” Movellan J. R. 15. Xiaopeng Hong, Hongxun Yao, Yuqi Wan, Rong Chen , ”A
G. Tesauro, D. Toruetzky, & T. Leen (eds.) (1995) Visual PCA Based Visual DCT Feature Extraction Method for Lip-
Speech Recognition with Stochastic Networks. in Advances in Reading” Proc. 2006 International Conference on Intelligent
Neural Information Processing Systems, Vol 7, MIT Pess, Information Hiding and Multimedia Signal Processing, (IIH-
Cambridge. MSP'06.),pp.321-326,Pasadena,CA,USA,2006.