Professional Documents
Culture Documents
A Summary of Literature Review Face Recognition
A Summary of Literature Review Face Recognition
net/publication/228790347
CITATIONS READS
9 20,901
2 authors:
Tt Tt Dzulkifli Mohamad
Suan Sunandha Rajabhat University Universiti Teknologi Malaysia
5 PUBLICATIONS 62 CITATIONS 106 PUBLICATIONS 1,264 CITATIONS
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Dzulkifli Mohamad on 30 September 2014.
1. Introduction
Face recognition is one of the few biometric
methods that possess the merits of both high accuracy
and low intrusiveness. It has the accuracy of a
physiological approach without being intrusive. Over
past 30 years, many researchers have been proposed
different face recognition techniques, motivated by
the increased number of real world applications
requiring the recognition of human faces. There are
several problems that make automatic face
recognition a very difficult task. However, the face
image of a person inputs to the database that is
usually acquired under different conditions. The Figure 1. Configuration of a generic face
important of automatic face recognition is much be recognition
cope with numerous variations of images of the same
face due to changes in the following parameters such
as pose, illumination, expression, motion, facial hair, 2. Face recognition approach
glasses, and background [1][2]
Face recognition technology is well advance that Face recognition can be done in both a still
can applied for many commercial applications such image and video sequence which has its origin in
as personal identification, security system, image- still-image face recognition. Different approaches of
film processing, psychology, computer interaction, face recognition for still images can be categorized
entertainment system, smart card, law enforcement, into tree main groups such as holistic approach,
surveillance and so on. A general problem of face feature-based approach, and hybrid approach [7]:
This paper has not been revised and corrected according to reviewers comments Copyright PARS’07.
Postgraduate Annual Research Seminar 2007 (3-4 July 2007) 2
This paper has not been revised and corrected according to reviewers comments Copyright PARS’07.
Postgraduate Annual Research Seminar 2007 (3-4 July 2007) 3
In this paper, we review some method from three Figure 3 : HMM for face recognition
approach such as Hidden Markov Models Method,
Neural Network Method, Eigen Face Method, and Each face image with W and height H is divided
Fisher face Method. into overlapping blocks of height L and the same
This paper has not been revised and corrected according to reviewers comments Copyright PARS’07.
Postgraduate Annual Research Seminar 2007 (3-4 July 2007) 4
width. The block extracting is shown in Fig. 4 [9]. 3.2 Neural Network Method
The amount of the overlapping P has a significant
effect on recognition rate since features are captured Neural networks-based approaches are learned
independent of vertical position. The magnitude of L from the example-images and rely on the techniques
is also important. Small length of L will assign from machine learning to find the relevant
insufficient information to discriminate to the characteristics of face images. The learned
observation vector. On the other hand, large value of characteristics, in the form of discriminant functions
L will increase the chances of cutting across the (i.e. non-linear decision surfaces), are subsequently
feature. Therefore, it is important to find good value used for face recognition. Conventionally, face
for L. Once blocks are extracted from the image, a set images are projected to a low-dimensional feature
of DCT coefficients are calculated for each block. space and nonlinear decision surface is formed using
When each block is transformed with DCT, the most multilayer neural networks for classifications and
important coefficients with low frequencies are recognition [10]. Neural networks have also been
converged and clustered in small area in the DCT used successfully for face recognition problem
domain. The author from [9] uses 12x3 size window [11],[12],[10]. The advantage of using the neural
to pick these significant information of signal energy. networks for face recognition is that the networks can
In this way, the size of observation vector is reduced be trained to capture more knowledge about the
significantly, which makes the system very efficient variation of face patterns, and thereby achieving good
while still retaining good detecting rate. In the generalization [13]. The main drawback of this
training phase, the image is segmented from top to technique is that the networks have to be extensively
bottom where each segment corresponds to a state, tuned to get exceptional performance. Among the
and initial observation probability matrix B is neural networks approaches for face recognition,
obtained from observation vectors associated with multilayer perceptron (MLP) with back propagation
each state. Once B is obtained, the initial value of A (BP) algorithm has been mostly used [14]. However,
and π are set given the left to right structure of the the convergence of the MLP networks is slow and the
face global minima of the error space may not be always
achieved [11]. On the other hand, the RBF neural
networks have fast learning ability [15] and best
approximation property [16]. So, in recent times,
many researches have used RBF networks for face
recognition and show in figure 5 [17],[18],[13].
However, their success rates are not so
promising as the error rates vary from 5 to 9% under
variation of pose, orientation, scale and light [13].
This may be due to the fact that the selection of the
centers of the hidden layer neurons might not have
Figure 4. Block extraction from image been done by capturing the knowledge about the
distribution of training patterns and variations of face
The face image is recognized if given Markov pose, orientation and lighting. [19]
mode, the probability of observation symbols is
maximum. For experiment in the paper, 400 images
of 40 individuals with 10 face images per individual
are used. The image database contains face images
with different expressions, hair styles, eye wears and
head orientations. The system achieves 84 % correct
classification with L =10 and P =9 while eigenfaces
approach achieves 73% of correct classification with
the same dataset. Considering this fact, HMM
approach has a bit better performance than eigenfaces
method for images with variations.
This paper has not been revised and corrected according to reviewers comments Copyright PARS’07.
Postgraduate Annual Research Seminar 2007 (3-4 July 2007) 5
3.3 Eigenface Method distance for example). The class of the most similar
vector is the result of the recognition process, i.e. the
In 1991, Turk and Pentland used PCA identity of the face. In addition, a Rejection System
projections as the feature vectors to solve the problem for unknown faces is used if the similarity matching
of face recognition, using the Euclidean distance as measure is not good enough and show in figure 6 [24]
similarity function [20]. This system, later called
Eigenfaces, was the first eigenspace-based face
recognition approach and, from then on, many
eigenspace-based systems have been proposed using
different projection methods and similarity functions.
In particular, Belhumeur et al. proposed in 1997 the
use of FLD as projection algorithm in the so-called
Fisherfaces system [21]. In all standard eigenspace-
based approaches a similarity function, which works
as a nearest-neighbor classifier [22], is employed.
In 1997, Pentland and Moghaddam proposed a Figure 6. Block diagram of a generic eigenspace-
differential eigenspace-based approach that allows based face recognition system
the application of statistical analysis in the
recognition process [23]. The main idea is to work
with differences between face images, rather than 3.4 Fisherface Method
with single face images. In this way the recognition
problem becomes a two-class problem, because the Fisherface algorithm considers the ratio between
so-called “differential image” contains information of the variation of one person and that of another
whether the two subtracted images belong to the person. It maximizes the determinant of between-
same class or to different classes. In this case the class scatter matrix simultaneously, minimizing the
number of training images per class increases so that determinant of within-class scatter matrix.
statistical information becomes available, and a For a C – class problem, the between -class
statistical classifier can be used for performing the scatter matrix is defined as follows:
recognition. The system proposed in [23] used Dual-
PCA projections and a Bayesian classifier.
Eigenspace-based approaches approximate the
face vectors (face images) by lower dimensional
feature vectors. These approaches consider an off-
line phase or training, where the projection matrix where Pr(Ωi ) is the prior class probability, µi is the
(W ∈ RN ×m), the one that achieve the dimensional mean sample of class Ωi and µ is the mean sample
reduction, is obtained using all the database face of all classes.
images. In the off-line phase, the mean face ( x ) and The within-class scatter matrix is defined as
the reduced representation of each database image follows:
( pk ) are also calculated. The recognition process
works as follows. A preprocessing module transforms
the face image into a unitary vector (normalization
module in the case of Fig. 1) and then performs a
subtraction of the mean face. The resulting vector is
where
projected using the projection matrix that depends on
the eigenspace method been used (PCA, FLD, etc.).
This projection corresponds to a dimensional is covariance matrix of within-class sample. N i is
reduction of the input, starting with vectors in RN the number of samples in class Ωi .
(where N is the dimension of the image vectors) and
obtaining projected vectors q in Rm , with m<N Fisher criteria function is defined as follows:
(usually m<<N). Then, the similarity of q with each
of the reduced vectors € pk ( € pk ∈ Rm ) is computed
using a certain criterion of similarity (Euclidean
This paper has not been revised and corrected according to reviewers comments Copyright PARS’07.
Postgraduate Annual Research Seminar 2007 (3-4 July 2007) 6
Then the projective matrix W fld can be chosen 4. Face Expression Recognition
as follows:
Several research about facial expression
recognition. The facial expressions under
examination were defined by psychologists as a set of
six basic facial expressions (anger, disgust, fear,
happiness, sadness, and surprise) [44]. In order to
make the recognition procedure more standardized, a
W fld can be calculated by solving the generalized
set of muscle movements known as Facial Action
eigenvalue problem: Units (FAUs) that produce each facial expression,
was created, thus forming the so-called Facial Action
Coding System (FACS) [45]. These FAUs are
combined in order to create the rules responsible for
the formation of facial expressions as proposed in
[46].
In face recognition applications, because rank of
A survey on the research made regarding facial
S w ∈ R m×n is at most N - c, where N is the number expression recognition can be found in [47] and [48].
of images in training set and typically much smaller The approaches reported regarding facial expression
than , number of pixels in each image, the within- recognition can be distinguished in two main
class scatter matrix S w is always singular. To directions, the feature-based ones and the template-
based ones, according to the method they use for
overcome this problem, PCA is first utilized to facial information extraction. The feature-based
reduce the dimension of the images from N to N-c, methods use texture or geometrical information as
then recalculated S w will be non-singular and FLD features for expression information extraction. The
template-based methods use 3-D or 2-D head and
can be utilized to find the projective matrix W fld ,
facial models as templates for expression information
which is referred to as Fisherfaces and Figure 7 is a extraction. However, in overview about 2D and 3D
comparison of PCA and FLD for a two-class problem face recognition algorithm that can be summary in
in which the samples from each class are randomly table 2:
perturbed in a direction perpendicular to a linear Tanaka et al. [49] also perform curvature-based
subspace.[25] - [28] segmentation and represent the face using an
Extended Gaussian Image (EGI). Recognition is then
performed using a spherical correlation of the EGIs.
Hesher et al. [50] explore PCA type approaches using
different numbers of eigenvectors and image sizes.
The image data set used has 6 different facial
expressions for each of 37 subjects. The performance
figures reported result from using multiple images
per subject in the gallery. This effectively gives the
probe image more chances to make a correct match,
and is known to raise the recognition rate.
Medioni et al. [51] perform 3D face recognition using
iterative closest point (ICP) matching of probe face
surface against gallery face surface. Whereas most of
the works covered here acquired 3D using structured light,
this work uses a stereo-based system. An Equal Error
Rate (EER) of ”better than 2%” is reported.
Figure 7. A comparison of principal component
analysis (PCA) and Fisher’s linear
discriminant (FLD) [28]
This paper has not been revised and corrected according to reviewers comments Copyright PARS’07.
Postgraduate Annual Research Seminar 2007 (3-4 July 2007) 7
Moreno and co-workers [52] approach 3D face UMIST face database demonstrate impressive
recognition by first performing a segmentation based performance improvement of our method over the
on Gaussian curvature and then creating a feature vector conventional benchmarks, for face recognition
based on the segmented regions. The results on a training from one image while testing under some
dataset of 420 face meshes representing 60 different expressions, illumination and slight pose variations.
persons, with some sampling of different expressions Passalis et al. [55] perform face recognition by
and poses for each person. They report 78% rank-one intraclass retrieval of nonrigid 3D objects. A novel
3D object retrieval method is presented which uses a
recognition on the subset of frontal views, and 93%
parameterized annotated model of the shape of the
overall rank-five recognition. class objects, incorporating its main characteristics
Lee and co-workers perform 3D face recognition by and transformed into the wavelet domain. The result
locating the nose tip, and then forming a feature of this method does not require user interaction,
vector based on contours along the face at a sequence of achieves high accuracy, efficient for use with large
depth values [53]. They report 94% correct recognition databases, and suitable for nonrigid object classes.
at rank 5, and do not report rank-one recognition. Given They report 95.2% at 10_3 false accept rate on
Grand Challenge v2 database, yielding an average
the relatively small dataset, 35 persons, and the recognition
verification.
rates reported for other works, it would appear that the
Lao et al. [56] perform 3D face recognition using
contour-oriented method is not as powerful as other
a sparse depth map constructed from stereo images.
methods. Iso-luminance contours are used for the stereo
Bai et al. [54] used extended fisherface with 3D matching. Both 2D edges and iso-luminance contours
morpheble model is utilized to derive multiple are used in finding the irises. In this limited sense
images form a single example image to form the with multi-modal approach. The report 87% to 96%
training set for Fisherface. Experiments on ORL and
This paper has not been revised and corrected according to reviewers comments Copyright PARS’07.
Postgraduate Annual Research Seminar 2007 (3-4 July 2007) 8
are using a dataset of ten persons, with four images taken Mian et al. [61] used multimodal (2D and 3D)
at each of nine poses for each person. However, no and performs hybrid (feature-based and holistic)
matching to achieve efficiency and robustness to
attempt is made to deal with variation in facial
facial expressions. A novel 3D Spherical Face
expression. Representation (SFR) is used in conjunction with the
Beumier and Acheroy [57] perform multi-modal SIFT descriptor which quickly eliminates a large
recognition by using a weighted sum of the 3D and number of candidate faces at an early stage for
2D similarity measures. They report on experiments efficient recognition in case of large galleries. This
with a dataset of 26 persons in the gallery and 29 persons approach automatically segments the eyes-forehead
in the probe set, achieving recognition performance as high and the nose regions. The results of all the matching
engines are fused at the metric level to achieve higher
as 2% equal error rate (EER) for multi-modal
accuracy. However, multimodal hybrid algorithm
recognition, compared to 4% for 3D alone and 8% for better than others approach and identification rates of
2D alone. They found that 3D+2D data was acquired 99.02% and 95.37% for probes with neutral and non-
on a larger set of 120 persons on each of two neutral expression respectively.
different acquisition sessions.
Wang et al. [58] use Gabor filter responses in 2D and 5. Summary of review
“point signatures” in 3D to perform multi-modal face
recognition. The 2D and 3D features together form a The approach for 3D face recognition involves
feature vector and used support vector sensitivity to size variation that can be use a purely
machines(SVM). Experiments were performed with curvature-based representation and handle size
images from 50 subjects, six images per subject, with change between faces, but run into problems with
pose and expression variations. Recognition rates change of facial expression between the enrollment
exceeding 90%. image and the image to be recognized. In facial
Bronstein et al. [59] used an isometric recognition system should be able to handle variation in
transformation approach to 3D face analysis in an
expression. The seriousness of the problem of
attempt to better cope with variation in face shape
variation in facial expression between the enrollment
due to facial expression. One method they propose is
image and the image to be recognized is illustrated in
effectively multi-modal 2D+3D recognition using
the results shown in Figure 2. This experiment
Eigen composition of fattened textures and canonical
focuses on the effects of expression change.
images. They show examples of correct and incorrect
recognition by different algorithms, but do not report Recognition was done with PCA-based 2D and 3D
any overall quantitative performance results for any algorithm. The upper cumulative match characteristic
algorithm. (CMC) curves represent performance with time lapse
Chang et al. [60] report on PCA-based recognition only between gallery and probe.
experiments performed using 3D and 2D images from A main problem to experimental validation and
200 persons. One experiment uses a single set of later comparison of 3D face recognition is lack of
images for each person as the probes, and another appropriate datasets. Desirable properties of such a
experiment uses a larger set of 676 probes. Results in dataset include: (1) a large number and demographic
both experiments were approximately 99% rank-one variety of people represented, (2) images of a given
recognition for multi-modal 3D+2D, 94% for 3D person taken at repeated intervals of time, (3) images
alone and 89% for 2D alone. The combined result of a given person that represent substantial variation
was obtained using a weighted sum of the distances in facial expression, (4) high spatial resolution, for
from the individual 3D and 2D face spaces. This example, depth resolution of 0.1 mm or better, and (5) low
work represents the largest experimental study yet frequency of sensor-specific artifacts in the data.
reported in the literature either for 3D face alone or
for multi-modal 2D+3D.
This paper has not been revised and corrected according to reviewers comments Copyright PARS’07.
Postgraduate Annual Research Seminar 2007 (3-4 July 2007) 9
6. Future work [3] Chellappa R., Wilson C.L., and Sirohey S., Human
In reviewing last research, we found many and Machinr Recognition of Faces: A Survey, Proc. IEEE,
(83), 1995, 705-741.
approach can be used for face recognition that each
method have different advantage and disadvantage [4] Zhang J., Yan Y., and Lades M., Face Recognition:
Eigenfaces, Elastic Matching, and Neural Nets, Proc.
such as local, global, and hybrid method. There are IEEE, l.85(9), 1997, 1423-1435.
have 2 type of image for face recognition technique:
still image and video image (still image sequence). [5] Torres L., Is there any hope for face recognition.
Technical University of Catalonia, Spain, 2004, 1-4.
However, we found some problems in face
recognition system such as: (1) pose problem : due to [6] Kim H. H., Survey paper: Face Detection and Face
can not control face image for capturing and have Recognition. Department of Computer science, University
many pose variation will be change every time. (2) of Saskatchewan, South Korea, 2004, 1-7.
This paper has not been revised and corrected according to reviewers comments Copyright PARS’07.
Postgraduate Annual Research Seminar 2007 (3-4 July 2007) 10
[15] Moody J., and Darken C.J., Fast learning in Morphable model, Proceedings of the Fourth
network of locallytuned processing units, Neural International Conference on Machine Learning and
Computing, vol 1, page 281–294, 1989. Cybernetics, Guangzhou, August 2005, 18-21.
[16] Girosi F., and Poggio T., Networks and the best [27] Duda R., and Hart P., Pattern Classification and
approximation property, Biol. Cybern, (63), 1990, Scene Analysis. New York: Wiley, 1973.
169-176.
[28] Belhumeur P. N., Hespanha J. P., and Kriegman
[17] Howell, H. Buxton, Learning identity with D. J., Eigenfaces vs. Fisherfaces: Recognition Using
radial basis function networks, Neurocomputing, l 20, Class Specific Linear Projection, IEEE Transactions
1998, 15-34. on pattern analysis and machine intelligence, 19(7),
July 1997.
[18] Ranganath S., and Arun K., Face recognition
using transform features and neural networks, Pattern [29] Kittler J., Hatef M., Duin R.P.W., and Matas J.,
Recognition, l 30, 1997, 1615-1622. On combining classifiers. IEEE Trans. Pattern
Analysis and Machine Intelligence, 20(3), 1998, 226-
[19] Sing J. K., Basu D. K., Nasipuri M., and Kandu 239.
M., Face recognition using point symmetry distance-
based RBF network, Applied soft computing, [30] Zhou Z.H., Wu J., and Tang W., Ensembling
available at www.sciencedirect.com, l 7, January neural networks: Many could be better than all.
2007, 58-70. Artificial Intelligence, 137(1-2), 2000, 239-263.
[20] Turk M. and Pentland A., Eigenfaces for [31] Hallinan P.W., and al. et, Two-and three-
Recognition, J. Cognitive Neuroscience, 3(1), 1991, dimensional patterns of the face, Natick, MA: A K
71-86. Peters,Letd, 1999.
[21] Belhumeur P.N., Hespanha J.P., and Kriegman [32] Martinez A.M., Recognizing imprecisely
D.J., Eigenfaces vs. Fisherfaces: recognition using localized, partially occluded, and expression variant
class specific linear projection, IEEE Trans. Pattern faces from a single sample per class. IEEE Trans.
Analysis and Machine Intelligence, 19(7), July 1997, Pattern Analysis and Machine Intelligence, 25(6),
711-720. 2002, 748-763.
[22] Duda R.O., Hart P.E., and Stork D.G., Pattern [33] Tan X., Chen S.C., Zhou Z.-H., and Zhang F.,
Classification, Second Edition, 2001. Recognizing partially occluded, expression variant
faces from single training image per person with
[23] Pentland A. and Moghaddam B., Probabilistic SOM and soft kNN ensemble. IEEE Transactions on
Visual Learning for Object Representation, IEEE Neural. Networks, 16(2005),4875-886.
Trans. Pattern Analysis and Machine Intelligence,
19(7), July 1997, 696-710. [34] Heisele, B. T., Serre, M., Pontil and Poggio, T.,
Component-based face detection. In Proceedings,
[24] Ruiz J. D. S., and Navarrete P., Eigenspace- IEEE Conference on Computer Vision and Pattern
based Face Recognition: A comparative study of Recognition, 1(2001) 657-662.
different approaches, IEEE Trans. on Sys., Man. &
Cyb. C., 16(7), 2002, 817-830. [35] Costen N.P., Cootes T.F., Taylor C.J.,
Compensating for ensemble-specific effects when
[25] Belhumeur P. N., Hespanha J. P., and Kriegman building facial models, Image and Vision Computing,
D. J., Eigenfaces vs. Fisherfaces: Recognition using 20(2002) 673-682.
class specific linear projection, IEEE Trans. Pattern
Anal. Machine Intel, 19, May 1997, 711-720. [36] Lanitis A., Taylor C.J., and Cootes T.F.,
Automatic face identification system using flexible
[26] Bai X. M., Yin B. C., Shi Q., and Sun Y. F., appear ance models. Image Vis. Comput. 13(1995)
Face recognition using extended Fisherface with3D 393-401.
This paper has not been revised and corrected according to reviewers comments Copyright PARS’07.
Postgraduate Annual Research Seminar 2007 (3-4 July 2007) 11
[46] Pantic M., and Rothkrantz L. J. M., “Expert [56] Lao S., Sumi Y., Kawade M., and Tomita F.,
system for automatic analysis of facial expressions,” 3D template matching for pose invariant face
Image Vis. Comput., 18(11), Aug. 2000, 881-905 recognition using 3D Facial Model Built with
This paper has not been revised and corrected according to reviewers comments Copyright PARS’07.
Postgraduate Annual Research Seminar 2007 (3-4 July 2007) 12
Isoluminance Line Based Stereo Vision, Int’lConf. on [60] Chang K., Bowyer K., and Flynn P., Face
Patt. Rec. (ICPR 2000), 2000, 91-916. recognition using 2D and 3D facial data. 2003
Multimodal User Authentication Workshop,
[57] Beumier C. and Acheroy M., Face verification December 2003, 25-32.
from 3D and grey level cues. Patt. Rec. Letters, 22,
2001, 1321-1329. [61] Mian A., Bennamoun M., and Owens R., An
efficient Multimodal 2D-3D Hybrid approach to
[58] Wang Y., Chua C., and Ho Y., Facial feature automatic face recognition, IEEE Transactions on
detection and face recognition from 2D and 3D pattern analysis and machine intelligence, January 10,
images. Pattern Recognition Letters, 23, 2002, 1191- 2007, 1-34.
1202.
This paper has not been revised and corrected according to reviewers comments Copyright PARS’07.