Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

Analysis of Neural Networks for Face Recognition Systems with Feature

Extraction To Develop An Eye Localization Based Method


M.R.M. Rizk, Senior Member IEEE, and A. Taha
Alexandria University, Egypt, mrmrizk@ieee.org

Abstract: This paper provides an analysis of (openiclosed eyes, smilinginonsmiling), and facial
Multilayer Perceptron Backpropagation neural details (glassesino glasses).
networks (MLPIBP NN), Radial Basis Function All the images were taken against a dark
neural networks (RBF NN), and Multilayer Cluster homogeneous background with the subjects in an
neural networks (MCNN) applications in face upright frontal position, with tolerance for tilting
recognition. Feature extraction methods involved in and rotation of up to about Z O O , and scaling to about
the analysis are the Discrete Wavelet transform 10%. The images are greyscale with a resolution of
(DWT), Discrete Radon transform (DRT), Discrete 92x1 12.
Cosine transform (DCT) and the principal
component analysis (PCA) technique. Algorithms We proposed a 2-stage face recognition system.
were developed using Matlab and tested on the The first stage is a MLP BP NN that detects the
ORL database. Also, a new proposed 2-stage face eyes region in the introduced faces then a useful
recognition system is presented based on eye windowed face region (the eye area is stretched to
localization and a windowed face area for include the eye brows upwards and the mouth
recognition. downwards) is constructed and used in recognition
in the second stage via a DCT RBF NN.
1. INTRODUCTION
Face recognition is a complicated object
recognition problem due to abundance of various
face expressions, face position and lighting
changes.
Different methods for solving the problem can be Fig. 1.Samples of the ORL database
classified into: Neural Networks [I ,2], Template
Matching [3], PCA, geometric feature based
matching and algebraic moments. Also, hybrid LAPPLYING MLP BACKPROPAGATION
systems constructed of two or more of the previous NEURAL NETWORK
methods are made.
The MLPiBP NN may be trained to recognize face
This paper analyzes and compares the neural images directly. For example if the images are 92 x
networks' performance in face recognition. We 112, the number of the network inputs would he
tested MLPiBP NN, RBF NN, and compare them 10304. To reduce the computational complexity,
with MCNN, used previously in character two common feature extraction techniques are
recognition [4]. The input data fed to the networks applied. The first technique is based on the
undergo a feature extraction stage, to help in computation of a set of geometrical features from
dimensionality reduction and removing redundancy the face image. The second class is based on
in the input data, thus obtaining high speed transforms, for reducing dimensionality, getting
recognition systems. The feature extraction multiresolution features, or invariant features to
techniques include the DWT, DRT, DCT and PCA. scaling, translation, and rotation within the image.
For the PCA method, we used two types of
classifiers: the Euclidean classifier and the neural We focus on the second technique, where DWT,
networks' classifier. DCT, DRT and PCA coefficients are used, together
with normalization methods for the sake of image
We have used the ORL database taken at the compression.
Olivetti Research Laboratory in Cambridge, U.K.
There are ten different images of 40 distinct The following steps are used for recognition using
subjects. There are variations in facial expressions dimensionality reduction:

0-7803-7596-3/02/$17.00 02002 E E E 847


Step 1. The transform is used to reduce the input
space from 10304 units (92x1 12)
Step 2. The image vectors are normalized.
Step 3. The training set used contains 5 samples per
subject.
Step 4. The testing set used contains I O samples per
subject, with 5 samples not introduced in the
training phase.
Step 5. The MLPBP NN structure with the reduced Fig.2. The scanning strategy used for locating low
input units is used, one hidden layer of 50 units, frequency DCT components
and 40 output units .
3. APPLYING RBF NEURAL NETWORKS
2.1 Recognition Using DWT The RBF network has been known by its
We used DI wavelets, to achieve 93.75% image computational simplicity, fast convergence, and
compression. The DWT is used to reduce the input statistical robustness due its localized receptive
space to (23x28) units. Steps 2 through 5 are used. fields in the hidden layer. They are good at
The resulting recognition rate reached 93.25% . handling sparse, high dimensional data (common in
images), and because they use approximation which
2.2 Recognition Using Common is better than interpolation for handling noisy, real
Interpolation life data, RBF networks are claimed to be more
Image compression is achieved using common accurate than those based on MLP/BP networks.
interpolation method based on the nearest neighbor They have been widely used in face recognition,
pixel. Thus, resizing the input space to 644 units and proved their efficiency [SI.
and following steps 2 through 5, the resulting The four previous experiments held using the
recognition rate reached 90.25% . MLP/BP NN are replaced by RBF networks in
Step 5. The results are shown in Table I .
2.3 Recognition Using DRT
We used the Radon transform to achieve around 4. APPLYING MCNN
83.6% dimensionality reduction. The Radon
projections are taken over 180 degrees with a step MCNN are first introduced by S. Lee et al. in 1995
of I O degrees. The projection vector is truncated to [4] for multiresolution recognition of handwritten
a reasonable size to get the most useful area of the numerals. Here, we used the MCNN on face
face where 1692 input units are taken, and going recognition by extracting multiresolution features
from steps 2 to 4. The resulting recognition rate via D4 wavelets to compress images to (16*16) and
reached 93.75%. This result is even better than the (8*8) matrices. Having the same steps from 2 to 4,
DWT approach. the MCNN has the structure of [4 (8*8) + 4 (4*4)]
input layer, [4 (8'8) + 4 (4*4)] hidden layer and 40
2.4 Recognition Using the DCT nodes in the output layer. The resulting recognition
We have used DCT to achieve around 99.11% rate reached 91%.
dimensionality reduction, by truncating high
frequency components and preserving the low 5. APPLYING PCA
spatial frequency DCT components only. A
scanning strategy is done that concentrates on the The PCA has been widely used in face recognition
upper left corner of the DCT space image as shown systems [6][7]. To compare its performance
in figure 2. The image vectors are subtracted from against that of previous techniques, the images are
their mean and normalized. We reduced the input normalized, the eigenfaces calculated using five
space to 91 units, following steps 3 and 4, the samples per subject (40 eigenfaces). The
MLP/BP neural network structure is 91 input units, Euclidean distance (L2 norm), the RBF and the
40 hidden units, and 40 output units. The resulting MLPBP NN were used for classification
recognition rate reached 97.5%. This is the best respectively. The results are given in Table 1. The
result achieved in the above experiments. MLP/BP NN has the best performance as a
classifier.

848
6. The EYE LOCALIZER/ FACE 7. DISCUSSION
RECOGNITION SYSTEM
It is evident that the MLPiBP NN trained with DCT
We proposed an eye localizer method via MLP/BP has the best recognition rate. Aside from the
NN followed by RBF NN for face recognition. recognition rates, it exhibits the hest performance in
terms of data reduction and compatibility.
6.1 The Eye Localizer Neural Network This is due to the fact that DCT coefficients, unlike
PCA, are data independent, and a very small
L 84 samples of (42 eye / 42 non-eye) blocks are number of low frequency components can preserve
used. The eye blocks size is of 92x1 5. the important facial features. DCT coefficients
I - The eye blocks are first subtracted from their show the most variance over the training images.
mean. The MLP/BP NN trained with DRT follows the
2- The DCT coefficients are taken using the same MLPiBP DCT NN in terms of recognition rate,
previous scanning strategy. together with the RBF NN trained with DWT. But
3- DCT coefficients are truncated to the first 15 the latter is better in terms of speed. This is due to
low frequency coefficients and then normalized. the nature of the RBF networks of fast
4- The input data is then introduced to a MLP/BP generalization and local receptive fields. Also,
NN with 15x15x2 units . DWT provides one of the best image compression
The network achieved 96% correct classifications ratios and noise immunity, and is much better than
for the eye region. the DRT. Table 1 shows the overall results.
Samples of the network input are shown in figure ~____
3. I Classifier I Coefficients1 Recognition I
Type I Type I Rate
c ,5 1 a MLPiBPNN I DWT I 93.25%

Fig.3.a.Positive Fig. 3.b.Negative


Samples Samples

6.2 The Face Recognition NN


The output of the previous neural network is taken
and a useful face area is constructed accordingly.
The area is set such that the eye block is stretched
to 10 pixels upwards and 35 pixels downwards,
where the overall area sums up to 92x60 pixels
removing 46.4% redundant data. The RBF stage is
summarized as follows: Table. 1. The overall performance of the techniques
1- 5 training samples are taken and subtracted used
from their mean.
2- The first 105 low frequency DCT coefficients The propagating error in the eye localization/face
are taken from the samples and normalized. recognition system caused the low recognition rate
3- A R B F " is used for classification. of 90.5%, since the misclassifications in the first
The network achieved 90% recognition rate. stage of the MLP/BP NN causes inaccurate face
region and wrong data fed to the second stage, the
second stage has also its own errors in recognition.
We investigated the effect of database increase and
the number of training samples on the performance
of the MLP/BP neural network. Figure 5 and 6
summarizes the obtained results.
Fig.4.Examples of the RBF NN input data It is evident from figure 5 that at small databases,
most techniques perform very well, but as the
database volume increases, only DCT keeps its
high recognition rate. Also, we find out from figure

849
6 that if the number of available training samples is Through these applications, many feature extraction
low, PCA and DWT act very well, while if the techniques were used. These techniques include
number of available training samples is high, most DWT, DCT, DRT, PCA and compression using
techniques perform well and DCT is at the top. interpolation.
The experiments that we held on the ORL database
proved that MLPiBP networks fed with DCT low
frequency components exhibit the hest
performance, as it has the lowest overall training
time, the lowest redundant data, the highest
recognition rates.

9. REFERENCES
[I]. S. Haykin , "Neural Networks", Macmillan
,1994.
[2]. J. Zhang, Y. Yan, and M. Lades," Face
Recognition : Eigenfaces, Elastic Matching
, and Neural Networks " , IEEE
Fig.5. The effect of database increase. proceedings , vol. 85 , no. 9, pp. 1422-
1435,1997.
[3]. R. Brunelli and T. Paggio," Face
Recognition: Features versus Templates
Effecl of Database Increase on M L e P NN ",IEEE Trans. On PAMI, Vol. 15, No. IO,
October 1993.
[4]. S. Lee , C. Kim and Y. Tang
,"Multiresolution Recognition of
Unconstrained Handwritten Numerals with
Wavelet Transform and Multilayer Cluster
Neural Network" , Pattern Recognition,
Vol. 29 ,No. 12, pp. 1953-1961,1996.
10 20 30 40 [5]. A. Jonathan Howell and H. Buxton, "Face
Recognition Using radial Basis Function
Neural Networks", Proceedings of British
Machine Vision Conference, pages 455--
464, Edinburgh, 1996. BMVA Press.,
March 1999.
[6]. W. S. Yambor , B. A. Draper and J. R.
Fig.6. The effect of increasing the training samples
Beveridge , "Analyzing PCA-based Face
Recognition Algorithms: Eigenvector
The DCT has been performed by taking the whole Selection and Distance Measures", 2nd
image (92*112 matrix) convert it into a vector Workshop on Empirical Evaluation in
(1*10304), and for the whole 200 sample set of Computer Vision, Dublin, Ireland, July 1,
images, calculate the average vector, then subtract 2000.
each image vector from the average. The resulting [7]. 0. A. A b d e l - A h , M. Rizk and M. M.
Saii, "Position Invariant Face Detection
vector is converted into a matrix (92*112) again.
Based On PCA", Proceedings of the 17th
The DCT transform is performed for the whole National Radio Science Conference,
matrix (92*112) in the 2dimension form. we took March 2000.
the first 91 coefficients using the zigzag scanning
strategy from the DCT matrix and use it for
recognition.

8. CONCLUSIONS
The paper discussed the application of artificial
neural networks in face recognition systems.
850

You might also like