Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

172 IEEE SIGNAL PROCESSING LETTERS, VOL. 21, NO.

2, FEBRUARY 2014
3-D Face Recognition Using Curvelet Local Features
S. Elaiwat, M. Bennamoun, F. Boussaid, and A. El-Sallam
AbstractIn this letter, we present a robust single modality
feature-based algorithm for 3-D face recognition. The proposed
algorithm exploits Curvelet transform not only to detect salient
points on the face but also to build multi-scale local surface de-
scriptors that can capture highly distinctive rotation/displacement
invariant local features around the detected keypoints. This ap-
proach is shown to provide robust and accurate recognition under
varying illumination conditions and facial expressions. Using the
well-known and challenging FRGC v2 dataset, we report a supe-
rior performance compared to other algorithms, with a 97.83%
verication rate for probes with all facial expressions.
Index TermsDigital curvelet transform, face recognition, local
features.
I. INTRODUCTION
D
ESPITE decades of research efforts, 2-D recognition sys-
tems remain sensitive to variations in illumination, pose
and facial expressions. The use of commercially available 3-D
imaging devices has shown to overcome sensitivity to illumi-
nation and pose [1]. In addition, 3-D face recognition exploits
structural information about the face such as geodesic distances
and surface curvatures. However, variations in facial expres-
sions still constitute a major challenge because they can re-
sult in important geometrical facial changes [1]. Reported ap-
proaches to the problem of face recognition can be classied
into three categories [2]: (i) Holistic matching algorithms which
use the whole face region for recognition, (ii) Local feature-
based matching algorithms which extract local features from
some facial regions (e.g. eyes and nose), (iii) Hybrid matching
algorithms which exploit the combination of the holistic and
the local feature-based matching, but at the expense of a greater
computational cost. Among these approaches, the local feature-
based matching algorithms are potentially the most effective, in
that, they can exclude those facial regions that could be most af-
fected by perturbations such as changes in facial expression or
spurious elements. They are also robust to occlusion and clutter
[3]. Their performance depends on their ability to extract dis-
tinctive local facial features. Curvelet theory provides a pow-
Manuscript received August 29, 2013; revised October 26, 2013; accepted
December 06, 2013. Date of publication December 13, 2013; date of current
version January 03, 2014. This work was supported by the ARC under
DP110102166. The associate editor coordinating the review of this manuscript
and approving it for publication was Prof. Jing-Ming Guo.
S. Elaiwat, M. Bennamoun, and A. El-Sallam are with the School of Com-
puter Science and Software Engineering, The University of Western Australia,
Perth, Australia (e-mail: elaiws01@student.uwa.edu.au).
F. Boussaid is with the School of Electrical, Electronic and Computer Engi-
neering, The University of Western Australia, Perth, Australia.
Color versions of one or more of the gures in this paper are available online
at http://ieeexplore.ieee.org.
Digital Object Identier 10.1109/LSP.2013.2295119
erful framework to extract such key local facial features. Un-
like the isotropic elements of other transforms such as wavelets,
the needle-shaped elements of the Curvelet transform has a very
high directional sensitivity and anisotropy, making it well suited
for curvature representation. Other directional transforms such
as Gabor wavelets and Dual-Tree Complex Wavelet Transform
(DTCWT) only cover part of the spectrum in the frequency do-
main. On the other hand, the Ridgelet transform is only appli-
cable to objects with global straight-line singularities, which
are rarely observed in real applications [4]. The main contri-
butions of this paper lie in exploiting the Curvelet transform in
two novel ways:
A) As a keypoint detector to extract salient points on the
face. The identication of the keypoint is undertaken in
the Curvelet domain by examining Curvelet coefcients
in each subband. Given that a high Curvelet coefcients
is indicative of high global variations at coarse resolu-
tions and of high local variations at ne resolutions, the
idea is to obtain dominant coefcients associated to the
mid-bands of the Curvelets. This is done so as to lessen
the bias introduced by noise, face boundary endpoints or
other spurious elements. Because each keypoint is repre-
sented by a scale, orientation, Curvelet position, spatial
position and magnitude, the identication of the detected
keypoints is robust and repeatable in the Curvelet domain.
Given that our keypoint detector is based on anisotropi-
cally scaled basis functions, it can capture a large variety
of geometrical features, including curves and edges. Other
keypoint detectors such as Difference of Gaussian DoG
in SIFT act as isotropic lters, thereby requiring an ad-
ditional step to detect edges (used to detect keypoints) [5].
B) As multi-scale local surface descriptors that can extract
highly distinctive features around the detected keypoints.
Unlike previously reported Curvelet based works ([6],
[7]), which extract global features (e.g. PCA, entropy,
standard deviation and mean) from an entire 2-D face, our
descriptor operates on depth facial images only. In addi-
tion, it exploits the Curvelet decomposition (scale, orien-
tation and Curvelet position) to construct highly descrip-
tive local (rather than global) features around the detected
keypoints of all subbands. To overcome the sensitivity of
Curvelets to rotation, the elements (descriptors) of the fea-
ture vectors are reordered (reoriented) based on the orien-
tation of the detected keypoints, thus resulting in rotation
invariant local features. Because our descriptor is con-
structed in the Curvelet domain, the accurate localization
of the keypoints is not required. In contrast, other key-
point detectors require an additional step to invert back
accurately to the spatial domain [5].
1070-9908 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
ELAIWAT et al.: 3-D FACE RECOGNITION USING CURVELET LOCAL FEATURES 173
Fig. 1. Illustration of a 4 scales Curvelet decomposition.
II. ALGORITHM DESCRIPTION
A. Keypoint detector
The Discrete Curvelet transform decomposes each image
into a set of frequency and angle decomposition components ,
as shown in Fig. 1. The angle decomposition process produces
identical subbands coefcients at angle and for the
same scale. As a result, only half of the subbands needs to be
considered. Given a 3-D face image (depth image), the face
Curvelet coefcients are determined using the Fast Discrete
Curvelet Transform (FDCT) dened by [8],
(1)
where is a discrete Curvelet coefcient at scale , sub-
band at position , while and are the discrete
mother Curvelet and the 2-D array (e.g. image), respectively.
To extract keypoints from the Curvelet coefcients (Fig. 2(b)),
a keypoint selection measure based on the coefcients magni-
tude is used in each scale:
(2)
where is the set of keypoints at scale , while repre-
sents the mean value of all coefcients at scale . The mean
value has previously successfully been applied, as a threshold,
on Curvelet transform for feature extraction and coefcients l-
tering [9]. This work uses the mean value for keypoint selection
(Fig. 2(c)). However, since the keypoints are extracted from all
subbands of a given scale, several keypoints could invert back
to the same spatial location or region. This would result in re-
dundant features. In order to address this, a keypoint in
set is discarded if:
(3)
where is the maximum magnitude:
(4)
found by examining all keypoints in that share the same spa-
tial position than keypoint . refers to the max-
imum magnitude:
(5)
found by examining all keypoints present in a patch centered
around spatial position . The weighting factor in Eq. (3)
was set to 80% to keep only the most signicant keypoints in
each patch. Fig. 2(b) and Fig. 2(c) illustrate the keypoint de-
tector algorithm at a given scale . In our experiments, we ap-
plied four scales Curvelet transform. The coarsest (rst scale)
and the nest (last scale) were discarded because (i) both do not
have an angle decomposition and (ii) information in the lowest
(at surfaces) and the highest (noise and image boundaries) fre-
quency bands are rarely signicant. Furthermore, all Curvelet
coefcients which are falling on the face boundaries and mouth
area, including chin/beard, (areas below nose) were automati-
cally excluded as they can be affected by changes in facial ex-
pression. This was done by creating a reference line below
the nose tip to exclude the Curvelet coefcients belonging to
the mouth and chin/beard.
B. Local Feature Representation
Once the dominant keypoints are identied (detected), unique
multi-scale local surface descriptors are extracted around each
of these keypoints, considering each subband of a given scale
(Fig 2(d)). To explain how these descriptors (features) are
extracted, let us consider one of these keypoints, . Let
be the corresponding position of the keypoint in each
subband . The rst step is to extract sub-patches
around each position in order to form a 3-D patch
dened as
(6)
where is a discrete Curvelet coefcient at scale , sub-
band , and position . The size of the rst dimension in
represents the number of subbands in scale . The other
two dimensions represent the size of the sub-patches, which de-
termines the degree of locality of the features. A smaller sub-
patch results in weak descriptiveness, while a larger sub-patch
exhibits a higher sensitivity to facial expressions. Experimental
tests conducted with varying sub-patch sizes have shown that a
sub-patch size offers the best trade-off between descrip-
tiveness and sensitivity to facial expressions. This was true for
both scales 2 and 3. Extracting sub-patches at different scales
using an adaptive patch size is also possible but at the cost of
a more complex matching process since feature vectors will be
of different sizes. After extraction, each 3-D patch is scaled
by a weighting Gaussian window ,
(7)
where is the euclidean distance of a coefcient to the keypoint,
thus more emphasis is given to the keypoint. The size of the
174 IEEE SIGNAL PROCESSING LETTERS, VOL. 21, NO. 2, FEBRUARY 2014
Fig. 2. Block diagram of keypoint detection and feature extraction algorithm (best seen in color): (a) Curvelet Transform (b) Keypoint Detection at scale
(c) Keypoints selection and spatial representation and (d) Feature extraction around keypoint .
Fig. 3. Verication and identication results for the FRGC v2 dataset.
sub-patch was xed to 5 while the optimal value of the param-
eter was determined experimentally to be for both scales
2 and 3. The scaled patch is dened by,
(8)
Building a feature vector directly from the resulting scaled sub-
patches is not advisable because Curvelet coefcients are sensi-
tive to rotation. To address this issue, we propose to rst reorder
each scaled patch w.r.t the keypoint orientation using cir-
cular shift:
(9)
where represents the keypoint sub-patch ( written
in bold to illustrate the shift). This is done so as to keep the
keypoint sub-patch always at a xed position (e.g. second po-
sition in Eq. (9)) and preserve the relative subbands order. This
ensures rotation invariance for each feature vector , which is
constructed by concatenating all elements of a patch into
one vector. This feature extraction is applied for scales 2 and
3 (Fig. 2(c)), as stated previously. To calculate the similarity
between a probe and a gallery face, corresponding fea-
tures at each scale are matched using the cosine rule,
(10)
where and refer to probe and gallery local facial fea-
tures, respectively. When the two features are exactly equal,
will be 1, corresponding to a perfect match. For each
feature (row), the best match in the similarity matrix is
included in one vector , which is constructed by concatenating
all of the best matches found from all features. The score
matching between a given probe face and a given gallery
face is calculated as the mean value of the best matches in
vector . Note that for each face, we have two types of features:
scale 2 and scale 3 features. We can match probe and gallery
faces based on the type (scale) of features separately, or fuse
scale 2 and scale 3 at the score level using weighted sum fusion
[3].
ELAIWAT et al.: 3-D FACE RECOGNITION USING CURVELET LOCAL FEATURES 175
TABLE I
PERFORMANCE EVALUATION ON FRGC V2 DATASET
III. EXPERIMENTAL RESULTS
We performed our experiments on the FRGC v2 dataset,
which includes 3-D faces scans partitioned into a training and
a validation set. 4007 3-D scans of 466 subjects were used for
validation. 466 scans under neutral expression were taken to
build a gallery, while the remaining scans (3541) representing
probes were divided into neutral (1944 images) and non-neutral
expression categories (1597 images). The dataset exhibits large
variations in facial expression and illumination conditions but
limited pose variations. More description about the FRGC v2
dataset can be found in [10]. Fig. 3 reports our identication
and verication results using scale 2, scale 3 and both scales.
In all cases, scale 2 performs slightly better than scale 3. This
is because more distinctive features can be found in scale
2. Table I reports identication and verication rates when
combining both scales 2 and 3, for neutral and non-neutral
cases. For both cases, our algorithm achieves higher recogni-
tion rates than state-of-the-art approaches, including geometric
features [11], [12], [13], local features [14] or optimized ICP
[15]. The reduction of Curvelet coefcients sensitivity to
rotation through circular shift (Eq. (9)) of the feature vectors
was also evaluated using all 1597 non-neutral faces of the
FRGC v2 dataset. The resulting identication rates were 86.2%
and 90.4%, without and with circular shift, respectively. In
addition to the FRGC v2 dataset, additional experiments were
conducted on the BU-3DFE dataset, which exhibits larger
expression variations [16]. The dataset contains a total of 2500
facial expressions, distributed over 100 subjects. The dataset
has six different facial expressions, with four levels of intensity
for each subject [16]. The resulting identication rate at 0.1
FAR was 98.21% when combining scales 2 and 3. This is
comparable to the most recently reported work by Lei et al.
[17], who achieved 98.20% but had to rely on special masks
to isolate forehead and nose regions. The computation cost of
our keypoint detector and surface descriptor was evaluated on
a standard desktop with Intel Core i7 3.4 GHz processor with
8.0 GB RAM. On average, for a standard depth
image of the dataset, it takes to detect all keypoints, and
to build all descriptors around each detected keypoint.
Given that the implemented Curvelet transform is based on
FFT, the proposed approach is well suited to either FPGA or
GPU real-time implementations [18].
IV. CONCLUSION
We presented a novel Curvelet-based feature extraction al-
gorithm for 3-D face recognition. The algorithm rst identies
important keypoints on the face by examining Curvelet coef-
cients in each subband. Such an identication is shown to be
robust and repeatable because each keypoint is represented by
scale, orientation, Curvelet position, spatial position and mag-
nitude. Rotation-invariant multi-scale local surface descriptors
are then built around the detected keypoints to extract highly
distinctive facial features for robust feature matching. Experi-
ments performed on the FRGC v2 dataset have shown that the
proposed algorithm is robust and accurate for varying facial ex-
pressions, with a superior verication rate of 97.83%.
REFERENCES
[1] K. Bowyer, K. Chang, and P. Flynn, A survey of approaches and chal-
lenges in 3d and multi-modal 3d + 2d face recognition, Comput. Vis.
Image Understand., vol. 101, no. 1, pp. 115, 2006.
[2] W. Zhao, R. Chellappa, P. J. Phillips, and A. Rosenfeld, Face recog-
nition: A literature survey, ACM Comput. Surv., vol. 35, no. 4, pp.
399458, 2003.
[3] A. Mian, M. Bennamoun, and R. Owens, An efcient multimodal
2d-3d hybrid approach to automatic face recognition, IEEE Trans.
Patt. Anal. Mach. Intell., vol. 29, no. 11, pp. 19271943, 2007.
[4] J. Ma and G. Plonka, A review of curvelets and recent applications,
IEEE Signal Process. Mag., 2009.
[5] J. Fauqueur, N. Kingsbury, and R. Anderson, Multiscale keypoint de-
tection using the dual-tree complex wavelet transform, in IEEE Int.
Conf. Image Processing, 2006.
[6] T. Mandal, Q. J. Wu, and Y. Yuan, Curvelet based face recognition via
dimension reduction, Signal Process., vol. 89, no. 12, pp. 23452353,
2009.
[7] S. Rahman, S. Naim, A. Al Farooq, and M. Islam, Curvelet texture
based face recognition using principal component analysis, in Int.
Conf. Computer and Information Technology (ICCIT), 2010.
[8] E. Cands, L. Demanet, D. Donoho, and L. Ying, Fast discrete curvelet
transforms, J. Multiscale Model. Simul., vol. 5, no. 3, pp. 861899,
2006.
[9] I. Sumana, M. Islam, Z. Dengsheng, and L. Guojun, Content based
image retrieval using curvelet transform, in IEEE Workshop on Mul-
timedia Signal Processing, 2008.
[10] P. Phillips, P. Flynn, T. Scruggs, K. Bowyer, J. Chang, K. Hoffman, J.
Marques, M. Jaesik, and W. Worek, Overview of the face recognition
grand challenge, in IEEE Comput. Soc. Conf. CVPR, 2005, vol. 1.
[11] P. Liu, Y. Wang, D. Huang, Z. Zhang, and L. Chen, Learning the
spherical harmonic features for 3-d face recognition, IEEE Trans.
Image Process., vol. 22, no. 3, 2013.
[12] D. Smeets, J. Keustermans, D. Vandermeulen, and P. Suetens, Mesh-
sift: Local surface features for 3d face recognition under expression
variations and partial data, Comput. Vis. Image Understand., vol. 117,
no. 2, 2013.
[13] Y. Ming and Q. Ruan, Robust sparse bounding sphere for 3d face
recognition, Image Vis. Comput., vol. 30, no. 8, 2012.
[14] Y. Lei, M. Bennamoun, M. Hayat, and Y. Guo, An efcient 3d
face recognition approach using local geometrical signatures, Patt.
Recognit., 2013.
[15] H. Mohammadzade and D. Hatzinakos, Iterative closest normal point
for 3d face recognition, IEEE Trans. Patt. Anal. Mach. Intell., vol. 35,
no. 2, pp. 381397, 2013.
[16] L. Yin, X. Wei, Y. Sun, J. Wang, and M. Rosato, A3d facial expression
database for facial behavior research, in 7th Int. Conf. Automatic Face
and Gesture Recognition, 2006 FGR 2006, 2006, pp. 211216.
[17] Y. Lei, M. Bennamoun, and A. A. El-Sallam, An efcient 3d face
recognition approach based on the fusion of novel local low-level fea-
tures, Patt. Recognit., vol. 46, no. 1, pp. 2437, 2013.
[18] K. Moreland and E. Angel, The fft on a gpu, in Proc. the ACM
SIGGRAPH/EUROGRAPHICS Conf. Graphics Hardware, 2003, pp.
112119.

You might also like