Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

Cross Spectral Iris Matching based on Predictive Image Mapping

Jinyu Zuo, Francesco Nicolo and Natalia A. Schmid

Abstract— An adaptive method to predict NIR channel image Similar to the application of multi-spectral imaging in
from color iris images is introduced. Both visual inspection of geoscience, in biometrics we can use the spectral informa-
the predicted image and the verification performance indicate tion to enhance cross spectral biometric matching. Spectral
that the adaptive mapping linking NIR image and color image is
a potential solution to the problem of matching NIR images vs. signatures of different regions of the iris, face or fingertip
color images in practice. When matched against NIR enrolled can be estimated from multi-spectral data. This estimate can
image the predicted NIR image achieves significantly high further be used to predict the appearance of a given spectral
performance compared to the case when the same NIR image representation of a biometric modality in a different spectral
is matched against R channel alone. band. That is, the estimated relationship is used to map one
I. INTRODUCTION spectral representation into another spectral representation.
Iris images captured in near infrared (NIR) band are tra- To be more specific, a nonlinear adaptive model to predict
ditional input to iris recognition system [1]. In recent years, the value of the NIR channel from a visible range iris image
however, biometricians turned their attention to iris images is proposed. The predicted value of the NIR channel is
acquired in the visible band of electromagnetic spectrum compared with real NIR iris images by using the classical
[2]. This trend is supported by a variety of factors: (1) log-Gabor filter-based algorithm [9]. The results of the per-
optical cameras in visible range are cheap and characterized formance evaluation are promising. The proposed approach
by a very high resolution; (2) in terms of applications, outperforms NIR vs. R cross match comparison.
most of security and surveillance camera systems (installed Recently, Burge and Monaco presented a model for ap-
for monitoring various human activities) are visible range proximating a NIR iris image using features derived from
cameras. Apart from monitoring activities, these cameras the color and structure of a color iris image in [10], [11].
may capture face or iris images, which further may be used In their approach, first, the cluster analysis in the L∗ a∗ b∗
to authenticate a suspicious or violent individual. Thus iris color space on the visible light channels of the multispectral
images acquired by the cameras in visible spectrum are query iris images was employed to identify a set of represen-
images that have to be compared against enrolled database tative color clusters. Next, for each of the clusters, a set
of irises, which is traditionally formed from the iris images of tuples (L∗ , a∗ , b∗ , NIR) was constructed. This set is
in the NIR range. used to establish a relationship between the L∗ a∗ b∗ values
Although some publications are available on the topic (for and NIR values of registered images. The tuples are then
example, publications involving UBIRIS dataset and West augmented with local structural information derived from k
Virginia University (WVU) multi-spectral iris dataset), the multiscale log-Gabor filters G(w), where w is the center
investigation of matching NIR and visible range iris images frequency of filters. For each cluster, a functional mapping
has been very limited; and in general the problem of cross relating (G(w1 ), G(w2 ), . . . , G(wk ), L∗ , a∗ , b∗ ) and NIR is
spectral matching remains unsolved. constructed using a supervised learning algorithm, which
Extracting object reflectivity information across multiple outputs an approximation to the NIR term using the the first
spectral bands is a scope of multi-spectral and hyper-spectral k + 3 terms as inputs.
imaging [3], [4]. For example, both the surface of the earth In our approach, we use RGB color space (the traditional
(a very general object category) [5] and human face or color space) as input to a nonlinear multivariate adaptive
fingertip (a very specific object category) have different mapping and NIR image as output. Furthermore, the camera
reflectivity responses in different spectral bands [6]–[8]. This used for our experiments records three channels: R compo-
spectral information (often called spectral signatures) can nent, combined G/B component and NIR component. Apart
be used to solve many inference problems. It can be used from this, Burge and Monaco have not demonstrated any
to perform precise segmentation of ground regions with numerical results in their publications.
different reflectivity responses, to detect spectral anomalies, The remainder of the paper is organized as follows. Sec.
or perform classification based on object reflectivity function. II describes our proposed methodology. Sec. III presents
numerical results. Conclusions and future work are discussed
J. Zuo and F. Nicolo are with the Lane Department of Com- in Sec. IV.
puter Science and Electrical Engineering, West Virginia University,
Evansdale Drive — Morgantown, WV 26506-6070, USA {jzuo,
fnicolo}@mix.wvu.edu II. METHODOLOGY
N. A. Schmid is on the Faculty in the Lane Department of Since our methodology for performing iris cross spectral
Computer Science and Electrical Engineering, West Virginia Uni-
versity, P.O. Box 6109 — Morgantown, WV 26506-6109, USA matching requires that images from different spectral bands
Natalia.Schmid@mail.wvu.edu be well registered, we will first introduce an existing database

978-1-4244-7580-3/10/$26.00 ©2010 IEEE


Fig. 2. Pixel-based prediction.

Fig. 1. An illustration of false color image.

of multispectral iris which satisfies this assumption. Then we


will present the proposed processing and matching methods.

A. Multi-spectral Iris Dataset and Data used in Simulations


Multi-spectral iris data collected at WVU (see [12] for
detailed description of the dataset) are used to demonstrate
the performance of the proposed method. MS3100 camera
Fig. 3. Neighborhood-based prediction.
manufactured by Geospatial Systems, Inc. was employed for
data acquisition. The MS3100 camera is a 3-chip Multi-
spectral digital camera with 1.4M+ pixels per sensor. The
described in later sections the predictive model is estimated
multispectral iris dataset collected at WVU involves data of
using a neural network (NN). The type of NN and the number
35 users (68 classes, 232 images). Each iris snapshot outputs
of hidden layers can be selected based on the amount and
data acquired by three different spectral channels: NIR, R and type of data involved. Once the mapping is estimated, it is
G/B. They are perfectly registered and synchronized and can
used to predict the output parameter (NIR image) from the
be used to form a false-color image if those three channels
input parameters (R and G/B channels of a query image).
are treated as traditional R, G, and B channels. A sample To ensure reliable model for predictive mapping and robust
image with falsely mapped NIR into red, red into green and
matching, in place of using only two intensity value as the
G/B into blue is provided in Fig. 1. input parameter (as it is illustrated in Fig. 2) we also involve
B. Proposed Predictive Model intensity values of the nearest 8-neighborhood. The new
prediction strategy is illustrated in Fig. 3.
The variations in human eye color are attributed to varying
ratios of eumelanin produced by melanocytes in the iris. C. Neural Network
Based on the iris color, all irises can be placed in three Here the nonlinear mapping is estimated using Feed
broad groups: blue, brown, and green. Each group requires Forward Neural Network (FFNN). The optimal selection of
different predictive model. Thus, the models for the blue the number of hidden layers (here we selected two) was
irises, brown irises and green irises have to be designed and determined by trading off the complexity of the network
trained separately. It has been empirically confirmed that a versus its predictive performance. For the experiments cross
prediction model trained on a blue iris performs really well matching multispectral iris images, the first hidden layer of
only on blue irises. To ensure that the models are equally well the FFNN is composed of 16 neurons while the second layer
trained, sufficient amount of data for training and testing of is composed of 3 neurons. The training data are divided
each model has to be collected. randomly in two subsets: a learning subset composed of 60%
The predictive model is trained as follows. The problem of training data and a validation subset made of remaining
of training predictive model can be stated as a nonlinear 40% of data. The training process stops when the mean
multivariate regression problem, where the predictive model square error drops below 10−5 . The experimental results
establishes a relationship between input parameters and described below are obtained using functions from Neural
output parameters. As shown in Fig. 2, for each pixel in a Network Toolbox in M AT LAB T M .
single iris snapshot (composed of three channels R, G/B, and
NIR) selected for training, the pair of parameters (value of D. Training
R channel, value of G/B channel) form an input and a value Each training image is first segmented to guarantee that
of NIR channel form an output parameter. In our simulations only iris region is selected for training the proposed predic-
(a) (b)

Fig. 4. Block-diagram of the training procedure.

(c) (d)
Fig. 6. (a) The original NIR channel; (b) the R channel; (c) the G/B
channel; and (d) the predicted NIR channel using (b) and (c) as the input
to the predictive mapping function.

Fig. 5. The block-diagram of the testing procedure.

separately to predict the output. The best predictive model is


tive model. the one, which optimizes the verification performance. For
As shown in block-diagram in Fig. 4, training the predic- example, in our implementation, for each testing image, 10
tive model requires a vector of 18 input parameters and a different prediction results were generated by 10 different
single output parameter. In our implementation, first 9 input predictive models. When a query visible light image is
parameters come from the G/B channel and the following 9 compared with NIR enrolled images, 10 different matching
input parameter come from the R channel. scores (Hamming Distances) for each NIR enrollment are
Since estimation of predictive model depends on initial recorded. The minimum score is selected as the best score.
condition, we run the training experiment a number of times This approach is especially useful when dealing with a
and select the model, which produces the minimum genuine set of images of the same iris acquired under different
score. This score is formed by comparing predicted NIR data environmental conditions.
versus the true NIR data. G. Illustration of Predicted Results
E. Testing An example showing an iris snapshot (its original 3
During testing the estimated mapping is used to predict channels: R, G/B, and NIR) and a predicted NIR channel are
NIR channel from the inputs (R channel,G/B channel). The shown in Fig. 6. From this example, it can be easily seen
testing procedure is described in Fig. 5. that the predicted NIR channel visually is very similar to
the original NIR channel and substantially differs from the
F. Model Selection Criterion
R and G/B channels. Note, however, due to the predictive
Since different prediction models produce different results, model, which involves neighbors, that the predicted NIR
the question of how to choose the best model was raised. channel is slightly over-smoothed compared to the original
There is a variety of theoretical and practical model selection NIR channel.
criteria. In this work we used only data available to us, that is, For a close examination, the unwrapped iris patterns are
information in G/B and R channels. To select the best model displayed in Fig. 7 and 8.
the median intensity values of G/B and R channels can be
used. The medians are evaluated based on the entire available III. NUMERICAL RESULTS
area within the images. The optimal predictive model I is To evaluate recognition performance of predicted NIR iris
selected if it minimizes the following cost function: images, all iris images from the testing set were segmented
 
(T estG/B − T rainiG/B )2 + (T estR − T rainiR )2 , using the robust iris segmentation algorithm developed by
Zuo and Schmid [13]. The segmentation was applied to
where T estG/B and T estR are the median values of G/B NIR channel data only. Then the extracted segmentation
and R channels for the testing image and T rainiG/B and parameters were used to segment the other spectral channels.
T rainiR are the median values of G/B and R channels for After segmentation, the data were encoded using an en-
the ith training image. hanced version of the Masek’s iris encoding algorithm (the
An alternative solution would be to involve several predic- details are provided in [14]). Three similarity matrices were
tion models (in place of one). These models are then used formed: (1) the first matrix compared the true NIR imagery
1
(a)
0.98

0.96

GAR
(b)
0.94
NIR vs. R
0.92 NIR vs. PREDICTED NIR
NIR vs. NIR
(c) 0.9
0
10
Fig. 7. Unwrapped iris templates: (a) R channel; (b) NIR channel; and (c) FAR
the predicted NIR channel. Fig. 9. Performance improvement due to the proposed predictive model.

TABLE I
T HE RESULTS OF THREE CROSS COMPARISONS .
(a)
Mean of Mean of d′ EER
Genuine Scores Impostor Scores
NIR vs. NIR 0.0505 0.3986 10.3211 0
(b) NIR vs. P NIR 0.0841 0.3825 8.7493 0.00034
NIR vs. R 0.1177 0.4012 6.3913 0.00051

(c)
comparing two cases (Predicted NIR, NIR) genuine com-
parison and (R, NIR) genuine comparison, in 267 out of
Fig. 8. Unwrapped iris templates: (a) R channel; (b) NIR channel; and (c)
the predicted NIR channel. 280 cases (Predicted NIR, NIR) comparison outperforms
(R, NIR) comparison. Similarly, when comparing two cases
(NIR, NIR) genuine comparison and (Predicted NIR, NIR)
versus the true NIR imagery. This case is treated as the best genuine comparison, in 271 out of 280 cases (NIR, NIR)
achieved performance; (2) the second matrix compares the comparison outperforms (Predicted NIR, NIR) comparison.
true NIR images versus predicted NIR data; and (3) the However, few outliers are observed in both cases. We in-
third matrix compares the true NIR versus the R channel. troduce gain (loss) of the performance for these genuine
Note that G and B channels were not involved in perfor- comparisons and define it as the difference between the
mance evaluation. Compared to G and B channels, R iris matching scores of one probe image and the matching score
image strongly resembles NIR iris image. When involving of another probe image compared to the same NIR target
simple fusion rules at the match score level, the matching image. The cumulative gain (loss) of performance can be
performance of R channel is much higher compared to the visualized through plotting histograms of performance gains
matching performance due to G and B channels. Therefore, on genuine comparisons (see Fig. 10).
the match score of R channel strongly dominates the decision The prediction also was proven to be robust with respect
making process. Here the comparison of NIR channel vs. to imperfect segmentation. We conducted the following ex-
R channel is treated as the baseline comparison. During periment on all 150 images. All segmentations of R channel
numerical evaluations, 5 out of 150 testing images were and predicted NIR channel were slightly distorted in radial
discarded due to poor segmentation. The remaining data were direction. The segmentation of NIR iris images was not mod-
used to plot receiver operating characteristic (ROC) curves. ified. The ROC curves for these experiments are displayed in
The results of performance comparison are summarized in Fig. 11. Few statistics for the same case are summarized in
Fig. 9 and in Table I. Fig. 9 shows the ROC curves for the Table II. Overall, the performance dropped slightly, but the
three comparisons above. The ROCs are displayed as a plot relative position of the ROC curve of predicted NIR versus
of Genuine Acceptance Rate (GAR) versus False Acceptance the true NIR remained unchanged.
Rate (FAR). Note the performance improvement due to
TABLE II
involvement of the predictive model compared to the direct
T HE RESULTS OF THREE CROSS COMPARISONS WITH IMPERFECT
cross spectral matching of NIR data against R-channel data.
RELATIVE SEGMENTATION OF R AND P REDICTED NIR CHANNELS .
The corresponding values of the average genuine scores,
the average impostor scores, d-prime and Equal Error Rate
Mean of Mean of d′ EER
(EER) are summarized in Table I. It shows the improved Genuine Scores Impostor Scores
average performance of the predicted NIR data compared to NIR vs. NIR 0.0671 0.4093 7.9266 0.0069
NIR vs. P NIR 0.0967 0.3907 6.8603 0.0131
performance of R-channel. NIR vs. R 0.1300 0.4118 5.5570 0.0137
Results in Table I indicate that the improved performance
is due to the reduction of values of genuine scores. When
80 1

60 0.95
FREQUENCY 0.9
40

GAR
(a) 0.85
20
0.8
NIR vs. R
0 0.75 NIR vs. PREDICTED NIR
−0.2 −0.15 −0.1 −0.05 0
NIR vs. NIR
HAMMING DISTANCE
0.7
−4 −3 −2 −1 0
10 10 10 10 10
60
FAR
Fig. 11. The performance improvement due to the proposed predictive
FREQUENCY

40 model when different segmentations for R and predicted NIR channels were
used.

(b) 20
[4] J. M. Ellis, “Searching for oil seeps and oil-impacted soil with hyper-
spectral imagery,” Earth Observation Magazine, pp. 25–28, January
0 2001.
−0.2 −0.15 −0.1 −0.05 0
HAMMING DISTANCE
[5] G. Shaw and D. Manolakis, “Signal processing for hyperspectral image
exploitation,” IEEE Signal Processing Magazine, vol. 19, no. 1, pp.
40 12–16, January 2002.
[6] Z. Pan, G. Healey, M. Prasad, and B. Tromberg, “Face recognition
in hyperspectral images,” IEEE Transactions on Pattern Analysis and
30
FREQUENCY

Machine Intelligence, vol. 25, no. 12, pp. 1552–1560, December 2003.
[7] R. K. Rowe, K. A. Nixon, and S. P. Corcoran, “Multispectral finger-
20 print biometrics,” in Information Assurance Workshop, 2005. IAW ’05.
(c) Proceedings from the Sixth Annual IEEE SMC, 2005, pp. 14 – 20.
10 [8] R. K. Rowe, U. Uludag, M. Demirkus, S. Parthasaradhi, and A. K.
Jain, “A multispectral whole-hand biometric authentication system,”
0 in Biometrics Symposium, 2007, September 2007, pp. 1 –6.
−0.2 −0.15 −0.1 −0.05 0 [9] L. Masek, “Recognition of human iris patterns for biometric identifi-
HAMMING DISTANCE cation,” Bachelor’s Dissertation, The School of Computer Science and
Fig. 10. Performance gains on genuine scores: (a) (Predicted NIR, NIR) Software Engineering, The University of Western Australia, 2003.
vs. (R, NIR); (b) (NIR, NIR) vs. (Predicted NIR, NIR); and (c) (NIR, NIR) [10] M. J. Burge and M. K. Monaco, “Multispectral iris fusion for
vs. (R, NIR). enhancement, interoperability, and cross wavelength matching,” S. S.
Shen and P. E. Lewis, Eds., vol. 7334, no. 1. SPIE, 2009, p. 73341D.
[Online]. Available: http://link.aip.org/link/?PSI/7334/73341D/1
[11] ——, “Multispectral iris fusion for enhancement and interoperability,”
IV. CONCLUSIONS AND FUTURE WORKS United States Patent Application 20090279790, November 2009.
[12] C. Boyce, A. Ross, M. Monaco, L. Hornak, and X. Li, “Multispectral
We have demonstrated that a well designed predictive iris analysis: A preliminary study,” in Computer Vision and Pattern
mapping, which is used to map color image into Predicted Recognition Workshop on Biometrics(CVPRW), June 2006, pp. 51–
59.
NIR image, is a promising approach to improve cross spectral [13] J. Zuo, N. D. Kalka, and N. A. Schmid, “A robust iris segmentation
iris recognition. Application of a well designed predictive procedure for unconstrained subject presentation,” in Proceedings of
mapping, as shown here, resulted in 10% improvement when 2006 Biometric Consortium Conference (BCC’06), September 2006.
[14] J. Zuo and N. A. Schmid, “On a local ordinal binary extension to gabor
color image is mapped into NIR image first and then com- wavelet-based encoding for improved iris recognition,” in Proceedings
pared against a NIR image. This improvement is measured of 2007 Biometric Consortium Conference (BCC’07), September 2007.
with respect to R channel versus NIR image comparison
(which is treated as the worst possible comparison).
Similar experiment have been performed on multispectral
face images. The results, unfortunately, are not as promising
as the results for iris biometrics. Designing a new predictive
mapping to improve performance of cross spectral face
recognition is ongoing work in our lab.

R EFERENCES
[1] J. Daugman, “Iris recognition,” American Scientist, vol. 89, pp. 326–
333, July-August 2001.
[2] H. Proenca, S. Filipe, R. Santos, J. Oliveira, and L. A. Alexandre, “The
UBIRIS.v2: A database of visible wavelength images captured on-the-
move and at-a-distance,” IEEE Transactions on Pattern Analysis and
Machine Intelligence, vol. 99, no. PrePrints, 2009.
[3] MicroImages, Inc., “Introduction to hyperspectral imaging,”
http://www.microimages.com/getstart/pdf/hyprspec.pdf, accessed
on May 17, 2010.

You might also like