Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

DeepIris: Iris Recognition Using A Deep Learning

Approach

Shervin Minaee∗ , Amirali Abdolrashidi†


∗ New York University
† University of California, Riverside

Abstract—Iris recognition has been an active research area wavelet for iris recognition. Each iris is represented as a la-
during last few decades, because of its wide applications in
arXiv:1907.09380v1 [cs.CV] 22 Jul 2019

beled graph and a similarity function is defined to compare the


security, from airports to homeland security border control. two graphs. In [8], Belcher used region-based SIFT descriptor
Different features and algorithms have been proposed for iris for iris recognition and achieved a relatively good performance.
recognition in the past. In this paper, we propose an end-to-end In [9], Umer proposed an algorithm for iris recognition using
deep learning framework for iris recognition based on residual
convolutional neural network (CNN), which can jointly learn the
multiscale morphologic features. More recently, Minaee et al
feature representation and perform recognition. We train our [10] proposed an iris recognition using multi-layer scattering
model on a well-known iris recognition dataset using only a few convolutional networks, which decomposes iris images using
training images from each class, and show promising results Wavelets of different scales and orientations, and used those
and improvements over previous approaches. We also present features for iris recognition. An illustration of the decomposed
a visualization technique which is able to detect the important images in the first two layers of scattering network is shown
areas in iris images which can mostly impact the recognition in Fig 1.
results. We believe this framework can be widely used for other
biometrics recognition tasks, helping to have a more scalable and
accurate systems.

I. I NTRODUCTION
To personalize an experience or make an application more
secure and less accessible to undesired people, we need to
be able to distinguish a person from everyone else. There are
various ways to identify a person, and biometrics are one of
the most secure options so far. They can be divided into two
categories: behavioral and physiological features. Behavioral
features are those actions that a per-son can uniquely cre-
ate or express, such as signatures, walking rhythm, and the
physiological features are those characteristics that a person
possesses, such as fingerprints and iris pattern. Many works
revolved around recognition and categorization of such data
including, but not limited to, fingerprints,faces, palmprints and
iris patterns [1]-[5].
Iris recognition systems are widely used for security ap-
plications, since they contain a rich set of features and do
not change significantly over time. They are also virtually
impossible to fake. One of the first modern algorithms for
iris recognition was developed by John Daugman and used
2D Gabor wavelet transform [6]. Since then, there have been
Fig. 1. The images from the first (on top) and second layers of scattering
various works proposing different approaches for iris recogni- transform [10] for a sample iris image. Each image is capturing the wavelet
tion. Many of the traditional approaches follow the two-step energies along specific orientation and scale.
machine learning approach, where in the first step a set of
hand-crafted features are derived from iris images, and in the
second step a classifier is used recognize the iris images. Here Although many of the previous works for iris recogni-
we will discuss about some of the previous works proposed tion achieve high accuracy rates, they involve a lot of pre-
for iris recognition. processing (including iris segmentation, and unwraping the
original iris into a rectangular area) and using some hand-
In a more recent work, Kumar [6] proposed an algorithm crafted features, which may not be optimum for different iris
based on a combination of Log-Gabor, Haar wavelet, DCT and datasets (collected under different lightning and environmental
FFT features, and achieved high accuracy. In [7], Farouk pro- conditions). In recent years, there have been a lot of focus
posed an scheme which uses elastic graph matching and Gabor on developing models for jointly learning the features, while
doing prediction. Along this direction, convolutional neural There are two main ways in which the pre-trained model
networks [11] have been very successful in various computer is used for a different task. In one approach, the pre-trained
vision and natural language processing (NLP) tasks [12]. Their model is treated as a feature extractor, and then a classi-
success is mainly due to three key factors: the availability fier/regressor model is trained on top of that to perform the
of large-scale manually labeled datasets, powerful processing second task. In this approach the internal weights of the
tools (such Nvidia’s GPUs), and good regularization tech- pre-trained model are not adapted to the new task. One can
niques (such as dropout, etc) that can prevent overfitting think of using a pre-trained language model for deriving word
problem. representation used in another task (such as sentiment analysis,
NER, etc.) as an example of the first approach. In the second
Deep learning have been used for various problems in approach, the whole network (or a subset of layers/parameters
computer vision, such as image classification, image segmenta- of the model) is fine-tuned on the new task, therefore the pre-
tion, super-resolution, image captioning, emotion analysis, face trained model weights are treated as the initial values for the
recognition, and object detection, and significantly improved new task, and are updated during the training procedure.
the performance over traditional approaches [13]-[20]. It has
also been used heavily for various NLP tasks, such as sen-
timent analysis, machine translation, name-entity-recognition, B. Iris Image Classification Using ResNet
and question answering [21]-[24].
In this work, we focused on iris recognition task, and chose
More interestingly, it is shown that the features learned a dataset with a large number of subjects, but limited number of
from some of these deep architectures can be transferred to images per subject, and proposed a transfer learning approach
other tasks very well. In other words, one can get the features to perform identity recognition using a deep residual convo-
from a trained model for a specific task and use it for a different lutional network. We use a pre-trained ResNet50 [13] model
task (by training a classifier/predictor on top of it) [25]. trained on ImageNet dataset, and fine-tune it on our training
Inspired by [25], Minaee et al. [26] explored the application of images. ResNet is popular CNN architecture which was the
learned convolutional features for iris recognition and showed winner of ImageNet 2015 visual recognition competition. It
that features learned by training a ConvNet on a general image generates easier gradient flow for more efficient training. The
classification task, can be directly used for iris recognition, core idea of ResNet is introducing a so-called identity shortcut
beating all the previous approaches. connection that skips one or more layers, as shown in Figure
3. This would help the network to provide a direct path to the
For iris recognition task, there are several public datasets very early layers in the network, making the gradient updates
with a reasonable number of samples, but for most of them for those layers much easier.
the number of samples per class is limited, which makes it
difficult to train a convolutional neural network from scratch
on these datasets. In this work we propose a deep learning
framework for iris recognition for the case where only a few
samples are available for each class (few shots learning).
The structure of the rest of this paper is as follows. Section
II provides the description of the overall proposed framework.
Section III provides the experimental studies and comparison
with previous works. And finally the paper is concluded in
Section IV.

II. T HE P ROPOSED F RAMEWORK Fig. 2. The residual block used in ResNet Model

In this work we propose an iris recognition framework


based on transfer learning approach. We fine-tune a pre-trained To perform recognition on our iris dataset, we fine-tuned
convolutional neural network (trained on ImageNet), on a a ResNet model with 50 layers on the augmented training set.
popular iris recognition dataset. Before discussing about the The overall block-diagram of the ResNet50 model, and how it
model architecture, we will provide a quick introduction of is used for iris recognition is illustrated in Figure 3.
transfer learning.
We fine-tune this model for a fixed number of epochs,
which is determined based on the performance on a validation
A. Transfer Learning set, and then evaluate it on the test set. This model is then
trained with a cross-entropy loss function. To reduce the
Transfer learning is a machine learning technique in which chance of over-fitting the `2 norm can be added to the loss
a model trained on one task is modified and applied to another function, resulting in an overall loss function as:
related task, usually by some adaptation toward the new task.
For example, one can imagine using an image classification Lf inal = Lclass + λ1 ||Wf c ||2F (1)
model trained on ImageNet [27] to perform medical image P
classification. Given the fact that a model trained on general where Lclass = − i pi log(qi ) is the cross-entropy loss, and
purpose object classification should learn an abstract repre- ||Wf c ||2F denotes the Frobenius norm of the weight matrix in
sentation for images, it makes sense to use the representation the last layer. We can then minimize this loss function using
learned by that model for a different task. stochastic gradient descent or Adam optimizer.
3x3x128 kernel, /2

3x3x256 kernel, /2

3x3x512 kernel, /2
7x7x64 kernel, /2

3x3x128 kernel

3x3x128 kernel

3x3x128 kernel

3x3x128 kernel

3x3x128 kernel

3x3x256 kernel

3x3x256 kernel

3x3x256 kernel

3x3x256 kernel

3x3x256 kernel

3x3x256 kernel

3x3x256 kernel

3x3x256 kernel

3x3x256 kernel

3x3x256 kernel

3x3x512 kernel

3x3x512 kernel

3x3x512 kernel

3x3x512 kernel

3x3x512 kernel
3x3x128 kernel

3x3x128 kernel

3x3x256 kernel
3x3x64 kernel

3x3x64 kernel

3x3x64 kernel

3x3x64 kernel

3x3x64 kernel

3x3x64 kernel
Convolution

Convolution

Convolution

Convolution

Convolution

Convolution

Convolution

Convolution

Convolution

Convolution

Convolution

Convolution

Convolution

Convolution

Convolution

Convolution

Convolution

Convolution

Convolution

Convolution

Convolution

Convolution

Convolution

Convolution

Convolution

Convolution

Convolution

Convolution

Convolution

Convolution

Convolution

Convolution

Convolution

Avg Pooling

Softmax
FC layer
Pooling

1x2048
Input Classes

Fig. 3. The architecture of ResNet50 neural network [13], and how it is transferred for iris recognition. The last layer is changed to match the number of
classes in our dataset.

TABLE I. C OMPARISON OF PERFORMANCE OF DIFFERENT


III. E XPERIMENTAL R ESULTS ALGORITHMS

In this section we provide the experimental results for the Method Accuracy Rate
proposed algorithm, and the comparison with the previous Multiscale Morphologic Features [9] 87.94%
The proposed algorithm 95.5%
works on this dataset.
Before presenting the result of the proposed model, let
us first talk about the hyper-parameters used in our training C. Important Regions Visualization
procedure. We train the proposed model for 100 epochs using
an Nvidia Tesla GPU. The batch size is set to 24, and Here we provide a simple approach to visualize the most
Adam optimizer is used to optimize the loss function, with important regions while performing iris recognition using
a learning rate of 0.0002. All images are down-sampled to convolutional network, inspired by the work in [31]. We start
224x224 before being fed to the neural network. All our from the top-left corner of an image, and each time zero out
implementations are done in PyTorch [28]. We present the a square region of size N xN inside the image, and make a
details of the datasets used for our work in the next section, prediction using the trained model on the occluded image.
followed by quantitative and visual experimental results. If occluding that region makes the model to mis-label that
iris image, that region would be considered as an important
region, while doing iris recognition. On the other hand, if
A. Dataset removing that region does not impact the model’s prediction,
we infer that region is not as important. Now if we repeat this
We have tested our algorithm on the IIT Delhi iris database,
procedure for different sliding windows of N xN , each time
which contains 2240 iris images captured from 224 different
shifting them with a stride of S, we can get a saliency map
people. The resolution of these images is 320x240 pixels [29].
for the most important regions in recognizing fingerprints. The
Six sample images from this dataset are shown in Fig 4. As we
saliency maps for four example iris images are shown in Figure
can see the iris images in this dataset have slightly different
5. As it can be seen, most regions inside the iris area seem to
color distribution, as well as different sizes.
be important while doing iris recognition.

IV. C ONCLUSION
In this work we propose a deep learning framework for
iris recognition, by fine-tuning a pre-trained convolutional
model on ImageNet. This framework is applicable for other
biometrics recognition problems, and is specially useful for
the cases where there are only a few labeled images available
for each class. We apply the proposed framework on a well-
known iris dataset, IIT-Delhi, and achieved promising results,
Fig. 4. Six sample iris images from IIT Delhi dataset [30]. which outperforms previous approaches on this datasets. We
train these models with very few original images per class. We
also present a visualization technique for detecting the most
For each person, 4 images are used as test samples ran- important regions while doing iris recognition.
domly, and the rest are using for training and validation.

B. Recognition Accuracy ACKNOWLEDGMENT


Table I provides the recognition accuracy achieved by the The authors would like to thank IIT Delhi for providing
proposed model and one of the previous works on this dataset, the iris dataset used in this work. We would also like to thank
for iris identification task. Facebook AI research for open sourcing the PyTorch package.
[10] S Minaee, A Abdolrashidi, and Y Wang. ”Iris recognition using
scattering transform and textural features.” Signal Processing and
Signal Processing Education Workshop (SP/SPE), IEEE, 2015.
[11] LeCun, Yann, et al. ”Gradient-based learning applied to docu-
ment recognition.” Proceedings of the IEEE: 2278-2324, 1998.
[12] A Krizhevsky, I Sutskever, GE Hinton, ”Imagenet classification
with deep convolutional neural networks”, Advances in neural
information processing systems, 2012.
[13] He, Kaiming, et al. ”Deep residual learning for image recog-
nition.” Proceedings of the IEEE conference on computer vision
and pattern recognition. 2016.
[14] Badrinarayanan, Vijay, Alex Kendall, and Roberto Cipolla.
”Segnet: A deep convolutional encoder-decoder architecture for
image segmentation.” IEEE transactions on pattern analysis and
machine intelligence 39.12: 2481-2495, 2017.
[15] Ren, S., He, K., Girshick, R., Sun, J. “Faster r-cnn: Towards
real-time object detection with region proposal networks”, In
Advances in neural information processing systems, 2015.
[16] Dong, Chao, et al. ”Learning a deep convolutional network
for image super-resolution.” European conference on computer
vision. Springer, Cham, 2014.
[17] Minaee, Shervin, and Amirali Abdolrashidi. ”Deep-Emotion:
Facial Expression Recognition Using Attentional Convolutional
Network.” arXiv preprint arXiv:1902.01019, 2019.
[18] Sun, Yi, et al. ”Deep learning face representation by joint
identification-verification.”, NIPS, 2014.
[19] Minaee, Shervin, et al. ”MTBI Identification From Diffusion
MR Images Using Bag of Adversarial Visual Features.” IEEE
transactions on medical imaging, 2019.
[20] Minaee, Shervin, et al. ”A deep unsupervised learning approach
toward MTBI identification using diffusion MRI.” Engineering
Fig. 5. The saliency map of important regions for Iris recognition. in Medicine and Biology Society (EMBC), IEEE, 2018.
[21] Kim, Yoon. ”Convolutional neural networks for sentence clas-
sification.”, Conference on Empirical Methods on Natural Lan-
R EFERENCES guage Processing, 2014.
[22] A Severyn, A Moschitti. ”Learning to rank short text pairs
[1] Marasco, Emanuela, and Arun Ross. ”A survey on antispoofing with convolutional deep neural networks.”, SIGIR conference on
schemes for fingerprint recognition systems.” ACM Computing research and development in information retrieval, ACM, 2015.
Surveys (CSUR) 47.2 (2015): 28.
[23] S Minaee, Z Liu. ”Automatic question-answering using a deep
[2] Minaee, Shervin, and AmirAli Abdolrashidi. ”Highly accurate similarity neural network.” Global Conference on Signal and
palmprint recognition using statistical and wavelet features.” Information Processing, IEEE, 2017.
Signal Processing and Signal Processing Education Workshop
[24] Bahdanau, Dzmitry, Kyunghyun Cho, and Yoshua Bengio.
(SP/SPE), IEEE, 2015.
”Neural machine translation by jointly learning to align and
[3] Bowyer, Kevin W., and Mark J. Burge, eds. ”Handbook of iris translate.” arXiv preprint arXiv:1409.0473 (2014).
recognition”. London, UK: Springer, 2016.
[25] AS Razavian, H Azizpour, et al. ”CNN features off-the-shelf:
[4] Ding, Changxing, and Dacheng Tao. ”Robust face recognition an astounding baseline for recognition.” IEEE conference on
via multimodal deep face representation.” IEEE Transactions on computer vision and pattern recognition workshops, 2014.
Multimedia 17.11 (2015): 2049-2058. [26] S Minaee, A Abdolrashidi, Y Wang. ”An experimental study of
[5] S Minaee, A Abdolrashidi, and Y Wang. ”Face recognition using deep convolutional features for iris recognition.” signal process-
scattering convolutional network.” Signal Processing in Medicine ing in medicine and biology symposium (SPMB), IEEE, 2016.
and Biology Symposium (SPMB). IEEE, 2017. [27] Deng, Jia, et al. ”Imagenet: A large-scale hierarchical image
[6] A. Kumar and A. Passi, Comparison and combination of iris database.” 2009 IEEE conference on computer vision and pattern
matchers for reliable personal authentication, Pattern Recogni- recognition. IEEE, 2009.
tion, vol. 43, no. 3, pp. 1016-1026, Mar. 2010. [28] https://pytorch.org/
[7] RM. Farouk, Iris recognition based on elastic graph matching [29] Ajay Kumar and Arun Passi, ”Comparison and combination
and Gabor wavelets, Computer Vision and Image Understanding, of iris matchers for reliable personal authentication, Pattern
Elsevier, 115.8: 1239-1244, 2011. Recognition, vol. 43, no. 3, pp. 1016-1026, Mar. 2010.
[8] C. Belcher and Y. Du, Region-based SIFT approach to iris [30] https://www4.comp.polyu.edu.hk/ csajaykr/IITD/Database-
recognition, Optics and Lasers in Engineering, Elsevier 47.1: Iris.htm
139-147, 2009.
[31] M Zeiler, R Fergus. ”Visualizing and understanding convo-
[9] S Umer, BC Dhara, and Bhabatosh Chanda. ”Iris recognition us- lutional networks.” European conference on computer vision,
ing multiscale morphologic features.” Pattern Recognition Letters springer, Cham, 2014.
65: 67-74, 2015.

You might also like