The SWAX Benchmark

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

T HE SWAX B ENCHMARK

Attacking Biometric Systems with Wax Figures


Rafael Henrique Vareto, Araceli Marcia Sandanha, William Robson Schwartz
Smart Sense Laboratory, Department of Computer Science
Universidade Federal de Minas Gerais, Belo Horizonte, Brazil
Email: {rafaelvareto, william}@dcc.ufmg.br, asaldanha@ufmg.br

Abstract—A face spoofing attack occurs when an intruder at-


tempts to impersonate someone who carries a gainful authentica-
tion clearance. It is a trending topic due to the increasing demand
for biometric authentication on mobile devices, high-security
areas, among others. This work introduces a new database named
Sense Wax Attack dataset (SWAX), comprised of real human and
wax figure images that endorse the problem of face spoofing
detection. The dataset consists of more than 1800 face samples of
55 people/waxworks, arranged in training, validation and test sets
with a large range in expression, illumination and pose variations.
Experiments performed with baseline methods show that despite
the progress in recent years, advanced spoofing methods are still
vulnerable to high-quality violation attempts. Fig. 1. A Korean celebrity on the left and a realistic sculpted wax dummy.

I. I NTRODUCTION
This work presents a database comprised of both real human
Over the last decade, biometric systems have been widely and wax figure images in the interest of endorsing the problem
employed on the recognition of humans beings. These tech- of face spoofing detection. The proposed database, entitled
niques aim at recognizing individuals by taking into account Sense Wax Attack dataset (SWAX), is designed to investigate
their behavioral and observable aspects. In the same degree, the problem in which a face image is presented to a system that
biometric technologies are constantly susceptible to malicious must determine whether it categorizes a bona fide (authentic)
attacks and can be exposed to emerging high-quality cheating or a counterfeit (attack) sample. The goal is to deliver a novel
mechanisms, requiring research to persevere with the concep- spoofing benchmark as well as to specify the dataset protocols
tion of new anti-imposture approaches and databases. and the usage requirements associated with them.
The term spoofing, also referred to as presentation attack, Although several datasets are focused on spoofing detection,
represents a legitimate threat for biometric systems. The most of them engage in recapturing authentic images or
attack occurs when a malefactor attempts to pass him/herself videos in distinct mediums [5]–[7] by varying input sensors,
off as someone who carries an advantageous authentication attack types and capture conditions. Only recently have few
clearance. Intruders regularly employ falsified data to bypass researchers turned themselves to modeling emerging attack
security procedures and gain illegitimate access. As a coun- strategies [8]–[10], such as silicon masks, which may involve
termeasure to presentation attacks, new datasets are regularly concealing a suspect’s real appearance or impersonating some-
released in the literature as an attempt to address the latest one else’s identity. To the best of our knowledge, SWAX is the
attack nuances and leverage upcoming researches [1]–[3]. only dataset comprising both real and wax-modeled persons
The human face may have become the “universal” authen- up to the present time. It provides means of studying presenta-
tication biometric for several reasons, such as the increasing tion attacks in unconstrained environments as it encompasses
distribution of personal pictures on social networks, smart significant divergence in pose, expression, lighting, and scene,
devices management, spread of surveillance cameras, and and camera settings.
convenience to name a few [4]. Therefore, most spoofing The proposed SWAX dataset consists of genuine and coun-
attacks comprise still or motion facial pictures from users terfeit images for all available subjects. As illustrated in
enrolled in recognition systems since face images are easily Figure 1, the database contains labeled photographs of char-
acquired due to their broad availability when contrasted to acters, celebrities and public figures, chosen based on a list
iris and fingerprint data. The low-cost access to face images of waxworks obtained from a famous chain of wax museums.
contributes to the increase of criminals designing presentation More precisely, this work aims at investigating face spoofing
attacks to be validated as authentic users, turning face spoofing detection in realistic scenarios, whither there is little control
into a popular way of deceiving biometric applications. over the pictures acquisition. In contemplation of fair algo-
WIFS‘2019, December, 9-12, 2019, Delft, Netherlands. rithm comparisons, we provide four protocols for developing
XXX-X-XXXX-XXXX-X/XX/$XX.00 2019c IEEE. and evaluating algorithms using the SWAX benchmark.
Fig. 2. Face samples collected from online resources and constituting bona fide (top row) and counterfeit (bottom row) images.

The main contributions of this work are: 1) generation Erdogmus et al. [12] designed 3 D - MAD, the first publicly
of a public set of real human and wax figure images in available 3D spoofing benchmark, composed of real access
the wild, indicating that these faces carry most common and mask attack videos of 17 subjects recorded by a Kinect
variations observed in ordinary scenarios and, consequently, sensor. Liu et al. [8] came up with 12 masks from two
represent practical situations; 2) development of straightfor- distinct factories and having different presentation qualities,
ward baselines, which stand on long-established techniques in constituting a database with more than a thousand videos.
the interest of meeting the requirements of each proposed pro- Manjani et al. [10] propose SMAD, a database containing
tocol and guarantee a legitimate comparison among algorithms face videos of subjects with and without silicone masks
considering the sort of data they are trained on. under unconstrained environments. The masks are stretchable,
allowing actors to speak and blink, in an attempt to pass on
II. R ELATED WORKS lifelike sensation. Similarly, Bhattacharjee et al. [3] consider
rigid and flexible silicone masks as they release CS - MAD, a
Most datasets in the literature concentrate either on mod- dataset captured using low-cost cameras with the collaboration
eling impersonation attack strategies or recapturing genuine of 14 subjects.
images and videos on different mediums. The former demands
a high expertise level since it requires the manufacturing of Attacks involving rigid, silicone and latex rubber masks
realistic masks and, therefore, is an expensive process. The consistently fail to proper fit and match real faces. The use of
latter usually involves high-quality cameras for the sake of masks can restrict natural facial functions/movements such as
capturing deceptive print and video replay attacks. smiling, talking and blinking. The problem becomes even more
Chingovska et al. [1] proposed the REPLAY- ATTACK, a complex under the need of accurately reproducing another
biometric dataset consisting of more than one thousand short person’s appearance due to the requirement of facial casts.
video recordings from 50 subjects. In a similar manner, More precisely, a realistic mask impersonation demands a
Zhang et al. [11] presented CASIA - FASD, a dataset with 50 face mold from the subject being modeled and one more
individuals where falsified samples derive from authentic still mold from the person about to wear the disguise. The process
and motion pictures. Wen et al. [5] proposed MSU - MFSD, of making lifelike masks can cost high figures due to its
which contains 280 video recordings of photo and video attack manufacturing complexity and, to make matters worse, real
attempts of 35 subjects in the interest of modeling mobile presentation attacks do not rely on the cooperation of the
phone attacks. Boulkenafet et al. [6] designed one of the individual “being cloned”. All these drawbacks may end up
largest mobile-based benchmarks, OULU - NPU, holding more turning the adoption of masks into an unfeasible situation.
than four thousand real-access and attack videos captured Besides the dataset proposition as described earlier, there
using frontal cameras of six mobile devices. Liu et al. [7] are numerous approaches focusing on print or replay spoofing
introduced another massive anti-spoofing dataset, called SIW, attacks. Many methods deal with the design of handcrafted
with live and spoof videos of 165 individuals covering a large feature descriptors and learning algorithms whereas others fo-
range of expression, illumination and pose variations. cus on the world-wide trend on convolutional neural networks,
Even though been very important to provide advances in the as described as follows.
research area, the aforementioned datasets represent situations Wen et al. [5] came up with an algorithm built on im-
that fail to generalize well in conditions where attacks do not age distortion analysis and low-level feature descriptors that
proceed from print or replay spoof mediums. They are limited consists of an embedding of SVM classification algorithms
to real subjects accessing laptops and mobile devices through evaluated on cross-dataset scenarios. Pinto et al. [13] explored
built-in or peripheral cameras, whereas attacks originate from the spatial domain during the recapture process as it takes over
photo or video recordings of the same person. Thus, they the noise with Fourier transforms followed by visual rhythm
require that algorithms learn whether a medium corresponds algorithms and the extraction of gray-level co-occurrence
to an attack rather than the individual itself. A way to make matrices. Li et al. [14] employed a multiple-input hierarchical
biometric systems robust to other realistic intrusions is to lay neural network combining either shearlet or optical-flow-based
out medium-free presentation attacks, like mask-based attacks. features.
Valle et al. [15] presented a transfer learning method using A. Evaluation Protocols
a pre-trained DNN model on static features to recognize The following paragraphs specify the evaluation protocols as
photo, video and mask attacks. Liu et al. [7] combined DNN we provide a guideline in the interest of avoiding data misuse
with RNN to estimate the depth of face images along with and clarifying the proper manipulation of training, validation
rPPG signals to boost the detection of unauthorized access. and test collections. These sets are characterized by the cor-
Jourabloo et al. [16] decomposed a spoofing face into noise responding subject/figure identity in which each face image is
and face information by modeling the process of how the presented as either bona fide (authentic) or counterfeit (attack)
attack is generated from its original live image. sample. The proposed protocols and evaluation procedures are
Only recently have researchers turned their attention to inspired on the update report of Huang et al. [20] although
fraudulent 3D-mask attacks as they start to become significant it defines instructions for face identification and verification
threats. Steiner et al. [17] adopted spectral signatures of problems. The four protocols2 are: unsupervised with addi-
material surfaces in the short wave infrared range to dis- tional data, restricted without additional data, unrestricted with
tinguish human skin from other materials, regardless of the no-label additional data, and totally unrestricted.
hide characteristics. Bhattacharjee et al. [18] used off-the- 1) Protocol 01: Unsupervised, with additional data: Unsu-
shelf extended-range imaging devices to detect print and replay pervised learning is a paradigm characterized by autonomous
intrusions as well as mask-based presentation attacks near- intelligence that learns about the data they observe with focus
real-time applications. Many literature methods end up being on no particular problem. This task relies on having input
restricted to specific datasets domains, especially when cam- data and no corresponding output variables in the interest of
eras have comparable capture quality. However, they usually modeling its underlying structure or distribution and thereupon
fail to achieve good results on cross-dataset evaluations [7], learn more about the data [21]. More precisely, algorithms
[13], [19]. have access to input features but no associated target variable.
III. T HE SWAX DATASET For the spoofing detection problem, an unsupervised method
The proposed Sense Wax Attack dataset (SWAX)1 is com- cannot access information indicating whether an image sample
piled from unrestrained online resources and consists of char- is authentic or concerns a counterfeit attack. There should
acters, celebrities and public figure images to whom wax be no access to image class labels or their respective label
dummies have been sculpted into. As a result of collecting statistics, and any plan of action to generating these labels is
image samples from the Internet, there are differences in prohibited.
resolution, aspect ratio, quality and file format. All pictures The unsupervised protocol is crucial to face spoofing de-
are manually captured under uncontrolled scenarios, formed tection since it models the dominant uncertainty of realis-
by uncooperative individuals and distinct camera viewpoints. tic scenarios: unexplored presentations attacks. Real-world
The SWAX database consists of 1,812 images of 55 peo- biometric applications are inclined to anticipate all sorts of
ple/figures, arranged in training, validation and test sets. Fig- illegal intrusions and undergo attacks of distinct nature, which
ure 2 points up the main aspects of the proposed database, are unknown to the training stage. However, unpredictable
depicting authentic and attack samples, respectively. Bona attack characteristics require a notable generalization potential
fide individual samples also incorporate human appearance that is not usually represented in supervised and multi-class
alterations due to the natural aging process as pictures are classification techniques as they tend to become too specific
collected without any sort of age restraint. Some subjects to some particular attacks.
samples also present accessories like spectacles or hats, and It is pertinent to handle spoofing detection problems with
some male individuals may even have mustaches or beards. one-class-based techniques when labels are not available. Al-
Consequently, there is a significant variation in facial expres- ternate unsupervised strategies comprehend autoencoders [16],
sions, illumination, pose, scene, camera types and imaging clustering [22] and domain adaptation [23] algorithms. One-
conditions. class classification can be defined as a special case of unsu-
The proposed dataset comprises mutually exclusive sets, pervised learning where only the class comprising authentic
which indicates that no subject identity contained in the face images is well characterized by training data instances. It
training set can be made available in the validation or the implies that counterfeit images are not known at training time
test sets simultaneously. The same applies to validation and but may emerge at test time. In essence, having all image
test sets, suggesting that alike subjects should not come out samples sharing identical labels in the target variable is analo-
in distinct subsets. For development purposes, the proposed gous to being unlabeled seeing that there is no discriminative
dataset encompasses independent randomly generated splits information about their class names [24].
for 10-fold cross validation in an attempt to escape unfairly For the sake of following the unsupervised protocol appro-
algorithm biases and expose overfitting occurrences. In addi- priately, the following conditions shall apply:
tion, the data presented in the SWAX dataset should be used • Procedures claimed to be unsupervised cannot make use
as is, according to the provided training, evaluation and test of presentation attack samples at the training stage, being
sets. 2 Protocol 01 follows the unsupervised paradigm whereas the remaining
1 The SWAX dataset will be released upon paper acceptance. ones adhere to the supervised learning task.
restricted to authentic samples only. such as manual face segmentation or facial landmark
• There should be no parameters carrying bona fide or annotation.
counterfeit labels, not to mention any other relevant 3) Protocol 03: Unrestricted, with no-label additional data:
information such as file names or singular identifiers. Differently from the first two protocols, which restrain in-
• Approaches can neither benefit from “beforehand in- vestigators and developers in such a way they must employ
formation” concerning the number of samples included either no-label or SWAX training data alone, protocol 03
in each class nor use the label distribution inherent to acknowledges the exploitation of additional data sources in
training and test sets. the interest of improving an algorithm’s precision. Actually,
• Supplementary samples, outside of SWAX database, are the main distinction between protocols 02 and 03 is that the
allowed in cases where they depict bona fide individuals latter supports the adoption of SWAX independent data if and
only and should not be composed of hand-labeled data only if they do not consist of bona fide or counterfeit labels.
or information indicating whether images are authentic The external training data may encompass genuine or
or comprise presentation attacks. fraudulent face samples, but the annotation data must be
• A decision threshold is not supposed to be established limited to labeled or segmented face components, for instance,
in view of test results as it suggests that the threshold is mouth contour and eye corners to name a few. Auxiliary
being chosen in a supervised manner. algorithms proposed for feature description, face detection
2) Protocol 02: Restricted, without additional data: In the and face alignment, even though having the possibility of
supervised paradigm, the training process infers a function being tuned on non-SWAX face samples, are legitimate as we
that relates input values to the output based on previously understand that outside data and methods used exclusively
seen data-label pairs. According to Brink et al. [25], one for no-spoofing reasons is significantly different from using
of the most difficult obstacles in supervised learning is the supplementary data/methods to learn spoofing detection classi-
acquisition of training instances with known target variables fiers, deserving an exclusive category. Therefore, this protocol
due to its expensive or time-consuming effort to label data. On grants permission to utilizing feature extractors and alignment
the other hand, it supports multiple classes and well-definite algorithms that have been developed for completely unrelated
decision boundaries that are capable of distinguishing different purposes.
classes accurately. The third protocol is distinguished from the others as it is
The second protocol is in consonance with the supervised subjected to the following conditions:
learning task, in which there is access to target variables for a • Outside data should not consist of individuals included
set of training data. Each face image is provided with one out in SWAX database.
of two categories: bona fide or counterfeit class. These two • External images cannot hold corresponding information
labels represent ground-truth information that favor the model indicating whether they are genuine or fraudulent sam-
in encountering the best-fitted mapping function to guarantee ples.
good predictions about “future” image presentations. That is, • Additional images can be labeled with keypoints or seg-
authentic and attack samples not listed in the training set ments for the sake of designing preprocessing algorithms
to provide more accurate guidance on how well approaches and, as a result, enhance the overall performance of
generalize to unseen data. spoofing detection methods.
In contrast to the unsupervised paradigm, this protocol • Non- SWAX annotations cannot present information that
admits information indicating whether an image sample is may accredit the formation of authentic or attack face
authentic or consists of an attack. Still, it dismisses any type images.
of annotation or data from outside SWAX database, including 4) Protocol 04: Totally Unrestricted: Supplementary la-
supplementary image samples, external tools like facial land- beled data play an important role in machine learning algo-
mark detectors and alignment methods learned on separate rithms as it qualifies biometric algorithms to identify a broader
images, or feature extractors trained on other data sources. variety of spoofing patterns. On the contrary previously de-
Such external algorithms must be unsupervised, and the data scribed protocols, protocol 04 endorses the use of additional
they operate on must be entirely within SWAX database’s image samples regardless of the data annotation provided. The
training sets. In other words, authentic/attack training labels totally unrestricted pattern is the most permissive protocol as it
provided in protocol 02 are authorized to be used along with admits outside datasets, external feature extractors and other
external algorithms, provided that they satisfy the requirements methods that have been built on independent image data as
below: long as they adhere to the subsequent requirements:
• Algorithms under this protocol are not allowed to use • Supplementary genuine and fraudulent images in which
supplementary data, either to identify presentation attacks their corresponding identity is not available in the
or perform any kind of image preprocessing. SWAX database.
• Validation and test sets are exclusive to their own pur- • External face samples may include annotated keypoints,
poses and cannot be employed to learn auxiliary methods. attributes, segments as well as information carrying bona
• Researchers cannot rely on supplementary labeled data, fide or counterfeit labels.
Protocol 04 admits cross-dataset experiments in the interest A. External Datasets
of assessing the generalization capability of algorithms and Protocol 01 and 04 authorize researchers to incorporate ex-
increasing their performance. In consequence, researchers are ternal data in the interest of evaluating forthcoming approaches
allowed to avoid SWAX samples, using external data only in appropriately. We select Labeled Faces in the Wild (LFW) [29]
the training stage, but compelled to evaluate the designed database, initially designed for studying the problem of uncon-
approaches on the provided testing splits. The idea of combin- strained face recognition, to compose the proposed unsuper-
ing multiple datasets provides improved statistical power and vised analysis. LFW is compatible with SWAX’s first protocol
enhanced classification ability on different domains, turning as it only comprises face images of distinct public figures, hav-
an algorithm more robust to diversified attacks. ing no incidence of waxwork models. The MSU Mobile Face
Spoofing Database (MFSD) consists of genuine live recordings
B. Evaluation Metrics and distinct counterfeit shots captured by different cameras
The vast majority of spoofing detection approaches in in different scenarios. MFSD satisfies the totally unrestricted
the literature explore machine learning algorithms. Evalua- protocol requirements as it provides authentic/attack label
tion metrics are employed in order to point out a model’s information for the training stage. Supplementary datasets are
performance. For the SWAX database, we select two differ- depicted in Table I between parenthesis.
ent metrics: the Receiver Operating Characteristic and the
B. Feature Descriptors
standardized ISO/IEC 30107-3 assessment mechanisms for
biometric systems. We select GLCM [30] and LBP [31] feature descriptors since
The IEC 30107-3 designates a particular set of metrics traditional machine learning algorithms, such as SVM [32] and
for supervised and unsupervised paradigms [26]. They are PLS [33], usually require the extraction of visual features in
denominated Attack lower dimensional spaces. GLCM features are computed with
PVP APresentation Classification Error Rate, directions θ ∈ {0, 45, 90, 135} degrees, distance d ∈ {1, 2},
APCER = V 1 i=1 (1 − Resi ); and Bona Fide Presenta-
PA
tion Classification Error Rate, BPCER = VBF 1
PVBF 16 bins and six texture properties: contrast, dissimilarity,
i=1 (Resi ).
VP A indicates the number of spoofing attacks whereas VBF homogeneity, energy, correlation, and angular second moment.
expresses the total number of authentic presentations. Resi They are obtained from each image’s corresponding Fourier
receives the value 1 when the i-th probe video presentation transform in the frequency domain [13]. LBP information
is categorized as an attack and 0 if labeled as bona fide comprises 256 bins, a radius equal to 1, and eight points
presentation. APCER and BPCER resemble False Acceptance arranged in a 3 × 3 matrix. They are taken from each face
and False Rejection Rates, traditionally employed in the image color band after being converted from RGB into HSV
literature when assessing binary classification methods. The and YC R C B [34]. ALL refers to experiments in which GLCM and
LBP descriptors are extracted and concatenated into a single
Average Classification Error Rate (ACER) summarizes the
overall system performance as it comprehends the mean of feature vector.
APCER and BPCER at the decision threshold determined on C. Baselines
the validation set [27].
As showed in Table I, some approaches have been picked
Although not used in this work, researchers can also con- out from literature and employed as baselines, meeting the
sider both extensively employed Receiver Operating Charac- requirements and cross validation fold splits of each SWAX’s
teristic (ROC) and its associated Area Under Curve (AUC) [28] protocol. W SVM [35] is a classification algorithm that provides
for the unsupervised task. The ROC curve is a performance solutions for non-linear classification in open-set scenarios.
measurement with varying thresholds that tells how much a For the unsupervised paradigm, we combine one-class W SVM
model is capable of distinguishing between classes, whereas with different feature descriptors in an attempt to encounter
AUC represents its degree of separability. The curve presents
an adequate boundary between authentic and attack samples.
the true positive rate on the Y axis and the false positive rate R HYTHM [13] searches for frequency domain artifacts as it
on the X axis, indicating that the plot’s top left corner is the is trained and tested on SWAX’s second and third protocols in
optimal point. Perfect presentation attack detection systems favor of distinguishing counterfeit from valid images. Since
would present true positive rates for the ROC curve equal to Protocol 03 admits external data and methods as long as they
one. Similarly, AUC ranges from zero to one, being preferable are not designed for spoofing purposes, additional algorithms
values approaching one. can be engaged in preprocessing tasks. A LIGN [36] is a
DNN -based face alignment method that iteratively improves
IV. BASELINE E XPERIMENTS
the locations of the facial landmarks estimated in previous
In addition to analyzing the performance of some algorithms stages. Such alignment algorithms tend to improve biometric
on the SWAX benchmark in terms of the proposed protocols, systems performance due to carefully positioning subject faces
this section also reports the employed feature descriptors, ad- into a canonical pose. For Protocol 04, D E -S POOFING [16]
ditional dataset and implementation details of straightforward performs a cross-dataset evaluation as it is trained on OULU -
spoofing baselines. Table I shows the achieved results for all NPU dataset [6] and then aims to estimate spoof noises from
protocols following the IEC 30107-3 evaluation metrics. SWAX probe face samples.
TABLE I [6] Z. Boulkenafet, J. Komulainen, L. Li, X. Feng, and A. Hadid, “Oulu-npu:
APCER AND BPCER RESULTS (%) ON SWAX ’ S PROTOCOLS . A mobile face presentation attack database with real-world variations,”
Prot. Method APCER BPCER ACER in FG. IEEE, 2017.
GLCM +W SVM 85.8 ± 3.65 20.2 ± 3.93 53.0 ± 3.79 [7] Y. Liu, A. Jourabloo, and X. Liu, “Learning deep models for face anti-
LBP +W SVM 84.2 ± 4.92 11.4 ± 2.15 47.8 ± 3.53 spoofing: Binary or auxiliary supervision,” in CVPR. IEEE, 2018.
1
ALL +W SVM 89.9 ± 3.10 11.0 ± 4.26 50.5 ± 3.68 [8] S. Liu, B. Yang, P. C. Yuen, and G. Zhao, “A 3d mask face anti-spoofing
ALL +W SVM ( LFW ) 83.6 ± 4.38 16.2 ± 5.32 49.9 ± 4.85 database with real world variations,” in CVPRW, 2016.
R HYTHM 57.8 ± 12.1 41.0 ± 8.84 49.5 ± 10.5 [9] A. Agarwal, D. Yadav, N. Kohli, R. Singh, M. Vatsa, and A. Noore,
2 ALL +E PLS 34.6 ± 1.75 20.6 ± 2.13 27.6 ± 1.94 “Face presentation attack with latex masks in multispectral videos,” in
ALL +E SVM 46.0 ± 2.12 19.9 ± 2.10 33.0 ± 2.11 CVPRW, 2017.
A LIGN +R HYTHM 51.1 ± 5.24 43.2 ± 4.86 47.1 ± 5.05 [10] I. Manjani, S. Tariyal, M. Vatsa, R. Singh, and A. Majumdar, “Detecting
3 A LIGN + ALL +E PLS 32.4 ± 1.68 19.8 ± 1.25 26.1 ± 1.46 silicone mask-based presentation attack via deep dictionary learning,”
A LIGN + ALL +E SVM 44.9 ± 3.29 22.0 ± 3.40 33.5 ± 3.34 TIFS, vol. 12, no. 7, 2017.
D E -S POOFING 13.6 ± 6.20 67.6 ± 11.9 40.6 ± 9.07 [11] Z. Zhang, J. Yan, S. Liu, Z. Lei, D. Yi, and S. Z. Li, “A face antispoofing
4 ALL +E PLS ( MFSD ) 32.1 ± 1.21 20.1 ± 1.41 26.1 ± 1.31 database with diverse attacks,” in ICB. IEEE, 2012.
ALL +E SVM ( MFSD ) 47.9 ± 1.75 17.6 ± 1.11 32.7 ± 1.43 [12] N. Erdogmus and S. Marcel, “Spoofing in 2d face recognition with 3d
masks and anti-spoofing with kinect,” in BTAS. IEEE, 2013.
We propose an embedding of PLS [33] and SVM [32] clas- [13] A. Pinto, W. R. Schwartz, H. Pedrini, and A. de Rezende Rocha, “Using
visual rhythms for detecting video-based facial spoof attacks,” TIFS,
sifiers, entitled E PLS and E SVM, respectively. The embedding vol. 10, no. 5, 2015.
is learned on random subsets of the training set to create an [14] L. Li, X. Feng, Z. Boulkenafet, Z. Xia, M. Li, and A. Hadid, “An original
array of classifiers, guaranteeing a balanced division of bona face anti-spoofing approach using partial convolutional neural network,”
in IPTA. IEEE, 2016.
fide (positive class) and counterfeit (negative class) samples [15] E. Valle and R. Lotufo, “Transfer learning using convolutional neural
within each classification model. At test time the method networks for face anti-spoofing,” in ICIAR. Springer, 2017.
projects a probe image onto all trained classifiers and computes [16] A. Jourabloo, Y. Liu, and X. Liu, “Face de-spoofing: Anti-spoofing via
noise modeling,” in ECCV, 2018.
the ratio of the number of positive responses attained to the [17] H. Steiner, S. Sporrer, A. Kolb, and N. Jung, “Design of an active mul-
total number of classification models. If most classifiers return tispectral swir camera system for skin detection and face verification,”
positive responses, it implies that the face image is likely to Journal of Sensors, vol. 2016, 2016.
[18] S. Bhattacharjee and S. Marcel, “What you can’t see can help you-
be an authentic sample, or a spoofing attack, otherwise. extended-range imaging for 3d-mask pad,” in BIOSIG. IEEE, 2017.
Evaluated algorithms provide an insight into the perfor- [19] Z. Boulkenafet, J. Komulainen, and A. Hadid, “Face spoofing detection
mance of countermeasure methods when confronted with wax using colour texture analysis,” TIFS, vol. 11, no. 8, 2016.
[20] G. B. Huang and E. Learned-Miller, “Labeled faces in the wild:
figure images. Experiments in Table I show that advanced Updates,” Massachusetts Amherst, Tech. Rep., 2014.
spoofing methods are still vulnerable to high-quality violation [21] G. E. Hinton, T. J. Sejnowski, and T. A. Poggio, Unsupervised learning:
attempts, demanding researchers to deliver more accurate foundations of neural computation. MIT press, 1999.
[22] O. Nikisins, A. Mohammadi, A. Anjos, and S. Marcel, “On effectiveness
biometric systems. The attained results are not satisfactory and of anomaly detection approaches against unseen presentation attacks in
reveal that face anti-spoofing has a large room for improve- face anti-spoofing,” in ICB. IEEE, 2018.
ment despite the significant progress in recent years. [23] H. Li, W. Li, H. Cao, S. Wang, F. Huang, and A. C. Kot, “Unsupervised
domain adaptation for face anti-spoofing,” TIFS, vol. 13, no. 7, 2018.
V. C ONCLUSIONS [24] W. Liu, G. Hua, and J. R. Smith, “Unsupervised one-class learning for
automatic outlier removal,” in CVPR. IEEE, 2014.
This work comprises a public benchmark of real human and [25] H. Brink, J. W. Richards, M. Fetherolf, and B. Cronin, Real-world
wax figure images, entitled Sense Wax Attack dataset, devel- machine learning. Manning, 2017.
[26] “Information technology - biometric presentation attack detection,”
oped for face anti-spoofing purposes. The database consists of ISO/IEC JTC 1/SC 37 Biometrics, Tech. Rep., 2017. [Online].
different protocols, cross validation splits and evaluation met- Available: https://www.iso.org/standard/67380.html
rics to operate fair comparisons among different algorithms. [27] Z. Boulkenafet, J. Komulainen, Z. Akhtar, A. Benlamoudi, D. Samai,
S. Bekhouche, A. Ouafi, F. Dornaika, A. Taleb-Ahmed, L. Qin et al.,
Results show that wax figures can be employed in pursuance “A competition on generalized software-based face presentation attack
of bypassing biometric systems and obtain illegitimate access. detection in mobile scenarios,” in IJCB. IEEE, 2017.
Therefore, there is a lot to improve when it comes to wax- [28] A. Jain, L. Hong, and S. Pankanti, “Biometric identification,” ACM,
vol. 43, no. 2, 2000.
based face spoofing attacks. We hope that with the advent of [29] G. B. Huang, M. Mattar, T. Berg, and E. Learned-Miller, “Labeled faces
this dataset researchers will consider new sorts of spoofing in the wild: A database forstudying face recognition in unconstrained
attacks as a motivation to delivering robust methods to the environments,” in WFRLF, 2008.
[30] R. M. Haralick, K. Shanmugam et al., “Textural features for image
challenging area of spoofing detection. classification,” TSMC, no. 6, 1973.
R EFERENCES [31] T. Ojala, M. Pietikainen, and T. Maenpaa, “Multiresolution gray-scale
and rotation invariant texture classification with local binary patterns,”
[1] I. Chingovska, A. Anjos, and S. Marcel, “On the effectiveness of local PAMI, vol. 24, no. 7, 2002.
binary patterns in face anti-spoofing,” in BIOSIG, 2012. [32] I. Steinwart and A. Christmann, Support vector machines. Springer
[2] A. Pinto, H. Pedrini, W. R. Schwartz, and A. Rocha, “Face spoofing Science & Business Media, 2008.
detection through visual codebooks of spectral temporal cubes,” TIP, [33] R. Rosipal and N. Krämer, “Overview and recent advances in partial
vol. 24, no. 12, 2015. least squares,” in ISOPW. Springer, 2005.
[3] S. Bhattacharjee, A. Mohammadi, and S. Marcel, “Spoofing deep face [34] Z. Boulkenafet, J. Komulainen, and A. Hadid, “Face anti-spoofing based
recognition with custom silicone masks,” in BTAS. IEEE, 2018. on color texture analysis,” in ICIP. IEEE, 2015.
[4] S. Kumar, S. Singh, and J. Kumar, “A comparative study on face [35] W. J. Scheirer, L. P. Jain, and T. E. Boult, “Probability models for open
spoofing attacks,” in ICCCA. IEEE, 2017. set recognition,” PAMI, vol. 36, no. 11, 2014.
[5] D. Wen, H. Han, and A. K. Jain, “Face spoof detection with image [36] M. Kowalski, J. Naruniec, and T. Trzcinski, “Deep alignment network:
distortion analysis,” TIFS, vol. 10, no. 4, 2015. A convolutional neural network for robust face alignment,” in CVPRW,
2017.

You might also like