Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

2012 2nd IEEE International Conference on Parallel, Distributed and Grid Computing

Advances in Face Detection Techniques in


Video
Zafar G. Sheikh, V. M. Thakare, S. S. Sherekar
P.G. Department of Computer Science and Engineering
SGB Amravati University
Amravati (M.S.), India
zgsheikh@gmail.com, vilthakare@yahoo.co.in, ss_sherekar@rediffmail.com

Abstract- With enormous growth in video applications, a huge The proposed work is inspired from the same issue and
amount of video data is being generated every day. The study, work presented by R. Kasturi et al.,[4], a framework for
analysis and investigation of recent development would leads to evaluating object detection and tracking in video:
acquire objective of future. The proposed work is inspired specifically for face, text, and vehicle objects. This
from the same issue in concern face detection in video. It would framework includes the source video data, ground-truth
be the future demand for searching, browsing, and retrieving
annotations (along with guidelines for annotation),
human face of interest from video database for several
performance metrics, evaluation protocols, and tools
applications.
including scoring software and baseline algorithms. The
The goal of proposed work is to systematically address the goal of proposed work is to systematically address the recent
recent work of face detection in video through evaluation that work of face detection in video through evaluation that
permits a meaningful objective comparison of techniques,
permits a meaningful objective comparison of techniques,
provides the research community with sufficient data for the
provides the research community with sufficient data for the
exploration of automatic modeling techniques. The outcome of
exploration of automatic modeling techniques. Objective
the paper which fulfilled the three objectives-i) study and
classification of recent techniques ii) analyzed the status recent
evaluation would be extremely useful to the computer vision
techniques with respect to results and performance Finally, iii) research community for years to come. The outcome of the
identification of most feasible and optimized technique along paper which fulfilled the three objectives-i) study and
with discussion for betterment. Also, information about some classification of recent techniques ii) analyzed the status
video database provided. Mean while, objective evaluation recent techniques with respect to results and performance
would be extremely useful to the computer vision research Finally, iii) identification of most feasible and optimized
community for years to come.
technique along with discussion for betterment. Also,
Keywords: face detection; Adaboost; Haar features; cascade information about some video database provided. Mean
classifier while the study would lead to develop such system which
fulfilled the gap between demands.
I. INTRODUCTION
The paper is organized as follows- Section II described
Video data is growing enormously because of web
classification of face detection in video database. Section III
enabled devices and application such as youtube, google+,
contained important tabular analysis of different approaches
filckr etc. always deal with videos for searching, uploading
and available databases called significant analysis,
and downloading. However security and surveillance
performance evaluation and discussion has discussed in
systems are also being generated a huge amount of video
Section IV. The paper ends with conclusion in Section V.
data every day. The part of research which deals with videos
is the requirement of present. However, many researchers II. STUDY AND CLASSIFICATION
are working on immense challenge of video for
This section contained study of recent techniques related
organization, storage, retrieval, recovery etc. operations.
with human face detection specifically in video along with
Videos have contained information of characters or classification of available techniques based on attributes
persons such as like gait, speech, motion and face. The face used.
is complex object and it turns more complex problem for
A. Temporal difference approach
video. The face detection on image has been getting mature
results as an evidence of years of research. The challenges Effortless and easy to implemented approach is temporal
were discussed by N. Ahuja et at., [1] become working area difference. Methods include difference between frames or
for many researcher such as pose, scale, illumination, background or both in temporal, object localization
expression etc. The video based face detection in new performed using other methods by scanning binary image.
challenge of era. In concert with video, researchers are
A model proposed by W. Lushen et al. [5] have used
focuses to overcome the problems such as complex
temporal difference and background difference for acquiring
background [2], NICTA high resolution (5 MP) smart
and tracking moving object. The mosaic gray rules are
camera [3] by detecting and tracking video images. It would
adopted to detect the human face. Gaussian background
be the future demand for video summarization, character
model worked from first frame and after every 30 frames
based searching, browsing and retrieving human face of
temporal difference model used. Temporal difference
interest from video database for several applications such as
method obtained current frame without moving region.
state-of art, security, and surveillance, personal and
Background model has update by current frame and
industrial demands and face detection is the first step
calculated binarized differential image by setting the
towards achieving the task.

978-1-4673-2925-5/12/$31.00 ©2012 IEEE 278


2012 2nd IEEE International Conference on Parallel, Distributed and Grid Computing

threshold. Scanning method is used to process the binary detection are propagated over time using a Condensation
image and draw the rectangle of moving region by setting filter and factored sampling. Prediction is based on a zero
threshold. Face detection with mosaic algorithm has order model for position, scale, and pose; update uses the
performed using histogram equalization after detecting probability maps produced by the detection routine. The
moving target. Position and size of face region is determined method handled multiple faces, appearing/disappearing
using frequency histogram graph and threshold. Non-face faces as well as changing scale and pose.
region has been removed and merge the overlapping face
E. Naive Bayes classifier method
region.
D. Nguyen et al., [9] have proposed a new technique for
Still it is simple and easy to implemented, however,
face detection and lip feature extraction. Naive Bayes
threshold value has set manually in every operation.
classifier has classified an edge-extracted representation of
Approach completely based on background or frame
an image for face detection. Using edge representation has
difference which is varies due to lighting variation.
significantly reduced the model's size to only 5184 B, which
B. Multi feature based approach is 2417 times smaller than a comparable statistical modeling
technique, while achieving an 86.6% correct detection rate
The methods that combine different facial features have
under various lighting conditions.
been proposed to locate or detect faces. Most of them utilize
global features such as skin color, size and shape to find face F. A daboost based
candidates, and then verified these candidates using local
In [10], Wee Lau Cheong et aI., have performed face
and detailed features such as eye brows, nose, and hair.
detection by drawn bounding boxes on frame. The Adboost­
Face detection in a given frame or still images has been based simple optimization technique based on rescaling or
proposed using YUV color information [6]. The every resizing the image to the threshold value, which requires less
region has decomposed into wavelet domain sub-images processing time and space than non-optimized image. The
with 2D high pass filter or wavelet packet decomposition. only face tracked in live video sequence with more or equal
Resultant image has contained the edge information. The to the existence threshold has been recognized. The face
edge projection of the candidate image region is used as tracking performed with the help of finding closest match
features. The component of the feature vectors are the between the bounding box positions of the face candidate in
horizontal, vertical and filter -like projections. Dynamic current and previous frames. The pair with minimum
programming and SVM classifiers used for classification distance has matched. The face recognition using template­
problem. The distance metric has measured between probe based or Eigenface.
feature vector and typical prototype feature vector. The
Lijing Zhang et al. [11] have used background
classification has applied by the threshold of resulting
subtraction method. The current image have fixed
distance.
background, the difference obtain from candidate region of
C. Template based moving face. The grayscale difference between each point
of the current image and background image, and then set a
Template matching is based on correlation values.
threshold to determine which pixels are the moving points.
Features such as contours and edges are extracted and their
The dynamic threshold method has used to improve
relative locations are matched against predetermined
robustness of motion detection. The acquisition of the initial
estimates. Conformance to a certain 'skeleton' is considered
background image median method is used. The median filter
to be a detection.
is used to filter out some noise and the closing and opening
Towards the real-time face detection on CMOS operations in morphological for obtaining more accurate
technology have proposed by Y. Hori and T. Kuroda [7]. motion region. The projection method is used to determine
Optimized design for both algorithm and hardware proposed the boundaries of motion region. Then, the face detector by
to improve performance reduces area and power dissipation. Viola and Jones with Adaboost and Haar features applied to
Two kinds of templates with facial features are proposed to detect the human face region.
achieve high speed and accurate face detection. A Steady
Chih-Rung Chen et aI., [12] have proposed a novel
State Genetic algorithm is used for high-speed hardware
cascade face detection architecture based on a reduced two­
implementation of template matching. To reduce area and
field feature extraction scheme for faster integral image
power dissipation, frame memory is optimized at minimum
calculation and feature extraction. Author claimed that the
and the detection engine is shared for two kinds of template
proposed scheme reduces the required memory for storing
matching. It can detect eight faces in each frame of moving
integral images by 75%, and employs multiple register files
pictures at 30 frames/second.
instead of a single SRAM to speed up the integral image
D. Wavelet based probabilistic method updating and feature extraction processes. The reduced
integral images have only 5% of the features of original
R. Verma et al.,[8] have presented new probabilistic
images. Although this approach requires more weak
method for detecting and tracking multiple faces in a video
classifiers, the proposed parallel cascade detection
sequence. The pose is represented based on the combination
architecture reduces the average detection time for one
of two detectors, one for frontal views and one for profiles.
feature to 63% that of the original. Integral images are used
Face detection is fully automatic and based on well known
to quickly generate a feature value, and in the training
method developed by Schneiderman and Kanade. They used
process, a set of features called weak classifiers is generated
local histograms of wavelet coefficients represented with
using the AdaBoost algorithm from the training images. The
respect to a coordinate frame fixed to the object. A
detection process performs cascade detection on a sliding
probability of detection is obtained for each image position
sub-window in an image. The feature value extractor fetches
and at several scales and poses. The probabilities of
selected features from integral images. The cascade

279
2012 2nd IEEE International Conference on Parallel, Distributed and Grid Computing

detection process compares the trained weak classifiers and III. SIGNIFICANT ANALYSIS
the calculated feature values to decide whether to discard the
Analysis is the important part of any research. Analysis
sample or identify a face. Each selected feature includes five
can be performed using scientific tools and data under the
parameters: type, position, size, threshold, and weight. It is
objective. However before going to scientific analysis one
nearly impossible to build an efficient classification system
should know about fundamental aspects and information can
with a single weak classifier, and so iterations are needed to
be called as theoretical analysis. This section is deal with
select more features. Weak classifiers with different weights
such theoretical analysis which is very useful for further
form a strong classifier or final classifier. The cascade
development.
classifier used to discard negative samples while
maintaining high detection rates in the early stages. With The Table I has provided the technical analysis of recent
two-field integral image, this approach retains two integral human face detection techniques. It has involved brief
windows (IW) (the odd-field and the even-field IW) of the information of approach, parameter and database used. The
same size simultaneously during the face detection results occurred during experiments has also covered with
procedure. Using the odd-field integral image for feature special remarks. The special remarks have provided the
extraction allows the system to calculate the even-field useful information regarding process which optimized the
integral image from the frame buffer and update it to a new overall processing under the certain objective.
location for subsequent detection. Using reduced two-field
integral image decreases the storage requirements for The detail information of some video database has
integral images, speeding up feature extraction and lowering described in Table 2 which is presented in appendix A. It
the detection time. A tradeoff in terms of precision in storing covers all related information for video database such as­
the input image makes it possible to reduce the frame buffer format, frame rate, number of subjects, no. of datasets
size. Although the total number of weak classifiers is greater available along with special remarks regarding camera and
when using the RFS, the major increase is in the next stages conditions of video. We have considered four video
of the cascade classifier. database specially used for the purpose of face detection and
recognition from video under controlled and uncontrolled
A robust face detection technique along with mouth
situation or conditions. Whereas, NIST [20] has provided
localization, processing every frame in real time (video
video database for different purpose and it is available on for
rate), is presented [13]. Moreover, it is exploited for motion
researchers. TREC video retrieval Evaluation(TRECVID)
analysis onsite to verify "liveness" as well as to achieve lip
sponsored by NIST, contained series sponsored a video
reading of digits. A methodological novelty is the suggested
"track" devoted to research in automatic segmentation,
quantized angle features ("quangles") being designed for
indexing, and content-based retrieval of digital video. It also
illumination invariance without the need for preprocessing
contains different evolutions tools for content based
(e.g., histogram equalization). This is achieved by using
retrieval.
both the gradient direction and the double angle direction
(the structure tensor angle), and by ignoring the magnitude IV. PERFORMANCE EVALUATION AND DISCUSSION:
of the gradient. Boosting techniques are applied in a
The analysis of the different approaches concerned in
quantized feature space. A major benefit is reduced
tables, the approaches have different objectives to deal
processing time (i.e., that the training of effective cascaded
in real life problems. The major concern of the analysis is to
classifiers is feasible in very short time, less than I h for
evaluate the performance. For the evaluation as a part,
data sets of order 104). Scale invariance is implemented
consider the objective that the human face image detection
through the use of an image scale pyramid. Proposed
in video.
"liveness" verification barriers as applications for which a
significant amount of computation is avoided when The recent available techniques has been studied and
estimating motion. Novel strategies to avert advanced classified according to types of approach has been used.
spoofing attempts (e.g., replayed videos which include There are approaches like Feature invariant, template and
person utterances) are demonstrated. Presented favorable appearance based. The features invariant based approach
results on face detection for the YALE face test set and [5-6] based on structural features, however template based
competitive results for the CMU-MIT frontal face test set as approach [7] based on defined or trained template. In
well as on "Iiveness" verification barriers. contrast to template matching, the models (or templates) are
learned from a set of training images which capture the
A robust object/face detection technique processing
representative variability of facial appearance. These learned
every frame in real-time (video-rate) had presented [14]. A
models are then used for detection called as appearance
methodological novelty are the suggested quantized angle
based methods [8-12].
features ("quangles"), being designed for illumination
invariance without the need for pre-processing, e.g. The approaches are categorized into techniques, the
histogram equalization. This is achieved by using both the techniques may belongs to more than one approach e.g.
gradient direction and the double angle direction (the skin-tone is colour feature sub category of feature invariant
structure tensor angle), and by ignoring the magnitude of the approach which is used in [7] along with template. The
gradient. Boosting techniques are applied in a quantized classification preformed solely on type of approach because
feature space. Separable filtering and the use of lookup technique would be combination of more than one
tables favor the detection speed. Furthermore, the gradient approaches.
may then be reused for other tasks as well. A side effect is
that the training of effective cascaded classifiers is feasible
in very short time, less than 1 hour for data sets of order
104.

280
2012 2nd IEEE International Conference on Parallel, Distributed and Grid Computing

TABLE 1. SIGNIFICANT ANALYSIS OF FACE DETECTION TECHNIQUES

Ref. Approach Technique Parameter Database Results Remark


[5] Temporal Temporal Frequency Real time Not specified 1. Mosaic algo. Using histogram equalization
difference difference + histogram, (320 x 240 with integral image.
approach background threshold image)
difference
[6] Multi feature Edge projection+ Threshold of CVL database 95.8% using I. High pass filter of wavelet transformation
based color information resulting ( 480 x 640 dynamic
approach distance image) programming, 99.6
FERET (32 x % using SVM
32)
[7] Template Skin-tone edge Colour and Moving 92% accuracy 1. Genetic algo. for template matching
based extraction + coarse shape picture at 30 2. Noise eliminated using median filter and
face detection frames/ sec. opening! closing filter.
[8] Wavelet Scheiderman & Position, scale UMIST 3 ( 0.065%) 1. Probability score
based Kanade detection and face pose database, missing detection 2. Temporal information for tracking
probabilistic for frontal and videos form and 40 false 3. Appearance and disappearance of head pose
approach profile face NIST detection in 4. Performance analysis on quantitative and time
temporal approach analysis.
[9] Nai"ve Bayes Sobel filter + Nalve Threshold YALE 86.6% detection 1. Bootstrapping to reduce training time.
classifier Bayes classifier rate with no false
based +ve
(appearance
based)
[10] Adaboost Threshold Real time ( Detection time 1. Resize and reduce resolution.
480 x 640 reduces to -22 sec.
image)
� Fixed Background+ Dynamic ORL and self 92.7% detection 1. Median filter for noise reduction
Motion detection + Threshold built face correct rate 2. Projection method for determine boundaries
Adaboost Algo. database (320 of motion region
Haar feature x 240 image )
---yi2] Cascade + reduce Type, Bao, CMU, 81% approx. with 1. Reduce two field feature
two field feature Position, Size, EssexFace94 very low false +ve 2. Parallel classifier reduce time by 63%
Adaboost extraction + Threshold and and NCKU ( rate 3. Smaller area and faster processing speed.
based integral image + Weight 160 x 120
(appearance Adaboost image)
---yi3] based) Viola and Jones ( Threshold CMU-MIT 93% detection rate 1. Preprocessing not required
Adaboost) frontal face with IxIO-6 false 2. Minimum weak classifier used to build strong
set, YALE, classifier for computation speed.
+ve rate
XM2VTS
---yi4] Adaboost+ Threshold( all YALE, CMU- 100% on Yale and 1. Illumination invariant features
Gradient direction quangle mask, MIT frontal 93% detection rate 2. Bootstrapping strategy for replacing rejected
and angle direction number of face set with Ix 10-6 false negative class with new ones.
+ cascade classifier rotation) 3. Less training time because less features
+ve rate on CMU-
suffice( quangle build upon derivative
MIT
features)

The temporal difference [5] would provide more accurate to accuracy and inversely proportional to time. Less training
results in real time but background should be fixed. It also time would be achieved with less features suffice. Haar
used integral image along with colour information. It has features (square) or two field features (circle) has been used.
enabled to achieve robustness without fixed background. The optimum feature selection for training is critical and
Above 95% of detection rate [6] has achieved with image important step toward achieving better results. Next
databases under dynamic programming and SVM. It based important aspect is rejection of negative classifier.
on edge and colour information same as [7] would be Combination of weak classifiers build strong classifiers
provides false results under lighting conditions. The would be used for the same. Bootstrapping strategy for
challenges related with profile and frontal face detection [8] replacing rejected negative class with new ones would
has solved and tested in various conditions of appearance reduce time of processing.
and disappearance with NIST [20] video database. The
Adaboost has various advantages over other methods in
training time has reduced in [9] and tested on YALE image
videos and real time. It would be provide better results
database.
obtained under the following factors-
The most of method used Adaboost based method to
• Selection of optimum features which reduce
achieved real time results with variation in techniques. All
training and processing time.
the methods based on Adaboost have performance good
• Selection of mlllimum classifier with
results in terms of accuracy and speed. It has also found that
bootstrapping strategy.
most of work has been tested on video of real time. The
challenges involved in selection of best minimum features
• Pre processing such as filtering, resizing and
for training. Number features selection is directly proportion reduce resolution.

281
2012 2nd IEEE International Conference on Parallel, Distributed and Grid Computing

• The parallel classifier which reduce time Signal Processing Conference, Volume 6246, pp. - 624-628, Sep,
2006.
V, CONCLUSION AND FUTURE WORK [7] Y. Hori and T. Kuroda, "A 0.79-mm2 29-mW Real-Time Face
Detection Core", IEEE Trans. solid-state circuit, Vol. 42, No. 4, pp-
The study and analysis would help to identify problems in 790-797, ApriI 2007.
current techniques and improved by eliminating it Study [8] R. Verma, C. Schmid, and K. Mikolajczyk, "Face Detection and
assist to use and impact of video database for face detection. Tracking in a Video by Propagating Detection Probabilities", IEEE
The proposed work provides review and analysis of recent trans. pattern analysis and machine intelligence, Vol. 25, No. 10, pp.-
different available approaches for human face detection in 1215-1228, Oct 2003.

video. The study leads towards development of improved [9] D. Nguyen, D. Halupka, P. Aarabi and A. Sheikholeslami, "Real­
Time Face Detection and Lip Feature Extraction Using Field­
approach or method which has reduced the existing
Programmable Gate Arrays", IEEE trans. system, man and
limitation of system. The categorization and classification cybernetics- part B: cybernetics, Vol. 36, No. 4, pp- 902-912 Aug
has been performed on the available face detection 2006.
techniques. As per our knowledge the classification of face [10] W. L. Cheong, C. M. Char, Y. C. Lim, S. Lim, S. W. Khor, "Building
detection in video is unavailable. a Computation Savings Real-Time Face Detection and Recognition
System", IEEE 2nd International Conference on Signal Processing
Meaningful objective comparison of techniques would Systems (lCSPS), pp-VI 815-819, 2010.
provide the research community with sufficient data for the [11] [11] L. Zhang and Y. Liang, "A Fast Method of Face Detection in
exploration of automatic modeling techniques. The work has Video Images", Proc. 2nd International Conference on Advanced
been extended for the various objectives such as searching, Computer Control, pp-490-494, 2010.

browsing and indexing. We explored the study and analysis [12] C. Chen, W. Wong, and C. Chiu, "A 0.64 mm Real-Time Cascade
Face Detection Design Based on Reduced Two- Field Extraction" ,
for any other available and latest relevant approaches with
IEEE trans. very large scale integration (VLSI) systems, Vol. 19, No.
suitable classification. This work would be extended with 11, pp- 1937-1948, Nov 2011.
using scientific tools and data for evaluation and
[13] K. Kollreider, H. Fronthaler, M. 1. Faraj, and J. Bigun, "Real-Time
performance analysis. Face Detection and Motion Analysis With Application in "Liveness"
Assessment", IEEE trans. information forensics and security, Vol. 2,
ACKNOWLEDGMENT No. 3,pp-548-558, SEP 2007.

The authors would like to very thankful M. Pietikainen, [14] K. Kollreider, H. Fronthaler, and 1. Bigun, "Real-Time Face
Detection Using Illumination Invariant Features", Proceedings of the
University of oulu, Finland for make available Gulu face
15th Scandinavian conference on Image analysis, pp. 41-50,
video sequences on request It is also our duty to Springer-Verlag Berlin Heidelberg, 2007.
acknowledge Honda IUCSD for avail the database of face [15] Enrique Bailly-baillire , Samy Bengio , Frederic Bimbot , Miroslav
video sequences for research community. Hamouz , Josef Kittler , Johnny Mariethoz , Jiri Matas , Kieron
Messer , Fabienne Poree , Belen Ruiz, 'The BANCA database and
evaluation protocol", In Proc. Int. Conf. on Audio- and Video-Based
REFERENCES
Biometric Person Authentication, 2003. BANCA database: available
[I] M. Yang, D. J. Kriegman and N. Ahuja, "Detecting Faces in Images: on the link http://www.ee.surrey.ac.uk/CVSSP/banca/
A Survey", IEEE trans. pattern analysis and machine intelligence, [16] K. C. Lee, J. Ho, M. H. Yang, and D. Kriegman,"Videobased face
Vol. 24, No. I, pp-36-58, Jan 2002. recognition using probabilistic appearance manifolds", In Proc. of
[2] Xinzhu Wang ,Yantao Tian, Shuaishi Liu, Jinsong Li, Cheng Peng , IEEE Int. Conf. on Computer Vision andPattern Recognition, pp-
"Face Detection and Tracking Algorithm in Video Images with 313-320, 2003. Honda UCSD database: available on request
Complex Background", Proceedings of the 2010 IEEE International http://vision.ucsd.edu/-leekc/HondaUCSDVideoDatabase/HondaUCS
Conference on Robotics and Biometrics ,pp.- 1206-1211, Dec 2010. D.html
[3] Y. M. Mustafah, T. Shan, A. W. Azman, A. Bigdeli, B. C. Lovell, [17] K.C. Lee and 1. Ho and M.H. Yang and D. Kriegman, "Visual
"Real-Time Face Detection and Tracking for High Resolution Smart Tracking and Recognition Using Probabilistic Appearance
Camera System" , IEEE computer society international conference Manifolds", Computer Vision and Image Understanding , 2005.
Digital Image Computing Techniques and Applications, pp- 387-393, [18] G. Zhao, M. Pietikainen, and A. Hadid, "Local spatiotemporal
2007. descriptors for visual recognition of spoken phrases," Proc. 2nd
[4] R. Kasturi, D. Goldgof, P. Soundararajan, Manohar, 1. Garofolo, R. International Workshop on Human-Centered Multimedia
Bowers, M. Boonstra, V. Korzhova, and J. Zhang, "Framework for (HCM2007), 2007, pp. 57-65.0ulu database: available on request
Performance Evaluation of Face, Text, and Vehicle Detection and http://www.cse.oulu.fi/CMV/Downloads/OuluVS
Tracking in Video: Data, Metrics, and Protocol" , IEEE trans. pattern [19] K. Messer, 1. Matas, 1. Kittler, 1. LUttin, G. Maitre, "XM2VTSDB:
analysis and machine intelligence, Vol. 31, No. 2, pp-319-336 FEB The Extended M2VTS Database", In Second International Conference
2009. on Audio and Video-based Biometric Person Authentication, 1999.
[5] W. Lushen, W. Peimin, M. Fanwen, "A Fast Face Detection for Video XM2VTSDB database: available on the link
Sequences", Proc. Second IEEE International Conference on http://www.ee.surrey.ac.uk/CVSSP/xm2vtsdb
Intelligent Human-Machine Systems and Cybernetics, pp-1l7-120, [20] Proc. Text REtrieval Conf. VIDeo Retrieval Evaluation (TRECVID),
2010. 2012. http://www-nlpir.nist.gov/projects/trecvid/
[6] M. TUrkan, B. DUlek, 1. Onaran, and A. E. <;:etin, 'Human face
detection in video using edge projections", Proc. 14th European

282
2012 2nd IEEE International Conference on Parallel, Distributed and Grid Computing

ApPENDIX A

TABLE II. VIDEO DATABASE FOR FACE DETECTION AND RECOGNITION

Frame No. of File


Database Name Color Resolution Condition Available Special remarks
rate subject format
BANACA[15] 25 yes nOx576 controlled, 208 subjects, aVI paid 12 different sessIOns spanning three
frames/ degraded half men and months
sec and adverse half women
Honda UCSD 15 yes 640x480 Indoor 20 and 15 aVI On Two datasets: the first dataset is recorded
[16-17] frames/ human request by a SONY EVI-D30 camera contains
sec subjects for with subset 20, 42, 13 videos. The second
1 d
1' and 2n condition dataset is recorded by a SONY DFW-
datasets V500 camera contains two subsets 30
respectively videos.
Oulu [18] 15 yes 160xl20 under real 12, 6 and 5 aVI On 3 set : Alaris contains 12 .avi files along
frames/ indoor and in three request with zip , Nogatech contains 6 .avi files
sec outdoor different with along with zip and Sony contain 5 zip files
illuminatio datasets condition
n
conditions
XM2VTSDB[19] 25 yes nOx576 Subject 3 for each of MPEG paid This set contains all 295 subjects for all 4
frames/ rotating the 295 7 sessions, a total of 1,180 video clips
sec head subjects and
center, left, each of the
right, top, four sessions
down and
center

283

You might also like