Face Detection and Recognition

Face Detection and
Recognition
Zoltán VÁMOSSY
Budapest Tech, John von Neumann Faculty of Informatics
30/08/2006 IPCV 2006 Budapest 1

Face tasks in CV
 Face detection: Are there any face on the image (where)?
– concerns with a category of object
 Face localization:
– Aim to determine the image position of a single face
– A simplified detection problem with the assumption that
an input image contains only one face
 Facial feature extraction:
– To detect the presence and location of features such as
eyes, nose, nostrils, eyebrow, mouth, lips, ears, etc
– Usually assume that there is only one face in an image
 Face recognition (identification): concerns with individual
identity
 Facial expression recognition
 Human pose estimation and tracking
30/08/2006 IPCV 2006 Budapest 2
Typical Biometric Authentication
Workflow
Enroll
Enrollment subsystem Template
Biometric Feature
reader Extractor
1010010…
Authenticate Authentication subsystem Template

Match or Biometric
No Match Matcher
Database
1010010…
Biometric Feature
reader Extractor
30/08/2006 IPCV 2006 Budapest 3

Identification vs. Verification
Identification (1:N)
Biometric Biometric
reader Matcher
Database
This person is
Emily Dawson
I am Emily
Dawson Verification (1:1)
ID
Biometric Biometric Database

reader Matcher
Match
30/08/2006 IPCV 2006 Budapest 4
Face Recognition Applications
Automatic border control Samsung’s Magicgate:

Sydney airport door lock control
30/08/2006 IPCV 2006 Budapest 5
A Solved Problem?
 Excellent results and
products:
– Fast, multi pose,
partial occlusion
 Is face recognition
and detection a solved
problem?
 Not quite
Omron’s face detector

30/08/2006 IPCV 2006 Budapest [Liu et al. 04] 6
Outline
 Relevant psychophysics/neuroscience issues
 Face Detection
– Detection on gray scale images
– Detection on color images
– Research directions
 Face Recognition
– Still image face recognition
 Face projects at Budapest Tech
30/08/2006 IPCV 2006 Budapest 7

Psychophysics Issues
 Is face perception the results of holistic/configural or local
feature analysis? Is it a dedicated process?
– Build a special or generic system
 Ranking of significance of facial features
– Different weights for features
 Movement and face recognition
– Favor video based recognition
 Role of race/gender/familiarity
– Suggest adaptive system
30/08/2006 IPCV 2006 Budapest 8

Inverted Faces
Prof. Eric H. Chudler Dept. of Anesthesiology University of Washington

30/08/2006 IPCV 2006 Budapest 9
Inverted Faces

30/08/2006 IPCV 2006 Budapest 10
More Inverted Faces

30/08/2006 IPCV 2006 Budapest 11
Negated Faces
30/08/2006 IPCV 2006 Budapest 12

Human Face Recognition Limitations
Harmon and Julesz, 1973

Degraded images
Human can handle low resolution

[Sinha05]
30/08/2006 IPCV 2006 Budapest 13
Outline of Detection Part
 Objective
– Survey major face detection works
– Address “how” and “why” questions
– Pros and cons of detection methods
– Research directions
 This part of the presentation is based on the

survey M-H. Yang:
http://vision.ai.uiuc.edu/mhyang/face-detection-
survey.html
30/08/2006 IPCV 2006 Budapest 14

Face Detection
 Identify and locate
human faces in an
image regardless
of their
– position
– scale
– in-plane rotation
– orientation
– pose (out-of plane rotation)
– and illumination
 Face is a highly non-rigid object
Where are the faces, if any?
30/08/2006 IPCV 2006 Budapest 15

Why Face Detection is Important?
 First step for any fully automatic face
recognition system
 First step in many surveillance systems
 Lots of applications
 A step towards Automatic Target
Recognition (ATR) or generic object
detection/recognition
30/08/2006 IPCV 2006 Budapest 16

Why Face Detection Is Difficult?
 Pose (Out-of-Plane Rotation): frontal, 45
degree, profile, upside down
 Presence or absence of structural
components: beard, mustache and
glasses
 Facial expression: face appearance is
directly affected by a person's facial
expression
 Occlusion: faces may be partially occluded
by other objects
 Orientation (In-Plane Rotation): face
appearance directly vary for different
rotations about the camera's optical axis
 Imaging conditions: lighting (spectra,
source distribution and intensity) and
camera characteristics (sensor response,
gain control, lenses), resolution
30/08/2006 IPCV 2006 Budapest 17

In One Thumbnail Face Image
 Consider a thumbnail 19 × 19 face pattern
 256361 possible combination of gray values
– 256361 = 28×361 = 22888
 Total world population 6,600,000,000
– 6,600,000,000 ≅ 232
 87 times more than the world population!
 Extremely high dimensional space!
30/08/2006 IPCV 2006 Budapest 18

What is a Correct Detection?
 Different interpretation of “correct detect”

 Precision of face location
 Affect the reporting results: detection, false
positive, false negative rates
30/08/2006 IPCV 2006 Budapest 19
Receiver Operator Characteristic Curve
 Useful for detailed performance
assessment
 Plot true positive (TP) proportion against
the false positive (FP) proportion for
various possible settings
 False positive: Predict a face when there is
actually none
 False negative: Predict a non-face where
there is actually one
30/08/2006 IPCV 2006 Budapest 20
Receiver Operator Characteristic (ROC)
 Correct
identification of
nearly all objects
will cause a large
percentage of
unknown objects to
be incorrectly
classified into some
known class
30/08/2006 IPCV 2006 Budapest 21

Face Detector: Ingredients
 Target application domain: single image, video
 Representation: holistic, feature, etc
 Pre processing: histogram equalization, etc
 Cues: color, motion, depth, voice, etc
 Search strategy: exhaustive, greedy, focus of
attention, etc
 Classifier design: ensemble, cascade
 Post processing: interpret detection results
30/08/2006 IPCV 2006 Budapest 22

In our Focus
 Focus on detecting
– upright, frontal
faces
– in a single gray-
scale image
– with decent
resolution
– under good lighting
conditions
30/08/2006 IPCV 2006 Budapest 23

Methods to Detect/Locate Faces
 Knowledge-based methods:
– Encode human knowledge of what constitutes a typical
face (usually, the relationships between facial features)
 Feature invariant approaches:
– Aim to find structural features of a face that exist even
when the pose, viewpoint, or lighting conditions vary
 Template matching methods:
– Several standard patterns stored to describe the face
as a whole or the facial features separately
 Appearance based methods:
– The models (or templates) are learned from a set of
training images which capture the representative
variability of facial appearance
30/08/2006 IPCV 2006 Budapest 24

Agenda (Detection)
 Detecting faces in gray scale images
– Knowledge-based
– Feature-based
– Template-based
– Appearance-based
 Detecting faces in color images
 Detecting faces in video
 Performance evaluation
 Research direction and concluding remarks
30/08/2006 IPCV 2006 Budapest 25

Knowledge-Based Methods
 Top-down approach: Represent a face using a
set of human-coded rules
 Examples:
– The center part of face has uniform intensity
values
– The difference between the average intensity
values of the center part and the upper part is
significant
– A face often appears with two eyes that are
symmetric to each other, a nose and a mouth
 Use these rules to guide the search process
30/08/2006 IPCV 2006 Budapest 26

Knowledge-Based Method: [Yang and Huang 94]
 Multi-resolution focus-of-
attention approach
 Level 1 (lowest
resolution): apply the rule
“the center part of the
face has 4 cells with a
basically uniform
intensity” to search for
candidates
 Level 2: local histogram
equalization followed by
edge detection
 Level 3: search for eye
and mouth features for
validation
30/08/2006 IPCV 2006 Budapest 27
Knowledge-Based Method:
[Kotropoulos & Pitas 94]
 Horizontal/vertical projection to search for candidates
 Search eyebrow/eyes, nostrils/nose for validation

 Difficult to detect multiple people or in complex
background
30/08/2006 IPCV 2006 Budapest 28

Knowledge-based Methods: Summary
 Pros:
– Easy to find simple rules to describe the features of a face
and their relationships
– Based on the coded rules, facial features in an input image
are extracted first and face candidates are identified
– Work well for face localization in uncluttered background
 Cons:
– Difficult to translate human knowledge into rules precisely:
detailed rules fail to detect faces and general rules may
find many false positives
– Difficult to extend this approach to detect faces in different
poses: implausible to enumerate all the possible cases
30/08/2006 IPCV 2006 Budapest 29

Agenda (Detection)
– Knowledge-based
– Feature-based
– Template-based
30/08/2006 IPCV 2006 Budapest 30

Feature-based Methods
 Bottom-up approach: Detect facial features
(eyes, nose, mouth, etc) first
– How can we detect facial features: edge,
intensity, shape, texture, color, etc
– Aim to detect invariant features
 Group features into candidates and verify
them
30/08/2006 IPCV 2006 Budapest 31

Feature-based Example [Yang 2003]
 Use the Canny operator to find an edge
map
 Separate face from background by
grouping edges
 Best fit an ellipse to the remaining edges
30/08/2006 IPCV 2006 Budapest 32

Feature Grouping [Yow and Cipolla 90]
 Apply a 2nd derivative Gaussian filter to
search for interest points
 Group the edges near interest points into
regions
 Each feature and grouping is evaluated
within a Bayesian network
 Handle a few poses
30/08/2006 IPCV 2006 Budapest 33

Feature Grouping [Yow and Cipolla 90]
 Face model and
component
 Model facial feature as pair

of edges
 Apply interest point

operator and edge detector
to search for features
 Using Bayesian network to

combine evidence
30/08/2006 IPCV 2006 Budapest 34

Random Graph Matching [Leung et al. 95]
 Formulate as a problem to find the
correct geometric arrangement of
facial features
– Facial features are defined by
the average responses of multi-
orientation, multi-scale
Gaussian derivative filters
– Learn the configuration of
features with Gaussian
distribution of mutual distance
between facial features
 Random graph matching among
the candidates to locate faces
30/08/2006 IPCV 2006 Budapest 35
Feature-Based Methods: Summary
 Pros:
– Features are invariant to pose and orientation
change
 Cons:
– Difficult to locate facial features due to several
corruption (illumination, noise, occlusion)
– Difficult to detect features in complex
background
30/08/2006 IPCV 2006 Budapest 36

Agenda (Detection)
– Knowledge-based
– Feature-based
– Template-based
30/08/2006 IPCV 2006 Budapest 37

Template Matching Methods
 Store a template
– Predefined: based on edges or regions
– Deformable: based on facial contours
(e.g., Snakes)
 Templates are hand-coded (not learned)
 Use correlation to locate faces
30/08/2006 IPCV 2006 Budapest 38

Face Template
 Use relative pair-wise
ratios of the brightness of
facial regions (14 × 16
pixels): the eyes are
usually darker than the
surrounding face [Sinha 94]
 Use average area
intensity values than
absolute pixel values
 See also Point Distribution
Model (PDM) [Lanitis et al. 95]
30/08/2006 IPCV 2006 Budapest 39

Template-Based Methods: Summary
 Pros:
– Relatively simple
 Cons:
– Templates needs to be initialized near the
face images
– Difficult to enumerate templates for
different poses (similar to knowledge-
based methods)
30/08/2006 IPCV 2006 Budapest 40
Agenda (Detection)
– Knowledge-based
– Feature-based
– Template-based
30/08/2006 IPCV 2006 Budapest 41

Appearance-Based Methods
Idea: train a classifier using positive (and
usually negative) examples of faces
 Representation
 Pre processing
 Train a classifier
 Search strategy
 Post processing
30/08/2006 IPCV 2006 Budapest 42

Appearance-Based Methods: Classifiers
 Neural network: Multilayer Perceptrons
 Principal Component Analysis (PCA), Factor Analysis
 Support vector machine (SVM)
 Mixture of PCA, Mixture of factor analyzers
 Distribution-based method
 Naive Bayes classifier
 Hidden Markov model
 Sparse network of winnows (SNoW)
 Kullback relative information
 Inductive learning: C4.5
 AdaBoost (Jones and Viola)
 …
30/08/2006 IPCV 2006 Budapest 43

Representation
 Holistic: Each image is raster scanned and
represented by a vector of intensity values
 Block-based: Decompose each face image into a
set of overlapping or non-overlapping blocks
– At multiple scale
– Further processed with vector quantization,
Principal Component Analysis, etc.
30/08/2006 IPCV 2006 Budapest 44

Face and Non-Face Examples
 Positive examples:
– Get as much variation as possible
– Manually crop and normalize each face image into a
standard size face (e.g., 19 × 19 pixels)
– Creating virtual examples [Sung and Poggio 94]
 Negative examples:
– Fuzzy idea
– Any images that
do not contain faces
– A large image subspace
30/08/2006 IPCV 2006 Budapest 45

Distribution-Based Method [Sung & Poggio 94]
 Masking: reduce the
unwanted noise in a face
pattern
 Illumination gradient
correction: find the best fit
brightness plane and then
subtracted from it to reduce
heavy shadows caused by
extreme lighting angles
 Histogram equalization:
compensates the imaging
effects due to changes in
illumination and different
camera input gains
30/08/2006 IPCV 2006 Budapest 46

Creating Virtual Positive Examples
 Simple and very
effective method
 Randomly mirror,
rotate, translate
and scale face
samples by small
amounts
 Less sensitive to
alignment error
30/08/2006 IPCV 2006 Budapest 47
Search over Space and Scale
 Scan an input image at one-pixel Downsample the input image by a

increments horizontally and vertically factor of 1.2 and continue to search
30/08/2006 IPCV 2006 Budapest 48
Continue to Search over Space and Scale
 Continue to downsample the input image

and search until the image size is too small
30/08/2006 IPCV 2006 Budapest 49
Detecting Faces in Different Pose
 Probabilistic Modeling of
Local Appearance
 Extend to detect faces in
different pose with
multiple detectors
 Each detector specializes
to a view: frontal, left pose
and right pose
[Schneiderman and Kanade 98]
30/08/2006 IPCV 2006 Budapest 50

Viola & Jones: Robust Real-time Face Detection
Viola and Jones

 Goals
 Motivation
 Features (rectangle features & integral
image)
 Learning classifier functions (AdaBoost)
 Cascade of classifiers
 Results
30/08/2006 IPCV 2006 Budapest 51

Goals of Viola and Jones

 Robust real-time face detection in images
 Decide for each sub-window (starting at
24x24) of the image whether it is a face
30/08/2006 IPCV 2006 Budapest 52

Motivation
 Rapid face detection in images and image
sequences
 Previous approaches too slow
– They are using auxiliary information
(image differences, colors)
 Only grey scale images needed
30/08/2006 IPCV 2006 Budapest 53

The Framework of the Method

 Haar-like Feature: A feature representation
 Integral Image: A new image representation
and a fast Haar-like feature calculation
method
 Perceptron: Classifier, uses the features
 Boost Algorithm: An efficient classifier
generating algorithm
 Cascade: Method for combining classifiers
30/08/2006 IPCV 2006 Budapest 54

Features
 Detection based on features, not pixels
 Why?
– Features can encode domain knowledge
that is difficult to learn
– Faster than a pixel-based system
– Scale independent
30/08/2006 IPCV 2006 Budapest 55

Rectangle Features
 (Sum of pixels in black rectangles) - (sum of

pixels in white rectangles)
 Features are very simple  rapid
computation
 How to compute them rapidly?
30/08/2006 IPCV 2006 Budapest 56
Integral Image
 Rectangle Features
can be computed
rapidly using an
intermediate
representation of the
image: integral image
 Value at point (x,y) is
the sum of all the pixel
values above and to
the left
30/08/2006 IPCV 2006 Budapest 57
Integral Image
 The integral image can be computed with
the following recurrences:
 where
s(x, y): cumulative column sum
 One pass over the image
30/08/2006 IPCV 2006 Budapest 58

Computation of Rectangular Sums
Sum of rectangle D: ii(4) – ii(3) – ii(2) + ii(1)

30/08/2006 IPCV 2006 Budapest 59
Integral Image
 Rectangular sum can be computed in four
array references
 Two-rectangle features need six array
references
 Three-rectangle features need eight array
references
 Four-rectangle features need nine array
references
30/08/2006 IPCV 2006 Budapest 60
Feature properties
 Rough, but sensitive to simple image
structures
 Only 2 orientations
 Extremely efficient
 Compensates the disadvantages
 Face detection for an entire image at every
scale in real-time
30/08/2006 IPCV 2006 Budapest 61

Learning Classification Functions

 Given: feature set and training set of positive +
negative face images
– In principle any machine learning approach
can be used (SVM, neural network, …)
– But: 45 396 rectangle features for each sub-
window
 Experiments showed that only a few features
needed for an efficient classifier
 How to find them?
30/08/2006 IPCV 2006 Budapest 62

AdaBoost-Based Detector [Viola and Jones 01]

 Main idea:
– Feature selection: select important features
– Focus of attention: focus on potential regions
– Use an integral graph for fast feature evaluation
 Use AdaBoost to learn
– A small set of important features (feature selection)
 sort them in the order of importance
 each feature can be used as a simple (weak) classifier
– A cascade of classifiers that
 combine all the weak classifiers to do a difficult task
 filter out the regions that most likely do not contain
faces
30/08/2006 IPCV 2006 Budapest 63

Feature Selection [Viola and Jones 01]

 Training: If x is a face, then x
–
most likely has feature 1 (easiest feature, and of greatest importance)
–
very likely to have feature 2 (easy feature)
… –
–
likely to have feature n (more complex feature, and of less importance
since it does not exist in all the faces in the training set)
 Testing: Given a test sub-image image x’
– if x’ has feature 1:
 Test whether x’ has feature 2
– Test whether x’ has feature n
–…
– else …
 else, it is not face
– else, it is not a face
 Similar to decision tree
30/08/2006 IPCV 2006 Budapest 64

AdaBoost
 Variant of AdaBoost is used both to select
features and train the classifier
 AdaBoost is used to boost classification
performance of a simple (weak) learning
algorithm
 It combines a collection of weak classifier to form
a stronger one
 The weak learning algorithm is designed to select
the single rectangle feature which best separates
the positive and negative examples
30/08/2006 IPCV 2006 Budapest 65

AdaBoost
 Searches over the set of possible weak classifiers
and selects those with the lowest classification
error
 Greedy feature selection process
 Efficient for selecting a small number of “good”
features (fj(x)) with significant variety
Weak classifier
 Error rates between 0.1-0.3 (early rounds) and 0.4

and 0.5 (later rounds)
30/08/2006 IPCV 2006 Budapest 66

AdaBoost in Detail
 Given example images yi = 0,1 pos./neg.
 Initialize weights for yi = 0, 1: where m and
l number of negatives and positives
 For t = 1,…,T:
1. Normalize weights
2. For each feature j train a classifier hj which is restricted to
using a single feature, the error
3. Choose the classifier ht with lowest error
4. Update weights:
where ei = 0, if xi correctly classified, else ei = 1
 Final strong classifier:
30/08/2006 IPCV 2006 Budapest 67

Feature Selection by AdaBoost
 First and second feature

selected by AdaBoost
 60 microprocessor
instructions
30/08/2006 IPCV 2006 Budapest 68

AdaBoost – Learning Results

PIII 700 MHz, 2001
 200-feature-classifier
 Detection rate: 95%
 False positive rate: 1/14084
 Takes 0.7 seconds to scan 384x288 image
 Adding features to the classifier increases
computation time
30/08/2006 IPCV 2006 Budapest 69

Cascade of Classifiers
 0.7 s for 1 image?  too slow =>
 Idea: Cascade of Classifiers to reduce
computation time
 Simple Classifiers to reject majority of negative
sub-windows: all sub-windows
many sub-windows
have no faces
T T T Further
1 2 3
processing
F F
F
rejected
sub-windows
30/08/2006 IPCV 2006 Budapest 70
Cascade of Classifiers - Experiment

 Monolithic 200-feature-classifier vs. cascade of
ten 20-feature-classifier
 Trained with 5000 faces and 10000 non-face
sub-windows
30/08/2006 IPCV 2006 Budapest 71

Results - Training
 Structure of the Cascade:
– 32 layers with 4297 features
– First layer uses two features, 60% non-faces
rejected, close to 100% faces detected
– Second layer uses five features, rejects 80%
non-faces
– 3rd 20-feature-classifier
– 4th 50-feature-classifier
30/08/2006 IPCV 2006 Budapest 72

Results - Speed
 Speed of the detector depends on the
number of features evaluated
 On the MIT+CMU test set, an average of 8
features (out of 4297), on a 700 MHz
Pentium III, 384x288 takes 0.067s (2001)
30/08/2006 IPCV 2006 Budapest 73

Outlook
 Viola & Jones improved their detector to detect not
only frontal faces, but also non-upright and non-
frontal faces
 Additional rectangle features
 Pose estimator
 Classifier for each rotation class (12)
30/08/2006 IPCV 2006 Budapest 74

Haar-like Features Extension
 45º rotated rectangle
feature by Lienhart
 Feature is represented by
(type, x, y, w, h): type
[1(a), 2(b), etc.]; (x, y) for
location in detection
window and (w, h) for
feature size
 Number of features: the
window size 24x24, there
are 117,941 different
features, a large number
30/08/2006 IPCV 2006 Budapest 75
Integral Image, Fast Feature
Calculation
 Corresponding to the
origin feature and 45º
rotated feature, there are
two types of Integral SAT(x,y)
Image, called by Lienhart
as Sum Area Table and
Rotated Sum Area Table
RSAT(x,y)
30/08/2006 IPCV 2006 Budapest 76

Integral Image, Fast Feature
Calculation
 SAT and RSAT Calculation:
 Sum of Rect Calculation:
 Feature value: calculated as the weighted

sum of the rectangles
30/08/2006 IPCV 2006 Budapest 77
Appearance-Based Methods: Summary
 Pros:
– Use powerful machine learning algorithms
– Has demonstrated good empirical results
– Fast and fairly robust
– Extended to detect faces in different pose and
orientation
 Cons
– Usually needs to search over space and scale
– Need lots of positive and negative examples
– Limited view-based approach
30/08/2006 IPCV 2006 Budapest 78

Agenda (Detection)
– Knowledge-based
– Feature-based
– Template-based
30/08/2006 IPCV 2006 Budapest 79

Color-Based Face Detector
 Distribution of skin color across different ethnic
groups
– Under controlled illumination conditions:
compact
– Arbitrary conditions: less compact
 Color space
– RGB, normalized RGB, HSV, HIS, YCrCb,
YIQ, UES, CIE XYZ, CIE LIV, …
 Statistical analysis
– Histogram, look-up table, Gaussian model,
mixture model, …
30/08/2006 IPCV 2006 Budapest 80
Methods of Skin Detection
 Pixel-Based Methods
– Classify each pixel as skin or non-skin individually,
independently from its neighbors
– Simpler, faster
– Color Based Methods fall in this category
 Region Based Methods
– Try to take the spatial arrangement of skin pixels into
account during the detection stage to enhance the
methods performance
– More computation extensive
– Additional knowledge in terms of texture etc. are
required
30/08/2006 IPCV 2006 Budapest 81

Skin Color Based Methods - Advantages
 Allows fast processing
 Robust to geometric variations of the skin
patterns
 Robust under partial occlusion
 Robust to resolution changes
 Experience suggests that human skin has a
characteristic color, which is easily recognized
by humans
30/08/2006 IPCV 2006 Budapest 82

Skin Modelling
 Explicitly defined skin region
– Using simple rules
 Non-parametric skin distribution modeling -
Estimate skin color distribution from the training
data without deriving an explicit model of the skin
– Look up table or Histogram Model
– Bayes Classifier
 Parametric skin distribution modeling - Deriving a
parametric model from the training set
30/08/2006 IPCV 2006 Budapest 83
Explicit Rule
 To define explicitly (through a number of rules) the boundaries
skin cluster in some color space
 E.g. assuming in space (RGB, rgb, HSI, YUV, etc)
– (R’,G’,B’) is classified as skin if:
 R’ > 95 and G’ > 40 and B’ > 20,
 Max{R’,G’,B’} – min{R’,G’,B’} > 15, and Peer (2003)
 |R’-G’| > 15 and R’ > G’ and R’ > B’
 The simplicity of this method is attractive

 Simplicity of skin detection rules leads to construction of a very
rapid classifier
 The main difficulty achieving high recognition rates need to find
– Good color space
– Adequate decision rules
30/08/2006 IPCV 2006 Budapest 84

Example: YCrCb Space [Yoo and Kim]
 Skin-colors form a cluster in YCrCb color space
 We can approximate the skin color region with
several lines or with an ellipse
30/08/2006 IPCV 2006 Budapest 85

Color Histogram
 If
– h(rk, gk, bk)=nk,
 Then
– p(rk, gk, bk) = nk / N
– p(r’k, g’k, b’k) = (N-nk) / N
 That is:
– If there are nk skin color (rk, gk, bk) pixels in the image,
the probability of any pixel which is a skin pixel is nk/N
– and the probability of any pixel which is NOT a skin pixel
is (N-nk)/N
30/08/2006 IPCV 2006 Budapest 86

Bayes Probability - Recall
 P(A,B) = P(A|B) x P(B)
 P(A,B) = P(B|A) x P(A)
 If P(B) + P(~B) = 1, and P(A) + P(~A) = 1, then
– P(A) = P(A|B) x P(B) + P(A|~B) x P(~B)
– P(B) = P(B|A) x P(A) + P(B|~A) x P(~A)
 P(A|B) x P(B) = P(B|A) x P(A)
P(B|A) x P(A)
P(A|B) = -------------------
P(B)
P(B|A) x P(A)
P(A|B) = -----------------------------------------------
P(B|A) x P(A) + P(B|~A) x P(~A) )
30/08/2006 IPCV 2006 Budapest 87

Bayes Classifier (1)
 p(r,g,b|skin) is a conditional probability - a
probability of observing color (red, or green, or
blue), knowing that we see a skin pixel
 A more appropriate measure for skin detection
would be p(skin|r,g,b) - a probability of observing
skin, given a concrete (r,g,b) color value
 To compute this probability, the Bayes rule is
used:
30/08/2006 IPCV 2006 Budapest 88

 p(r,g,b|skin) and p(r,g,b|non-skin) can be

computed from skin and non-skin color histograms
 The prior probabilities p(skin) and p(non-skin) can
also be estimated from the overall number of skin
and non-skin samples in the training set An
inequality
p(skin|r,g,b) >= Q,
where Q is a threshold value, can be used as a
skin detection rule
30/08/2006 IPCV 2006 Budapest 89

 Consider normalised RGB space
 Histograms approximate probability densities
– p(skin) can be estimated as the ratio of the skin size
Nskin to the size of the image Ntotal ,
– p(non-skin) can be estimated as the ratio of the skin
size Nnon-skin to the size of the image Ntotal ,
– p((r,g,b)|skin) the probability density of (r,g,b) given
the skin,
– p((r,g,b)|non-skin) the probability density of (r,g,b)
given the non-skin,
30/08/2006 IPCV 2006 Budapest 90

 Then we can compute for any color (r,g,b)
the probability p(skin|(r,g,b)) → Probability
Map of the image
 Skin Probability Map (SPM)
 That is:
Ratio of
histograms
30/08/2006 IPCV 2006 Budapest 91

Bayes Classifier from [Rehg & Jones]
 Skin photos:
 Non skin photos:
30/08/2006 IPCV 2006 Budapest 92

Results from [Rehg & Jones]
 Used 18 696 images to build a general color model
 Density is concentrated around the gray line and is more
sharply peaked at white than black
 Most colors fall on or near the gray line
 Black and white are by far the most frequent colors, with
white occurring slightly more frequently
 There is a marked skew in the distribution toward the red
corner of the color cube
 77% of the possible 24 bit RGB colors are never
encountered (i.e. the histogram is mostly empty)
 52% of web images have people in them
30/08/2006 IPCV 2006 Budapest 93

Marginal Distributions [Rehg & Jones]
30/08/2006 IPCV 2006 Budapest 94

Skin model [Rehg & Jones]
30/08/2006 IPCV 2006 Budapest 95

Non Skin Model [Rehg & Jones]
30/08/2006 IPCV 2006 Budapest 96

Skin and Non-Skin Color Model
 Analyze 1 billion labeled pixels
 Skin and non-skin models
 A significant degree of separation between
skin non-skin model
 Achieves 80% detection rate with 8.5%
false positives
 Histogram method outperforms Gaussian
mixture model
30/08/2006 IPCV 2006 Budapest 97
Experimental Results [Jones and Rehg 02]
 May have lots of
false positives in
the raw results
 Need further
processing to
eliminate false
negatives and
group color pixels
for face detection
30/08/2006 IPCV 2006 Budapest 98

Results from [Zarit et al.]
 They compared 5 different color spaces CIELab, HSV, HS,
Normalized RGB and YCrCb
 Four different metrics are used to evaluate the results of the
skin detection algorithms
– C %– Skin and Non Skin pixels identified correctly
– S %– Skin pixels identified correctly
– SE – Skin error – skin pixels identified as non skin
– NSE – Non Skin error – non skin pixels identified as skin
 They compared the 5 color space with 2 color models –
look up table and Bayes classifier
30/08/2006 IPCV 2006 Budapest 99

Look up table results
 HSV, HS
gave the best
results
 Normalized
rg is not far
behind
 CIELAB and
YCrCb gave
poor results
30/08/2006 IPCV 2006 Budapest 100
Color-Based Face Detector: Summary
 Pros:
– Easy to implement
– Effective and efficient in constrained
environment
– Insensitive to pose, expression, rotation
variation
 Cons:
– Sensitive to environment and lighting change
– Noisy detection results (body parts, skin-tone
line regions)
30/08/2006 IPCV 2006 Budapest 101

Agenda (Detection)
– Knowledge-based
– Feature-based
– Template-based
30/08/2006 IPCV 2006 Budapest 102

Video-Based Face Detector
 Motion cues:
– Frame differencing
– Background modeling and subtraction
 Can also used depth cue (e.g., from stereo)
when available
 Reduces the search space dramatically
30/08/2006 IPCV 2006 Budapest 103

HONDA Humanoid Robot: ASIMO
 Using motion and depth cue
– Motion cue: from a gradient-based tracker
– Depth cue: from stereo camera
 Dramatically reduce search space
 Cascade face detector
30/08/2006 IPCV 2006 Budapest 104

Video-Based Detectors: Summary
 Pros:
– An easier problem than detection in still
images
– Use all available cues: motion, depth, voice,
etc. to reduce search space
 Cons:
– Need to efficient and effective methods to
process the multimodal cues
– Data fusion
30/08/2006 IPCV 2006 Budapest 105
Agenda (Detection)
– Knowledge-based
– Feature-based
– Template-based
30/08/2006 IPCV 2006 Budapest 106

Performance Evaluation
 Need to set the evaluation criteria/protocol
– Training set
– Test set
– What is a correct detect?
– Detection rate: false positive/negative
– Precision of face location
– Speed: training/test stage
30/08/2006 IPCV 2006 Budapest 107

Training Sets
 Cropped and pre processed data set with face
and non-face images provided by MIT CBCL:
http://www.ai.mit.edu/projects/cbcl.old/software-
datasets/index.html
30/08/2006 IPCV 2006 Budapest 108

Standard Test Sets
 MIT test set: subsumed by CMU test set
 CMU test set (http://www.cs.cmu.edu/~har)
(de facto benchmark): 130 gray scale
images with a total of 507 frontal faces
 CMU profile face test set: 208 images with
faces in profile views
 Kodak data set (Eastman Kodak Corp):
faces of multiple size, pose and varying
lighting conditions in color images
30/08/2006 IPCV 2006 Budapest 109
CMU Test Set I: Upright Frontal Faces
 130 images with 507
frontal faces
 Collected by K. Collected
by K.-K. Sung K. Sung
and H. Rowley
 Including 23 images used
in [Sung and Poggio 94]
 Some images have low
resolution
 De facto benchmark
30/08/2006 IPCV 2006 Budapest 110

CMU Test Sets: Rotated and Profile
Faces
faces in-plane
orientation, collected
by H. Rowley

faces, collected by H.
Schneiderman
30/08/2006 IPCV 2006 Budapest 111

Agenda (Detection)
– Knowledge-based
– Feature-based
– Template-based
30/08/2006 IPCV 2006 Budapest 112

Research Issues
 Representation: How to describe a typical face?
 Scale: How to deal with face of different size?
 Search strategy: How to spot these faces?
 Speed: How to speed up the process?
 Precision: How to locate the faces precisely?
 Post processing: How to combine detection
results?
30/08/2006 IPCV 2006 Budapest 113

Face Detection: A Solved Problem?
 Not quite yet…
 Factors:
– Shadows
– Occlusions
– Robustness
– Resolution
 Lots of potential
applications
 Can be applied to
other domains
30/08/2006 IPCV 2006 Budapest 114

Outline of Recognition Part
 Objective
– Categorization
– Brief description of representative face
recognition techniques
– Open research issues
 This part of the presentation is based on the

survey W. Zhao, R. Chellappa, J. Phillips and A.
Rosenfeld:
Face Recognition: A Literature Survey ACM
Computing Survey Vol. 35, Dec. 2003/
30/08/2006 IPCV 2006 Budapest 115

Recognition from Intensity Images
High-level Categorization
 1. Holistic matching methods
– Classification using whole face region
 2. Feature-based (structural) matching methods
– Structural classification using local features
 3. Hybrid Methods
– Using local features and whole face region
30/08/2006 IPCV 2006 Budapest 116

Holistic Approaches
 Principal Component Analysis (PCA)
– Eigenface
– Fisherface/Subspace LDA
– SVM
– ICA
 Other Representations
– LDA/FLA
– PDBNN
30/08/2006 IPCV 2006 Budapest 117
IPCV, Saint-Etienne, 2004, Prof. V. Bochko
What is PCA?
 Principal component analysis (PCA) is a
technique for extracting a smaller set of
variables with less redundancy from high-
dimensional data sets in order to retain as
much of the information as possible
 A small number of principal components
often give most of the structure in the data
30/08/2006 IPCV 2006 Budapest 118

IPCV, Saint-Etienne, 2004, Prof. V. Bochko
Standard PCA
 Assume that an n-dimensional observed vector is
 The vector is centered where
 The covariance matrix is where is an expectation
 The eigendecomposition is
 The matrix of eigenvectors is
where
 The eigenvectors corresponds to eigenvalues
 The eigenvalues are
 The eigenvalues are in descending order
 The first k principal components are
 Finally, the reconstruction is
30/08/2006 IPCV 2006 Budapest 119

Facial Recognition: PCA - Overview
 Create training set of faces and
calculate the eigenfaces
 Project the new image onto the
eigenfaces
 Check if image is close to “face
space”
 Check closeness to one of the
known faces
 Add unknown faces to the
training set and re-calculate
30/08/2006 IPCV 2006 Budapest 120

Facial Recognition: PCA Training
 Find average of training
images
 Subtract average face from
each image
 Create covariance matrix
 Generate eigenfaces
 Each original image can be
expressed as a linear
combination of the
eigenfaces – face space
30/08/2006 IPCV 2006 Budapest 121
Facial Recognition: PCA Recognition
 A new image is project into the “facespace”
 Create a vector of weights that describes
this image
 The distance from the original image to this
eigenface is compared
 If within certain thresholds then it is a
recognized face
30/08/2006 IPCV 2006 Budapest 122

Feature Based Approaches
 Elastic bunch graph matching using Gabor
wavelets
30/08/2006 IPCV 2006 Budapest 123

Detecting Feature Points
 Using of Gabor Wavelets
 Create 5 frequencies, 8 angles
Spatial figure Top view

30/08/2006 IPCV 2006 Budapest 124
Gabor filters
 The responses of the filters
Original Filtered Result

 2D Gabor filters can be used to detect
dimensions along a specific orientation
 Jet: set of 40 complex coefficients for the
reference points
30/08/2006 IPCV 2006 Budapest 125

Face Graph
 Facial fiducial points
– Pupil, tip of mouth, etc.
 Face graph
– Nodes at fiducial pts.  Bunch
– Un-directed graph – A set of Jets assoc. with the
same fiducial point
– The structure of graph is
– e.g. an eye Jet may consists
the same for each face
of different types of eyes:
– Fitting a face image to a open, closed, etc.
face graph is done  Face bunch graph (FBG):
automatically – Same as a face graph,
– Some nodes may be except each node consists of
undefined due to occlusion a jet bunch rather than a jet
30/08/2006 IPCV 2006 Budapest 126

Face Bunch Graph
 FBG has the same structure as
individual face graph
– Each node labeled with a
bunch of jets
– Each edge labeled with
average distance between
corresponding nodes in face
samples
30/08/2006 IPCV 2006 Budapest 127

Still Face Recognition: Open Issues
 Addressing the issue of recognition being too
sensitive to inaccurate facial feature localization
 Robustly recognizing faces
– Small and/or noisy images
– Images acquired years apart
– Outdoor acquisition: lighting, pose
 What is the limit on the number of faces that can
be distinguished?
 What is the principal and optimal way to arbitrate
and combine local features and global features
30/08/2006 IPCV 2006 Budapest 128

Web Resources (detection)
 Face detection home page
http://home.t-online.de/home/Robert.Frischholz/face.htm
 Henry Rowley’s home page
http://www-2.cs.cmu.edu/~har/faces.html
 Henry Schneiderman’s home page
http://www.ri.cmu.edu/projects/project_416.html
 MIT CBCL web page
http://www.ai.mit.edu/projects/cbcl.old/software-datasets/
index.html
 Face detection resources
http://vision.ai.uiuc.edu/mhyang/face-detection-survey.html
30/08/2006 IPCV 2006 Budapest 129

Web Resources (recognition)
Face Recognition Home Page http.www.cs.rug.nl/~peterkr/FACE/frhp.html
Face Databases
 UT Dallas www.utdallas.edu/dept/bbs/FACULTY_PAGES/otoole/database.htm
 Notre Dame database www.nd.edu/~cvrl/HID-data.html
 MIT database ftp://whitechapel.media.mit.edu/pub/images
 Edelman ftp://ftp.wisdom.weizmann.ac.il/pub/FaceBase
 CMU PIE www.ri.cmu.edu/projects/project\_418.htm
 Stirlingdatabase pics.psych.stir.ac.uk
 M2VTS multimodal www.tele.ucl.ac.be/M2VTS
 Yale database cvc.yale.edu/projects/yalefaces/yalefaces.htm
 Yale databaseB cvc.yale.edu/projects/yalefacesB/yalefacesB.htm
 Harvard database hrl.harvard.edu/pub/faces
 Weizmanndatabase www.wisdom.weizmann.ac.il/~yael
 UMIST database images.ee.umist.ac.uk/danny/database.html
 Purdue rvl1.ecn.purdue.edu/~aleix/aleix\_face\_DB.html
 Olivetti database www.cam-orl.co.uk/facedatabase.html
 Ouluphysics-based www.ee.oulu.fi/research/imag/color/pbfd.html
30/08/2006 IPCV 2006 Budapest 130

References (detection)
 M.-H. Yang, D. J. Kriegman, and N. Ahuja,
“Detecting Faces in Images: A Survey” IEEE
Transactions on Pattern Analysis and Machine
Intelligence (PAMI), vol. 24, no. 1, pp. 34-58,
2002. (http://vision.ai.uiuc.edu/mhyang/face-
detection-survey.html
 M.-H. Yang and N. Ahuja, Face Detection and
Hand Gesture Recognition for Human Computer
Interaction, Kluwer Academic Publishers, 2001.
 Paul Viola, Michael J. Jones, Robust Real- time
Object Detection, Cambridge Research Laboratory
Technology Reports Series, CRL 2001/01, Feb, 2001
30/08/2006 IPCV 2006 Budapest 131

Face and Some CV-based
Projects at Budapest Tech
30/08/2006 IPCV 2006 Budapest 132

Combined verification
Hangfelvétel Hangminta LP spektrum
Neurális háló
Arc maszk
Original Filter response Features

30/08/2006 IPCV 2006
Szűrő válasz Budapest 133
Morphing by Automatic Feature
Detection
30/08/2006 IPCV 2006 Budapest 134

Eye and Lip Tracking
 Real time face detection
 Eye a mouth motion tracking
 Client-server internet communication
30/08/2006 IPCV 2006 Budapest 135

MYRA
 Face detection
 Skin tone based
tracking
 Feature tracking
 Motion detection
 Recognition options:
– Eigenface method
– Gabor wavelet
based bunch
graph
30/08/2006 IPCV 2006 Budapest 136

Mobile robots – navigation using CV
30/08/2006 IPCV 2006 Budapest 137

Recognition and navigation
30/08/2006 IPCV 2006 Budapest 138

Face Detection and Recognition

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Face Detection and Recognition

Uploaded by

Copyright:

Available Formats

Face Detection and

Budapest Tech, John von Neumann Faculty of Informatics

30/08/2006 IPCV 2006 Budapest 1

Authenticate Authentication subsystem Template

30/08/2006 IPCV 2006 Budapest 3

Biometric Biometric Database

Automatic border control Samsung’s Magicgate:

Omron’s face detector

30/08/2006 IPCV 2006 Budapest 7

30/08/2006 IPCV 2006 Budapest 8

Prof. Eric H. Chudler Dept. of Anesthesiology University of Washington

Prof. Eric H. Chudler Dept. of Anesthesiology University of Washington

Prof. Eric H. Chudler Dept. of Anesthesiology University of Washington

30/08/2006 IPCV 2006 Budapest 12

Harmon and Julesz, 1973

Human can handle low resolution

 This part of the presentation is based on the

30/08/2006 IPCV 2006 Budapest 14

30/08/2006 IPCV 2006 Budapest 15

30/08/2006 IPCV 2006 Budapest 16

30/08/2006 IPCV 2006 Budapest 17

30/08/2006 IPCV 2006 Budapest 18

 Different interpretation of “correct detect”

30/08/2006 IPCV 2006 Budapest 21

30/08/2006 IPCV 2006 Budapest 22

30/08/2006 IPCV 2006 Budapest 23

30/08/2006 IPCV 2006 Budapest 24

30/08/2006 IPCV 2006 Budapest 25

30/08/2006 IPCV 2006 Budapest 26

 Horizontal/vertical projection to search for candidates

 Search eyebrow/eyes, nostrils/nose for validation

30/08/2006 IPCV 2006 Budapest 28

30/08/2006 IPCV 2006 Budapest 29

30/08/2006 IPCV 2006 Budapest 30

30/08/2006 IPCV 2006 Budapest 31

30/08/2006 IPCV 2006 Budapest 32

30/08/2006 IPCV 2006 Budapest 33

 Model facial feature as pair

 Apply interest point

 Using Bayesian network to

30/08/2006 IPCV 2006 Budapest 34

30/08/2006 IPCV 2006 Budapest 36

30/08/2006 IPCV 2006 Budapest 37

30/08/2006 IPCV 2006 Budapest 38

30/08/2006 IPCV 2006 Budapest 39

30/08/2006 IPCV 2006 Budapest 41

30/08/2006 IPCV 2006 Budapest 42

30/08/2006 IPCV 2006 Budapest 43

30/08/2006 IPCV 2006 Budapest 44

30/08/2006 IPCV 2006 Budapest 45

30/08/2006 IPCV 2006 Budapest 46

 Scan an input image at one-pixel Downsample the input image by a

 Continue to downsample the input image

[Schneiderman and Kanade 98]

30/08/2006 IPCV 2006 Budapest 50

Viola and Jones

30/08/2006 IPCV 2006 Budapest 51

Goals of Viola and Jones

30/08/2006 IPCV 2006 Budapest 52

30/08/2006 IPCV 2006 Budapest 53

The Framework of the Method

30/08/2006 IPCV 2006 Budapest 54

30/08/2006 IPCV 2006 Budapest 55