Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 138

Face Detection and


Budapest Tech, John von Neumann Faculty of Informatics

30/08/2006 IPCV 2006 Budapest 1

Face tasks in CV
 Face detection: Are there any face on the image (where)?
– concerns with a category of object
 Face localization:
– Aim to determine the image position of a single face
– A simplified detection problem with the assumption that
an input image contains only one face
 Facial feature extraction:
– To detect the presence and location of features such as
eyes, nose, nostrils, eyebrow, mouth, lips, ears, etc
– Usually assume that there is only one face in an image
 Face recognition (identification): concerns with individual
 Facial expression recognition
 Human pose estimation and tracking
30/08/2006 IPCV 2006 Budapest 2
Typical Biometric Authentication
Enrollment subsystem Template

Biometric Feature
reader Extractor


Authenticate Authentication subsystem Template

Match or Biometric
No Match Matcher

Biometric Feature
reader Extractor

30/08/2006 IPCV 2006 Budapest 3

Identification vs. Verification
Identification (1:N)

Biometric Biometric
reader Matcher
This person is
Emily Dawson

I am Emily
Dawson Verification (1:1)

Biometric Biometric Database

reader Matcher

30/08/2006 IPCV 2006 Budapest 4
Face Recognition Applications

Automatic border control Samsung’s Magicgate:

Sydney airport door lock control
30/08/2006 IPCV 2006 Budapest 5
A Solved Problem?
 Excellent results and
– Fast, multi pose,
partial occlusion
 Is face recognition
and detection a solved
 Not quite

Omron’s face detector

30/08/2006 IPCV 2006 Budapest [Liu et al. 04] 6
 Relevant psychophysics/neuroscience issues
 Face Detection
– Detection on gray scale images
– Detection on color images
– Research directions
 Face Recognition
– Still image face recognition
 Face projects at Budapest Tech

30/08/2006 IPCV 2006 Budapest 7

Psychophysics Issues
 Is face perception the results of holistic/configural or local
feature analysis? Is it a dedicated process?
– Build a special or generic system
 Ranking of significance of facial features
– Different weights for features
 Movement and face recognition
– Favor video based recognition
 Role of race/gender/familiarity
– Suggest adaptive system

30/08/2006 IPCV 2006 Budapest 8

Inverted Faces

Prof. Eric H. Chudler Dept. of Anesthesiology University of Washington

30/08/2006 IPCV 2006 Budapest 9
Inverted Faces

Prof. Eric H. Chudler Dept. of Anesthesiology University of Washington

30/08/2006 IPCV 2006 Budapest 10
More Inverted Faces

Prof. Eric H. Chudler Dept. of Anesthesiology University of Washington

30/08/2006 IPCV 2006 Budapest 11
Negated Faces

30/08/2006 IPCV 2006 Budapest 12

Human Face Recognition Limitations

Harmon and Julesz, 1973

Degraded images

Human can handle low resolution

30/08/2006 IPCV 2006 Budapest 13
Outline of Detection Part
 Objective
– Survey major face detection works
– Address “how” and “why” questions
– Pros and cons of detection methods
– Research directions

 This part of the presentation is based on the

survey M-H. Yang:

30/08/2006 IPCV 2006 Budapest 14

Face Detection
 Identify and locate
human faces in an
image regardless
of their
– position
– scale
– in-plane rotation
– orientation
– pose (out-of plane rotation)
– and illumination
 Face is a highly non-rigid object
Where are the faces, if any?

30/08/2006 IPCV 2006 Budapest 15

Why Face Detection is Important?
 First step for any fully automatic face
recognition system
 First step in many surveillance systems
 Lots of applications
 A step towards Automatic Target
Recognition (ATR) or generic object

30/08/2006 IPCV 2006 Budapest 16

Why Face Detection Is Difficult?
 Pose (Out-of-Plane Rotation): frontal, 45
degree, profile, upside down
 Presence or absence of structural
components: beard, mustache and
 Facial expression: face appearance is
directly affected by a person's facial
 Occlusion: faces may be partially occluded
by other objects
 Orientation (In-Plane Rotation): face
appearance directly vary for different
rotations about the camera's optical axis
 Imaging conditions: lighting (spectra,
source distribution and intensity) and
camera characteristics (sensor response,
gain control, lenses), resolution

30/08/2006 IPCV 2006 Budapest 17

In One Thumbnail Face Image
 Consider a thumbnail 19 × 19 face pattern
 256361 possible combination of gray values
– 256361 = 28×361 = 22888
 Total world population 6,600,000,000
– 6,600,000,000 ≅ 232
 87 times more than the world population!
 Extremely high dimensional space!

30/08/2006 IPCV 2006 Budapest 18

What is a Correct Detection?

 Different interpretation of “correct detect”

 Precision of face location
 Affect the reporting results: detection, false
positive, false negative rates
30/08/2006 IPCV 2006 Budapest 19
Receiver Operator Characteristic Curve
 Useful for detailed performance
 Plot true positive (TP) proportion against
the false positive (FP) proportion for
various possible settings
 False positive: Predict a face when there is
actually none
 False negative: Predict a non-face where
there is actually one
30/08/2006 IPCV 2006 Budapest 20
Receiver Operator Characteristic (ROC)
 Correct
identification of
nearly all objects
will cause a large
percentage of
unknown objects to
be incorrectly
classified into some
known class

30/08/2006 IPCV 2006 Budapest 21

Face Detector: Ingredients
 Target application domain: single image, video
 Representation: holistic, feature, etc
 Pre processing: histogram equalization, etc
 Cues: color, motion, depth, voice, etc
 Search strategy: exhaustive, greedy, focus of
attention, etc
 Classifier design: ensemble, cascade
 Post processing: interpret detection results

30/08/2006 IPCV 2006 Budapest 22

In our Focus
 Focus on detecting
– upright, frontal
– in a single gray-
scale image
– with decent
– under good lighting

30/08/2006 IPCV 2006 Budapest 23

Methods to Detect/Locate Faces
 Knowledge-based methods:
– Encode human knowledge of what constitutes a typical
face (usually, the relationships between facial features)
 Feature invariant approaches:
– Aim to find structural features of a face that exist even
when the pose, viewpoint, or lighting conditions vary
 Template matching methods:
– Several standard patterns stored to describe the face
as a whole or the facial features separately
 Appearance based methods:
– The models (or templates) are learned from a set of
training images which capture the representative
variability of facial appearance

30/08/2006 IPCV 2006 Budapest 24

Agenda (Detection)
 Detecting faces in gray scale images
– Knowledge-based
– Feature-based
– Template-based
– Appearance-based
 Detecting faces in color images
 Detecting faces in video
 Performance evaluation
 Research direction and concluding remarks

30/08/2006 IPCV 2006 Budapest 25

Knowledge-Based Methods
 Top-down approach: Represent a face using a
set of human-coded rules
 Examples:
– The center part of face has uniform intensity
– The difference between the average intensity
values of the center part and the upper part is
– A face often appears with two eyes that are
symmetric to each other, a nose and a mouth
 Use these rules to guide the search process

30/08/2006 IPCV 2006 Budapest 26

Knowledge-Based Method: [Yang and Huang 94]
 Multi-resolution focus-of-
attention approach
 Level 1 (lowest
resolution): apply the rule
“the center part of the
face has 4 cells with a
basically uniform
intensity” to search for
 Level 2: local histogram
equalization followed by
edge detection
 Level 3: search for eye
and mouth features for
30/08/2006 IPCV 2006 Budapest 27
Knowledge-Based Method:
[Kotropoulos & Pitas 94]

 Horizontal/vertical projection to search for candidates

 Search eyebrow/eyes, nostrils/nose for validation

 Difficult to detect multiple people or in complex

30/08/2006 IPCV 2006 Budapest 28

Knowledge-based Methods: Summary
 Pros:
– Easy to find simple rules to describe the features of a face
and their relationships
– Based on the coded rules, facial features in an input image
are extracted first and face candidates are identified
– Work well for face localization in uncluttered background
 Cons:
– Difficult to translate human knowledge into rules precisely:
detailed rules fail to detect faces and general rules may
find many false positives
– Difficult to extend this approach to detect faces in different
poses: implausible to enumerate all the possible cases

30/08/2006 IPCV 2006 Budapest 29

Agenda (Detection)
 Detecting faces in gray scale images
– Knowledge-based
– Feature-based
– Template-based
– Appearance-based
 Detecting faces in color images
 Detecting faces in video
 Performance evaluation
 Research direction and concluding remarks

30/08/2006 IPCV 2006 Budapest 30

Feature-based Methods
 Bottom-up approach: Detect facial features
(eyes, nose, mouth, etc) first
– How can we detect facial features: edge,
intensity, shape, texture, color, etc
– Aim to detect invariant features
 Group features into candidates and verify

30/08/2006 IPCV 2006 Budapest 31

Feature-based Example [Yang 2003]
 Use the Canny operator to find an edge
 Separate face from background by
grouping edges
 Best fit an ellipse to the remaining edges

30/08/2006 IPCV 2006 Budapest 32

Feature Grouping [Yow and Cipolla 90]
 Apply a 2nd derivative Gaussian filter to
search for interest points
 Group the edges near interest points into
 Each feature and grouping is evaluated
within a Bayesian network
 Handle a few poses

30/08/2006 IPCV 2006 Budapest 33

Feature Grouping [Yow and Cipolla 90]
 Face model and

 Model facial feature as pair

of edges

 Apply interest point

operator and edge detector
to search for features

 Using Bayesian network to

combine evidence

30/08/2006 IPCV 2006 Budapest 34

Random Graph Matching [Leung et al. 95]
 Formulate as a problem to find the
correct geometric arrangement of
facial features
– Facial features are defined by
the average responses of multi-
orientation, multi-scale
Gaussian derivative filters
– Learn the configuration of
features with Gaussian
distribution of mutual distance
between facial features
 Random graph matching among
the candidates to locate faces
30/08/2006 IPCV 2006 Budapest 35
Feature-Based Methods: Summary
 Pros:
– Features are invariant to pose and orientation
 Cons:
– Difficult to locate facial features due to several
corruption (illumination, noise, occlusion)
– Difficult to detect features in complex

30/08/2006 IPCV 2006 Budapest 36

Agenda (Detection)
 Detecting faces in gray scale images
– Knowledge-based
– Feature-based
– Template-based
– Appearance-based
 Detecting faces in color images
 Detecting faces in video
 Performance evaluation
 Research direction and concluding remarks

30/08/2006 IPCV 2006 Budapest 37

Template Matching Methods
 Store a template
– Predefined: based on edges or regions
– Deformable: based on facial contours
(e.g., Snakes)
 Templates are hand-coded (not learned)
 Use correlation to locate faces

30/08/2006 IPCV 2006 Budapest 38

Face Template
 Use relative pair-wise
ratios of the brightness of
facial regions (14 × 16
pixels): the eyes are
usually darker than the
surrounding face [Sinha 94]
 Use average area
intensity values than
absolute pixel values
 See also Point Distribution
Model (PDM) [Lanitis et al. 95]

30/08/2006 IPCV 2006 Budapest 39

Template-Based Methods: Summary
 Pros:
– Relatively simple
 Cons:
– Templates needs to be initialized near the
face images
– Difficult to enumerate templates for
different poses (similar to knowledge-
based methods)
30/08/2006 IPCV 2006 Budapest 40
Agenda (Detection)
 Detecting faces in gray scale images
– Knowledge-based
– Feature-based
– Template-based
– Appearance-based
 Detecting faces in color images
 Detecting faces in video
 Performance evaluation
 Research direction and concluding remarks

30/08/2006 IPCV 2006 Budapest 41

Appearance-Based Methods
Idea: train a classifier using positive (and
usually negative) examples of faces
 Representation
 Pre processing
 Train a classifier
 Search strategy
 Post processing

30/08/2006 IPCV 2006 Budapest 42

Appearance-Based Methods: Classifiers
 Neural network: Multilayer Perceptrons
 Principal Component Analysis (PCA), Factor Analysis
 Support vector machine (SVM)
 Mixture of PCA, Mixture of factor analyzers
 Distribution-based method
 Naive Bayes classifier
 Hidden Markov model
 Sparse network of winnows (SNoW)
 Kullback relative information
 Inductive learning: C4.5
 AdaBoost (Jones and Viola)
 …

30/08/2006 IPCV 2006 Budapest 43

 Holistic: Each image is raster scanned and
represented by a vector of intensity values
 Block-based: Decompose each face image into a
set of overlapping or non-overlapping blocks
– At multiple scale
– Further processed with vector quantization,
Principal Component Analysis, etc.

30/08/2006 IPCV 2006 Budapest 44

Face and Non-Face Examples
 Positive examples:
– Get as much variation as possible
– Manually crop and normalize each face image into a
standard size face (e.g., 19 × 19 pixels)
– Creating virtual examples [Sung and Poggio 94]

 Negative examples:
– Fuzzy idea
– Any images that
do not contain faces
– A large image subspace

30/08/2006 IPCV 2006 Budapest 45

Distribution-Based Method [Sung & Poggio 94]
 Masking: reduce the
unwanted noise in a face
 Illumination gradient
correction: find the best fit
brightness plane and then
subtracted from it to reduce
heavy shadows caused by
extreme lighting angles
 Histogram equalization:
compensates the imaging
effects due to changes in
illumination and different
camera input gains

30/08/2006 IPCV 2006 Budapest 46

Creating Virtual Positive Examples
 Simple and very
effective method
 Randomly mirror,
rotate, translate
and scale face
samples by small
 Less sensitive to
alignment error
30/08/2006 IPCV 2006 Budapest 47
Search over Space and Scale

 Scan an input image at one-pixel Downsample the input image by a

increments horizontally and vertically factor of 1.2 and continue to search
30/08/2006 IPCV 2006 Budapest 48
Continue to Search over Space and Scale

 Continue to downsample the input image

and search until the image size is too small
30/08/2006 IPCV 2006 Budapest 49
Detecting Faces in Different Pose
 Probabilistic Modeling of
Local Appearance
 Extend to detect faces in
different pose with
multiple detectors
 Each detector specializes
to a view: frontal, left pose
and right pose

[Schneiderman and Kanade 98]

30/08/2006 IPCV 2006 Budapest 50

Viola & Jones: Robust Real-time Face Detection

Viola and Jones

 Goals
 Motivation
 Features (rectangle features & integral
 Learning classifier functions (AdaBoost)
 Cascade of classifiers
 Results

30/08/2006 IPCV 2006 Budapest 51

Viola & Jones: Robust Real-time Face Detection

Goals of Viola and Jones

 Robust real-time face detection in images
 Decide for each sub-window (starting at
24x24) of the image whether it is a face

30/08/2006 IPCV 2006 Budapest 52

Viola & Jones: Robust Real-time Face Detection

 Rapid face detection in images and image
 Previous approaches too slow
– They are using auxiliary information
(image differences, colors)
 Only grey scale images needed

30/08/2006 IPCV 2006 Budapest 53

Viola & Jones: Robust Real-time Face Detection

The Framework of the Method

 Haar-like Feature: A feature representation
 Integral Image: A new image representation
and a fast Haar-like feature calculation
 Perceptron: Classifier, uses the features
 Boost Algorithm: An efficient classifier
generating algorithm
 Cascade: Method for combining classifiers

30/08/2006 IPCV 2006 Budapest 54

Viola & Jones: Robust Real-time Face Detection

 Detection based on features, not pixels

 Why?
– Features can encode domain knowledge
that is difficult to learn
– Faster than a pixel-based system
– Scale independent

30/08/2006 IPCV 2006 Budapest 55

Viola & Jones: Robust Real-time Face Detection

Rectangle Features

 (Sum of pixels in black rectangles) - (sum of

pixels in white rectangles)
 Features are very simple  rapid
 How to compute them rapidly?
30/08/2006 IPCV 2006 Budapest 56
Viola & Jones: Robust Real-time Face Detection

Integral Image
 Rectangle Features
can be computed
rapidly using an
representation of the
image: integral image
 Value at point (x,y) is
the sum of all the pixel
values above and to
the left
30/08/2006 IPCV 2006 Budapest 57
Viola & Jones: Robust Real-time Face Detection

Integral Image
 The integral image can be computed with
the following recurrences:

 where
s(x, y): cumulative column sum

 One pass over the image

30/08/2006 IPCV 2006 Budapest 58

Viola & Jones: Robust Real-time Face Detection

Computation of Rectangular Sums

Sum of rectangle D: ii(4) – ii(3) – ii(2) + ii(1)

30/08/2006 IPCV 2006 Budapest 59
Viola & Jones: Robust Real-time Face Detection

Integral Image
 Rectangular sum can be computed in four
array references
 Two-rectangle features need six array
 Three-rectangle features need eight array
 Four-rectangle features need nine array
30/08/2006 IPCV 2006 Budapest 60
Viola & Jones: Robust Real-time Face Detection

Feature properties
 Rough, but sensitive to simple image
 Only 2 orientations
 Extremely efficient
 Compensates the disadvantages
 Face detection for an entire image at every
scale in real-time

30/08/2006 IPCV 2006 Budapest 61

Viola & Jones: Robust Real-time Face Detection

Learning Classification Functions

 Given: feature set and training set of positive +
negative face images
– In principle any machine learning approach
can be used (SVM, neural network, …)
– But: 45 396 rectangle features for each sub-
 Experiments showed that only a few features
needed for an efficient classifier
 How to find them?

30/08/2006 IPCV 2006 Budapest 62

Viola & Jones: Robust Real-time Face Detection

AdaBoost-Based Detector [Viola and Jones 01]

 Main idea:
– Feature selection: select important features
– Focus of attention: focus on potential regions
– Use an integral graph for fast feature evaluation
 Use AdaBoost to learn
– A small set of important features (feature selection)
 sort them in the order of importance
 each feature can be used as a simple (weak) classifier
– A cascade of classifiers that
 combine all the weak classifiers to do a difficult task
 filter out the regions that most likely do not contain

30/08/2006 IPCV 2006 Budapest 63

Viola & Jones: Robust Real-time Face Detection

Feature Selection [Viola and Jones 01]

 Training: If x is a face, then x

most likely has feature 1 (easiest feature, and of greatest importance)

very likely to have feature 2 (easy feature)
… –

likely to have feature n (more complex feature, and of less importance
since it does not exist in all the faces in the training set)
 Testing: Given a test sub-image image x’
– if x’ has feature 1:
 Test whether x’ has feature 2
– Test whether x’ has feature n
– else …
 else, it is not face
– else, it is not a face
 Similar to decision tree

30/08/2006 IPCV 2006 Budapest 64

Viola & Jones: Robust Real-time Face Detection

 Variant of AdaBoost is used both to select
features and train the classifier
 AdaBoost is used to boost classification
performance of a simple (weak) learning
 It combines a collection of weak classifier to form
a stronger one
 The weak learning algorithm is designed to select
the single rectangle feature which best separates
the positive and negative examples

30/08/2006 IPCV 2006 Budapest 65

Viola & Jones: Robust Real-time Face Detection

 Searches over the set of possible weak classifiers
and selects those with the lowest classification
 Greedy feature selection process
 Efficient for selecting a small number of “good”
features (fj(x)) with significant variety
Weak classifier

 Error rates between 0.1-0.3 (early rounds) and 0.4

and 0.5 (later rounds)

30/08/2006 IPCV 2006 Budapest 66

Viola & Jones: Robust Real-time Face Detection

AdaBoost in Detail
 Given example images yi = 0,1 pos./neg.
 Initialize weights for yi = 0, 1: where m and
l number of negatives and positives
 For t = 1,…,T:
1. Normalize weights
2. For each feature j train a classifier hj which is restricted to
using a single feature, the error
3. Choose the classifier ht with lowest error
4. Update weights:
where ei = 0, if xi correctly classified, else ei = 1
 Final strong classifier:

30/08/2006 IPCV 2006 Budapest 67

Viola & Jones: Robust Real-time Face Detection

Feature Selection by AdaBoost

 First and second feature

selected by AdaBoost
 60 microprocessor

30/08/2006 IPCV 2006 Budapest 68

Viola & Jones: Robust Real-time Face Detection

AdaBoost – Learning Results

PIII 700 MHz, 2001
 200-feature-classifier
 Detection rate: 95%
 False positive rate: 1/14084
 Takes 0.7 seconds to scan 384x288 image
 Adding features to the classifier increases
computation time

30/08/2006 IPCV 2006 Budapest 69

Viola & Jones: Robust Real-time Face Detection

Cascade of Classifiers
 0.7 s for 1 image?  too slow =>
 Idea: Cascade of Classifiers to reduce
computation time
 Simple Classifiers to reject majority of negative
sub-windows: all sub-windows
many sub-windows
have no faces
T T T Further
1 2 3
30/08/2006 IPCV 2006 Budapest 70
Viola & Jones: Robust Real-time Face Detection

Cascade of Classifiers - Experiment

 Monolithic 200-feature-classifier vs. cascade of
ten 20-feature-classifier
 Trained with 5000 faces and 10000 non-face

30/08/2006 IPCV 2006 Budapest 71

Viola & Jones: Robust Real-time Face Detection

Results - Training
 Structure of the Cascade:
– 32 layers with 4297 features
– First layer uses two features, 60% non-faces
rejected, close to 100% faces detected
– Second layer uses five features, rejects 80%
– 3rd 20-feature-classifier
– 4th 50-feature-classifier
– 5th 100-feature-classifier
– 20th 200-feature-classifier

30/08/2006 IPCV 2006 Budapest 72

Viola & Jones: Robust Real-time Face Detection

Results - Speed
 Speed of the detector depends on the
number of features evaluated
 On the MIT+CMU test set, an average of 8
features (out of 4297), on a 700 MHz
Pentium III, 384x288 takes 0.067s (2001)

30/08/2006 IPCV 2006 Budapest 73

Viola & Jones: Robust Real-time Face Detection

 Viola & Jones improved their detector to detect not
only frontal faces, but also non-upright and non-
frontal faces
 Additional rectangle features

 Pose estimator
 Classifier for each rotation class (12)

30/08/2006 IPCV 2006 Budapest 74

Haar-like Features Extension
 45º rotated rectangle
feature by Lienhart
 Feature is represented by
(type, x, y, w, h): type
[1(a), 2(b), etc.]; (x, y) for
location in detection
window and (w, h) for
feature size
 Number of features: the
window size 24x24, there
are 117,941 different
features, a large number
30/08/2006 IPCV 2006 Budapest 75
Integral Image, Fast Feature
 Corresponding to the
origin feature and 45º
rotated feature, there are
two types of Integral SAT(x,y)
Image, called by Lienhart
as Sum Area Table and
Rotated Sum Area Table

30/08/2006 IPCV 2006 Budapest 76

Integral Image, Fast Feature
 SAT and RSAT Calculation:

 Sum of Rect Calculation:

 Feature value: calculated as the weighted

sum of the rectangles
30/08/2006 IPCV 2006 Budapest 77
Appearance-Based Methods: Summary
 Pros:
– Use powerful machine learning algorithms
– Has demonstrated good empirical results
– Fast and fairly robust
– Extended to detect faces in different pose and
 Cons
– Usually needs to search over space and scale
– Need lots of positive and negative examples
– Limited view-based approach

30/08/2006 IPCV 2006 Budapest 78

Agenda (Detection)
 Detecting faces in gray scale images
– Knowledge-based
– Feature-based
– Template-based
– Appearance-based
 Detecting faces in color images
 Detecting faces in video
 Performance evaluation
 Research direction and concluding remarks

30/08/2006 IPCV 2006 Budapest 79

Color-Based Face Detector
 Distribution of skin color across different ethnic
– Under controlled illumination conditions:
– Arbitrary conditions: less compact
 Color space
– RGB, normalized RGB, HSV, HIS, YCrCb,
 Statistical analysis
– Histogram, look-up table, Gaussian model,
mixture model, …
30/08/2006 IPCV 2006 Budapest 80
Methods of Skin Detection
 Pixel-Based Methods
– Classify each pixel as skin or non-skin individually,
independently from its neighbors
– Simpler, faster
– Color Based Methods fall in this category
 Region Based Methods
– Try to take the spatial arrangement of skin pixels into
account during the detection stage to enhance the
methods performance
– More computation extensive
– Additional knowledge in terms of texture etc. are

30/08/2006 IPCV 2006 Budapest 81

Skin Color Based Methods - Advantages
 Allows fast processing
 Robust to geometric variations of the skin
 Robust under partial occlusion
 Robust to resolution changes
 Experience suggests that human skin has a
characteristic color, which is easily recognized
by humans

30/08/2006 IPCV 2006 Budapest 82

Skin Modelling
 Explicitly defined skin region
– Using simple rules
 Non-parametric skin distribution modeling -
Estimate skin color distribution from the training
data without deriving an explicit model of the skin
– Look up table or Histogram Model
– Bayes Classifier
 Parametric skin distribution modeling - Deriving a
parametric model from the training set
30/08/2006 IPCV 2006 Budapest 83
Explicit Rule
 To define explicitly (through a number of rules) the boundaries
skin cluster in some color space
 E.g. assuming in space (RGB, rgb, HSI, YUV, etc)
– (R’,G’,B’) is classified as skin if:
 R’ > 95 and G’ > 40 and B’ > 20,
 Max{R’,G’,B’} – min{R’,G’,B’} > 15, and Peer (2003)
 |R’-G’| > 15 and R’ > G’ and R’ > B’

 The simplicity of this method is attractive

 Simplicity of skin detection rules leads to construction of a very
rapid classifier
 The main difficulty achieving high recognition rates need to find
– Good color space
– Adequate decision rules

30/08/2006 IPCV 2006 Budapest 84

Example: YCrCb Space [Yoo and Kim]
 Skin-colors form a cluster in YCrCb color space
 We can approximate the skin color region with
several lines or with an ellipse

30/08/2006 IPCV 2006 Budapest 85

Color Histogram
 If
– h(rk, gk, bk)=nk,
 Then
– p(rk, gk, bk) = nk / N
– p(r’k, g’k, b’k) = (N-nk) / N

 That is:
– If there are nk skin color (rk, gk, bk) pixels in the image,
the probability of any pixel which is a skin pixel is nk/N
– and the probability of any pixel which is NOT a skin pixel
is (N-nk)/N

30/08/2006 IPCV 2006 Budapest 86

Bayes Probability - Recall
 P(A,B) = P(A|B) x P(B)
 P(A,B) = P(B|A) x P(A)
 If P(B) + P(~B) = 1, and P(A) + P(~A) = 1, then
– P(A) = P(A|B) x P(B) + P(A|~B) x P(~B)
– P(B) = P(B|A) x P(A) + P(B|~A) x P(~A)

 P(A|B) x P(B) = P(B|A) x P(A)

P(B|A) x P(A)
P(A|B) = -------------------
P(B|A) x P(A)
P(A|B) = -----------------------------------------------
P(B|A) x P(A) + P(B|~A) x P(~A) )

30/08/2006 IPCV 2006 Budapest 87

Bayes Classifier (1)
 p(r,g,b|skin) is a conditional probability - a
probability of observing color (red, or green, or
blue), knowing that we see a skin pixel
 A more appropriate measure for skin detection
would be p(skin|r,g,b) - a probability of observing
skin, given a concrete (r,g,b) color value
 To compute this probability, the Bayes rule is

30/08/2006 IPCV 2006 Budapest 88

Bayes Classifier (2)

 p(r,g,b|skin) and p(r,g,b|non-skin) can be

computed from skin and non-skin color histograms
 The prior probabilities p(skin) and p(non-skin) can
also be estimated from the overall number of skin
and non-skin samples in the training set An
p(skin|r,g,b) >= Q,
where Q is a threshold value, can be used as a
skin detection rule

30/08/2006 IPCV 2006 Budapest 89

Bayes Classifier (3)
 Consider normalised RGB space
 Histograms approximate probability densities
– p(skin) can be estimated as the ratio of the skin size
Nskin to the size of the image Ntotal ,
– p(non-skin) can be estimated as the ratio of the skin
size Nnon-skin to the size of the image Ntotal ,
– p((r,g,b)|skin) the probability density of (r,g,b) given
the skin,
– p((r,g,b)|non-skin) the probability density of (r,g,b)
given the non-skin,

30/08/2006 IPCV 2006 Budapest 90

Bayes Classifier (4)
 Then we can compute for any color (r,g,b)
the probability p(skin|(r,g,b)) → Probability
Map of the image
 Skin Probability Map (SPM)
 That is:

Ratio of

30/08/2006 IPCV 2006 Budapest 91

Bayes Classifier from [Rehg & Jones]
 Skin photos:

 Non skin photos:

30/08/2006 IPCV 2006 Budapest 92

Results from [Rehg & Jones]
 Used 18 696 images to build a general color model
 Density is concentrated around the gray line and is more
sharply peaked at white than black
 Most colors fall on or near the gray line
 Black and white are by far the most frequent colors, with
white occurring slightly more frequently
 There is a marked skew in the distribution toward the red
corner of the color cube
 77% of the possible 24 bit RGB colors are never
encountered (i.e. the histogram is mostly empty)
 52% of web images have people in them

30/08/2006 IPCV 2006 Budapest 93

Marginal Distributions [Rehg & Jones]

30/08/2006 IPCV 2006 Budapest 94

Skin model [Rehg & Jones]

30/08/2006 IPCV 2006 Budapest 95

Non Skin Model [Rehg & Jones]

30/08/2006 IPCV 2006 Budapest 96

Skin and Non-Skin Color Model
 Analyze 1 billion labeled pixels
 Skin and non-skin models
 A significant degree of separation between
skin non-skin model
 Achieves 80% detection rate with 8.5%
false positives
 Histogram method outperforms Gaussian
mixture model
30/08/2006 IPCV 2006 Budapest 97
Experimental Results [Jones and Rehg 02]
 May have lots of
false positives in
the raw results
 Need further
processing to
eliminate false
negatives and
group color pixels
for face detection

30/08/2006 IPCV 2006 Budapest 98

Results from [Zarit et al.]
 They compared 5 different color spaces CIELab, HSV, HS,
Normalized RGB and YCrCb
 Four different metrics are used to evaluate the results of the
skin detection algorithms
– C %– Skin and Non Skin pixels identified correctly
– S %– Skin pixels identified correctly
– SE – Skin error – skin pixels identified as non skin
– NSE – Non Skin error – non skin pixels identified as skin
 They compared the 5 color space with 2 color models –
look up table and Bayes classifier

30/08/2006 IPCV 2006 Budapest 99

Look up table results
gave the best
 Normalized
rg is not far
 CIELAB and
YCrCb gave
poor results
30/08/2006 IPCV 2006 Budapest 100
Color-Based Face Detector: Summary
 Pros:
– Easy to implement
– Effective and efficient in constrained
– Insensitive to pose, expression, rotation
 Cons:
– Sensitive to environment and lighting change
– Noisy detection results (body parts, skin-tone
line regions)

30/08/2006 IPCV 2006 Budapest 101

Agenda (Detection)
 Detecting faces in gray scale images
– Knowledge-based
– Feature-based
– Template-based
– Appearance-based
 Detecting faces in color images
 Detecting faces in video
 Performance evaluation
 Research direction and concluding remarks

30/08/2006 IPCV 2006 Budapest 102

Video-Based Face Detector
 Motion cues:
– Frame differencing
– Background modeling and subtraction
 Can also used depth cue (e.g., from stereo)
when available
 Reduces the search space dramatically

30/08/2006 IPCV 2006 Budapest 103

HONDA Humanoid Robot: ASIMO
 Using motion and depth cue
– Motion cue: from a gradient-based tracker
– Depth cue: from stereo camera
 Dramatically reduce search space
 Cascade face detector

30/08/2006 IPCV 2006 Budapest 104

Video-Based Detectors: Summary
 Pros:
– An easier problem than detection in still
– Use all available cues: motion, depth, voice,
etc. to reduce search space
 Cons:
– Need to efficient and effective methods to
process the multimodal cues
– Data fusion
30/08/2006 IPCV 2006 Budapest 105
Agenda (Detection)
 Detecting faces in gray scale images
– Knowledge-based
– Feature-based
– Template-based
– Appearance-based
 Detecting faces in color images
 Detecting faces in video
 Performance evaluation
 Research direction and concluding remarks

30/08/2006 IPCV 2006 Budapest 106

Performance Evaluation
 Need to set the evaluation criteria/protocol
– Training set
– Test set
– What is a correct detect?
– Detection rate: false positive/negative
– Precision of face location
– Speed: training/test stage

30/08/2006 IPCV 2006 Budapest 107

Training Sets
 Cropped and pre processed data set with face
and non-face images provided by MIT CBCL:

30/08/2006 IPCV 2006 Budapest 108

Standard Test Sets
 MIT test set: subsumed by CMU test set
 CMU test set (
(de facto benchmark): 130 gray scale
images with a total of 507 frontal faces
 CMU profile face test set: 208 images with
faces in profile views
 Kodak data set (Eastman Kodak Corp):
faces of multiple size, pose and varying
lighting conditions in color images
30/08/2006 IPCV 2006 Budapest 109
CMU Test Set I: Upright Frontal Faces
 130 images with 507
frontal faces
 Collected by K. Collected
by K.-K. Sung K. Sung
and H. Rowley
 Including 23 images used
in [Sung and Poggio 94]
 Some images have low
 De facto benchmark

30/08/2006 IPCV 2006 Budapest 110

CMU Test Sets: Rotated and Profile
 50 images with 223
faces in-plane
orientation, collected
by H. Rowley

 208 images with 441

faces, collected by H.

30/08/2006 IPCV 2006 Budapest 111

Agenda (Detection)
 Detecting faces in gray scale images
– Knowledge-based
– Feature-based
– Template-based
– Appearance-based
 Detecting faces in color images
 Detecting faces in video
 Performance evaluation
 Research direction and concluding remarks

30/08/2006 IPCV 2006 Budapest 112

Research Issues
 Representation: How to describe a typical face?
 Scale: How to deal with face of different size?
 Search strategy: How to spot these faces?
 Speed: How to speed up the process?
 Precision: How to locate the faces precisely?
 Post processing: How to combine detection

30/08/2006 IPCV 2006 Budapest 113

Face Detection: A Solved Problem?
 Not quite yet…
 Factors:
– Shadows
– Occlusions
– Robustness
– Resolution
 Lots of potential
 Can be applied to
other domains

30/08/2006 IPCV 2006 Budapest 114

Outline of Recognition Part
 Objective
– Categorization
– Brief description of representative face
recognition techniques
– Open research issues

 This part of the presentation is based on the

survey W. Zhao, R. Chellappa, J. Phillips and A.
Face Recognition: A Literature Survey ACM
Computing Survey Vol. 35, Dec. 2003/

30/08/2006 IPCV 2006 Budapest 115

Recognition from Intensity Images
High-level Categorization
 1. Holistic matching methods
– Classification using whole face region
 2. Feature-based (structural) matching methods
– Structural classification using local features
 3. Hybrid Methods
– Using local features and whole face region

30/08/2006 IPCV 2006 Budapest 116

Holistic Approaches
 Principal Component Analysis (PCA)
– Eigenface
– Fisherface/Subspace LDA
 Other Representations
30/08/2006 IPCV 2006 Budapest 117
IPCV, Saint-Etienne, 2004, Prof. V. Bochko

What is PCA?
 Principal component analysis (PCA) is a
technique for extracting a smaller set of
variables with less redundancy from high-
dimensional data sets in order to retain as
much of the information as possible
 A small number of principal components
often give most of the structure in the data

30/08/2006 IPCV 2006 Budapest 118

IPCV, Saint-Etienne, 2004, Prof. V. Bochko

Standard PCA
 Assume that an n-dimensional observed vector is
 The vector is centered where
 The covariance matrix is where is an expectation
 The eigendecomposition is
 The matrix of eigenvectors is
 The eigenvectors corresponds to eigenvalues
 The eigenvalues are
 The eigenvalues are in descending order
 The first k principal components are
 Finally, the reconstruction is

30/08/2006 IPCV 2006 Budapest 119

Facial Recognition: PCA - Overview
 Create training set of faces and
calculate the eigenfaces
 Project the new image onto the
 Check if image is close to “face
 Check closeness to one of the
known faces
 Add unknown faces to the
training set and re-calculate

30/08/2006 IPCV 2006 Budapest 120

Facial Recognition: PCA Training
 Find average of training
 Subtract average face from
each image
 Create covariance matrix
 Generate eigenfaces
 Each original image can be
expressed as a linear
combination of the
eigenfaces – face space
30/08/2006 IPCV 2006 Budapest 121
Facial Recognition: PCA Recognition
 A new image is project into the “facespace”
 Create a vector of weights that describes
this image
 The distance from the original image to this
eigenface is compared
 If within certain thresholds then it is a
recognized face

30/08/2006 IPCV 2006 Budapest 122

Feature Based Approaches
 Elastic bunch graph matching using Gabor

30/08/2006 IPCV 2006 Budapest 123

Detecting Feature Points
 Using of Gabor Wavelets

 Create 5 frequencies, 8 angles

Spatial figure Top view

30/08/2006 IPCV 2006 Budapest 124
Gabor filters
 The responses of the filters

Original Filtered Result

 2D Gabor filters can be used to detect
dimensions along a specific orientation
 Jet: set of 40 complex coefficients for the
reference points

30/08/2006 IPCV 2006 Budapest 125

Face Graph
 Facial fiducial points
– Pupil, tip of mouth, etc.
 Face graph
– Nodes at fiducial pts.  Bunch
– Un-directed graph – A set of Jets assoc. with the
same fiducial point
– The structure of graph is
– e.g. an eye Jet may consists
the same for each face
of different types of eyes:
– Fitting a face image to a open, closed, etc.
face graph is done  Face bunch graph (FBG):
automatically – Same as a face graph,
– Some nodes may be except each node consists of
undefined due to occlusion a jet bunch rather than a jet

30/08/2006 IPCV 2006 Budapest 126

Face Bunch Graph
 FBG has the same structure as
individual face graph
– Each node labeled with a
bunch of jets
– Each edge labeled with
average distance between
corresponding nodes in face

30/08/2006 IPCV 2006 Budapest 127

Still Face Recognition: Open Issues
 Addressing the issue of recognition being too
sensitive to inaccurate facial feature localization
 Robustly recognizing faces
– Small and/or noisy images
– Images acquired years apart
– Outdoor acquisition: lighting, pose
 What is the limit on the number of faces that can
be distinguished?
 What is the principal and optimal way to arbitrate
and combine local features and global features

30/08/2006 IPCV 2006 Budapest 128

Web Resources (detection)
 Face detection home page
 Henry Rowley’s home page
 Henry Schneiderman’s home page
 MIT CBCL web page
 Face detection resources

30/08/2006 IPCV 2006 Budapest 129

Web Resources (recognition)
Face Recognition Home Page
Face Databases
 UT Dallas
 Notre Dame database
 MIT database
 Edelman
 CMU PIE\_418.htm
 Stirlingdatabase
 M2VTS multimodal
 Yale database
 Yale databaseB
 Harvard database
 Weizmanndatabase
 UMIST database
 Purdue\_face\_DB.html
 Olivetti database
 Ouluphysics-based

30/08/2006 IPCV 2006 Budapest 130

References (detection)
 M.-H. Yang, D. J. Kriegman, and N. Ahuja,
“Detecting Faces in Images: A Survey” IEEE
Transactions on Pattern Analysis and Machine
Intelligence (PAMI), vol. 24, no. 1, pp. 34-58,
2002. (
 M.-H. Yang and N. Ahuja, Face Detection and
Hand Gesture Recognition for Human Computer
Interaction, Kluwer Academic Publishers, 2001.
 Paul Viola, Michael J. Jones, Robust Real- time
Object Detection, Cambridge Research Laboratory
Technology Reports Series, CRL 2001/01, Feb, 2001

30/08/2006 IPCV 2006 Budapest 131

Face and Some CV-based
Projects at Budapest Tech

30/08/2006 IPCV 2006 Budapest 132

Combined verification

Hangfelvétel Hangminta LP spektrum

Neurális háló
Arc maszk

Original Filter response Features

30/08/2006 IPCV 2006
Szűrő válasz Budapest 133
Morphing by Automatic Feature

30/08/2006 IPCV 2006 Budapest 134

Eye and Lip Tracking
 Real time face detection
 Eye a mouth motion tracking
 Client-server internet communication

30/08/2006 IPCV 2006 Budapest 135

 Face detection
 Skin tone based
 Feature tracking
 Motion detection
 Recognition options:
– Eigenface method
– Gabor wavelet
based bunch

30/08/2006 IPCV 2006 Budapest 136

Mobile robots – navigation using CV

30/08/2006 IPCV 2006 Budapest 137

Recognition and navigation

30/08/2006 IPCV 2006 Budapest 138

You might also like