Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 138

Face Detection and

Recognition
Zoltán VÁMOSSY

Budapest Tech, John von Neumann Faculty of Informatics

30/08/2006 IPCV 2006 Budapest 1


Face tasks in CV
 Face detection: Are there any face on the image (where)?
– concerns with a category of object
 Face localization:
– Aim to determine the image position of a single face
– A simplified detection problem with the assumption that
an input image contains only one face
 Facial feature extraction:
– To detect the presence and location of features such as
eyes, nose, nostrils, eyebrow, mouth, lips, ears, etc
– Usually assume that there is only one face in an image
 Face recognition (identification): concerns with individual
identity
 Facial expression recognition
 Human pose estimation and tracking
30/08/2006 IPCV 2006 Budapest 2
Typical Biometric Authentication
Workflow
Enroll
Enrollment subsystem Template

Biometric Feature
reader Extractor

1010010…

Authenticate Authentication subsystem Template


Match or Biometric
No Match Matcher
Database
1010010…

Biometric Feature
reader Extractor

30/08/2006 IPCV 2006 Budapest 3


Identification vs. Verification
Identification (1:N)

Biometric Biometric
reader Matcher
Database
This person is
Emily Dawson

I am Emily
Dawson Verification (1:1)
ID

Biometric Biometric Database


reader Matcher

Match
30/08/2006 IPCV 2006 Budapest 4
Face Recognition Applications

Automatic border control Samsung’s Magicgate:


Sydney airport door lock control
30/08/2006 IPCV 2006 Budapest 5
A Solved Problem?
 Excellent results and
products:
– Fast, multi pose,
partial occlusion
 Is face recognition
and detection a solved
problem?
 Not quite

Omron’s face detector


30/08/2006 IPCV 2006 Budapest [Liu et al. 04] 6
Outline
 Relevant psychophysics/neuroscience issues
 Face Detection
– Detection on gray scale images
– Detection on color images
– Research directions
 Face Recognition
– Still image face recognition
 Face projects at Budapest Tech

30/08/2006 IPCV 2006 Budapest 7


Psychophysics Issues
 Is face perception the results of holistic/configural or local
feature analysis? Is it a dedicated process?
– Build a special or generic system
 Ranking of significance of facial features
– Different weights for features
 Movement and face recognition
– Favor video based recognition
 Role of race/gender/familiarity
– Suggest adaptive system

30/08/2006 IPCV 2006 Budapest 8


Inverted Faces

Prof. Eric H. Chudler Dept. of Anesthesiology University of Washington


30/08/2006 IPCV 2006 Budapest 9
Inverted Faces

Prof. Eric H. Chudler Dept. of Anesthesiology University of Washington


30/08/2006 IPCV 2006 Budapest 10
More Inverted Faces

Prof. Eric H. Chudler Dept. of Anesthesiology University of Washington


30/08/2006 IPCV 2006 Budapest 11
Negated Faces

30/08/2006 IPCV 2006 Budapest 12


Human Face Recognition Limitations

Harmon and Julesz, 1973


Degraded images

Human can handle low resolution


[Sinha05]
30/08/2006 IPCV 2006 Budapest 13
Outline of Detection Part
 Objective
– Survey major face detection works
– Address “how” and “why” questions
– Pros and cons of detection methods
– Research directions

 This part of the presentation is based on the


survey M-H. Yang:
http://vision.ai.uiuc.edu/mhyang/face-detection-
survey.html

30/08/2006 IPCV 2006 Budapest 14


Face Detection
 Identify and locate
human faces in an
image regardless
of their
– position
– scale
– in-plane rotation
– orientation
– pose (out-of plane rotation)
– and illumination
 Face is a highly non-rigid object
Where are the faces, if any?

30/08/2006 IPCV 2006 Budapest 15


Why Face Detection is Important?
 First step for any fully automatic face
recognition system
 First step in many surveillance systems
 Lots of applications
 A step towards Automatic Target
Recognition (ATR) or generic object
detection/recognition

30/08/2006 IPCV 2006 Budapest 16


Why Face Detection Is Difficult?
 Pose (Out-of-Plane Rotation): frontal, 45
degree, profile, upside down
 Presence or absence of structural
components: beard, mustache and
glasses
 Facial expression: face appearance is
directly affected by a person's facial
expression
 Occlusion: faces may be partially occluded
by other objects
 Orientation (In-Plane Rotation): face
appearance directly vary for different
rotations about the camera's optical axis
 Imaging conditions: lighting (spectra,
source distribution and intensity) and
camera characteristics (sensor response,
gain control, lenses), resolution

30/08/2006 IPCV 2006 Budapest 17


In One Thumbnail Face Image
 Consider a thumbnail 19 × 19 face pattern
 256361 possible combination of gray values
– 256361 = 28×361 = 22888
 Total world population 6,600,000,000
– 6,600,000,000 ≅ 232
 87 times more than the world population!
 Extremely high dimensional space!

30/08/2006 IPCV 2006 Budapest 18


What is a Correct Detection?

 Different interpretation of “correct detect”


 Precision of face location
 Affect the reporting results: detection, false
positive, false negative rates
30/08/2006 IPCV 2006 Budapest 19
Receiver Operator Characteristic Curve
 Useful for detailed performance
assessment
 Plot true positive (TP) proportion against
the false positive (FP) proportion for
various possible settings
 False positive: Predict a face when there is
actually none
 False negative: Predict a non-face where
there is actually one
30/08/2006 IPCV 2006 Budapest 20
Receiver Operator Characteristic (ROC)
 Correct
identification of
nearly all objects
will cause a large
percentage of
unknown objects to
be incorrectly
classified into some
known class

30/08/2006 IPCV 2006 Budapest 21


Face Detector: Ingredients
 Target application domain: single image, video
 Representation: holistic, feature, etc
 Pre processing: histogram equalization, etc
 Cues: color, motion, depth, voice, etc
 Search strategy: exhaustive, greedy, focus of
attention, etc
 Classifier design: ensemble, cascade
 Post processing: interpret detection results

30/08/2006 IPCV 2006 Budapest 22


In our Focus
 Focus on detecting
– upright, frontal
faces
– in a single gray-
scale image
– with decent
resolution
– under good lighting
conditions

30/08/2006 IPCV 2006 Budapest 23


Methods to Detect/Locate Faces
 Knowledge-based methods:
– Encode human knowledge of what constitutes a typical
face (usually, the relationships between facial features)
 Feature invariant approaches:
– Aim to find structural features of a face that exist even
when the pose, viewpoint, or lighting conditions vary
 Template matching methods:
– Several standard patterns stored to describe the face
as a whole or the facial features separately
 Appearance based methods:
– The models (or templates) are learned from a set of
training images which capture the representative
variability of facial appearance

30/08/2006 IPCV 2006 Budapest 24


Agenda (Detection)
 Detecting faces in gray scale images
– Knowledge-based
– Feature-based
– Template-based
– Appearance-based
 Detecting faces in color images
 Detecting faces in video
 Performance evaluation
 Research direction and concluding remarks

30/08/2006 IPCV 2006 Budapest 25


Knowledge-Based Methods
 Top-down approach: Represent a face using a
set of human-coded rules
 Examples:
– The center part of face has uniform intensity
values
– The difference between the average intensity
values of the center part and the upper part is
significant
– A face often appears with two eyes that are
symmetric to each other, a nose and a mouth
 Use these rules to guide the search process

30/08/2006 IPCV 2006 Budapest 26


Knowledge-Based Method: [Yang and Huang 94]
 Multi-resolution focus-of-
attention approach
 Level 1 (lowest
resolution): apply the rule
“the center part of the
face has 4 cells with a
basically uniform
intensity” to search for
candidates
 Level 2: local histogram
equalization followed by
edge detection
 Level 3: search for eye
and mouth features for
validation
30/08/2006 IPCV 2006 Budapest 27
Knowledge-Based Method:
[Kotropoulos & Pitas 94]

 Horizontal/vertical projection to search for candidates

 Search eyebrow/eyes, nostrils/nose for validation


 Difficult to detect multiple people or in complex
background

30/08/2006 IPCV 2006 Budapest 28


Knowledge-based Methods: Summary
 Pros:
– Easy to find simple rules to describe the features of a face
and their relationships
– Based on the coded rules, facial features in an input image
are extracted first and face candidates are identified
– Work well for face localization in uncluttered background
 Cons:
– Difficult to translate human knowledge into rules precisely:
detailed rules fail to detect faces and general rules may
find many false positives
– Difficult to extend this approach to detect faces in different
poses: implausible to enumerate all the possible cases

30/08/2006 IPCV 2006 Budapest 29


Agenda (Detection)
 Detecting faces in gray scale images
– Knowledge-based
– Feature-based
– Template-based
– Appearance-based
 Detecting faces in color images
 Detecting faces in video
 Performance evaluation
 Research direction and concluding remarks

30/08/2006 IPCV 2006 Budapest 30


Feature-based Methods
 Bottom-up approach: Detect facial features
(eyes, nose, mouth, etc) first
– How can we detect facial features: edge,
intensity, shape, texture, color, etc
– Aim to detect invariant features
 Group features into candidates and verify
them

30/08/2006 IPCV 2006 Budapest 31


Feature-based Example [Yang 2003]
 Use the Canny operator to find an edge
map
 Separate face from background by
grouping edges
 Best fit an ellipse to the remaining edges

30/08/2006 IPCV 2006 Budapest 32


Feature Grouping [Yow and Cipolla 90]
 Apply a 2nd derivative Gaussian filter to
search for interest points
 Group the edges near interest points into
regions
 Each feature and grouping is evaluated
within a Bayesian network
 Handle a few poses

30/08/2006 IPCV 2006 Budapest 33


Feature Grouping [Yow and Cipolla 90]
 Face model and
component

 Model facial feature as pair


of edges

 Apply interest point


operator and edge detector
to search for features

 Using Bayesian network to


combine evidence

30/08/2006 IPCV 2006 Budapest 34


Random Graph Matching [Leung et al. 95]
 Formulate as a problem to find the
correct geometric arrangement of
facial features
– Facial features are defined by
the average responses of multi-
orientation, multi-scale
Gaussian derivative filters
– Learn the configuration of
features with Gaussian
distribution of mutual distance
between facial features
 Random graph matching among
the candidates to locate faces
30/08/2006 IPCV 2006 Budapest 35
Feature-Based Methods: Summary
 Pros:
– Features are invariant to pose and orientation
change
 Cons:
– Difficult to locate facial features due to several
corruption (illumination, noise, occlusion)
– Difficult to detect features in complex
background

30/08/2006 IPCV 2006 Budapest 36


Agenda (Detection)
 Detecting faces in gray scale images
– Knowledge-based
– Feature-based
– Template-based
– Appearance-based
 Detecting faces in color images
 Detecting faces in video
 Performance evaluation
 Research direction and concluding remarks

30/08/2006 IPCV 2006 Budapest 37


Template Matching Methods
 Store a template
– Predefined: based on edges or regions
– Deformable: based on facial contours
(e.g., Snakes)
 Templates are hand-coded (not learned)
 Use correlation to locate faces

30/08/2006 IPCV 2006 Budapest 38


Face Template
 Use relative pair-wise
ratios of the brightness of
facial regions (14 × 16
pixels): the eyes are
usually darker than the
surrounding face [Sinha 94]
 Use average area
intensity values than
absolute pixel values
 See also Point Distribution
Model (PDM) [Lanitis et al. 95]

30/08/2006 IPCV 2006 Budapest 39


Template-Based Methods: Summary
 Pros:
– Relatively simple
 Cons:
– Templates needs to be initialized near the
face images
– Difficult to enumerate templates for
different poses (similar to knowledge-
based methods)
30/08/2006 IPCV 2006 Budapest 40
Agenda (Detection)
 Detecting faces in gray scale images
– Knowledge-based
– Feature-based
– Template-based
– Appearance-based
 Detecting faces in color images
 Detecting faces in video
 Performance evaluation
 Research direction and concluding remarks

30/08/2006 IPCV 2006 Budapest 41


Appearance-Based Methods
Idea: train a classifier using positive (and
usually negative) examples of faces
 Representation
 Pre processing
 Train a classifier
 Search strategy
 Post processing

30/08/2006 IPCV 2006 Budapest 42


Appearance-Based Methods: Classifiers
 Neural network: Multilayer Perceptrons
 Principal Component Analysis (PCA), Factor Analysis
 Support vector machine (SVM)
 Mixture of PCA, Mixture of factor analyzers
 Distribution-based method
 Naive Bayes classifier
 Hidden Markov model
 Sparse network of winnows (SNoW)
 Kullback relative information
 Inductive learning: C4.5
 AdaBoost (Jones and Viola)
 …

30/08/2006 IPCV 2006 Budapest 43


Representation
 Holistic: Each image is raster scanned and
represented by a vector of intensity values
 Block-based: Decompose each face image into a
set of overlapping or non-overlapping blocks
– At multiple scale
– Further processed with vector quantization,
Principal Component Analysis, etc.

30/08/2006 IPCV 2006 Budapest 44


Face and Non-Face Examples
 Positive examples:
– Get as much variation as possible
– Manually crop and normalize each face image into a
standard size face (e.g., 19 × 19 pixels)
– Creating virtual examples [Sung and Poggio 94]

 Negative examples:
– Fuzzy idea
– Any images that
do not contain faces
– A large image subspace

30/08/2006 IPCV 2006 Budapest 45


Distribution-Based Method [Sung & Poggio 94]
 Masking: reduce the
unwanted noise in a face
pattern
 Illumination gradient
correction: find the best fit
brightness plane and then
subtracted from it to reduce
heavy shadows caused by
extreme lighting angles
 Histogram equalization:
compensates the imaging
effects due to changes in
illumination and different
camera input gains

30/08/2006 IPCV 2006 Budapest 46


Creating Virtual Positive Examples
 Simple and very
effective method
 Randomly mirror,
rotate, translate
and scale face
samples by small
amounts
 Less sensitive to
alignment error
30/08/2006 IPCV 2006 Budapest 47
Search over Space and Scale

 Scan an input image at one-pixel Downsample the input image by a


increments horizontally and vertically factor of 1.2 and continue to search
30/08/2006 IPCV 2006 Budapest 48
Continue to Search over Space and Scale

 Continue to downsample the input image


and search until the image size is too small
30/08/2006 IPCV 2006 Budapest 49
Detecting Faces in Different Pose
 Probabilistic Modeling of
Local Appearance
 Extend to detect faces in
different pose with
multiple detectors
 Each detector specializes
to a view: frontal, left pose
and right pose

[Schneiderman and Kanade 98]

30/08/2006 IPCV 2006 Budapest 50


Viola & Jones: Robust Real-time Face Detection

Viola and Jones


 Goals
 Motivation
 Features (rectangle features & integral
image)
 Learning classifier functions (AdaBoost)
 Cascade of classifiers
 Results

30/08/2006 IPCV 2006 Budapest 51


Viola & Jones: Robust Real-time Face Detection

Goals of Viola and Jones


 Robust real-time face detection in images
 Decide for each sub-window (starting at
24x24) of the image whether it is a face

30/08/2006 IPCV 2006 Budapest 52


Viola & Jones: Robust Real-time Face Detection

Motivation
 Rapid face detection in images and image
sequences
 Previous approaches too slow
– They are using auxiliary information
(image differences, colors)
 Only grey scale images needed

30/08/2006 IPCV 2006 Budapest 53


Viola & Jones: Robust Real-time Face Detection

The Framework of the Method


 Haar-like Feature: A feature representation
 Integral Image: A new image representation
and a fast Haar-like feature calculation
method
 Perceptron: Classifier, uses the features
 Boost Algorithm: An efficient classifier
generating algorithm
 Cascade: Method for combining classifiers

30/08/2006 IPCV 2006 Budapest 54


Viola & Jones: Robust Real-time Face Detection

Features
 Detection based on features, not pixels

 Why?
– Features can encode domain knowledge
that is difficult to learn
– Faster than a pixel-based system
– Scale independent

30/08/2006 IPCV 2006 Budapest 55


Viola & Jones: Robust Real-time Face Detection

Rectangle Features

 (Sum of pixels in black rectangles) - (sum of


pixels in white rectangles)
 Features are very simple  rapid
computation
 How to compute them rapidly?
30/08/2006 IPCV 2006 Budapest 56
Viola & Jones: Robust Real-time Face Detection

Integral Image
 Rectangle Features
can be computed
rapidly using an
intermediate
representation of the
image: integral image
 Value at point (x,y) is
the sum of all the pixel
values above and to
the left
30/08/2006 IPCV 2006 Budapest 57
Viola & Jones: Robust Real-time Face Detection

Integral Image
 The integral image can be computed with
the following recurrences:

 where
s(x, y): cumulative column sum

 One pass over the image

30/08/2006 IPCV 2006 Budapest 58


Viola & Jones: Robust Real-time Face Detection

Computation of Rectangular Sums

Sum of rectangle D: ii(4) – ii(3) – ii(2) + ii(1)


30/08/2006 IPCV 2006 Budapest 59
Viola & Jones: Robust Real-time Face Detection

Integral Image
 Rectangular sum can be computed in four
array references
 Two-rectangle features need six array
references
 Three-rectangle features need eight array
references
 Four-rectangle features need nine array
references
30/08/2006 IPCV 2006 Budapest 60
Viola & Jones: Robust Real-time Face Detection

Feature properties
 Rough, but sensitive to simple image
structures
 Only 2 orientations
 Extremely efficient
 Compensates the disadvantages
 Face detection for an entire image at every
scale in real-time

30/08/2006 IPCV 2006 Budapest 61


Viola & Jones: Robust Real-time Face Detection

Learning Classification Functions


 Given: feature set and training set of positive +
negative face images
– In principle any machine learning approach
can be used (SVM, neural network, …)
– But: 45 396 rectangle features for each sub-
window
 Experiments showed that only a few features
needed for an efficient classifier
 How to find them?

30/08/2006 IPCV 2006 Budapest 62


Viola & Jones: Robust Real-time Face Detection

AdaBoost-Based Detector [Viola and Jones 01]


 Main idea:
– Feature selection: select important features
– Focus of attention: focus on potential regions
– Use an integral graph for fast feature evaluation
 Use AdaBoost to learn
– A small set of important features (feature selection)
 sort them in the order of importance
 each feature can be used as a simple (weak) classifier
– A cascade of classifiers that
 combine all the weak classifiers to do a difficult task
 filter out the regions that most likely do not contain
faces

30/08/2006 IPCV 2006 Budapest 63


Viola & Jones: Robust Real-time Face Detection

Feature Selection [Viola and Jones 01]


 Training: If x is a face, then x

most likely has feature 1 (easiest feature, and of greatest importance)

very likely to have feature 2 (easy feature)
… –

likely to have feature n (more complex feature, and of less importance
since it does not exist in all the faces in the training set)
 Testing: Given a test sub-image image x’
– if x’ has feature 1:
 Test whether x’ has feature 2
– Test whether x’ has feature n
–…
– else …
 else, it is not face
– else, it is not a face
 Similar to decision tree

30/08/2006 IPCV 2006 Budapest 64


Viola & Jones: Robust Real-time Face Detection

AdaBoost
 Variant of AdaBoost is used both to select
features and train the classifier
 AdaBoost is used to boost classification
performance of a simple (weak) learning
algorithm
 It combines a collection of weak classifier to form
a stronger one
 The weak learning algorithm is designed to select
the single rectangle feature which best separates
the positive and negative examples

30/08/2006 IPCV 2006 Budapest 65


Viola & Jones: Robust Real-time Face Detection

AdaBoost
 Searches over the set of possible weak classifiers
and selects those with the lowest classification
error
 Greedy feature selection process
 Efficient for selecting a small number of “good”
features (fj(x)) with significant variety
Weak classifier

 Error rates between 0.1-0.3 (early rounds) and 0.4


and 0.5 (later rounds)

30/08/2006 IPCV 2006 Budapest 66


Viola & Jones: Robust Real-time Face Detection

AdaBoost in Detail
 Given example images yi = 0,1 pos./neg.
 Initialize weights for yi = 0, 1: where m and
l number of negatives and positives
 For t = 1,…,T:
1. Normalize weights
2. For each feature j train a classifier hj which is restricted to
using a single feature, the error
3. Choose the classifier ht with lowest error
4. Update weights:
where ei = 0, if xi correctly classified, else ei = 1
 Final strong classifier:

30/08/2006 IPCV 2006 Budapest 67


Viola & Jones: Robust Real-time Face Detection

Feature Selection by AdaBoost

 First and second feature


selected by AdaBoost
 60 microprocessor
instructions

30/08/2006 IPCV 2006 Budapest 68


Viola & Jones: Robust Real-time Face Detection

AdaBoost – Learning Results


PIII 700 MHz, 2001
 200-feature-classifier
 Detection rate: 95%
 False positive rate: 1/14084
 Takes 0.7 seconds to scan 384x288 image
 Adding features to the classifier increases
computation time

30/08/2006 IPCV 2006 Budapest 69


Viola & Jones: Robust Real-time Face Detection

Cascade of Classifiers
 0.7 s for 1 image?  too slow =>
 Idea: Cascade of Classifiers to reduce
computation time
 Simple Classifiers to reject majority of negative
sub-windows: all sub-windows
many sub-windows
have no faces
T T T Further
1 2 3
processing
F F
F
rejected
sub-windows
30/08/2006 IPCV 2006 Budapest 70
Viola & Jones: Robust Real-time Face Detection

Cascade of Classifiers - Experiment


 Monolithic 200-feature-classifier vs. cascade of
ten 20-feature-classifier
 Trained with 5000 faces and 10000 non-face
sub-windows

30/08/2006 IPCV 2006 Budapest 71


Viola & Jones: Robust Real-time Face Detection

Results - Training
 Structure of the Cascade:
– 32 layers with 4297 features
– First layer uses two features, 60% non-faces
rejected, close to 100% faces detected
– Second layer uses five features, rejects 80%
non-faces
– 3rd 20-feature-classifier
– 4th 50-feature-classifier
– 5th 100-feature-classifier
– 20th 200-feature-classifier

30/08/2006 IPCV 2006 Budapest 72


Viola & Jones: Robust Real-time Face Detection

Results - Speed
 Speed of the detector depends on the
number of features evaluated
 On the MIT+CMU test set, an average of 8
features (out of 4297), on a 700 MHz
Pentium III, 384x288 takes 0.067s (2001)

30/08/2006 IPCV 2006 Budapest 73


Viola & Jones: Robust Real-time Face Detection

Outlook
 Viola & Jones improved their detector to detect not
only frontal faces, but also non-upright and non-
frontal faces
 Additional rectangle features

 Pose estimator
 Classifier for each rotation class (12)

30/08/2006 IPCV 2006 Budapest 74


Haar-like Features Extension
 45º rotated rectangle
feature by Lienhart
 Feature is represented by
(type, x, y, w, h): type
[1(a), 2(b), etc.]; (x, y) for
location in detection
window and (w, h) for
feature size
 Number of features: the
window size 24x24, there
are 117,941 different
features, a large number
30/08/2006 IPCV 2006 Budapest 75
Integral Image, Fast Feature
Calculation
 Corresponding to the
origin feature and 45º
rotated feature, there are
two types of Integral SAT(x,y)
Image, called by Lienhart
as Sum Area Table and
Rotated Sum Area Table
RSAT(x,y)

30/08/2006 IPCV 2006 Budapest 76


Integral Image, Fast Feature
Calculation
 SAT and RSAT Calculation:

 Sum of Rect Calculation:

 Feature value: calculated as the weighted


sum of the rectangles
30/08/2006 IPCV 2006 Budapest 77
Appearance-Based Methods: Summary
 Pros:
– Use powerful machine learning algorithms
– Has demonstrated good empirical results
– Fast and fairly robust
– Extended to detect faces in different pose and
orientation
 Cons
– Usually needs to search over space and scale
– Need lots of positive and negative examples
– Limited view-based approach

30/08/2006 IPCV 2006 Budapest 78


Agenda (Detection)
 Detecting faces in gray scale images
– Knowledge-based
– Feature-based
– Template-based
– Appearance-based
 Detecting faces in color images
 Detecting faces in video
 Performance evaluation
 Research direction and concluding remarks

30/08/2006 IPCV 2006 Budapest 79


Color-Based Face Detector
 Distribution of skin color across different ethnic
groups
– Under controlled illumination conditions:
compact
– Arbitrary conditions: less compact
 Color space
– RGB, normalized RGB, HSV, HIS, YCrCb,
YIQ, UES, CIE XYZ, CIE LIV, …
 Statistical analysis
– Histogram, look-up table, Gaussian model,
mixture model, …
30/08/2006 IPCV 2006 Budapest 80
Methods of Skin Detection
 Pixel-Based Methods
– Classify each pixel as skin or non-skin individually,
independently from its neighbors
– Simpler, faster
– Color Based Methods fall in this category
 Region Based Methods
– Try to take the spatial arrangement of skin pixels into
account during the detection stage to enhance the
methods performance
– More computation extensive
– Additional knowledge in terms of texture etc. are
required

30/08/2006 IPCV 2006 Budapest 81


Skin Color Based Methods - Advantages
 Allows fast processing
 Robust to geometric variations of the skin
patterns
 Robust under partial occlusion
 Robust to resolution changes
 Experience suggests that human skin has a
characteristic color, which is easily recognized
by humans

30/08/2006 IPCV 2006 Budapest 82


Skin Modelling
 Explicitly defined skin region
– Using simple rules
 Non-parametric skin distribution modeling -
Estimate skin color distribution from the training
data without deriving an explicit model of the skin
– Look up table or Histogram Model
– Bayes Classifier
 Parametric skin distribution modeling - Deriving a
parametric model from the training set
30/08/2006 IPCV 2006 Budapest 83
Explicit Rule
 To define explicitly (through a number of rules) the boundaries
skin cluster in some color space
 E.g. assuming in space (RGB, rgb, HSI, YUV, etc)
– (R’,G’,B’) is classified as skin if:
 R’ > 95 and G’ > 40 and B’ > 20,
 Max{R’,G’,B’} – min{R’,G’,B’} > 15, and Peer (2003)
 |R’-G’| > 15 and R’ > G’ and R’ > B’

 The simplicity of this method is attractive


 Simplicity of skin detection rules leads to construction of a very
rapid classifier
 The main difficulty achieving high recognition rates need to find
– Good color space
– Adequate decision rules

30/08/2006 IPCV 2006 Budapest 84


Example: YCrCb Space [Yoo and Kim]
 Skin-colors form a cluster in YCrCb color space
 We can approximate the skin color region with
several lines or with an ellipse

30/08/2006 IPCV 2006 Budapest 85


Color Histogram
 If
– h(rk, gk, bk)=nk,
 Then
– p(rk, gk, bk) = nk / N
– p(r’k, g’k, b’k) = (N-nk) / N

 That is:
– If there are nk skin color (rk, gk, bk) pixels in the image,
the probability of any pixel which is a skin pixel is nk/N
– and the probability of any pixel which is NOT a skin pixel
is (N-nk)/N

30/08/2006 IPCV 2006 Budapest 86


Bayes Probability - Recall
 P(A,B) = P(A|B) x P(B)
 P(A,B) = P(B|A) x P(A)
 If P(B) + P(~B) = 1, and P(A) + P(~A) = 1, then
– P(A) = P(A|B) x P(B) + P(A|~B) x P(~B)
– P(B) = P(B|A) x P(A) + P(B|~A) x P(~A)

 P(A|B) x P(B) = P(B|A) x P(A)

P(B|A) x P(A)
P(A|B) = -------------------
P(B)
P(B|A) x P(A)
P(A|B) = -----------------------------------------------
P(B|A) x P(A) + P(B|~A) x P(~A) )

30/08/2006 IPCV 2006 Budapest 87


Bayes Classifier (1)
 p(r,g,b|skin) is a conditional probability - a
probability of observing color (red, or green, or
blue), knowing that we see a skin pixel
 A more appropriate measure for skin detection
would be p(skin|r,g,b) - a probability of observing
skin, given a concrete (r,g,b) color value
 To compute this probability, the Bayes rule is
used:

30/08/2006 IPCV 2006 Budapest 88


Bayes Classifier (2)

 p(r,g,b|skin) and p(r,g,b|non-skin) can be


computed from skin and non-skin color histograms
 The prior probabilities p(skin) and p(non-skin) can
also be estimated from the overall number of skin
and non-skin samples in the training set An
inequality
p(skin|r,g,b) >= Q,
where Q is a threshold value, can be used as a
skin detection rule

30/08/2006 IPCV 2006 Budapest 89


Bayes Classifier (3)
 Consider normalised RGB space
 Histograms approximate probability densities
– p(skin) can be estimated as the ratio of the skin size
Nskin to the size of the image Ntotal ,
– p(non-skin) can be estimated as the ratio of the skin
size Nnon-skin to the size of the image Ntotal ,
– p((r,g,b)|skin) the probability density of (r,g,b) given
the skin,
– p((r,g,b)|non-skin) the probability density of (r,g,b)
given the non-skin,

30/08/2006 IPCV 2006 Budapest 90


Bayes Classifier (4)
 Then we can compute for any color (r,g,b)
the probability p(skin|(r,g,b)) → Probability
Map of the image
 Skin Probability Map (SPM)
 That is:

Ratio of
histograms

30/08/2006 IPCV 2006 Budapest 91


Bayes Classifier from [Rehg & Jones]
 Skin photos:

 Non skin photos:

30/08/2006 IPCV 2006 Budapest 92


Results from [Rehg & Jones]
 Used 18 696 images to build a general color model
 Density is concentrated around the gray line and is more
sharply peaked at white than black
 Most colors fall on or near the gray line
 Black and white are by far the most frequent colors, with
white occurring slightly more frequently
 There is a marked skew in the distribution toward the red
corner of the color cube
 77% of the possible 24 bit RGB colors are never
encountered (i.e. the histogram is mostly empty)
 52% of web images have people in them

30/08/2006 IPCV 2006 Budapest 93


Marginal Distributions [Rehg & Jones]

30/08/2006 IPCV 2006 Budapest 94


Skin model [Rehg & Jones]

30/08/2006 IPCV 2006 Budapest 95


Non Skin Model [Rehg & Jones]

30/08/2006 IPCV 2006 Budapest 96


Skin and Non-Skin Color Model
 Analyze 1 billion labeled pixels
 Skin and non-skin models
 A significant degree of separation between
skin non-skin model
 Achieves 80% detection rate with 8.5%
false positives
 Histogram method outperforms Gaussian
mixture model
30/08/2006 IPCV 2006 Budapest 97
Experimental Results [Jones and Rehg 02]
 May have lots of
false positives in
the raw results
 Need further
processing to
eliminate false
negatives and
group color pixels
for face detection

30/08/2006 IPCV 2006 Budapest 98


Results from [Zarit et al.]
 They compared 5 different color spaces CIELab, HSV, HS,
Normalized RGB and YCrCb
 Four different metrics are used to evaluate the results of the
skin detection algorithms
– C %– Skin and Non Skin pixels identified correctly
– S %– Skin pixels identified correctly
– SE – Skin error – skin pixels identified as non skin
– NSE – Non Skin error – non skin pixels identified as skin
 They compared the 5 color space with 2 color models –
look up table and Bayes classifier

30/08/2006 IPCV 2006 Budapest 99


Look up table results
 HSV, HS
gave the best
results
 Normalized
rg is not far
behind
 CIELAB and
YCrCb gave
poor results
30/08/2006 IPCV 2006 Budapest 100
Color-Based Face Detector: Summary
 Pros:
– Easy to implement
– Effective and efficient in constrained
environment
– Insensitive to pose, expression, rotation
variation
 Cons:
– Sensitive to environment and lighting change
– Noisy detection results (body parts, skin-tone
line regions)

30/08/2006 IPCV 2006 Budapest 101


Agenda (Detection)
 Detecting faces in gray scale images
– Knowledge-based
– Feature-based
– Template-based
– Appearance-based
 Detecting faces in color images
 Detecting faces in video
 Performance evaluation
 Research direction and concluding remarks

30/08/2006 IPCV 2006 Budapest 102


Video-Based Face Detector
 Motion cues:
– Frame differencing
– Background modeling and subtraction
 Can also used depth cue (e.g., from stereo)
when available
 Reduces the search space dramatically

30/08/2006 IPCV 2006 Budapest 103


HONDA Humanoid Robot: ASIMO
 Using motion and depth cue
– Motion cue: from a gradient-based tracker
– Depth cue: from stereo camera
 Dramatically reduce search space
 Cascade face detector

30/08/2006 IPCV 2006 Budapest 104


Video-Based Detectors: Summary
 Pros:
– An easier problem than detection in still
images
– Use all available cues: motion, depth, voice,
etc. to reduce search space
 Cons:
– Need to efficient and effective methods to
process the multimodal cues
– Data fusion
30/08/2006 IPCV 2006 Budapest 105
Agenda (Detection)
 Detecting faces in gray scale images
– Knowledge-based
– Feature-based
– Template-based
– Appearance-based
 Detecting faces in color images
 Detecting faces in video
 Performance evaluation
 Research direction and concluding remarks

30/08/2006 IPCV 2006 Budapest 106


Performance Evaluation
 Need to set the evaluation criteria/protocol
– Training set
– Test set
– What is a correct detect?
– Detection rate: false positive/negative
– Precision of face location
– Speed: training/test stage

30/08/2006 IPCV 2006 Budapest 107


Training Sets
 Cropped and pre processed data set with face
and non-face images provided by MIT CBCL:
http://www.ai.mit.edu/projects/cbcl.old/software-
datasets/index.html

30/08/2006 IPCV 2006 Budapest 108


Standard Test Sets
 MIT test set: subsumed by CMU test set
 CMU test set (http://www.cs.cmu.edu/~har)
(de facto benchmark): 130 gray scale
images with a total of 507 frontal faces
 CMU profile face test set: 208 images with
faces in profile views
 Kodak data set (Eastman Kodak Corp):
faces of multiple size, pose and varying
lighting conditions in color images
30/08/2006 IPCV 2006 Budapest 109
CMU Test Set I: Upright Frontal Faces
 130 images with 507
frontal faces
 Collected by K. Collected
by K.-K. Sung K. Sung
and H. Rowley
 Including 23 images used
in [Sung and Poggio 94]
 Some images have low
resolution
 De facto benchmark

30/08/2006 IPCV 2006 Budapest 110


CMU Test Sets: Rotated and Profile
Faces
 50 images with 223
faces in-plane
orientation, collected
by H. Rowley

 208 images with 441


faces, collected by H.
Schneiderman

30/08/2006 IPCV 2006 Budapest 111


Agenda (Detection)
 Detecting faces in gray scale images
– Knowledge-based
– Feature-based
– Template-based
– Appearance-based
 Detecting faces in color images
 Detecting faces in video
 Performance evaluation
 Research direction and concluding remarks

30/08/2006 IPCV 2006 Budapest 112


Research Issues
 Representation: How to describe a typical face?
 Scale: How to deal with face of different size?
 Search strategy: How to spot these faces?
 Speed: How to speed up the process?
 Precision: How to locate the faces precisely?
 Post processing: How to combine detection
results?

30/08/2006 IPCV 2006 Budapest 113


Face Detection: A Solved Problem?
 Not quite yet…
 Factors:
– Shadows
– Occlusions
– Robustness
– Resolution
 Lots of potential
applications
 Can be applied to
other domains

30/08/2006 IPCV 2006 Budapest 114


Outline of Recognition Part
 Objective
– Categorization
– Brief description of representative face
recognition techniques
– Open research issues

 This part of the presentation is based on the


survey W. Zhao, R. Chellappa, J. Phillips and A.
Rosenfeld:
Face Recognition: A Literature Survey ACM
Computing Survey Vol. 35, Dec. 2003/

30/08/2006 IPCV 2006 Budapest 115


Recognition from Intensity Images
High-level Categorization
 1. Holistic matching methods
– Classification using whole face region
 2. Feature-based (structural) matching methods
– Structural classification using local features
 3. Hybrid Methods
– Using local features and whole face region

30/08/2006 IPCV 2006 Budapest 116


Holistic Approaches
 Principal Component Analysis (PCA)
– Eigenface
– Fisherface/Subspace LDA
– SVM
– ICA
 Other Representations
– LDA/FLA
– PDBNN
30/08/2006 IPCV 2006 Budapest 117
IPCV, Saint-Etienne, 2004, Prof. V. Bochko

What is PCA?
 Principal component analysis (PCA) is a
technique for extracting a smaller set of
variables with less redundancy from high-
dimensional data sets in order to retain as
much of the information as possible
 A small number of principal components
often give most of the structure in the data

30/08/2006 IPCV 2006 Budapest 118


IPCV, Saint-Etienne, 2004, Prof. V. Bochko

Standard PCA
 Assume that an n-dimensional observed vector is
 The vector is centered where
 The covariance matrix is where is an expectation
 The eigendecomposition is
 The matrix of eigenvectors is
where
 The eigenvectors corresponds to eigenvalues
 The eigenvalues are
 The eigenvalues are in descending order
 The first k principal components are
 Finally, the reconstruction is

30/08/2006 IPCV 2006 Budapest 119


Facial Recognition: PCA - Overview
 Create training set of faces and
calculate the eigenfaces
 Project the new image onto the
eigenfaces
 Check if image is close to “face
space”
 Check closeness to one of the
known faces
 Add unknown faces to the
training set and re-calculate

30/08/2006 IPCV 2006 Budapest 120


Facial Recognition: PCA Training
 Find average of training
images
 Subtract average face from
each image
 Create covariance matrix
 Generate eigenfaces
 Each original image can be
expressed as a linear
combination of the
eigenfaces – face space
30/08/2006 IPCV 2006 Budapest 121
Facial Recognition: PCA Recognition
 A new image is project into the “facespace”
 Create a vector of weights that describes
this image
 The distance from the original image to this
eigenface is compared
 If within certain thresholds then it is a
recognized face

30/08/2006 IPCV 2006 Budapest 122


Feature Based Approaches
 Elastic bunch graph matching using Gabor
wavelets

30/08/2006 IPCV 2006 Budapest 123


Detecting Feature Points
 Using of Gabor Wavelets

 Create 5 frequencies, 8 angles

Spatial figure Top view


30/08/2006 IPCV 2006 Budapest 124
Gabor filters
 The responses of the filters

Original Filtered Result


 2D Gabor filters can be used to detect
dimensions along a specific orientation
 Jet: set of 40 complex coefficients for the
reference points

30/08/2006 IPCV 2006 Budapest 125


Face Graph
 Facial fiducial points
– Pupil, tip of mouth, etc.
 Face graph
– Nodes at fiducial pts.  Bunch
– Un-directed graph – A set of Jets assoc. with the
same fiducial point
– The structure of graph is
– e.g. an eye Jet may consists
the same for each face
of different types of eyes:
– Fitting a face image to a open, closed, etc.
face graph is done  Face bunch graph (FBG):
automatically – Same as a face graph,
– Some nodes may be except each node consists of
undefined due to occlusion a jet bunch rather than a jet

30/08/2006 IPCV 2006 Budapest 126


Face Bunch Graph
 FBG has the same structure as
individual face graph
– Each node labeled with a
bunch of jets
– Each edge labeled with
average distance between
corresponding nodes in face
samples

30/08/2006 IPCV 2006 Budapest 127


Still Face Recognition: Open Issues
 Addressing the issue of recognition being too
sensitive to inaccurate facial feature localization
 Robustly recognizing faces
– Small and/or noisy images
– Images acquired years apart
– Outdoor acquisition: lighting, pose
 What is the limit on the number of faces that can
be distinguished?
 What is the principal and optimal way to arbitrate
and combine local features and global features

30/08/2006 IPCV 2006 Budapest 128


Web Resources (detection)
 Face detection home page
http://home.t-online.de/home/Robert.Frischholz/face.htm
 Henry Rowley’s home page
http://www-2.cs.cmu.edu/~har/faces.html
 Henry Schneiderman’s home page
http://www.ri.cmu.edu/projects/project_416.html
 MIT CBCL web page
http://www.ai.mit.edu/projects/cbcl.old/software-datasets/
index.html
 Face detection resources
http://vision.ai.uiuc.edu/mhyang/face-detection-survey.html

30/08/2006 IPCV 2006 Budapest 129


Web Resources (recognition)
Face Recognition Home Page http.www.cs.rug.nl/~peterkr/FACE/frhp.html
Face Databases
 UT Dallas www.utdallas.edu/dept/bbs/FACULTY_PAGES/otoole/database.htm
 Notre Dame database www.nd.edu/~cvrl/HID-data.html
 MIT database ftp://whitechapel.media.mit.edu/pub/images
 Edelman ftp://ftp.wisdom.weizmann.ac.il/pub/FaceBase
 CMU PIE www.ri.cmu.edu/projects/project\_418.htm
 Stirlingdatabase pics.psych.stir.ac.uk
 M2VTS multimodal www.tele.ucl.ac.be/M2VTS
 Yale database cvc.yale.edu/projects/yalefaces/yalefaces.htm
 Yale databaseB cvc.yale.edu/projects/yalefacesB/yalefacesB.htm
 Harvard database hrl.harvard.edu/pub/faces
 Weizmanndatabase www.wisdom.weizmann.ac.il/~yael
 UMIST database images.ee.umist.ac.uk/danny/database.html
 Purdue rvl1.ecn.purdue.edu/~aleix/aleix\_face\_DB.html
 Olivetti database www.cam-orl.co.uk/facedatabase.html
 Ouluphysics-based www.ee.oulu.fi/research/imag/color/pbfd.html

30/08/2006 IPCV 2006 Budapest 130


References (detection)
 M.-H. Yang, D. J. Kriegman, and N. Ahuja,
“Detecting Faces in Images: A Survey” IEEE
Transactions on Pattern Analysis and Machine
Intelligence (PAMI), vol. 24, no. 1, pp. 34-58,
2002. (http://vision.ai.uiuc.edu/mhyang/face-
detection-survey.html
 M.-H. Yang and N. Ahuja, Face Detection and
Hand Gesture Recognition for Human Computer
Interaction, Kluwer Academic Publishers, 2001.
 Paul Viola, Michael J. Jones, Robust Real- time
Object Detection, Cambridge Research Laboratory
Technology Reports Series, CRL 2001/01, Feb, 2001

30/08/2006 IPCV 2006 Budapest 131


Face and Some CV-based
Projects at Budapest Tech

30/08/2006 IPCV 2006 Budapest 132


Combined verification

Hangfelvétel Hangminta LP spektrum

Neurális háló
Arc maszk

Original Filter response Features


30/08/2006 IPCV 2006
Szűrő válasz Budapest 133
Morphing by Automatic Feature
Detection

30/08/2006 IPCV 2006 Budapest 134


Eye and Lip Tracking
 Real time face detection
 Eye a mouth motion tracking
 Client-server internet communication

30/08/2006 IPCV 2006 Budapest 135


MYRA
 Face detection
 Skin tone based
tracking
 Feature tracking
 Motion detection
 Recognition options:
– Eigenface method
– Gabor wavelet
based bunch
graph

30/08/2006 IPCV 2006 Budapest 136


Mobile robots – navigation using CV

30/08/2006 IPCV 2006 Budapest 137


Recognition and navigation

30/08/2006 IPCV 2006 Budapest 138

You might also like