Download as pdf or txt
Download as pdf or txt
You are on page 1of 16

Beyond Bags of Features: Spatial Pyramid

Matching for Recognizing Natural Scene Categories

Svetlana Lazebnik Cordelia Schmid Jean Ponce


September 19, 2011

Svetlana Lazebnik, Cordelia Schmid, Jean Ponce

Beyond Bags of Features: Spatial Pyramid Matching for Re

Motivation

Consider the problem of recognizing the semantic category of


an image.
Classify a photograph as depicting a scene (forest, street,
oce, etc.)
Bag-of- features approach with global geometric
correspondence
Subdividing the image and computing histograms of local
features at increasingly ne resolutions

Svetlana Lazebnik, Cordelia Schmid, Jean Ponce

Beyond Bags of Features: Spatial Pyramid Matching for Re

Histogram intersection

I l = I HXl , HYl =


Svetlana Lazebnik, Cordelia Schmid, Jean Ponce

D
X
i =1

min HXl (i ) , HYl (i )




Beyond Bags of Features: Spatial Pyramid Matching for Re

Pyramid match kernel

matches found at level l also includes all the matches found

l +1

new matches found at level l : I l I l +1 for l = 0, .., L 1

penalize matches found in larger cells:


LP
1

1
k L (X , Y ) = I L +
I l I l +1
2Ll
l =0
L
P
1 0
L
k (X , Y ) = 2L I + 2L1l + I l
l =1

1
2Ll

Svetlana Lazebnik, Cordelia Schmid, Jean Ponce

Beyond Bags of Features: Spatial Pyramid Matching for Re

Pyramid match kernel

Svetlana Lazebnik, Cordelia Schmid, Jean Ponce

Beyond Bags of Features: Spatial Pyramid Matching for Re

Spatial Matching Scheme

m: feature type
Xm : coordinates of features of type m
for L levels and M channels
K L (X , Y ) =

M
P
k L (Xm , Ym )

m=1

Vector dimensionality: M

L

P
4l = M 13 4L+1 1

l =0
However, these operations are ecient because the histogram
vectors are extremely sparse
The computational complexity of the kernel is linear in the
number of features

Svetlana Lazebnik, Cordelia Schmid, Jean Ponce

Beyond Bags of Features: Spatial Pyramid Matching for Re

Mercer's kernel

According to Mercer's theorem, a kernel K is positive semi-denite


if and only if there exists a mapping such that
K (xi , xj ) = h(xi ), (xj )i , xi , xj X
where h, idenotes a scalar dot product
Is an inner product in a suitable feature space

V (H ) =

H (1) m H (1)
z }| { z }| {
1, .., 1, 0, ...0
|
{z
}

rst bin

, ...,

H (r ) m H (r )

z }| { z }| {
, 1, ..., 1, 0, ...0
|
{z
}

last bin

p-dimensional binary vector, p = m r


m: total number of points in the histogram

Svetlana Lazebnik, Cordelia Schmid, Jean Ponce

Beyond Bags of Features: Spatial Pyramid Matching for Re

Feature Extraction
1

Weak features

Oriented to edge points


Points whose gradient magnitude in given direction exceeds a
minimum threshold
Extract edge points at two scales and eight orientations
M = 16 channels
2

Strong features

SIFT descriptor
Dense regular grid
16 16 pixel patches
Vocabulary sizes: M = 200, M = 400 (k-means )

Svetlana Lazebnik, Cordelia Schmid, Jean Ponce

Beyond Bags of Features: Spatial Pyramid Matching for Re

Experiments

Three diverse datasets:


fteen scene categories
Caltech-101 [3]
Graz

Perform all processing in grayscale


All experiments are repeated ten times with dierent randomly
selected training and test images
The nal result is reported as the mean and standard deviation
of the results from the individual runs
Multi-class classication: Support vector machine
(SVM),trained using the one-versus-all rule

Svetlana Lazebnik, Cordelia Schmid, Jean Ponce

Beyond Bags of Features: Spatial Pyramid Matching for Re

Scene Category Recognition

Svetlana Lazebnik, Cordelia Schmid, Jean Ponce

Beyond Bags of Features: Spatial Pyramid Matching for Re

Scene Category Recognition

Spatial pyramid kernel and strong features with M = 200


Latent semantic analysis (pLSA): Dimensionality reduction of
the feature space from 200 to 60

Svetlana Lazebnik, Cordelia Schmid, Jean Ponce

Beyond Bags of Features: Spatial Pyramid Matching for Re

Confusion table

Confusion occurs between the indoor classes (kitchen, bedroom, living room)
Svetlana Lazebnik, Cordelia Schmid, Jean Ponce

Beyond Bags of Features: Spatial Pyramid Matching for Re

Retrieval from the scene category database.

Svetlana Lazebnik, Cordelia Schmid, Jean Ponce

Beyond Bags of Features: Spatial Pyramid Matching for Re

Caltech-101

This database contains from 31 to 800 images per category


The most diverse object database

Svetlana Lazebnik, Cordelia Schmid, Jean Ponce

Beyond Bags of Features: Spatial Pyramid Matching for Re

The Graz database

High intra-class variation


Two object classes, bikes (373 images) and persons (460
images)

Train detectors for persons and bikes on 100 positive and 100
negative images
Results for strong features:

Svetlana Lazebnik, Cordelia Schmid, Jean Ponce

Beyond Bags of Features: Spatial Pyramid Matching for Re

Conclusions

This paper presents a method for recognizing scene categories


based on approximate global geometric correspondence
Ecient algorithm, the computational complexity of the kernel
is linear in the number of features
Does very well on global scene classication tasks
When a class is characterized by high geometric variability, it is
dicult to nd useful global features

Svetlana Lazebnik, Cordelia Schmid, Jean Ponce

Beyond Bags of Features: Spatial Pyramid Matching for Re

You might also like