Download as pdf or txt
Download as pdf or txt
You are on page 1of 70

Feature Matching

Light and Color


Visi Komputer
2021/2022 - 2
Local features: main components

1) Detection:
Find a set of distinctive key points.

2) Description:
Extract feature descriptor around
each interest point as vector.

x1 x1 = [ x1(1) ,  , xd(1) ]

3) Matching:
x2 x 2 = [ x1( 2 ) ,  , xd( 2 ) ]
Compute distance between feature
vectors to find correspondence.

d ( x1 , x 2 ) < T
1
Review: Harris corner detector E(u, v)

• Approximate distinctiveness by local


auto-correlation.
• Approximate local auto-correlation by
second moment matrix M.
• Distinctiveness (or cornerness) relates
to the eigenvalues of M.
(λmax)-1/2
• Instead of computing eigenvalues (λmin)-1/2
directly, we can use determinant
and trace of M.
 I x2 I x I y 
M = ∑ w( x, y )  2 
x, y  I x I y I y  2
James Hays
Trace / determinant and eigenvalues
• Given n x n matrix A with eigenvalues λ1…n

• R = λ1λ2 – α(λ1+λ2)2 = det(M) – αtrace(M)2

3
Harris Corner Detector [Harris88]
James Hays

0. Input image
We want to compute M at each pixel.
𝐼𝐼
1. Compute image derivatives (optionally, blur first).

𝐼𝐼𝑥𝑥 𝐼𝐼𝑦𝑦
2. Compute 𝑀𝑀 components
as squares of derivatives.
𝐼𝐼𝑥𝑥2 𝐼𝐼𝑦𝑦2 𝐼𝐼𝑥𝑥𝑥𝑥
3. Gaussian filter g() with width σ

𝑔𝑔(𝐼𝐼𝐼𝐼2) 𝑔𝑔(𝐼𝐼𝐼𝐼2) 𝑔𝑔(𝐼𝐼𝐼𝐼 ∘ 𝐼𝐼𝑦𝑦)

4. Compute cornerness
𝐶𝐶 = det 𝑀𝑀 − 𝛼𝛼 trace 𝑀𝑀 2
2
= 𝑔𝑔 𝐼𝐼𝑥𝑥2 ∘ 𝑔𝑔 𝐼𝐼𝑦𝑦2 − 𝑔𝑔 𝐼𝐼𝑥𝑥 ∘ 𝐼𝐼𝑦𝑦
2
−𝛼𝛼 𝑔𝑔 𝐼𝐼𝑥𝑥2 + 𝑔𝑔 𝐼𝐼𝑦𝑦2
𝐶𝐶 5. Threshold on 𝐶𝐶 to pick high cornerness
4
6. Non-maxima suppression to pick peaks.
Local Image Descriptors
Read Szeliski 4.1

Acknowledgment: Many slides from James Hays, Derek Hoiem and Grauman & Leibe 2008 AAAI Tutorial
Local features: main components
1) Detection:
Find a set of distinctive key points.

2) Description:
Extract feature descriptor around each
interest point as vector.

x1 x1 = [ x1(1) ,  , xd(1) ]

3) Matching:
x2 x 2 = [ x1( 2 ) ,  , xd( 2 ) ]
Compute distance between feature
vectors to find correspondence.

d ( x1 , x 2 ) < T
6
Image representations

• Templates
– Intensity, gradients, etc.

• Histograms
– Color, texture, SIFT descriptors, etc.

7
James Hays
Image Representations: Histograms

Global histogram to represent


distribution of features
– How ‘well exposed’ a photo is

What about a local histogram per Space Shuttle


detected point? Cargo Bay
8
Images from Dave Kauchak
Image Representations: Histograms
Histogram: Probability or count of data in each bin

• Joint histogram Marginal histogram


– Requires lots of data • Requires independent features
– Loss of resolution to • More data/bin than
avoid empty bins joint histogram 9
Images from Dave Kauchak
Image Representations: Histograms

Clustering

EASE Truss
Assembly

Use the same cluster centers for all images


Space Shuttle
Cargo Bay
10
Images from Dave Kauchak
Computing histogram distance
K
histint( hi , h j ) = 1 − ∑ min (hi (m), h j (m) )
m =1
Histogram intersection (assuming normalized histograms)

K [ h ( m) − h ( m)]2
1
χ 2 (hi , h j ) = ∑ i j

2 m =1 hi (m) + h j (m)
Chi-squared Histogram matching distance

11
Cars found by color histogram matching using chi-squared James Hays
Histograms: Implementation issues
• Quantization
– Grids: fast but applicable only with few dimensions
– Clustering: slower but can quantize data in higher
dimensions

Few Bins Many Bins


Need less data Need more data
Coarser representation Finer representation

• Matching
– Histogram intersection or Euclidean may be faster
– Chi-squared often works better
– Earth mover’s distance is good for when nearby bins
represent similar values
12
James Hays
For what things might we compute histograms?
• Color

L*a*b* color space HSV color space

• Model local appearance


13
James Hays
For matching between points,
we care about local texture

• Texture = distribution of image differences


• Build local histograms of oriented gradients

• Key algorithm:
SIFT: Scale Invariant Feature Transform
– David Lowe, IJCV 2004
– Effective: landmark algorithm in computer vision
– Extremely popular (63k citations in 2021)

14
James Hays
SIFT

SCALE-INVARIANT
FEATURE TRANSFORM

15
SIFT descriptor formation
• Compute on local 16 x 16 window around detection.
• Rotate and scale window according to discovered
orientation ϴ and scale σ (gain invariance).
• Compute gradients weighted by a Gaussian of
variance half the window (for smooth falloff).

Actually 16x16, only showing 8x8 16


James Hays
SIFT vector formation
• 4x4 array of gradient orientation histograms weighted by
gradient magnitude.
• Bin into 8 orientations x 4x4 array = 128 dimensions.

Showing only 2x2 here but is 4x4

17
James Hays
Ensure smoothness
• Gaussian weight the magnitudes
• Trilinear interpolation
– A given gradient contributes to 8 bins:
4 in space x 2 in orientation

Showing only 2x2 here but is 4x4

18
James Hays
Reduce effect of illumination
• 128-dim vector normalized to 1
• Threshold gradient magnitudes to avoid excessive
influence of high gradients
– After normalization, clamp gradients > 0.2
– Renormalize
Showing only 2x2 here but is 4x4

19
James Hays
SIFT steps
1. Find Harris corners scale-space extrema as key
point locations
– Alternative: Use Difference of Gaussian points
(see hidden slides in previous slide deck)

2. Post-processing
– Find subpixel key point positions by interpolation
– Discard low-contrast points
– Eliminate points along edges

3. Estimate dominant orientation per feature point

20
SIFT Orientation Estimation
• Compute gradient orientation histogram
• Select dominant orientation ϴ

0 2π
21
[Lowe, SIFT, 1999]
SIFT Orientation Normalization
• Compute orientation histogram
• Select dominant orientation ϴ
• Normalize: rotate to fixed orientation

0 2π
22
[Lowe, SIFT, 1999]
SIFT steps
4. Descriptor extraction based on texture
– Texture helps describe area around key point
– But texture will vary as camera view changes
– Solution: Balance sensitivity to spatial layout of
texture with area statistics
– Use spatial blocks of histograms to provide this.

23
SIFT Descriptor Extraction
• Given a keypoint with scale and orientation:
– Pick scale-space image which most closely matches
estimated scale
– Resample image to match orientation OR
– Subtract detector orientation from vector to give
invariance to general image rotation.

Rotate image to remove rotation


Resize image to set scale 24
SIFT Descriptor Extraction
• Given a keypoint with scale and orientation

Gradient magnitude
and orientation 8 bin ‘histogram’
per pixel - add magnitude
amounts!
Compute using
horiz+vert edge filters
+ trigonometry 25
Utkarsh Sinha
SIFT Descriptor Extraction
Weight 16x16 grid by Gaussian to add location
robustness and reduce effect of outer regions

𝜎𝜎 = half
window
width

Gradient magnitude
and orientation 8 bin ‘histogram’
per pixel - add magnitude
amounts!
Compute using
horiz+vert edge filters
+ trigonometry 26
Utkarsh Sinha
SIFT Descriptor Extraction
5. Extract 8 x 16 values into 128-dim vector
– Simply copy them in sequence!
6. Illumination invariance:
– Working in gradient space, so robust to I = I + b
– Normalize vector to [0…1]
• Robust to I = αI brightness changes
– Clamp all vector values > 0.2 to 0.2.
• Robust to “non-linear illumination effects”
• Image value saturation / specular highlights
– Renormalize
27
Specular highlights move between image pairs!

28
Efficient Implementation
• Filter using oriented kernels based on
directions of histogram bins.

• Use low-level mathematical operations to bin


values; exploit rounding to integers.

29
30
SIFT-like descriptor in HW 2
• SIFT is hand designed based on intuition
• You implement your own SIFT-like descriptor
– Ignore scale/orientation to start.

• Parameters: stick with defaults + minor tweaks


• Feel free to look at papers / resources for
inspiration

Lowe’s original paper: http://www.cs.ubc.ca/~lowe/papers/ijcv04.pdf


31
Review: Local Descriptors
• Most features can be thought of as templates,
histograms (counts), or combinations
• The ideal descriptor should be
– Robust and Distinctive
– Compact and Efficient
• Most available descriptors focus on
edge/gradient information
– Capture texture information
– Color rarely used

32
Next Class: Feature Matching, and Light & Color

33
Motion illusion, rotating snakes

34
Does it work on other animals?

35
Feature Matching
Read Szeliski 4.1

Many slides from James Hays, Derek Hoiem, and Grauman&Leibe 2008 AAAI Tutorial
Local features: main components
1) Detection:
Find a set of distinctive key points.

2) Description:
Extract feature descriptor around each
interest point as vector.

x1 x1 = [ x1(1) ,  , xd(1) ]

3) Matching:
x2 x 2 = [ x1( 2 ) ,  , xd( 2 ) ]
Compute distance between feature
vectors to find correspondence.

37
How do we decide which features match?

Distance: 0.34, 0.30, 0.40


38
Distance: 0.61, 1.22
Think-Pair-Share
• Design a feature point matching scheme.
• Two images, I1 and I2

• Two sets X1 and X2 of feature points


– Each feature point x1 has a descriptor x1 = [ x1(1) ,, xd(1) ]

• Distance, bijective/injective/surjective, noise,


confidence, computational complexity,
generality…
39
Euclidean distance vs. Cosine Similarity
• Euclidean distance:

• Cosine similarity:

40
Wikipedia
Feature Matching
• Criteria 1:
– Compute distance in feature space, e.g., Euclidean
distance between 128-dim SIFT descriptors
– Match point to lowest distance (nearest neighbor)

• Problems:
– Does everything have a match?

41
Feature Matching
• Criteria 2:
– Compute distance in feature space, e.g., Euclidean
distance between 128-dim SIFT descriptors
– Match point to lowest distance (nearest neighbor)
– Ignore anything higher than threshold (no match!)

• Problems:
– Threshold is hard to pick
– Non-distinctive features could have lots of close
matches, only one of which is correct
42
Nearest Neighbor Distance Ratio

Compare distance of closest (NN1) and second-


closest (NN2) feature vector neighbor.

𝑁𝑁𝑁𝑁𝑁
• If NN1 ≈ NN2, ratio will be ≈ 1 -> matches too close.
𝑁𝑁𝑁𝑁𝑁
𝑁𝑁𝑁𝑁𝑁
• As NN1 << NN2, ratio tends to 0.
𝑁𝑁𝑁𝑁𝑁

Sorting by this ratio puts matches in order of confidence.


Threshold ratio – but how to choose?
43
Nearest Neighbor Distance Ratio
• Lowe computed a probability distribution functions of ratios
• 40,000 keypoints with hand-labeled ground truth

Ratio threshold
depends on your
application’s view on
the trade-off between
the number of false
positives and true
positives!

44
Lowe IJCV 2004
Efficient compute cost
• Naïve looping: Expensive

• Operate on matrices of descriptors


• E.g., for row vectors,

features_image1 * features_image2T

produces matrix of dot product results


for all pairs of features

45
HOW GOOD IS SIFT?

46
SIFT Repeatability

47
Lowe IJCV 2004
SIFT Repeatability

48
Lowe IJCV 2004
SIFT Repeatability

49
Lowe IJCV 2004
SIFT Repeatability

50
Lowe IJCV 2004
Why is SIFT so good?
Tries to be invariant to:
• Translation [Harris]
• Rotation [finding major orientation]
• Scale [computed at many scales]

• Brightness changes [gradient space]


• Outlier brightnesses [clipping at 0.2]

• Local texture variation within 4x4 [binning]

51
Local Descriptors: SURF
• Fast approximation of SIFT idea
 Efficient computation by 2D box filters &
integral images
⇒ 6 times faster than SIFT
 Equivalent quality for object identification

• GPU implementation available


 Feature extraction @ 200Hz
(detector + descriptor, 640×480 img)
 http://www.vision.ee.ethz.ch/~surf

[Bay, ECCV’06], [Cornelis, CVGPU’08] 52


Local Descriptors: Shape Context

Count the number of points


inside each bin, e.g.:

Count = 4

...
Count = 10

Log-polar binning:
More precision for nearby
points, more flexibility for
farther points.

Belongie & Malik, ICCV 2001 53


K. Grauman, B. Leibe
Shape Context Descriptor

54
Local Descriptors: Geometric Blur

Compute
edges at four
orientations
Extract a patch
in each channel

~
Apply spatially varying
blur and sub-sample
Example descriptor
(Idealized signal)

Berg & Malik, CVPR 2001 55


K. Grauman, B. Leibe
Self-similarity Descriptor

Matching Local Self-Similarities across Images


and Videos, Shechtman and Irani, 2007
56
James Hays
Self-similarity Descriptor

Matching Local Self-Similarities across Images


and Videos, Shechtman and Irani, 2007
57
James Hays
Self-similarity Descriptor

Matching Local Self-Similarities across Images


and Videos, Shechtman and Irani, 2007
58
James Hays
Learning Local Image Descriptors
Winder and Brown, 2007

59
Right features are application specific
• Shape: scene-scale, object-scale, detail-scale
– 2D form, shading, shadows, texture, linear
perspective
• Material properties: albedo, feel, hardness, …
– Color, texture
• Motion
– Optical flow, tracked points
• Distance
– Stereo, position, occlusion, scene shape
– If known object: size, other objects

60
Available at a web site near you…
• Many local feature detectors have executables
available online:
– http://www.robots.ox.ac.uk/~vgg/research/affine
– http://www.cs.ubc.ca/~lowe/keypoints/
– http://www.vision.ee.ethz.ch/~surf

61
Review: Interest points

• Keypoint detection: repeatable


and distinctive
– Corners, blobs, stable regions
– Harris, DoG

• Descriptors: robust and selective


– Spatial histograms of orientation
– SIFT

62
Recap so far
• What is an image?
– Sampling and aliasing
– Thinking in frequency
• Image filtering
– Kernels and their responses
– Kernel orientation
– Kernel sizes and scale spaces
• Local features
– Detection by finding corners via peaks
– Description by hand-coding local ‘texture’
– Matching robustly 63
Feature matching can be used for fingerprints

Fingerprints have more specific features, but our general approach of


feature point detection, description, and matching remains the same.

Arches Loops Whorls 64


Biometric Data: Definitions

European GDPR
“Personal data resulting from specific technical processing relating to the physical,
physiological, or behavioral characteristics of a natural person, which allows or confirms
the unique identification of that natural person, such as facial images or fingerprint data.”

California Consumer Privacy Act


CCPA: “An individual's physiological, biological or behavioral characteristics, including an
individual's D.N.A., that can be used, singly or in combination with each other or with
other identifying data, to establish individual identity.”

65
Biometric Data + Privacy

Possible rights:
- Right of disclosure or access
- Right to be forgotten
- Data portability – data must be in a commonly used and readable format
- Right to request a businesses not to sell their personal information
- Right to opt-out
(Opt-in is the primary consent standard mandated by European GDPR)
- Right of action (penalties).
66
Biometric Data + Consent

[Cathal McNaughton/REUTERS] [Munir Uz Zaman/AFP/Getty Images]

Rohingya people (majority Muslim) live in Myanmar (majority Buddhist). Tensions flare in 2016,
leading to genocide and mass exodus (1m+) of Rohingya as refugees to neighboring countries
like Bangladesh.
UN Refugee Agency/Bangladesh issue SmartCards containing biometric data to access aid
and services. Bangladesh then shared Rohingya refugee biometric data without explicit
consent with Myanmar for possible repatriation and a risk of future persecution.

Rohingya Crisis in Maps, Al Jazeera, 28th October 2017


https://www.aljazeera.com/news/2017/10/28/rohingya-crisis-explained-in-maps

Human Rights Watch, 15th June 2021 67


https://www.hrw.org/news/2021/06/15/un-shared-rohingya-data-without-informed-consent
Biometric Data + Surveillance

US Department of Homeland Secuirty runs the


Office of Biometric Identity Management, using
the Automated Biometric Identification System
to store, match, share over 260 million
identities.

Panopticon (Bentham): A circular prison with cells


arranged around a central well, from which prisoners
could always be observed to condition their
behaviour.

Observation = Knowledge = Power

68
https://www.dhs.gov/biometrics [Presidio Modelo, Cuba]
Biometric Data + AI

South Korean government develops “smart epidemiological investigation


system” using AI for policing and to track spread of disease.

Trained using biometric data collected from CCTV camera footage without
people’s consent, and made accessible to private companies.

69
https://www.hani.co.kr/arti/english_edition/e_national/1019531.html

You might also like