Professional Documents
Culture Documents
Materi 08 VK 2122 2
Materi 08 VK 2122 2
1) Detection:
Find a set of distinctive key points.
2) Description:
Extract feature descriptor around
each interest point as vector.
x1 x1 = [ x1(1) , , xd(1) ]
3) Matching:
x2 x 2 = [ x1( 2 ) , , xd( 2 ) ]
Compute distance between feature
vectors to find correspondence.
d ( x1 , x 2 ) < T
1
Review: Harris corner detector E(u, v)
3
Harris Corner Detector [Harris88]
James Hays
0. Input image
We want to compute M at each pixel.
𝐼𝐼
1. Compute image derivatives (optionally, blur first).
𝐼𝐼𝑥𝑥 𝐼𝐼𝑦𝑦
2. Compute 𝑀𝑀 components
as squares of derivatives.
𝐼𝐼𝑥𝑥2 𝐼𝐼𝑦𝑦2 𝐼𝐼𝑥𝑥𝑥𝑥
3. Gaussian filter g() with width σ
4. Compute cornerness
𝐶𝐶 = det 𝑀𝑀 − 𝛼𝛼 trace 𝑀𝑀 2
2
= 𝑔𝑔 𝐼𝐼𝑥𝑥2 ∘ 𝑔𝑔 𝐼𝐼𝑦𝑦2 − 𝑔𝑔 𝐼𝐼𝑥𝑥 ∘ 𝐼𝐼𝑦𝑦
2
−𝛼𝛼 𝑔𝑔 𝐼𝐼𝑥𝑥2 + 𝑔𝑔 𝐼𝐼𝑦𝑦2
𝐶𝐶 5. Threshold on 𝐶𝐶 to pick high cornerness
4
6. Non-maxima suppression to pick peaks.
Local Image Descriptors
Read Szeliski 4.1
Acknowledgment: Many slides from James Hays, Derek Hoiem and Grauman & Leibe 2008 AAAI Tutorial
Local features: main components
1) Detection:
Find a set of distinctive key points.
2) Description:
Extract feature descriptor around each
interest point as vector.
x1 x1 = [ x1(1) , , xd(1) ]
3) Matching:
x2 x 2 = [ x1( 2 ) , , xd( 2 ) ]
Compute distance between feature
vectors to find correspondence.
d ( x1 , x 2 ) < T
6
Image representations
• Templates
– Intensity, gradients, etc.
• Histograms
– Color, texture, SIFT descriptors, etc.
7
James Hays
Image Representations: Histograms
Clustering
EASE Truss
Assembly
K [ h ( m) − h ( m)]2
1
χ 2 (hi , h j ) = ∑ i j
2 m =1 hi (m) + h j (m)
Chi-squared Histogram matching distance
11
Cars found by color histogram matching using chi-squared James Hays
Histograms: Implementation issues
• Quantization
– Grids: fast but applicable only with few dimensions
– Clustering: slower but can quantize data in higher
dimensions
• Matching
– Histogram intersection or Euclidean may be faster
– Chi-squared often works better
– Earth mover’s distance is good for when nearby bins
represent similar values
12
James Hays
For what things might we compute histograms?
• Color
• Key algorithm:
SIFT: Scale Invariant Feature Transform
– David Lowe, IJCV 2004
– Effective: landmark algorithm in computer vision
– Extremely popular (63k citations in 2021)
14
James Hays
SIFT
SCALE-INVARIANT
FEATURE TRANSFORM
15
SIFT descriptor formation
• Compute on local 16 x 16 window around detection.
• Rotate and scale window according to discovered
orientation ϴ and scale σ (gain invariance).
• Compute gradients weighted by a Gaussian of
variance half the window (for smooth falloff).
17
James Hays
Ensure smoothness
• Gaussian weight the magnitudes
• Trilinear interpolation
– A given gradient contributes to 8 bins:
4 in space x 2 in orientation
18
James Hays
Reduce effect of illumination
• 128-dim vector normalized to 1
• Threshold gradient magnitudes to avoid excessive
influence of high gradients
– After normalization, clamp gradients > 0.2
– Renormalize
Showing only 2x2 here but is 4x4
19
James Hays
SIFT steps
1. Find Harris corners scale-space extrema as key
point locations
– Alternative: Use Difference of Gaussian points
(see hidden slides in previous slide deck)
2. Post-processing
– Find subpixel key point positions by interpolation
– Discard low-contrast points
– Eliminate points along edges
20
SIFT Orientation Estimation
• Compute gradient orientation histogram
• Select dominant orientation ϴ
0 2π
21
[Lowe, SIFT, 1999]
SIFT Orientation Normalization
• Compute orientation histogram
• Select dominant orientation ϴ
• Normalize: rotate to fixed orientation
0 2π
22
[Lowe, SIFT, 1999]
SIFT steps
4. Descriptor extraction based on texture
– Texture helps describe area around key point
– But texture will vary as camera view changes
– Solution: Balance sensitivity to spatial layout of
texture with area statistics
– Use spatial blocks of histograms to provide this.
23
SIFT Descriptor Extraction
• Given a keypoint with scale and orientation:
– Pick scale-space image which most closely matches
estimated scale
– Resample image to match orientation OR
– Subtract detector orientation from vector to give
invariance to general image rotation.
Gradient magnitude
and orientation 8 bin ‘histogram’
per pixel - add magnitude
amounts!
Compute using
horiz+vert edge filters
+ trigonometry 25
Utkarsh Sinha
SIFT Descriptor Extraction
Weight 16x16 grid by Gaussian to add location
robustness and reduce effect of outer regions
𝜎𝜎 = half
window
width
Gradient magnitude
and orientation 8 bin ‘histogram’
per pixel - add magnitude
amounts!
Compute using
horiz+vert edge filters
+ trigonometry 26
Utkarsh Sinha
SIFT Descriptor Extraction
5. Extract 8 x 16 values into 128-dim vector
– Simply copy them in sequence!
6. Illumination invariance:
– Working in gradient space, so robust to I = I + b
– Normalize vector to [0…1]
• Robust to I = αI brightness changes
– Clamp all vector values > 0.2 to 0.2.
• Robust to “non-linear illumination effects”
• Image value saturation / specular highlights
– Renormalize
27
Specular highlights move between image pairs!
28
Efficient Implementation
• Filter using oriented kernels based on
directions of histogram bins.
29
30
SIFT-like descriptor in HW 2
• SIFT is hand designed based on intuition
• You implement your own SIFT-like descriptor
– Ignore scale/orientation to start.
32
Next Class: Feature Matching, and Light & Color
33
Motion illusion, rotating snakes
34
Does it work on other animals?
35
Feature Matching
Read Szeliski 4.1
Many slides from James Hays, Derek Hoiem, and Grauman&Leibe 2008 AAAI Tutorial
Local features: main components
1) Detection:
Find a set of distinctive key points.
2) Description:
Extract feature descriptor around each
interest point as vector.
x1 x1 = [ x1(1) , , xd(1) ]
3) Matching:
x2 x 2 = [ x1( 2 ) , , xd( 2 ) ]
Compute distance between feature
vectors to find correspondence.
37
How do we decide which features match?
• Cosine similarity:
40
Wikipedia
Feature Matching
• Criteria 1:
– Compute distance in feature space, e.g., Euclidean
distance between 128-dim SIFT descriptors
– Match point to lowest distance (nearest neighbor)
• Problems:
– Does everything have a match?
41
Feature Matching
• Criteria 2:
– Compute distance in feature space, e.g., Euclidean
distance between 128-dim SIFT descriptors
– Match point to lowest distance (nearest neighbor)
– Ignore anything higher than threshold (no match!)
• Problems:
– Threshold is hard to pick
– Non-distinctive features could have lots of close
matches, only one of which is correct
42
Nearest Neighbor Distance Ratio
𝑁𝑁𝑁𝑁𝑁
• If NN1 ≈ NN2, ratio will be ≈ 1 -> matches too close.
𝑁𝑁𝑁𝑁𝑁
𝑁𝑁𝑁𝑁𝑁
• As NN1 << NN2, ratio tends to 0.
𝑁𝑁𝑁𝑁𝑁
Ratio threshold
depends on your
application’s view on
the trade-off between
the number of false
positives and true
positives!
44
Lowe IJCV 2004
Efficient compute cost
• Naïve looping: Expensive
features_image1 * features_image2T
45
HOW GOOD IS SIFT?
46
SIFT Repeatability
47
Lowe IJCV 2004
SIFT Repeatability
48
Lowe IJCV 2004
SIFT Repeatability
49
Lowe IJCV 2004
SIFT Repeatability
50
Lowe IJCV 2004
Why is SIFT so good?
Tries to be invariant to:
• Translation [Harris]
• Rotation [finding major orientation]
• Scale [computed at many scales]
51
Local Descriptors: SURF
• Fast approximation of SIFT idea
Efficient computation by 2D box filters &
integral images
⇒ 6 times faster than SIFT
Equivalent quality for object identification
Count = 4
...
Count = 10
Log-polar binning:
More precision for nearby
points, more flexibility for
farther points.
54
Local Descriptors: Geometric Blur
Compute
edges at four
orientations
Extract a patch
in each channel
~
Apply spatially varying
blur and sub-sample
Example descriptor
(Idealized signal)
59
Right features are application specific
• Shape: scene-scale, object-scale, detail-scale
– 2D form, shading, shadows, texture, linear
perspective
• Material properties: albedo, feel, hardness, …
– Color, texture
• Motion
– Optical flow, tracked points
• Distance
– Stereo, position, occlusion, scene shape
– If known object: size, other objects
60
Available at a web site near you…
• Many local feature detectors have executables
available online:
– http://www.robots.ox.ac.uk/~vgg/research/affine
– http://www.cs.ubc.ca/~lowe/keypoints/
– http://www.vision.ee.ethz.ch/~surf
61
Review: Interest points
62
Recap so far
• What is an image?
– Sampling and aliasing
– Thinking in frequency
• Image filtering
– Kernels and their responses
– Kernel orientation
– Kernel sizes and scale spaces
• Local features
– Detection by finding corners via peaks
– Description by hand-coding local ‘texture’
– Matching robustly 63
Feature matching can be used for fingerprints
European GDPR
“Personal data resulting from specific technical processing relating to the physical,
physiological, or behavioral characteristics of a natural person, which allows or confirms
the unique identification of that natural person, such as facial images or fingerprint data.”
65
Biometric Data + Privacy
Possible rights:
- Right of disclosure or access
- Right to be forgotten
- Data portability – data must be in a commonly used and readable format
- Right to request a businesses not to sell their personal information
- Right to opt-out
(Opt-in is the primary consent standard mandated by European GDPR)
- Right of action (penalties).
66
Biometric Data + Consent
Rohingya people (majority Muslim) live in Myanmar (majority Buddhist). Tensions flare in 2016,
leading to genocide and mass exodus (1m+) of Rohingya as refugees to neighboring countries
like Bangladesh.
UN Refugee Agency/Bangladesh issue SmartCards containing biometric data to access aid
and services. Bangladesh then shared Rohingya refugee biometric data without explicit
consent with Myanmar for possible repatriation and a risk of future persecution.
68
https://www.dhs.gov/biometrics [Presidio Modelo, Cuba]
Biometric Data + AI
Trained using biometric data collected from CCTV camera footage without
people’s consent, and made accessible to private companies.
69
https://www.hani.co.kr/arti/english_edition/e_national/1019531.html