Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

21AI601 – COMPUTER VISION

UNIT II & LP 8 – ORIENTATION HISTOGRAM, SIFT, SURF,


HOG, GLOH, SCALE-SPACE ANALYSIS

1. FEATURE DESCRIPTORS
1.1 Scale Invariant Feature Transform (SIFT)
• Step 1: Scale-space extrema Detection - Detect interesting points (invariant to scale and
orientation) using DOG.
• Step 2: Keypoint Localization – Determine location and scale at each candidate location, and
select them based on stability.
• Step 3: Orientation Estimation – Use local image gradients to assigned orientation to each
localized keypoint.
Preserve theta, scale and location for each feature.
• Step 4: Keypoint Descriptor - Extract local image gradients at selected scale around keypoint and
form a representation invariant to local shape distortion and illumination them.
Step 1: Detect interesting points using DOG.
Step 2:Accurate keypointlocalization-Aim : reject the low contrast points and the points that lie on the
edge.
Low contrast points elimination:
Fit keypointat to nearby data using quadratic approximation.

Low contrast points elimination:


Fit keypointat to nearby data using quadratic approximation.

Eliminating edge response:


DOG gives strong response along edges – Eliminate those responses
Solution: check “cornerness” of each keypoint.
• On the edge one of principle curvatures is much bigger than another.
• High cornerness - No dominant principle curvature component.
• Consider the concept of Hessian and Harris corner

Step 3:Orientation Assignment-


Aim : Assign constant orientation to each keypoint based on local image property to obtain rotational
invariance.
The magnitude and orientation of gradient of an image patch I(x,y) at a particular scale is:

• Create weighted (magnitude + Gaussian) histogram of local gradient directions computed at selected
scale
• Assign dominant orientation of the region as that of the peak of smoothed histogram
• For multiple peaks create multiple key points
Already obtained precise location, scale and orientation to each keypoin
Step 4: Local image descriptor
Aim – Obtain local descriptor that is highly distinctive yet invariant to variation like illumination and
affine change
• Consider a rectangular grid 16*16 in the direction of the dominant orientation of the region.
• Divide the region into 4*4 sub-regions.
• Consider a Gaussian filter above the region which gives higher weights to pixel closer to the
center of the descriptor.
• Create 8 bin gradient histograms for each sub-region Weighted by magnitude and Gaussian
window ( σ is half the window size)
SIFT
Finally, normalize 128 dim vector to make it illumination invariant.
1.1.1 Applications of SIFT
• Object recognition
• Object categorization
• Location recognition
• Robot localization
• Image retrieval
• Image panoramas

1.2 Speed-Up Robust Feature(SURF)


Speed-up computations by fast approximation of
(i) Hessian matrix and
(ii) (ii) descriptor using “integral images”.
Integral Image
• The integral image IΣ(x,y) of an image I(x, y) represents the sum of all pixels in I(x,y) of a
rectangular region formed by (0,0) and (x,y).
• . i x ≤ j y ≤ = ∑∑ I x y Σ ( , ) i 0 = = j 0 I i j ( , ) Using integral images, it takes only four
array references to calculate the sum of pixels over a rectangular region of any size.

Approximate Lxx, Lyy, and Lxy using box filters.


(box filters shown are 9 x 9 – good approximations for a Gaussian with σ=1.2)
SURF
Can be computed very fast using integral images.
In SIFT, images are repeatedly smoothed with a Gaussian and subsequently sub sampled in order to
achieve a higher level of the pyramid.
Alternatively, we can use filters of larger size on the original image.
Due to using integral images, filters of any size can be applied at exactly the same speed!

Instead of using a different measure for selecting the location and scale of interest points

(e.g., Hessian and DOG in SIFT), SURF uses the determinant of to find
both.
Once interest points have been localized both in space and scale, the next steps are: (1)
Orientation assignment (2) Keypoint descriptor.
1.3 Histograms of Oriented Gradients
Local object appearance and shape can often be characterized rather well by the
distribution of local intensity gradients or edge directions, even without precise knowledge of
the correspond ing gradient or edge positions. In practice this is implemented by dividing the
image window into small spatial regions (“cells”), for each cell accumulating a local 1-D
histogram of gradient directions or edge orientations over the pixels of the cell. The combined
histogram entries form the representation. For better invariance to illumination, shadowing,
etc., it is also useful to contrast-normalize the local responses before using them. This can be
done by accumulating a measure of local histogram “energy” over somewhat larger spatial
regions (“blocks”) and using the results to normalize all of the cells in the block. We will refer
to the normalized descriptor blocks as Histogram of Oriented Gradient (HOG) descriptors.
Gradient.
Approximate the two components Ix and Iy of the gradient of I by central differences:
Ix(rc) = I(rc+1) I(rc 1) and Iy(rc)=I(r 1c) I(r+1c)
Cell Orientation Histograms.
Divide the window into adjacent, non-overlapping cells of size C C pixels (C =8). In each cell,
compute a histogram of the gradient orientations binned into B bins (B = 9). With so few bins, a pixel
whose orientation is close to a bin boundary might end up contributing to a different bin, were the image
to change slightly. To prevent these quantization artifacts, each pixel in a cell contributes to two adjacent
bins (modulo B) a fraction of the pixel’s gradient magnitude that decreases linearly with the distance of
that pixel’s gradient orientation from the two bin centers.

1.4 Gradient Localization Oriented Histogram (GLOH):


First 3 steps – same as SIFT
Step 4 – Local image descriptor
• Consider log-polar location grid with 3 different radii and 8 angular direction for two of them, in
total 17 location bin
• Form histogram of gradients having 16 bins
• Form a feature vector of 272 dimension (17*16)
• Perform dimensionality reduction and project the features to a 128 dimensional space

You might also like