Professional Documents
Culture Documents
Feature Extraction: Dr. Rafiqul Islam Professor, Department of CSE
Feature Extraction: Dr. Rafiqul Islam Professor, Department of CSE
• STEP-01: Preprocess the image and bring down the width to height ratio to 1:2. The image size
should preferably be 64 x 128. This is because we will be dividing the image into 8*8 and 16*16 patches
to extract the features. Having the specified size (64 x 128) will make all our calculations pretty simple
HOG
• Let’s apply the Pythagoras theorem to calculate the total gradient magnitude:
• Total Gradient Magnitude = √[(Gx)2+(Gy)2]
• Total Gradient Magnitude = √[(11)2+(8)2] = 13.6
• Next, calculate the orientation (or direction) for the same pixel.
• tan(Φ) = Gy / Gx
• Hence, the value of the angle would be:
• Φ = atan(Gy / Gx)
• The orientation comes out to be 36 when we plug in the values. So now, for every pixel value, have the total gradient
(magnitude) and the orientation (direction).
• Generate the histogram using these gradients and orientations.
HOG
• Keypoint Localization
• Once the images have been created, the next step is to find the important keypoints from the
image that can be used for feature matching. The idea is to find the local maxima and minima
for the images.
• This part is divided into two steps:
• Find the local maxima and minima
• Remove low contrast keypoints (keypoint selection)
• Local Maxima and Local Minima
• To locate the local maxima and minima, we go through every pixel in the image and
compare it with its neighboring pixels.
SIFT, or Scale Invariant Feature Transform
• Keypoint Localization
• The pixel marked x is compared with the neighboring pixels (in green) and is
selected as a keypoint if it is the highest or lowest among the neighbors:
SIFT, or Scale Invariant Feature Transform
• Keypoint Selection
• But some of these keypoints may not be robust to noise. This is why to perform a final check to
have the most accurate keypoints to represent the image features.
• Eliminate the keypoints that have low contrast, or lie very close to the edge.
• To deal with the low contrast keypoints, a second-order Taylor expansion is computed for each
keypoint. If the resulting value is less than 0.03 (in magnitude), reject the keypoint.
• Perform a check to identify the poorly located keypoints. These are the keypoints that are close to
the edge and have a high edge response but may not be robust to a small amount of noise. A
second-order Hessian matrix is used to identify such keypoints.
• Perform both the contrast test and the edge test to reject the unstable keypoints to assign an
orientation value for each keypoint to make the rotation invariant.
SIFT, or Scale Invariant Feature Transform
• Orientation Assignment
• Assign an orientation to each of these keypoints so that they are invariant to
rotation.
• We can again divide this step into two smaller steps:
• Calculate the magnitude and orientation
• Create a histogram for magnitude and orientation
SIFT, or Scale Invariant Feature Transform
• Keypoint Descriptor
• This is the final step for SIFT.
• Use the neighboring pixels, their orientations, and magnitude, to generate a unique fingerprint for this keypoint called a ‘descriptor’.
• Additionally, since the surrounding pixels are used, the descriptors will be partially invariant to illumination or brightness of the
images.
• Take a 16×16 neighborhood around the keypoint. This 16×16 block is further divided into 4×4 sub-blocks and for each of these sub-
blocks, generate the histogram using magnitude and orientation.