Feature Extraction: Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur-613 401

NPTEL – Electronics & Communication Engineering – Pattern Recognition
FEATURE EXTRACTION
Dr. K.Vijayarekha
Associate Dean
School of Electrical and Electronics Engineering
SASTRA University, Thanjavur-613 401
Joint Initiative of IITs and IISc – Funded by MHRD Page 1 of 8

Table of Contents
1. FEATURE EXTRACTION ..................................................... 3
1.1 Introduction ........................................................................................................................................ 3
1.2 Image texture ...................................................................................................................................... 3
1.3 Statistical Feature ............................................................................................................................... 5
1.4 Histogram based features ................................................................................................................... 5
Fig. 4.2 Textural images extracted from splitting infected fruits .......................................................... 5
Fig. 4.3 Textural images extracted from stem-end rot infected fruits...................................................... 5
1.5 Co-occurrence matrix based features ................................................................................................. 6

1. FEATURE EXTRACTION
1.1 Introduction
After an image has been segmented into regions of interest, image representation and image
description in a form suitable for further processing of the image is very important. Representing
a region can be done in two ways. (1) in terms of its external characteristics and (2) in terms of
its internal characteristics. Choosing a proper representation scheme makes the data useful.
Based on the representation method chosen, the region needs to be described. An external
representation is chosen when the primary focus is on shape analysis. An internal representation
is preferred when the primary focus is on regional properties like color and texture.
The automatic grading and sorting of citrus fruits requires that the external surface defect be
identified and classified. The shape of the fruit does not play a role in such an application and so
external representation cannot be performed. Internal representation is chosen for this study, as
the focus of the work is to distinguish the region of the fruit image based on either color or the
texture. Even though color difference between the normal surface and the defect surface is
prominent, color alone will not serve the purpose because, there will always be variations of
color in natural products like fruits and vegetables and may be of little use in classification.
Hence textural description is preferred. The textural description is done by extracting features
from the image. An image feature is a distinguishing primitive characteristic or attribute of an
image.
1.2 Image texture
Color is an important cue not only in human vision but also in digital image processing where its
impact is still rising. Color is measured globally according to the histogram ignoring local
neighboring pixels. Natural products like fruits and vegetables do not have uniform color
throughout and has variations between the samples pertaining to a single class.
Images of natural scenes are devoid of sharp edges over large areas. In these areas the scene can
be characterized as exhibiting a consistent structure analogous to the texture of cloth. Image
texture measurements can be used to segment an image and classify its segments. Texture is
characterized by the relationship of the intensities of neighboring pixels ignoring their color.
Texture plays an important role in many machine vision tasks such as surface inspection, scene
classification, surface orientation and shape determination. Texture is characterized by the spatial
distribution of gray levels in a neighborhood.

Texture is an important cue for the analysis of many images. It is usually used to point intrinsic
properties of surfaces especially those that do not have a smoothly varying intensity. Several
image properties such as smoothness, coarseness, depth, regularity etc. can be associated with
texture. Texture can also be defined as a descriptor of local brightness variation from pixel to
pixel in a small neighborhood through an image. Texture can be described as an attribute
representing the spatial arrangement of the gray levels of the pixels in a region of a digital image.
It is often qualitatively described by its coarseness and the coarseness index is related to the
spatial repetition period of the local structure.
A large period implies a coarse texture and small period implies a fine texture. Texture is a
neighborhood property of an image point. Therefore texture measures depend on the size of the
observation neighborhood. Texture analysis has played an important role in many areas
including medical imaging, remote sensing and industrial inspection and image retrieval.
The texture analysis is diverse and differs from each other by the method used for extracting
textural features. Four categories of extracting textural features are: 1) Statistical methods 2)
Structural methods 3) Model based methods 4) Transform-based methods.
Statistical texture analysis techniques describe texture of regions in an image through higher-
order moments of their grayscale histograms. The most commonly used method for texture
analysis is based on extracting various textural features from a gray level co-occurrence matrix
(GLCM). The GLCM approach is based on the use of second-order statistics of the grayscale
image histograms. Structural texture analysis techniques describe a texture as the composition of
well-defined texture elements such as regularly spaced parallel lines.
The properties and placement rules of the texture elements define the image texture. Model
based texture analysis techniques generate an empirical model of each pixel in the image based
on a weighted average of the pixel intensities in its neighborhood. The estimated parameters of
the image models are used as textural feature descriptors. Transform based texture analysis
techniques convert the image into a new form using the spatial frequency properties of the pixel
intensity variations. The success of this type lies in the type of transform used to extract textural
characteristics from the image. The image processing of citrus fruit images using statistical and
transform based texture analysis is explained here.

Fig. 4.1 Textural images extracted from pitting infected fruits
Fig. 4.2 Textural images extracted from splitting infected fruits
Fig. 4.3 Textural images extracted from stem-end rot infected fruits
Except for the wavelet packet transform features, which use the full fruit image for training and
testing, all the other methods make use of cropped windows containing the surface defect region.
Features are extracted for the images from imbank-3, and stored as feature vectors.
When classification is performed for the full fruit image, the image is cropped, features are
extracted and classification task is performed. Sample cropped windows with the three surface
defects are shown in Fig, Fig and Fig.
1.3 Statistical Feature
One of the simplest approaches for describing texture is to use the statistical moments of the gray
level histogram of the image. The various statistical textural features are based on gray level
histogram, gray level co-occurrence matrix, and edge frequency and run length distribution. In
our research work, we concentrated only on first and second order statistics i.e. gray level and
co-occurrence based measures.
1.4 Histogram based features
The histogram-based features used in this work are first order statistics that include mean,
variance, skewness and kurtosis. Let z be a random variable denoting image gray levels and p(zi),

i = 0,1,2,3,…….L-1, be the corresponding histogram, where L is the number of distinct gray

levels. The features are calculated using the above-mentioned histogram.
(a) Mean
The mean gives the average gray level of each region and it is useful only as a rough idea of
intensity not really texture.
L −1
m = ∑ z i p( z i )
i =0
(b) Variance
The variance gives the amount of gray level fluctuations from the mean gray level value.
L −1
µ 2 ( z ) = ∑ ( z i − m) 2 p ( z i )
i =0
(c) Skewness
Skewness is a measure of the asymmetry of the gray levels around the sample mean. If skewness
is negative, the data are spread out more to the left of the mean than to the right. If skewness is
positive, the data are spread out more to the right.
L −1
µ 3 ( z ) = ∑ ( z i − m) 3 p ( z i )
i =0
(d) Kurtosis
Kurtosis is a measure of how outlier-prone a distribution is. It describes the shape of the tail of
the histogram.
L −1
µ 4 ( z ) = ∑ ( z i − m) 4 p ( z i )
i =0
1.5 Co-occurrence matrix based features
Measures of texture computed using histograms suffer from the limitation that they carry no
information regarding the relative position of the pixels with respect to each other. One way to
bring this type of information into the texture analysis process is to consider not only the
distribution of the intensities but also the positions of pixels with equal or nearly equal intensity
values. One such type of feature extraction is from gray level co-occurrence matrices.

The second-order gray level probability distribution of a texture image can be calculated by
considering the gray levels of pixels in pairs at a time. A second-order probability is often called
a GLC probability. For a given displacement vector D5 at Dx Dy, the joint probability of a pixel
at location (x, y) having a gray level i, and the pixel at location (x1Dx, y1Dy) having a gray level
j . In other words it is a second-order joint probability P (i, j) of the intensity values of two pixels
(i and j), a distance d apart along a given direction, which is the probability that j and i have the
same intensity. This joint probability takes the form of a square array Pd with row and column
dimensions equal to the number of discrete gray levels (intensities) in the image being examined.
If an intensity image were entirely flat (i.e. contained no texture), the resulting GLCM would be
completely diagonal. As the image texture increases, the off-diagonal values in GLCM become
larger. The various features that can be calculated from the co-occurrence matrices (C) are inertia
(contrast), absolute value, inverse difference, energy, and entropy.
(a) Contrast
Contrast is the element difference moment of order 2, which has a relatively low value when the
high values of C are near the main diagonal.
contrast = ∑∑ (i − j ) 2cij
i j
(b) Energy
Energy value is highest when all values in the co-occurrence matrix are all equal
energy = ∑ ∑ c 2 ij
i j
(c) Entropy
Entropy of the image is the measure of randomness of the image gray levels.
Entropy = −∑∑ C ij log 2 C ij

i j
The statistical feature set consisting of seven feature vectors for each 40x40 sub window for the
three surface defects is used. Table-4.1 gives the statistical features taken for three windows of
each type.

Statistical feature set
S. Surface Mean Varianc Skewnes Kurtosis Contras Energ Entrop

No Defect e s t y y
1 Pitting 0.9612 0.2012 -0.6061 -0.26857 -0.8064 - 2.7062

7 2.054
9
2 Pitting 0.9194 -0.5200 -0.2363 -0.26056 -0.0607 - 1.2576

1.333
5
3 Pitting 0.9466 0.0086 -0.2017 -0.25200 0.3597 - -

0.654 0.3759
6 0
4 Splitting 0.6772 0.6937 -0.6280 -0.27106 -0.7040 - -

0.265 0.8223
9
5 Splitting 0.2893 -0.0914 -0.5049 -0.27064 2.3688 1.341 -

5 0.5925
6 Splitting 1.2805 0.1706 -0.6135 -0.27132 0.7655 - 0.3321

0.246
8
7 Stem end - -0.8154 1.5414 0.26310 -0.6483 1.576 -

rot 1.3333 5 1.1797
8 Stem end 0.1101 -0.7381 0.5747 -0.20437 -0.0545 1.630 -

rot 2 0.7412
9 Stem end 0.4198 0.3768 -0.1083 -0.25952 -1.3815 0.928 -

rot 5 1.0100

Feature Extraction: Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur-613 401

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Feature Extraction: Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur-613 401

Uploaded by

Copyright:

Available Formats

NPTEL – Electronics & Communication Engineering – Pattern Recognition

Joint Initiative of IITs and IISc – Funded by MHRD Page 1 of 8

Joint Initiative of IITs and IISc – Funded by MHRD Page 2 of 8

1.2 Image texture

Joint Initiative of IITs and IISc – Funded by MHRD Page 3 of 8

Joint Initiative of IITs and IISc – Funded by MHRD Page 4 of 8

Fig. 4.1 Textural images extracted from pitting infected fruits

Fig. 4.2 Textural images extracted from splitting infected fruits

1.3 Statistical Feature

1.4 Histogram based features

Joint Initiative of IITs and IISc – Funded by MHRD Page 5 of 8

i = 0,1,2,3,…….L-1, be the corresponding histogram, where L is the number of distinct gray

1.5 Co-occurrence matrix based features

Joint Initiative of IITs and IISc – Funded by MHRD Page 6 of 8

Entropy = −∑∑ C ij log 2 C ij

Joint Initiative of IITs and IISc – Funded by MHRD Page 7 of 8

Statistical feature set

S. Surface Mean Varianc Skewnes Kurtosis Contras Energ Entrop

1 Pitting 0.9612 0.2012 -0.6061 -0.26857 -0.8064 - 2.7062

2 Pitting 0.9194 -0.5200 -0.2363 -0.26056 -0.0607 - 1.2576

3 Pitting 0.9466 0.0086 -0.2017 -0.25200 0.3597 - -

4 Splitting 0.6772 0.6937 -0.6280 -0.27106 -0.7040 - -

5 Splitting 0.2893 -0.0914 -0.5049 -0.27064 2.3688 1.341 -

6 Splitting 1.2805 0.1706 -0.6135 -0.27132 0.7655 - 0.3321

7 Stem end - -0.8154 1.5414 0.26310 -0.6483 1.576 -

8 Stem end 0.1101 -0.7381 0.5747 -0.20437 -0.0545 1.630 -

9 Stem end 0.4198 0.3768 -0.1083 -0.25952 -1.3815 0.928 -

Joint Initiative of IITs and IISc – Funded by MHRD Page 8 of 8

You might also like