Classification of Hyperspectral Images

CLASSIFICATION OF
HYPERSPECTRAL IMAGES
A R E S E A R C H P R O JE C T B Y S A H IL S I D D H A RT H
U N D E R D R . A L O K E D AT TA
ELECTROMAGNETIC SPECTRUM
• The electromagnetic spectrum is the range of frequencies (the

spectrum) of electromagnetic radiation and their respective
wavelengths and photon energies.
REMOTE SENSING
• Remote sensing is the acquisition of information about an

object or phenomenon without making physical contact with
the object, in contrast to in situ or on-site observation.
HYPERSPECTRAL IMAGING
• Hyperspectral imaging (HSI) is a technique that analyzes a

wide spectrum of light instead of just assigning primary colors
(red, green, blue) to each pixel. The light striking each pixel is
broken down into many different spectral bands in order to
provide more information on what is imaged.
APPLICATIONS OF HYPERSPECTRAL
IMAGING
• Seed Viability Study
• Biotechnology
• Remote Sensing
• Environmental Monitoring
• Pharmaceuticals
• Medical Diagnosis
• Forensic Science
• Thin Films
• Oil and Gas
CLASSIFICATION OF HSI
• For a set of observations with known class labels
• Goal: to assign class label to each pixel
• Challenges:
 Dimensionality
 Insufficient sample pools

ML CLASSIFICATION TECHNIQUES
1. K- Nearest Neighbors
2. Support Vector Machines
3. Random Forest
K – NEAREST NEIGHBORS
• This algorithm is based on Supervised Learning technique. It’s

called a lazy learner algorithm because it does not learn from
the training set immediately instead it stores the dataset and at
the time of classification, it performs an action on the dataset.
This algorithm is quite simple in its implementation and is
robust to noisy training data.
K – NEAREST NEIGHBORS
SUPPORT VECTOR MACHINES
• The objective of the support vector machine algorithm is to

find a hyperplane in an N-dimensional space (N — the number
of features) that distinctly classifies the data points. To
separate the two classes of data points, there are many possible
hyperplanes that could be chosen. Our objective is to find a
plane that has the maximum margin, i.e. the maximum
distance between data points of both classes. Maximizing the
margin distance provides some reinforcement so that future
data points can be classified with more confidence.
SUPPORT VECTOR MACHINES
RANDOM FOREST
• It is a supervised learning algorithm which creates decision

trees on data samples and then gets the prediction from each of
them and finally selects the best solution by means of voting.
It is an ensemble method which is better than a single decision
tree because it reduces the over-fitting by averaging the result.
RANDOM FOREST
DIMENSIONALITY REDUCTION
Two major components of dimensionality reduction:

1. Feature selection is the process of identifying and selecting
relevant features for your sample. It usually involves three
ways:
i. Filter
ii. Wrapper
iii. Embedded
2. Feature Extraction This reduces the data in a high
dimensional space to a lower dimension space
PRINCIPAL COMPONENT ANALYSIS
(PCA)
• Principal Component Analysis (PCA) is an unsupervised
statistical technique primarily used for dimensionality
reduction. It reduces the computational complexity of the
model which makes machine learning algorithms run faster.
• Other techniques - Linear Discriminant Analysis (LDA) and

Generalised Discriminant Analysis (GDA).
INDIAN PINES DATASET
• 145 x 145 pixels

• 200 bands
• 16 Classes
INDIAN PINES DETAILS
• The Indian Pines(IP) HSI data is gathered using the AVIRIS

sensor over the Indian Pines test site in North-western Indiana
and it consists of 145 X 145 pixels, 16 classes, and 200 bands.
• The Indian Pines scene contains nearly 66% agriculture, and
33% of forest or other natural perennial vegetation.
• The scene was taken in June and therefore some of the crops
present like corn and soybeans were still in early stages of
growth with less than 5% coverage.
VISUALISATION OF RANDOM SPECTRAL
BANDS (OF 200)
PCA ANALYSIS
• Selected 40 as Prime Components

SVM (RBF)
PCA ACCURACY
Before PCA 89.73%
After PCA 85.16%

KNN (N=3)
PCA ACCURACY
Before PCA 73.98%
After PCA 73.34%

RANDOM FOREST (50 ESTIMATORS)
PCA ACCURACY
Before PCA 83.88%
After PCA 76.35%

PAVIA UNIVERSITY DATASET
• 610 x 610 pixels

• 103 bands
• 9 Classes
PAVIA UNIVERSITY DETAILS
• The Pavia University Dataset contains scenes acquired by the

ROSIS sensor during a flight campaign over Pavia, northern
Italy.
• The total number of spectral bands is 103.
• There are 610*610 pixels along with a geometric resolution of
1.3 meters but some of the samples contain no information and
thus have to be discarded before the analysis.
• The ground-truth images classify the dataset into 9 classes and
below are these classes along with their sample numbers.
VISUALISATION OF RANDOM SPECTRAL
BANDS (OF 103)
PCA ANALYSIS
• Selected 2 as Prime Components

SVM (RBF)
PCA ACCURACY
Before PCA 95.44%
After PCA 83.65%

KNN (N=3)
PCA ACCURACY
Before PCA 88.48%
After PCA 81.20%

RANDOM FOREST (50 ESTIMATORS)
PCA ACCURACY
Before PCA 92.08%
After PCA 83.07%

WAY FORWARD: DEEP LEARNING
NEURAL NETWORKS
• Neural networks are a series of algorithms that mimic the

operations of a human brain to recognize relationships
between vast amounts of data.
• An Artificial Neural Network is made up of 3 components:

1. Input Layer.
2. Hidden (computation) Layers.
3. Output Layer.
NEURAL NETWORKS
CONVOLUTIONAL NEURAL NETWORKS
(CNN)
• A Convolutional Neural Network (ConvNet/CNN) is a Deep
Learning algorithm which can take in an input image, assign
importance (learnable weights and biases) to various
aspects/objects in the image and be able to differentiate one
from the other.
• The pre-processing required in a CNN is much lower as
compared to other classification algorithms.
• While in primitive methods filters are hand-engineered, with
enough training, CNN have the ability to learn these
filters/characteristics.
(CNN)
• The architecture of a CNN is analogous to that of the
connectivity pattern of Neurons in the Human Brain and was
inspired by the organization of the Visual Cortex.
• Individual neurons respond to stimuli only in a restricted
region of the visual field known as the Receptive Field.
• A collection of such fields overlap to cover the entire visual
area.
(CNN)
ADVANTAGES OF A CNN
• A CNN is able to successfully capture the Spatial and

Temporal dependencies in an image through the application of
relevant filters.
• Little dependence on pre processing, decreasing the needs of
human effort developing its functionalities.
• It is easy to understand and fast to implement. I
• It has the highest accuracy among all algorithms that predicts
images.
CNN BASIC LAYOUT
STEP 1A: CONVOLUTION
• We create many feature maps to obtain our first convolution

layer.
• Convolution finds the features in our images while
maintaining spatial integrity
STEP 1B: RELU LAYER
• ReLU (rectified linear activation function) increases non-

linearity in images
• Helps find features better
• ReLU(x)=max(0,x)
STEP 2: POOLING
• Max pooling layer helps reduce the spatial size of the

convolved features and also helps reduce over-fitting by
providing an abstracted representation of them.
• It is a sample-based discretization process.
STEP 3: FLATTENING
• Max pooling layer helps reduce the spatial size of the

convolved features and also helps reduce over-fitting by
providing an abstracted representation of them.
• It is a sample-based discretization process.
STEP 4: FULL CONNECTION
• The input layer nodes are connected to every node in the

second layer.
• We use one or more fully connected layers at the end of a
CNN.
• Adding a fully-connected layer helps learn non-linear
combinations of the high-level features outputted by the
convolutional layers.
STEP 4: FULL CONNECTION
• Fully connected layers

LOSS FUNCTION – CROSS ENTROPY
• A loss function is a method of evaluating how well the model

models the dataset.
• The loss function will output a higher number if the
predictions are off the actual target values whereas otherwise it
will output a lower number.
• For multi-class classification we will be using cross entropy as
our loss function. It outputs a higher value if the predicted
class label probability is low for the actual target class label.
CNN BUILD MODEL
CLASS WISE ACCURACY (INDIAN PINES)
• Overall accuracy was

99.71%
PREDICTION VS GROUND-TRUTH
MODEL LOSS AND ACCURACY GRAPHS
ADVANTAGES OF A CNN
• A CNN is able to successfully capture the Spatial and

Temporal dependencies in an image through the application of
relevant filters.
• Little dependence on pre processing, decreasing the needs of
human effort developing its functionalities.
• It is easy to understand and fast to implement.
• It has the highest accuracy among all algorithms that predicts
images.
U-NET ARCHITECTURE
• UNet, evolved from the traditional convolutional neural

network, was first designed and applied in 2015 to process
biomedical images.
• As a general convolutional neural network focuses its task on
image classification, where input is an image and output is one
label, but in biomedical cases, it requires us not only to
distinguish whether there is a disease, but also to localise the
area of abnormality.
WHAT IS DIFFERENT ABOUT U-NET?
• It is able to localise and distinguish borders is by doing

classification on every pixel, so the input and output share
the same size.
U-NET OVERVIEW
• It has a “U” shape.

• The architecture is symmetric and consists of two major parts
— the left part is called contracting path, which is
constituted by the general convolutional process;
• the right part is expansive path, which is constituted by
transposed 2d convolutional layers
U-NET OVERVIEW
CONTRACTING PATH
• Each process constitutes two convolutional layers, and the

number of channel changes from 1 → 64, as convolution
process will increase the depth of the image. The red arrow
pointing down is the max pooling process which halves down
size of image.
• Size reduces due to padding issues
CONTRACTING PATH
• The process is repeated 3 more times:

BOTTOM OF THE U-NET
• At the bottom of the U-Net, we still build 2 convolution

layers.
• There is no max-pooling
EXPANSIVE PATH
• In the expansive path, the image is going to be upsized to its

original size.
• Transposed convolution is an upsampling technic that expands
the size of images.
EXPANSIVE PATH
• After the transposed convolution, the image (for eg) is upsized

from 28x28x1024 → 56x56x512.
• Then, this image is concatenated with the corresponding
image from the contracting path and together makes an image
of size 56x56x1024.
• The reason is to combine the information from the previous
layers in order to get a more precise prediction.
OUTPUT SEGMENT
• After we’ve reached the uppermost of the architecture, the last

step is to reshape the image to satisfy our prediction
requirements.
• The last layer is a convolution layer with 1 filter of size
1x1(notice that there is no dense layer in the whole network).
And the rest left is the same for neural network training.
U-NET MODEL FOR INDIAN PINES
MODEL ACCURACY AND LOSS
CLASS WISE ACCURACY
• Overall accuracy of the model

was 99.6%
PREDICTION VS GROUND-TRUTH
THANK YOU
Thesis Presentation
by
Sahil Siddharth (17dcs013)

Classification of Hyperspectral Images

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Classification of Hyperspectral Images

Uploaded by

Copyright:

Available Formats

CLASSIFICATION OF

• The electromagnetic spectrum is the range of frequencies (the

• Remote sensing is the acquisition of information about an

• Hyperspectral imaging (HSI) is a technique that analyzes a

• For a set of observations with known class labels

• Goal: to assign class label to each pixel

 Insufficient sample pools

• This algorithm is based on Supervised Learning technique. It’s

• The objective of the support vector machine algorithm is to

• It is a supervised learning algorithm which creates decision

Two major components of dimensionality reduction:

• Other techniques - Linear Discriminant Analysis (LDA) and

• 145 x 145 pixels

• The Indian Pines(IP) HSI data is gathered using the AVIRIS

• Selected 40 as Prime Components

After PCA 85.16%

After PCA 73.34%

After PCA 76.35%

• 610 x 610 pixels

• The Pavia University Dataset contains scenes acquired by the

• Selected 2 as Prime Components

After PCA 83.65%

After PCA 81.20%

After PCA 83.07%

• Neural networks are a series of algorithms that mimic the

• An Artificial Neural Network is made up of 3 components:

• A CNN is able to successfully capture the Spatial and

• We create many feature maps to obtain our first convolution

• ReLU (rectified linear activation function) increases non-

• Max pooling layer helps reduce the spatial size of the

• Max pooling layer helps reduce the spatial size of the

• The input layer nodes are connected to every node in the

• Fully connected layers

• A loss function is a method of evaluating how well the model

• Overall accuracy was

• A CNN is able to successfully capture the Spatial and

• UNet, evolved from the traditional convolutional neural

• It is able to localise and distinguish borders is by doing

• It has a “U” shape.

• Each process constitutes two convolutional layers, and the

• The process is repeated 3 more times:

• At the bottom of the U-Net, we still build 2 convolution

• In the expansive path, the image is going to be upsized to its

• After the transposed convolution, the image (for eg) is upsized

• After we’ve reached the uppermost of the architecture, the last

• Overall accuracy of the model

Sahil Siddharth (17dcs013)

You might also like