Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 64

CLASSIFICATION OF

HYPERSPECTRAL IMAGES
A R E S E A R C H P R O JE C T B Y S A H IL S I D D H A RT H
U N D E R D R . A L O K E D AT TA
ELECTROMAGNETIC SPECTRUM

• The electromagnetic spectrum is the range of frequencies (the


spectrum) of electromagnetic radiation and their respective
wavelengths and photon energies.
REMOTE SENSING

• Remote sensing is the acquisition of information about an


object or phenomenon without making physical contact with
the object, in contrast to in situ or on-site observation.
HYPERSPECTRAL IMAGING

• Hyperspectral imaging (HSI) is a technique that analyzes a


wide spectrum of light instead of just assigning primary colors
(red, green, blue) to each pixel. The light striking each pixel is
broken down into many different spectral bands in order to
provide more information on what is imaged.
APPLICATIONS OF HYPERSPECTRAL
IMAGING
• Seed Viability Study
• Biotechnology
• Remote Sensing
• Environmental Monitoring
• Pharmaceuticals
• Medical Diagnosis
• Forensic Science
• Thin Films
• Oil and Gas
CLASSIFICATION OF HSI

• For a set of observations with known class labels

• Goal: to assign class label to each pixel

• Challenges:

 Dimensionality

 Insufficient sample pools


ML CLASSIFICATION TECHNIQUES

1. K- Nearest Neighbors
2. Support Vector Machines
3. Random Forest
K – NEAREST NEIGHBORS

• This algorithm is based on Supervised Learning technique. It’s


called a lazy learner algorithm because it does not learn from
the training set immediately instead it stores the dataset and at
the time of classification, it performs an action on the dataset.
This algorithm is quite simple in its implementation and is
robust to noisy training data.
K – NEAREST NEIGHBORS
SUPPORT VECTOR MACHINES

• The objective of the support vector machine algorithm is to


find a hyperplane in an N-dimensional space (N — the number
of features) that distinctly classifies the data points. To
separate the two classes of data points, there are many possible
hyperplanes that could be chosen. Our objective is to find a
plane that has the maximum margin, i.e. the maximum
distance between data points of both classes. Maximizing the
margin distance provides some reinforcement so that future
data points can be classified with more confidence.
SUPPORT VECTOR MACHINES
RANDOM FOREST

• It is a supervised learning algorithm which creates decision


trees on data samples and then gets the prediction from each of
them and finally selects the best solution by means of voting.
It is an ensemble method which is better than a single decision
tree because it reduces the over-fitting by averaging the result.
RANDOM FOREST
DIMENSIONALITY REDUCTION

Two major components of dimensionality reduction:


1. Feature selection is the process of identifying and selecting
relevant features for your sample. It usually involves three
ways:
i. Filter
ii. Wrapper
iii. Embedded
2. Feature Extraction This reduces the data in a high
dimensional space to a lower dimension space
PRINCIPAL COMPONENT ANALYSIS
(PCA)
• Principal Component Analysis (PCA) is an unsupervised
statistical technique primarily used for dimensionality
reduction. It reduces the computational complexity of the
model which makes machine learning algorithms run faster.

• Other techniques - Linear Discriminant Analysis (LDA) and


Generalised Discriminant Analysis (GDA).
INDIAN PINES DATASET

• 145 x 145 pixels


• 200 bands
• 16 Classes
INDIAN PINES DETAILS

• The Indian Pines(IP) HSI data is gathered using the AVIRIS


sensor over the Indian Pines test site in North-western Indiana
and it consists of 145 X 145 pixels, 16 classes, and 200 bands.
• The Indian Pines scene contains nearly 66% agriculture, and
33% of forest or other natural perennial vegetation.
• The scene was taken in June and therefore some of the crops
present like corn and soybeans were still in early stages of
growth with less than 5% coverage.
VISUALISATION OF RANDOM SPECTRAL
BANDS (OF 200)
PCA ANALYSIS

• Selected 40 as Prime Components


SVM (RBF)

PCA ACCURACY
Before PCA 89.73%

After PCA 85.16%


KNN (N=3)

PCA ACCURACY
Before PCA 73.98%

After PCA 73.34%


RANDOM FOREST (50 ESTIMATORS)

PCA ACCURACY
Before PCA 83.88%

After PCA 76.35%


PAVIA UNIVERSITY DATASET

• 610 x 610 pixels


• 103 bands
• 9 Classes
PAVIA UNIVERSITY DETAILS

• The Pavia University Dataset contains scenes acquired by the


ROSIS sensor during a flight campaign over Pavia, northern
Italy.
• The total number of spectral bands is 103.
• There are 610*610 pixels along with a geometric resolution of
1.3 meters but some of the samples contain no information and
thus have to be discarded before the analysis.
• The ground-truth images classify the dataset into 9 classes and
below are these classes along with their sample numbers.
VISUALISATION OF RANDOM SPECTRAL
BANDS (OF 103)
PCA ANALYSIS

• Selected 2 as Prime Components


SVM (RBF)

PCA ACCURACY
Before PCA 95.44%

After PCA 83.65%


KNN (N=3)

PCA ACCURACY
Before PCA 88.48%

After PCA 81.20%


RANDOM FOREST (50 ESTIMATORS)

PCA ACCURACY
Before PCA 92.08%

After PCA 83.07%


WAY FORWARD: DEEP LEARNING
NEURAL NETWORKS

• Neural networks are a series of algorithms that mimic the


operations of a human brain to recognize relationships
between vast amounts of data.

• An Artificial Neural Network is made up of 3 components:


1. Input Layer.
2. Hidden (computation) Layers.
3. Output Layer.
NEURAL NETWORKS
CONVOLUTIONAL NEURAL NETWORKS
(CNN)
• A Convolutional Neural Network (ConvNet/CNN) is a Deep
Learning algorithm which can take in an input image, assign
importance (learnable weights and biases) to various
aspects/objects in the image and be able to differentiate one
from the other.
• The pre-processing required in a CNN is much lower as
compared to other classification algorithms.
• While in primitive methods filters are hand-engineered, with
enough training, CNN have the ability to learn these
filters/characteristics.
CONVOLUTIONAL NEURAL NETWORKS
(CNN)
• The architecture of a CNN is analogous to that of the
connectivity pattern of Neurons in the Human Brain and was
inspired by the organization of the Visual Cortex.
• Individual neurons respond to stimuli only in a restricted
region of the visual field known as the Receptive Field.
• A collection of such fields overlap to cover the entire visual
area.
CONVOLUTIONAL NEURAL NETWORKS
(CNN)
ADVANTAGES OF A CNN

• A CNN is able to successfully capture the Spatial and


Temporal dependencies in an image through the application of
relevant filters.
• Little dependence on pre processing, decreasing the needs of
human effort developing its functionalities.
• It is easy to understand and fast to implement. I
• It has the highest accuracy among all algorithms that predicts
images.
CNN BASIC LAYOUT
STEP 1A: CONVOLUTION

• We create many feature maps to obtain our first convolution


layer.
• Convolution finds the features in our images while
maintaining spatial integrity
STEP 1B: RELU LAYER

• ReLU (rectified linear activation function) increases non-


linearity in images
• Helps find features better
• ReLU(x)=max(0,x)
STEP 2: POOLING

• Max pooling layer helps reduce the spatial size of the


convolved features and also helps reduce over-fitting by
providing an abstracted representation of them.
• It is a sample-based discretization process.
STEP 3: FLATTENING

• Max pooling layer helps reduce the spatial size of the


convolved features and also helps reduce over-fitting by
providing an abstracted representation of them.
• It is a sample-based discretization process.
STEP 4: FULL CONNECTION

• The input layer nodes are connected to every node in the


second layer.
• We use one or more fully connected layers at the end of a
CNN.
• Adding a fully-connected layer helps learn non-linear
combinations of the high-level features outputted by the
convolutional layers.
STEP 4: FULL CONNECTION

• Fully connected layers


LOSS FUNCTION – CROSS ENTROPY

• A loss function is a method of evaluating how well the model


models the dataset.
• The loss function will output a higher number if the
predictions are off the actual target values whereas otherwise it
will output a lower number.
• For multi-class classification we will be using cross entropy as
our loss function. It outputs a higher value if the predicted
class label probability is low for the actual target class label.
CNN BUILD MODEL
CLASS WISE ACCURACY (INDIAN PINES)

• Overall accuracy was


99.71%
PREDICTION VS GROUND-TRUTH
MODEL LOSS AND ACCURACY GRAPHS
ADVANTAGES OF A CNN

• A CNN is able to successfully capture the Spatial and


Temporal dependencies in an image through the application of
relevant filters.
• Little dependence on pre processing, decreasing the needs of
human effort developing its functionalities.
• It is easy to understand and fast to implement.
• It has the highest accuracy among all algorithms that predicts
images.
U-NET ARCHITECTURE

• UNet, evolved from the traditional convolutional neural


network, was first designed and applied in 2015 to process
biomedical images.
• As a general convolutional neural network focuses its task on
image classification, where input is an image and output is one
label, but in biomedical cases, it requires us not only to
distinguish whether there is a disease, but also to localise the
area of abnormality.
WHAT IS DIFFERENT ABOUT U-NET?

• It is able to localise and distinguish borders is by doing


classification on every pixel, so the input and output share
the same size.
U-NET OVERVIEW

• It has a “U” shape.


• The architecture is symmetric and consists of two major parts
— the left part is called contracting path, which is
constituted by the general convolutional process;
• the right part is expansive path, which is constituted by
transposed 2d convolutional layers
U-NET OVERVIEW
CONTRACTING PATH

• Each process constitutes two convolutional layers, and the


number of channel changes from 1 → 64, as convolution
process will increase the depth of the image. The red arrow
pointing down is the max pooling process which halves down
size of image.
• Size reduces due to padding issues
CONTRACTING PATH

• The process is repeated 3 more times:


BOTTOM OF THE U-NET

• At the bottom of the U-Net, we still build 2 convolution


layers.
• There is no max-pooling
EXPANSIVE PATH

• In the expansive path, the image is going to be upsized to its


original size.
• Transposed convolution is an upsampling technic that expands
the size of images. 
EXPANSIVE PATH

• After the transposed convolution, the image (for eg) is upsized


from 28x28x1024 → 56x56x512.
• Then, this image is concatenated with the corresponding
image from the contracting path and together makes an image
of size 56x56x1024.
• The reason is to combine the information from the previous
layers in order to get a more precise prediction.
OUTPUT SEGMENT

• After we’ve reached the uppermost of the architecture, the last


step is to reshape the image to satisfy our prediction
requirements.
• The last layer is a convolution layer with 1 filter of size
1x1(notice that there is no dense layer in the whole network).
And the rest left is the same for neural network training.
U-NET MODEL FOR INDIAN PINES
MODEL ACCURACY AND LOSS
CLASS WISE ACCURACY

• Overall accuracy of the model


was 99.6%
PREDICTION VS GROUND-TRUTH
THANK YOU

Thesis Presentation
by

Sahil Siddharth (17dcs013)

You might also like