Professional Documents
Culture Documents
Survey Paper
Survey Paper
Review
Machine Learning and Deep Learning Techniques for Spectral
Spatial Classification of Hyperspectral Images: A
Comprehensive Survey
Reaya Grewal 1 , Singara Singh Kasana 2 and Geeta Kasana 3
1 Computer Science and Engineering Department, Thapar Institute of Engineering and Technology,
Patiala 147004, India; rgrewal_phd19@thapar.edu
2 Computer Science and Engineering Department, Thapar Institute of Engineering and Technology,
Patiala 147004 , India; singara@thapar.edu
3 Computer Science and Engineering Department, Thapar Institute of Engineering and Technology,
Patiala 147004 , India; gkasana@thapar.edu
* Correspondence: singara@thapar.edu
Abstract: The growth of Hyperspectral Image (HSI) analysis is due to technology advancements that
enable cameras to collect hundreds of continuous spectral information of each pixel in an image. HSI
classification is challenging due to the large number of redundant spectral bands, limited training
samples and non-linear relationship between the collected spatial position and the spectral bands. Our
survey highlights recent research in HSI classification using traditional Machine Learning techniques
like kernel-based learning, Support Vector Machines, Dimension Reduction and Transform-based
techniques. Our study also digs into Deep Learning (DL) techniques that involve the usage of
Autoencoders, 1D, 2D and 3D-Convolutional Neural Networks to classify HSI. From the comparison,
it is observed that DL-based classification techniques outperform ML-based techniques. It has also
been observed that spectral-spatial HSI classification outperforms pixel-by-pixel classification because
it incorporates spectral signatures and spatial domain information. The performance of ML and
DL-based classification techniques has been reviewed on commonly used land cover datasets like
Indian Pines, Salinas valley and Pavia University.
Keywords: hyperspectral images; classification; deep learning; PSO; SVM; KNN; decision tree; PCA;
Citation: Grewal, R.; Singh Kasana, S.; DWT; ANN; CNN
Kasana, G. Machine Learning and
Deep Learning Techniques for
Spectral Spatial Classification of
Hyperspectral Images: A 1. Introduction
Comprehensive Survey. Electronics
Remote Sensing (RS) has advanced significantly, leading to the use of new technologies.
2023, 1, 0. https://doi.org/
This enables novel data processing methods and better quality data with enhanced spatial
Academic Editor: Gemma Piella and spectral resolutions for a variety of research applications. HSI are high-resolution
electromagnetic spectrum images with a large number of contiguous bands. The spectral
Received: 17 December 2022 response of a material to incident light is distinct and this response is responsible for its
Revised: 6 January 2023 colour. The availability of HSI has enriched RS research with sounder data quality and
Accepted: 8 January 2023 capability to distinguish different features by their spectral profile, which is unique to
Published: them as shown in Figure 1. HSI has two spatial dimensions (Sx and Sy) and one spectral
dimension (Sz). The hyperspectral data is illustrated as a 3D hyperspectral data cube in
Figure 2. Spatial and spectral resolution play essential roles in various HSI applications.
It has drawn researchers interested in developing HSI techniques related to HSI in both
Copyright: © 2023 by the authors.
spatial and spectral domains.
Licensee MDPI, Basel, Switzerland.
The term “classification” refers to the process of assigning individual pixels of an
This article is an open access article
image to a class and producing a classification map as output.-based on the training pattern
distributed under the terms and
and availability of the data labels, classification techniques are broadly categorized as
conditions of the Creative Commons
supervised and unsupervised. Supervised techniques classify input data by using a set
Attribution (CC BY) license (https://
of representative samples having labels. The unsupervised classifiers segregate the pixels
creativecommons.org/licenses/by/
on the basis of similar spectral or spatial behaviour without any prior information. HSI
4.0/).
classification consistently suffers from various hurdles such as high dimensionality, limited
or unbalanced training patterns, spectral variability and mixed pixels.
• Food and Safety—HSI has contributed in food quality assessment and safety. It
has been used for identification of defects and levels of contamination. For e.g.,
Leiva et al. [3] employed HSI to find firmness of blueberries and achieved an accuracy
of 87%.
• Medical Diagnosis—Due to high spectral resolution, there is sharp capture of materi-
als and their chemical and physical compositions are highlighted. HSI has embarked
on excellent performance for studying and diagnosing tissues. For e.g., Liu, Wang and
Li [4] utilized HSI images of tongue tissues to detect the tumor. the spectral signatures
of tissues played a vital role for detection.
• Precision Agriculture—Manual crop monitoring is limited since apparent symptoms
often develop late in the disease’s progression, making it difficult to restore plant
health. Advances in HSI methodologies have made crop stress assessment and study
of soil and vegetation attributes more cost-effective. For e.g., Liu et al. [5] used spectral
signatures to estimate the yield of wheat crop.
• Environment Monitoring—HSI has also been applied for floods and water resources
management. HSI provides efficient and reliable information on water quality pa-
rameters which include hydrophysical, biochemical and biological properties. HSI
measured chlorophyll content in water bodies by Kutser et al. [6]
Electronics 2023, 1, 0 4 of 35
There are many approaches to classify a HSI image. In this work, ML and DL clas-
sification techniques have been reviewed and compared. ML-based image classification
focuses on developing algorithms to predict and detect patterns without human interven-
tion. Various classifiers like Support Vector Machine (SVM), K-Nearest Neighbor (KNN),
Decision Trees (DT) etc. are trained. Several steps of data pre-processing and feature engi-
neering need to be performed to get insights from raw images and improve performance of
classification techniques. In this study, we have sub-categorised traditional ML techniques
into commonly employed techniques in recent years like kernel-based learning, SVM
classification, dimension reduction and transform-based techniques. Peers have majorly
used kernel-based techniques to efficiently learn non-linearity of HSI dataset. Spectral and
spectral-spatial kernels have been added as another dimension of learning by authors to
capture complex details of HSI. SVM classifier also belongs to the family of kernel learning.
SVM has been extensively used to classify the high-dimension HSI data and discuused
. With transform-based techniques, authors have been able to extract useful information
while suppressing noise in HSI. HSI dataset. The influence of classification grows with
the increase of available training samples. The limited availability of HSI training samples
diminishes the classification performance with the rise of spectral dimension. This effect is
famously termed the “Hughes phenomenon.” To address this challenge, many authors have
implemented dimension reduction techniques prior to classification. We have discussed
various dimension reduction driven HSI classification that works on spectral features.
Unlike traditional ML techniques, DL delivers a dynamic approach for unsupervised
feature learning using a huge raw image data set. DL-based techniques can depict complex
relationships of data using numerous neural connections. The DL models for HSI classifi-
cation generally consist of three layers: (i) Input data, (ii) Construction of the deep layer
(iii) Classification [7]. A general representation of DL-based HSI classification has been
illustrated in Figure 4.
The papers reviewed are focused on how different state of the art classification tech-
niques have been used for HSI in the previous decade. A brief discussion on existing
classification techniques is in Section 2. Methodology adopted to conduct this survey
has been briefly stated too. The Section 3 elaborates traditional ML techniques employed
by authors like SVM, kernel-based methods, dimension reduction and transform-based
methods. Section 4 emphasises on DL techniques for spectral and spectral-spatial HSI
classification. Sections 5 and 6 highlight the analysis of this survey. It brings out comparison
in performance of ML and DL techniques for HSI. The paper is concluded with challenges
and future scope of research and improvement in HSI analysis.
2. Preliminaries
This section briefly defines the HSI classification techniques utilised in the surveyed
publications.
ods that are well-known are Principal Component Analysis (PCA) and Independent
component analysis (ICA). Figure 8 illustrates basic steps of PCA dimension reduction.
Figure 7. A schematic approach of Wavelet Transform decomposing data into two levels.
In the same year, Gao et al. [18] used a composite Spectral-Spatial Kernel for Anomaly
Detection (SSCAD). It considered non-linear characteristics of data unlike other detection
models that worked in linear space and just exploited spectral information. Using a kernel-
based approach, the data is implicitly mapped into high dimensional features space that
deals with non-linear problems well. Using local homoegeneity, superpixels were extracted
using ERS that provided spatial information. It was fused with direct spectral information
extracted from images to form composite kernel. Weights were adaptively determined
using iterative kernel learning algorithm-based on Centred Kernel Alignment (CKA). CKA
measured cosine similarity between two centred kernels. High value of CKA determined
that two kernels are similar to each other. The authors focused on obtaining highest possible
value of CKA between the composite kernel and target kernel. The detection map was
built using kernel-based Reed-Xiaoli anomaly detection algorithm. It used Mahalanobis
distance to form decision rules to distinguish text pixels and backgrounds. The proposed
work was implemented on real datasets obtained using HYDICE sensor, ROSIS sensor over
Pavia centre and AVIRIS sensor over San Diego area. It gave better performances in terms
of Receiver Operating Characteristic (ROC) curve and Area Under the ROC curve (AUC)
when compared with state of the art anomaly detection methods.
Following this, A MKL-based approach involving spectral, spatial and semantic infor-
mation using SVM were used by Wang et al. [20] for better classification results of HSI. First
three PCs (PC1-PC3) were obtained by applying PCA. These were used to obtain Gabor
features, entropy rate superpixel segmentation map and EMPs. Structure and textural
features were extracted and stacked as feature vectors for each pixel using combination
of gabor and EMP features. For uniformity in spatial characteristics, Mean filtering was
performed within each superpixel. For semantic information, k-means clustering map and
segmentation map via ERS were used to produce semantic feature vector for each super-
pixel. Each superpixel was treated as a separate document/image. Spectral features, ERS
map and manually decided ‘k’ number of cluster centroids were inputs to create semantic
features using Bag of Visual Words (BOVW). K-means clustering was performed on the
spectral features to cluster them into ‘k’ cluster centres that was used as visual dictionary.
Number of pixels belonging to each cluster inside each superpixel were counted. Creation
of k × 1 histogram feature vector was done for each superpixel. Three individual kernels
were used to extract spectral, spatial and semantic information. For final results, composite
kernel with SVM was applied using weighted sum of these three kernels. The work was
implemented on Indian Pines and Pavia university and obtained highest OA of 98.39% and
99.77%, respectively.
HSI dataset faces with mixed pixels and purely pixel driven classifiers like SVM
cannot deal with overlapping data. Recently in 2021, Ma et al. [19] overcame it using Kernel
Constrained Energy Minimization (KCEM) and Kernel Linearly Constrained Minimum
Variance (KLCMV) classification. KCEM was for binary classification whereas KLCMV
for multi-classification. KCEM achieved an OA of 99.48% and 99.50% for Indian Pines
dataset, respectively. Both the former and latter achieved an OA of 99.6% on Salinas
Valley. It surpassed the performance of other spectral spatial methods. The aforementioned
Kernel-based classification techniques have been compared in Table 1.
Electronics 2023, 1, 0 9 of 35
combined and classified using SVM. The proposed work achieved an OA of 98.8%, AA of
99.0% on Pavia university Dataset.
In 2017, Two-Dimensional Empirical Wavelet Transform (2D-EWT) was used by Prab-
hakar and Geetha [33] for selection of informative and non-redundant bands. It was
compared with Image Empirical Mode Decomposition (IEMD). EWT segmented the signal
via Fourier transform and estimated supports. The estimated supports helped in building
the wavelet filter banks. The signal was filtered that provided with frequency component
to be processed having detail and approximation coefficients. The proposed work im-
plemented a 2-D extension of Littlewood-Paley transform. Sparse-based classifiers were
employed for the classification of the HSI dataset like Subspace Pursuit (SP) and Orthogonal
Matching Pursuit (OMP) along with SVM, Hybrid Support Vector Selection and Adaptation
(HSVSA). The methodology was performed on the Indian Pines dataset. IEMD gave better
OA but in more time as compared to 2D-EWT. The low frequency components of IEMD
and 2D-EWT had improved kappa measure, OA and Average Accuracy (AA).
In 2019, Ji et al. [34] detected bruises on potatoes using Discrete Wavelet Transform
(DWT) technique. Characteristic bands were selected using PCA. Top significant PC images
were chosen. The images’ texture were enhanced with histogram equalization. The pro-
cessed PC images were decomposed using DWT. The textural properties like contrast,
entropy, and correlation were obtained using GLCM. These feature data were further
extracted using AdaBoost-Fisher Linear Discriminant (FLD) Algorithm. The identification
of bruised potatoes was done by Adaboost modeling. The highest detection accuracy of
99.82% was obtained
In 2021, Anand et al. [35] extracted 3D spectral spatial features of an HSI cube simulta-
neously. Haar, Coiflet and Fejer-Korovkin filters were used for the same. The features were
fed into SVM, KNN and Random Forest. It achieved highest performance with KNN and
Random Forest.
Recently in 2022, Xu, Zhao and Liu [36] implemented 3D wavelet transform in the
pre-processing step to reduce the number of the learnable parameters of CNN. It extracted
both spatial and spectral features and had robust feature representation. Haar wavelet was
used as the mother wavelet.
In 2022, Miclea et al. [37] with the aid of wavelet obtained spectral features which were
concatenated with spatial features of LBP. The spectral spatial features were fed to SVM.
To prevent data overlapping which caused exaggerated classification results, the training
and testing sets were divided through controlled sampling. The training samples were
selected such which had spectral and spatial variance. The samples were added through
region growing with a specified window size. The aforementioned Transform-based
classification techniques have been compared in Table 2.
Compression of medical image using ANN for segmentation and classification. Post
2008 Akbari et al. [30]
wavelet transform. LVQ processing using region growing.
Table 2. Cont.
3D-DWT to extract features and remove • Indian Pines: OA-91.98% and AA-80.34%
2020 Cao et al. [39] stripe noise effect. CNN classification • Pavia University: OA-91.27% and
with active learning strategy. AA-82.37%.
Texture classification using graph-based • Indian Pines: OA- 98.90% and AA-98.77%.
wavelet transform, de-correlation • Pavia University: OA- 99.65% and
2020 Zikiou et al. [40] between close pixels-based on spectral AA-99.47% .
similarity to build spectral graph • Kennedy Space Centre: OA- 99.73% and
wavelets, SVM classification. AA-99.80%.
3.4.1. Unsupervised
In 2011, Villa et al. [43] focused on removal of redundant bands and used Independent
Component Discriminant Analysis (ICDA) for the same. The authors obtained classifi-
cation results using Bayesian classifier. Their approach achieved better accuracy than
SVM classification.
In 2016, HSI band selection using combination of entropy filtering and K-means
clustering was done by Santos and Pedrini [44]. For increased intra cluster similarity and
inter cluster variance, the bands were grouped together using their correlations. The images
Electronics 2023, 1, 0 13 of 35
were downsized by selecting fewer features vector using bi-cubic interpolation to improve
computation time. K-means was applied where each band was treated as a sample and
the Pearson correlation coefficient was used. K Representative bands were selected from
grouped bands and a 2d entropy filter was applied to each band. The central pixel of each
kernel was replaced with computed entropy giving a new vector that was submitted to
radial kernel SVM. The methodology obtained an OA of 97.1%, 98.3% and 97.1% on Indian
Pines, Salinas valley and Pavia centre datasets, respectively.
In 2017, Schclar and Averbuch [45] focused on improving the classification results of
HSI using Diffusion Bases (DB)-based methodology. The non-linear correlations amongst
wavelengths were captured that produced low dimension representation of data, reducing
the amount of noise. A modified version of the DB method was also proposed that used
eigendecomposition of symmetric matrices. These were conjugate to the non-symmetric
Markov matrix and used weight functions comprising pairwise similarity between pixels.
To cluster the low dimensional data, two-phased histogram-based segmentation method
named as Wavelength-Wise Global segmentation (WWG) was used. In wavelength wise
understanding of n-band HSI, cube was considered as collection of n images having
size m*m. The clustering was performed on the basis of colour similarity. The colour-
based segmentation included normalisation of input image followed by it’s quantization.
The frequency colour histogram was built in which certain number of highest peaks were
detected that were assumed to belong to different objects in the image. The highest peak
being the largest homogeneous area i.e., background. It was assumed that quantized colour
vectors belonging to same peak were part of same coloured object. After identification of
peaks, each quantized colour vector was associated with a single peak using euclidean
distance and final images were constructed. Microscopy and remotely sensed images
of Washington DC’s National Mall were used on which various iterations of proposed
methodology were performed. The classification results were dependent on the dimension
of diffusion space whose optimal value selection was yet to be studied by the authors.
In 2018, Jain et al. [46] proposed classification of HSI and trained the important features
by optimizing the SVM using Self Organizing Maps (SOM). They classified the interior and
exterior pixels using the posterior probabilities. SOM is data compression technique in
which the incoming signal/pattern of any dimension is reduced to 1D or 2D lattice using
competitive learning of neurons. In their approach the input images were converted to
grayscale, and ROI were selected over which SOM algorithm was applied to properly group
together the pixels in terms of features and intensity levels. The SOM training algorithm
provided inputs and weights to each edge of the image. On the basis of neighbourhood Best
Matching Unit (BMU) using Euclidean distance, each neighbouring node’s weights were
updated iteratively. It brought them closer to the input pattern. For classification of interior
and exterior pixels, posterior probabilities and an optimal threshold were computed. If the
probability of a pixel was greater than the threshold, then the pixel belonged to the interior
of the particular class else it belonged to the boundary of certain class. The experiment
was performed on Indian Pines and Pavia University dataset where it outperformed other
baseline methods achieving highest accuracy of 85.29% and 95.46%, respectively.
Band reduction techniques would reveal nonlinear properties but at the expense of
losing orginal data’s representation. To address the same, Ahmad et al. [47] in 2019 used
non-linear Unsupervised, non-segmented and segmented Denoising Autoencoder(UDAE)-
based b method for improving the classification of HSI. For segmented UDAE, the HSI cubes
were segmented spatially-based on the pixel locations and further processing of segmented
Electronics 2023, 1, 0 14 of 35
HSI images was done spectrally by autoencoder. The experiment was performed on Pavia
Centre, Pavia university and Salinas valley dataset where the proposed methodology
achieved highest accuracy using SVM.
3.4.2. Semi-Supervised
In 2016, Romaszewski et al. [48] proposed a co-training approach-based on P-N
learning scheme inspired by the Tracking-Learning-Detection framework (TLD) used to
track the objects in videos. In P-N scheme, two independent learners P and N were used
that scored the unlabeled samples in different feature spaces and extended the training set.
P-expert assumed same class for spatially close pixels-based on region growing. The score
function was estimated using Gaussian Kernel Density Estimation that used distance from
known samples (seeds). N-expert assumed the same class for pixels with similar spectra
and was defined as a Nearest Neighbor classifier (NN) having a rejection score for pixel
i. It identified the n-closest spectral neighbours from the seeds and spectral Euclidean
distance was computed between the pixel i and pixel j. The score formula was-based on
the probability estimation with the distance-weighted KNN rule. The scores from both the
expert were combined. Spectral classification was performed for unlabeled pixels that could
not be labeled using region growing due to disjoint regions. They applied the approach
on six data sets: the Indian Pines, Salinas Valley, University of Pavia, La Selva Biological
Station and Madonna, Villelongue, France. The method achieved highest classification
accuracy in comparison with various state of the art approaches.
3.4.3. Supervised
In 2016, Li et al. [49] used dual -layer supervised Mahalanobis distance kernel for
HSI classification. The traditional unsupervised approach was modified using supervised
Mahalanobis matrix to obtain a new kernel using relativity information of the various
materials present in the images. The proposed approach was executed in two steps where
firstly, the traditional Mahalanobis matrix was used to map the raw data. Then using the
mapped data, difficult-to-identify classes from the various classes were selected and second
mahalanobis matrix was learned using this particular data only. A new mahalanobis kernel
was formed using the combination of these two matrices. In the end, on this dimensionally
reduced data, SVM was used achieving high performance on the Indian Pines, Salinas
valley and Pavia university dataset. It resolved the drawback of traditional Mahalanobis
distance metric learning, which learned a matrix without taking into accounts the weights
of each class.
Nhaila et al. [50] performed supervised classification of HSI in 2019 using SVM, KNN,
RF and Linear Discriminant Analysis (LDA) with different kernels along with MI for
dimension reduction. The features/bands were selected by computing the MI between the
ground truth and each band. The subsets of bands were intialised with the band having
highest MI value with ground truth. The average of last band and new candidate band
built a reference map called as ground truth estimated. Finally, the candidate band was
added to the subset if it increased the previous MI value between ground truth and the
reference map. The experiment was performed on Indian Pines, Salinas valley and Pavia
university dataset. SVM with RBF kernel and RF outperformed other learners.
The aforementioned supervised, semi-supervised and unsupervised dimension reduction-
based classification techniques have been compared in Table 3.
Electronics 2023, 1, 0 15 of 35
Table 3. Cont.
MKL could be built for compact representation. The drawback was choosing an appropriate
number of kernels which was a tradeoff between efficiency and accuracy. The number was
chosen between 9 and 12.
In 2017, Yang et al. [55] too worked on representative band selection in HSI. The dis-
tances between spectral bands were computed using disjoint information. Bands were clus-
tered using k-means and ‘K’ representative bands were selected from these clusters. The cri-
teria for optimal selection was-based on minimizing the distances between bands inside the
clusters and maximizing the gap between different representative bands. The disjoint infor-
mation was calculated using joint entropy and MI of two spectral images. The proposed
technique used KNN and SVM classifiers on the Indian Pines dataset and outperformed
various state of the art techniques.
In 2018, Medjahed et al. [56] proposed feature selection in HSI as optimization problem
by using a stochastic approach namely. Simulated annealing was used to optimize the
objective function embedded with classification accuracy rate and relevance among features
in terms of MI. The experiment was compared with existing feature selection approaches
like Mutual Information (MI) Feature Selection, MI Maximization (MIM), Joint MI (JMI),
Minimum Redundancy Maximum Relevance (MRMR) and Conditional MI Maximization
(CMIM). The proposed work achieved highest accuracy rate of 88.75% having 10 features
as compared to above techniques on the Pavia university dataset. Their study achieved
highest OA of 91.47% as compared to the other classifiers in their literature on the same
dataset. For Indian Pines dataset, the highest OA of 76.48% and AA of 71.72% was obtained
in comparison with SVM, genetic algorithm and using 10 features of 20% training pixels.
Xie et al. [57] addressed the problem of dimensionality reduction in 2019 via fea-
tures/bands selection that was information rich and less redundant. Improved Subspace
Decomposition (ISD) and Artificial Bee Colony algorithm (ABC) were used. The correlation
coefficients between adjacent bands were calculated. Local minima and spectral curve
visualization helped in achieving the subspace decomposition of choosing m bands from
the original n bands. Band subset selection was done where randomly k bands were chosen
from each band subspace. It was optimized by the ABC algorithm with the help of ISD
and maximum entropy. In the end, SVM was applied for the classification of the obtained
optimized band subsets. The proposed work was implemented on Pavia University, Indian
Pines and Salinas Valley datasets and achieved better performance than the various state of
the art approached for features selection.
In 2019, Sellami et al. [58] focused on tackling the curse of dimensionality and limited
number of training samples by selecting appropriate features/bands. Adaptive dimension
reduction was used that seeked relevant bands with high discrimination, information, low
redundancy. To extract spatial-spectral information, the spatial window includes features
from neighbouring pixels. These were loaded into a semi-supervised 3-D CNN with convo-
lutional encoder-decoder layers for 3-D convolution and max-pooling. The categorization
map was created using a linear regression classifier. The investigation was carried out
using data from Indian Pines, Pavia University, and Salinas Valley. In comparison to other
recent techniques, the suggested study attained the highest OA for all datasets..
Elzaimi et al. [59] used a filter-based approach using information gain function to
reduce the dimensionality in 2019. The bands were chosen-based on their interaction
and complimentarity. Classification was performed using SVM. The algorithm selected
the discriminative bands using an evaluation of interaction gain that maximised the com-
promise of the MI between the ground truth and the selected band. The average of the
interaction information helped in controlling the redundancy. The selected bands subset
was initialized with a band that had highest MI with class label that served as ground truth
estimated. Iteratively, candidate bands were added by computing their MI with ground
truth. Their information gain was calculated-based on the mean interaction information
between the candidate bands, ground truth and the estimated ground truth. The band that
maximized the information gain criterion was chosen in each step. The experiment was
performed on two benchmark hyperspectral datasets Indian Pines and Pavia University
and compared with other band selection algorithms like MI Feature Selection, Minimum
Redundancy Maximum Relevance (MRMR) method and MI-based Filter approach (MIBF).
Electronics 2023, 1, 0 18 of 35
The proposed work achieved highest OA of 95.25% and 96.83% in Indian Pines and Pavia
University dataset, respectively.
In 2020, Sawant et al. [60] proposed meta-heuristic-based optimization method of
bands selection using Modified Cuckoo Search algorithm (MCS). Initially, Chebyshev
chaotic map was used in the algorithm to initialize the nest locations (solutions). This
ensured non-repetition of generation of similar bands. Fitness value and current iteration
number were used to update iteratively the step size and a scaling factor of the Levy Flight
method. It generated new solutions (bands) in every iteration. These two modifications
in the standard Cuckoo Search algorithm gave MCS and helped in escaping from local
optimum. They used wrapper-based selection method due to which accuracy was checked
by involving the classifier in every iteration. Global best solution was obtained in the end.
The proposed technique outperformed standard CS algorithm and achieved the maximum
OA of 95.10% for Pavia University dataset, and 86.92% for Indian Pines dataset.
To reduce complexity of numerous spectral bands, Zhu et al. [61] used Affinity Propa-
gation (AP) clustering algorithm. An improved AP was used where subsets were created
inside the clusters, the information entropy was combined to change the availability matrix
and create clusters with arbitrary shapes. It achieved an OA of 91.5% on Salinas Valley.
The aforementioned features selection-based classification techniques have been com-
pared in Table 4.
Table 4. Cont.
Paul et al. [68] used MI-based S-SAE method in 2018. MI is a dependency measure
between bands. 1 indicates high dependency while 0 indicates independent bands. Non-
parametric MI-based spectral segmentation was performed. Local features of each segment
were extracted using S-SAE. MPs of the segmented spectral features gave spatial informa-
tion. The experiment was performed on 10%, 5% and 10% training samples of each class of
the Indian Pines, Pavia University and Botswana dataset. SVM with Gaussian kernel gave
better performance in classification of Pavia University and Botswana datasets. Random
Forest classified Indian Pines dataset better. It overcame the limitation of time consuming
and complex SAE-based features extraction method. The methodology performed well
even for limited number of samples. In future, various other non-linear feature extraction
methods like kernel PCA could be used with the proposed method. DL models could be
assimilated for spectral-spatial classification.
The comparative study of aforementioned features extraction-based classification
techniques is presented in Table 5.
2018 Paul et al. [68] MI-based SAEs, MPs for spatial features. NA
Hu et al. [76] got inspired from application of CNN on 2D images in 2015 and applied
the same in the spectral domain of HSIs. They used 1-D CNN with five layers consisting of
input, convolution, max pooling and fully connected layers. It helped in discriminating
each spectral signature amongst others. Their 5-layer architecture of CNN achieved better
accuracy than traditional SVM, 2-layer Neural Network and LeNet-5 architecture.
Chan et al. [77] proposed a DL-based network in 2015. It consisted of basic processing
components. Cascaded PCA to learn multistage filter banks, binary hashing and blockwise
histograms for indexing and pooling. This net was called PCANet. It was applied to
benchmark visual datasets for digit and face recognition. PCANet served as an effective
baseline where more advanced processing components or more sophisticated architectures
could be justified.
DL has been extensively used for HSI analysis and classification. But high quality
labeled samples are needed for DL to be utilised efficiently. In 2016, Liu et al. [78] tackled
this challenge using weighted incremental dictionary learning on which active learning-
based algorithm was developed. They selected only those training samples which improved
the selection criteria namely uncertainty and representative. This trained deep network on
how and which samples to select at each iteration for training. Their approach achieved
accuracy of 92.4% and 91.6% on Pavia University and Botswana dataset, respectively.
In 2016, Chen et al. [79] dealt with the challenges of limited training samples and
high dimensionality using regularized deep feature extraction method. To obtain better
spectral spatial features, the authors employed 3D CNN. They also applied L2 regular-
ization and dropout techniques to overcome overfitting. The authors improved the CNN
performance by also using virtual samples. These were generated by multiplying a random
factor with training samples and added noise. Their work achieved an OA of 97.56%,
99.54% and 96.31% on Indian Pines, Pavia University and Kennedy Space Centre dataset,
respectively. In future, a post processing methodology could help in further improvements
in classification.
Electronics 2023, 1, 0 22 of 35
Dimension reduction and features were extracted by Zabalza et al. [80] in 2016, using
Segmented Stacked Autoencoders (S-SAE). With S-SAE, the spectral segmentation of the
pixels was performed. The original features were confronted into smaller segments of data
processed separately by smaller and local SAEs on the segmented spectrum. The complexity
was highly reduced with the proposed method. It achieved better accuracy in segmentation
and classification of the scenes in Indian Pines and Centre of Pavia dataset. The work could
be extended using saliency detection methods, adaptive sparse representation and weakly
supervised learning. The major drawback was not extracting the spatial features.
To deal with highly correlated bands and limited samples Yu et al. [81], proposed
CNN in 2017 which dealt with raw HSI input in an end to end manner. Also, they used
small training dataset to optimise the parameters of CNN which helped with the problem
of overfitting. To deal with HSI information 1 × 1 convolutional layers were adopted. Their
approach obtained high OA of 64.19% on Indian Pines, 67.85% on Pavia University and
85.4% on Salinas Valley using 3 labelled samples per class using training.
In CNN, a lot of parameters are needed and hence more training samples are desired
for the convolution filters. But due to limited samples of HSI, overfitting happens which
gives overoptimistic results. Addressing these concerns, Chen et al. [82] focused on reduc-
tion of feature extraction of CNN by using Gabor filters which extracted spatial information,
edge information and textural features. They combined convolution filters with gabor
filters. Grid search was also used to find parameters for gabor filters. On comparison with
traditional methods like SVM, CNN with PCA and simple CNN, their approach achieved
highest OA and AA.
Deep CNN was used to reconstruct images and enhance their spatial features by
Yunsong et al. [83]. Each band was normalized in the range [0,1]. The spatial features of
different classes that had similar characteristics were enhanced to avoid spectral distortion.
PCA was performed to extract PC images. First PC image was chosen as reference image
due to high spatial information. Gray Level Co-Occurrence (GLCM) was used to extract
spatial features like entropy, contrast, correlation, dissimilarity. GLCM features of bands
were compared with the specific features of the first PC and used them in a ratio. The band
with minimum value of ratio was selected as the training label. CNN model with optimized
parameters as used to train the data. ELM was used for further classification. This combined
framework gave high performance for lesser training samples of Indian Pines, Salinas Valley
and Centre of Pavia dataset. Using image reconstruction helped in increasing the AA of
ELM by as high as 30.04%. It performed faster than other state of the art classifiers.
Although earlier authors gave good performance with 1-D CNNs, but it resulted in
information loss while representing HSI pixels as they are sequence-based datasets. Hence,
Mou et al. [84] analysed the pixels using deep Recurrent Neural Network (RNN) with
Parametric Rectified tanh (PRetanh) instead of regular activation functions used by others
like tanh or rectified linear unit. With this approach, band to band variability and spectral
correlation was understood well. It also helped them to learn with high learn rates without
risk of divergence in the training period. The authors reduced the number of parameters
by using gated recurrent unit to build their network. These units used Pretanh for hidden
representation and efficiently processing HSI. Their approach outperformed traditional
methods like SVM, RF and CNN.
In 2018, Zhang et al. [85] used CNN framework encoded with semantic information
which was context aware. Their approach had more discriminative power due to diverse
region-based inputs. Their model had different branches of CNN with each branch rep-
resenting different regions for pixel under inspection. Unlike traditional square window
across a pixel, they extracted six regions namely, right,left, top, bottom, whole region and
local region of a pixel with flexible shapes of patches. They also extracted deep spectral
spatial features using a multi-scale summation module which dealt with limited training
samples, enhanced learning capability and improved generalization. An accuracy of 98.54%
and 98.33% was recorded for Indian Pines and Salinas Valley, respectively.
Although, earlier many joint spectral spatial representations of features of limited
samples were done, but those were not very generic and robust. Deng et al. [86] built
a unified deep network in combination with Active Transfer learning (ATL). Initially,
the authors extracted joint spectral—spatial features using Stacked Sparse AutoEncoders
Electronics 2023, 1, 0 23 of 35
(SSAE). With the help of ATL, they transferred the pre-trained SSAE network and limited
training samples from a source domain to a target domain. The SSAE network was
correspondingly fine tuned using limited samples from both source and target domain
using active learning strategies. They obtained highest OA of 99.61% and 99.86% after
transferring the samples from Pavia university to Pavia Centre dataset and vice versa,
respectively.
HSI classification is improved by fusing spectral-spatial information. Taking advantage
of the same, Liang et al. [87] extracted deep multi-scale spectral spatial features for HSI and
named the framework as DMSF. They transferred the filter banks of VGG16 model which
learned about the spatial structure of HSI. They fused these deep spatial features with raw
spectral information using sparse autoencoders. They obtained the final discriminative
features by a weighted fuse of these spectral spatial features in VGG16. Their proposed
algorithm was classified using SVM and obtained high accuracy.
Wang et al. [88] focused on improving the training time and accuracy for classification
of HSI. The traditional methods used hand crafted features and needed improvement in
accuracy. To solve this, they developed end to end Fast Dense Spectral-Spatial Convolution
framework (FDSSC). They did not rely on PCA or any other feature extractors. In FDSSC,
they used “valid” convolutions of different sizes to extract spectral spatial features and
reduce dimensions. To achieve highly accurate results, they used densely connected
layers where each previous layer of neurons had a contribution in next layers. Authors
resorted to dynamic learning rate, parametric Rectified Linear Unit (ReLU) activation,
batch normalisation and dropout layers for more speed and reduce overfitting. This helped
authors to achieve high performance within 80 epochs.
Yang et al. [89] exploited the success of CNN in HSI classification in 2018. They
used spectral and spatial information both and built different models like 2D CNN, 3D
CNN, Recurrent 2D CNN (R-2D CNN) and R-3D CNN. Their models converged faster in
comparison with traditional methods like CNN and SVM. Although, their models were
superior yet those needed more training samples than other methods. Incorporating prior
domain knowledge of dataset and transfer learning could help improve performance more.
Pan et al. [90] used PCANet as the foundation, where multi-grain and semi-supervised
information were integrated. A multi-grained network called MugNet was used. It was
a simplified DL model to deal with less samples of training data. In each grain, there
was a DL model. Classification results were obtained via ensemble approach. MugNet
was built with three strategies to enhance the classification accuracy. First, multi-grained
scanning approach, to utilize the spectral relationships between the bands and the spatial
correlation within the neighbouring pixels. This scanning strategy extracted the joint
spatial-spectral information. In second strategy, the convolutional kernels were generated
in semi-supervised manner. Lastly, it did not include any hyperparameters for tuning.
The MugNet has two parallel branches: spectral MugNet and spatial MugNet. Their
frameworks were-based on Semi-Supervised PCA Net (SSPCANet) that had 4 layers: 1
input, 2 convolutional layers and 1 output layer. SSPCANet used the unlabeled pixels
for more representative convolutional kernels. The labeled pixels were used in training
using SVM classifier. It obtained highest OA of 90.65%, 90.82% and 93.15% on Indian Pines,
Grss_dfc_2013 and Grss_dfc_2014 datasets, respectively in comparison with other state of
the art approaches. The computational efficiency needs to be improved. In future, MugNet
could be transformed to a completely end-to-end manner.
Paoletti et al. [91] proposed a 3D CNN architecture to obtain spectral and spatial
features of HSI and classified them using Graphics Processing Unit (GPUs). A border
mirroring strategy was applied to process the border areas in the image. The images were
divided into patches of dxdxn where d was the width and height of the neighbourhood
window centered at a pixel and n were the number of spectral bands of original image.
d/2 pixels of border were mirrored outwards so that they could be used like any other
pixel in the image. The 3D patches were grouped into batches and sent to convolution
layers. Four fully connected layers were used and cross entropy was the loss function of
CNN. The experiment was performed on Indian Pines and Pavia University dataset using
various values of parameter d. On comparison with 1D, 2D, 3D CNNs and Multi-Layer , it
Electronics 2023, 1, 0 24 of 35
achieved highest accuracy for different values of parameter d. The classification accuracy
was dependent on manual selection of parameters.
In 2018, Chen et al. [92] proposed a joint spatial and spectral features driven HSI
classification. Image blocks containing local neighbourhood features gave spatial and
spectral features were merged using the convolutional layers. The results were obtained
from the fully connected layer and it outperformed other state of the art approaches.
The proposed network was also combined with the SVM (RBF kernel) in some of the fully
connected layers. Adaptive mechanism to select the spatial window size was proposed.
For obtaining the features, the first convolution layer was Multi-scale features extraction
layer that extracted features invariant of deformation and scaling. The second convolution
layer, feature fusion layer merged the spatial and spectral features followed by features
reduction convolution layer. The proposed network obtained an OA of 98.02% on Indian
Pines dataset which was higher than other approaches. On combination with SVM, highest
accuracy of 98.39% and 98.44% was obtained in the Indian Pines and Pavia University
dataset, respectively. The best size selection for the adaptive window was done on the basis
of confidence criterion where Conf(k) represented the possibility of input pattern being
classified into kth class. The algorithm worked as follows: two random size of window
A×A and B×B were chosen. When A > B, ‘m’ was the most possible class when window is
A×A and ‘n’ being the second most possible class. If for A, Conf(n) < Conf(m)×theta then
the output would be mth class. But if condition was not satisfied then window size B×B
would give higher confident result and classify the input block into m’ th class. Adaptive
window size selection helped in overcoming the problem of large window size that might
contain many intersecting categories hence confusing the network. This proposed method
improved the classification accuracy for HSI significantly.
Earlier classification techniques did not extract HSI features effectively. To address
the same concern, Singh and Kasana [93] used deep features to classify HSI. The authors
initially reduced the dimension to suppress data redundancy using Locality Preserving
Projection (LPP). This processed data was forwarded to Stacked Auto Encoder (SAE) for
deep feature extraction. Logistic regression was used and their work achieved an OA of
84.4% and 87.2% on Indian Pines and Salinas Valley, resp.
In 2019, Zhou et al. [94] used spectral-spatial LSTM networks shown in Figure 11,
for the classification of HSI. The spectral values of each pixel in all the channels were fed
into the Spectral LSTM (SeLSTM) as shown. Initially, the pixel vector having K number of
bands was transformed into K- length sequence. This sequence was fed one by one into
SeLSTM and the last output was fed to the SVM. 1st PC image, local patches centered at a
pixel and the row vectors of each image patch were one by one fed into the spatial LSTM
(SaLSTM). The rows of neighbourhood were converted into S-length sequence. Figures 12
and 13 display structure of SeLSTM and SaLSTM, respectively. For classification, spectral
and spatial features were obtained separately for each pixel. A decision fusion strategy
was adopted to obtain joint features. For joint spectral-spatial classification, results of
individual LSTMs were intuitively fused in weighted summation. The performance of
SeLSTM, SaLSTM and SSLSTMs were compared with several methods, including PCA,
LDA, non-parametric weighted feature extraction (NWFE), regularized local discriminant
embedding (RLDE), matrix-based discriminant analysis (MDA) and CNN where their
method improved the classification accuracy by at least 2.69%, 1.53% and 1.08% on Indian
Pines, Pavia University and Kennedy Space Centre dataset, respectively.
Electronics 2023, 1, 0 25 of 35
In 2019, Fang et al. [95] also extracted deep spectral spatial features at different patch
scales using 3D dilated convolutions. All the feature maps were densely connected with
each other. To obtain more distinguishing and less redundant spectral features, the authors
also built spectral-wise attention mechanism(SA) which used soft weights for features. It
achieved an OA of 86.62% on Indian Pines and 92.99% on Pavia University.
Earlier researches implementing ELM did not deal with insufficient samples efficiently.
To address the same, Liu et al. [96] in 2020 implemented ELM-based ensemble transfer
learning. The learners of the target domain helped in determining whether the source
dataset was useful or not. They retained biases and weights learned of the ELM in target
domain and utilised the instances of the source domain to iteratively update the output
weights of ELM. These weights were used by the authors for the training models which
were further ensembled using the same. In this manner, they used source data to improve
the ability of the learner in target domain. They used Pavia University and Pavia Centre
interchangeably as source and target domains to check efficiency of their approach.
Ramamurthy et al. [97] tried to reduce computational complexity by denoising and
reducing dimensions of HSI. Initially,they recognised edges of images through image
denoising and David Marr edge recognition with Canny edge detector. Further, they
segmented HSIs into pixels, reconstructed them and optimised the reconstruction loss.
The HSI were denoised again using AutoEncoders and dimension was reduced using PCA.
In the end, they obtained classification results using CNN. They obtained high OA of 92.5%
on Pavia University dataset.
Sharifi et al. [98] also focused on extracting spectral spatial features of HSI. Earlier,
gabor filters were used to extract shallow texture features and fed into DL model. The au-
Electronics 2023, 1, 0 26 of 35
thors aimed to improve the performance and hence extracted two stage textural features.
The authors applied PCA, afterwards extracted gabor features and took mean of them in
all directions in each scale. Then they obtained LBP of these gabor filters which were more
discriminative than gabor features and LBP alone. They stacked these features and used
3D CNN for classification. Their work recorded OA of 97.72% on Indian Pines dataset.
Cao et al. [99] proposed a new architecture for CNN termed as 3D-2D SSHDR. It
was an end to end hybrid dilated residual networks. 3D hyperspectral cubes were the
input. 3D-2D SSHDR contained five parts, i.e., spectral feature learning process, 3D to 2D
deformable part, spatial feature learning process, an average pooling layer, and a fully
connected layer. The 3D spectral residual blocks learned discriminant spectral features.
For spatial feature learning, the extracted spectral features of 3D images were converted
into 2D features map. To continue learning discriminative spatial features, hybrid dilated
convolution (HDC) residual blocks were used that increased the receptive field of the
convolution kernel. It did not increase any other parameters The proposed network was
trained using supervised learning. The experiment was applied on Indian Pines, Kennedy
Space center and Pavia University datasets achieving high OA of 99.46%, 99.89% and
99.81%, respectively as compared with other models of CNN. The spatial features had not
been extracted in 3D. Also, in future transfer learning could help to extend samples and
improve accuracy.
Nalepa et al. [100] proposed resource frugal quantized spectral CNN. The weights/
activations were represented in compact format like integer or binary numbers without
affecting the classification process. They utilized multi-stage quantization aware training.
The deep model was trained in full precision followed by fake quantization and trained
again before being quantized to final low-bit version. Fake quantization was used as
intermediate step to simulate the quantization of weights/activations. The experiment was
performed on Pavia University and Salinas Valley. This model, four times smaller in size
than the original counterparts segmented equally well. It helped to reduce the memory
footprint of large-capacity model to classify the HSI. Varying the quantization levels could
help understand abilities of DL model better.
Vaddi et al. [101] worked on data normalization and CNN-based classification of HSI.
The normalization was performed by downsizing pixel scalar values by dividing them
with the maximum pixel intensity value. Probabilistic PCA was used to extract spectral
features. Gabor filter helped in acquiring the spatial features. Both the spatial and spectral
information were integrated to form fused features used by CNN. The experiment was
performed on Indian pines, Salinas valley and Pavia University dataset where the proposed
approach gave highest accuracy as compared to other state of art approaches. The running
time of the propose approach needs to be improved.
Various deep neural network models were used by Jiao et al. [102] for HSI classifi-
cation. In first approach, multi scale spatial features were extracted using convolution
network-based on VGG-verydeep-16. It contained 13 convolutional layers, five pooling lay-
ers, three fully connected layers and activation and dropout layers. The deep scale spatial
features were fused with spectral features using weighted fusion method and z-score. It
was used to segment the scenes and obtained pixel-based classification results on Indian
Pines dataset. In second approach, Recursive Autoencoders were employed. It formed
high level spatial spectral features from the original data. It learned local homogeneous
area of the image using the pixel under investigation. The spatial features of the pixel were
learned using weighting scheme-based on the neighbouring pixels. The weights were deter-
mined using the spectral similarity between the investigated pixel and neighbouring pixels.
Unsupervised RAE was employed on Pavia University dataset achieving an accuracy of
99.91%. Third approach involved Superpixels-based Multi Local CNN (SML-CNN). Super-
pixels were formed using a linear iterative clustering algorithm. Multiple local regions of
superpixels were jointly represented namely original, central and corner regions. It gave
different semantic environment of each superpixel even if there was spectral similarity. Fea-
tures were fused from the same. The classification was improved using multi-information
modification strategy to eliminate the errors by combining semantic (superpixel level) and
detailed information (pixel level). The proposed algorithm achieved a good accuracy.
Electronics 2023, 1, 0 27 of 35
Sharifi et al. [98] extracted complex spatial features using multi-scale CNN where
patches of different sizes were used. The spatial features were proved to improve the
classification performance. Hence, the authors included spatial features obtained from
gabor filters, morphological operations and LBP. All these features were fused with PCA’s
spectral features at the decision level for classification. It achieved an OA of 97.98% and
99.44% on 1% and 5% training samples from each class.
Due to radiometric and atmospheric corrections, many informative bands would be
lost. In 2021, Singh and Kasana [103] performed a different spectral-spatial classification
by approximating lost noisy bands. They used linear interpolation to gain approximated
bands. Further, they reduced spectral dimension and obtained spatial features through a
combination of LPP and PCA. The features were classified using deep network alongwith
SAE. The work achieved an OA of 88.9%, 93.3%, 91% and 91.5% on IP, Sa, KSC and PU, resp.
The recent DL classification techniques discussed above have been compared in Table 6.
Table 6. Cont.
Approximation of lost noisy bands, PCA • Indian Pines: OA-99.02% and AA-99.17%.
2021 Singh and Kasana [103] and LPP-based spectral-spatial features, • Pavia University: OA-99.94%
Deep network SAE. and AA-99.92%.
5. Discussion
After an extensive survey of spectral, spatial and spectral-spatial features-based classi-
fication of HSI, following insights have been observed.
• Majorly, land cover HSI datasets have been covered in this work. Indian Pines and
Pavia University are the commonly used dataset for classification as depicted in
Figure 14. Figure 15 displays the highest and lowest OA achieved by different classifi-
cation techniques in the survey.
• In traditional ML, kernel-based techniques have been employed for landcover images.
Table 1 shows the greatest OA of 99.5%, obtained with Shape adaptable kernels. It
incorporated spectral and spatial features, which helped to increase performance.
The main disadvantage of mathematical kernel is calculations overhead.
• SVM classifier, a kernel-based classifier, has been widely used for land cover im-
ages. The highest performance was an accuracy of 98.68%. SVM classifier improves
classification results when combined with a spatial Gaussian filter.
Electronics 2023, 1, 0 29 of 35
• The transform-based techniques aid in the denoising and compression HSI. Table 2
demonstrates the highest OA with SVM on benchmark landcover photos of 99.0% and
99.82 percent using Adaboost modelling to detect bruising in fruits.
• PCA has been commonly utilised as a data pre-processing step in traditional ML
approaches. It aided in the elimination of unnecessary spectral data.
• Many classification methods include dimension reduction techniques as pre-processing
steps. However, we have explicitly included a few different strategies, such as super-
vised, unsupervised, feature selection, and extraction, to emphasise their performance.
Table 5 demonstrates that the land cover image with bilateral filtering and spectral
similarity calculated and used in sparse representation classification and had the
greatest OA of 99.76%.
• DL techniques have heavily invaded into the research for HSI. It has shown better
performances due to in-built features processing and convolution kernels to deal with
complex HSI data. The resource frugal networks for land cover image achieved the
highest OA of 99.89% as evident in Table 7. However, the data partitioning remains
a challenge for HSI. Due to limited samples, training and testing data overlaps and
exaggerated results are recorded.
Figure 15. The highest and lowest OA achieved by different classification techniques in the survey.
Electronics 2023, 1, 0 30 of 35
Techniques OA Remarks
The purpose of this paper is to explore how well various categorization techniques
performed for HSI analysis. Some authors employed either spectral or spatial data, however
in recent papers, the emphasis has changed to both spectral and spatial data. In terms of
OA, Table 7 demonstrates significant differences between classic ML and DL approaches.
Although the OA of both algorithms is comparable, DL outperforms due to its automatic
feature development and robustness in dealing with complex HSI.
• Even though high spectral information is available, the low spatial resolution offers
irregularities and difficult interpretations.
• Vast number of continuous spectral channels also gives birth to redundant and less
informative bands.
• The dataset available have limited labelled training samples.
• With lesser number of samples and huge number of spectral bands, Hughes Phe-
nomena occurs in HSI. In this, with increasing bands and data, the classification
performance increases initially but decreases gradually.
• Target detection also remains one of HSI’s significant challenges, as the inherent
variability in target and background spectra poses a severe obstacle to developing
effective target detection algorithms for HSI. This may be due to the problem of un-
known backgrounds or shortage of sufficient target data, making it more challenging
and becoming a problem to be solved by more sophisticated techniques.
Author Contributions: All the authors made significant contributions to this work. Conceptualiza-
tion, S.S.K. and G.K.; Writing—original draft preparation, R.G.; Writing—revision and editing, R.G.,
S.S.K. and G.K. All authors have read and agreed to the published version of the manuscript.
Funding: This research received no external funding
Data Availability Statement: Publicly available datasets were analyzed in this study. This data can
be found here [https://rslab.ut.ac.ir/data]
Conflicts of Interest: The authors declare that there is no conflict of interest.
References
1. Huete, A.R. Vegetation indices, remote sensing and forest monitoring. Geogr. Compass 2012, 6, 513–532.
2. Khan, M.J.; Khan, H.S.; Yousaf, A.; Khurshid, K.; Abbas, A. Modern trends in hyperspectral image analysis: A review. IEEE
Access 2018, 6, 14118–14129.
3. Leiva-Valenzuela, G.A.; Lu, R.; Aguilera, J.M. Prediction of firmness and soluble solids content of blueberries using hyperspectral
reflectance imaging. J. Food Eng. 2013, 115, 91–98.
4. Liu, Z.; Wang, H.; Li, Q. Tongue tumor detection in medical hyperspectral images. Sensors 2011, 12, 162–174.
5. Liu, L.; Wang, J.; Huang, W.; Zhao, C.; Zhang, B.; Tong, Q. Improving winter wheat yield prediction by novel spectral index.
Trans. CSAE 2004, 20, 172–175.
6. Kutser, T.; Paavel, B.; Verpoorter, C.; Kauer, T.; Vahtmäe, E. Remote sensing of water quality in optically complex lakes. ISPRS
Ann. Photogramm. Remote Sens. Spat. Inf. Sci. 2012, 39, B8.
7. Zhang, L.; Zhang, L.; Du, B. Deep learning for remote sensing data: A technical tutorial on the state of the art. IEEE Geosci.
Remote Sens. Mag. 2016, 4, 22–40.
8. Gogineni, R.; Chaturvedi, A. Hyperspectral image classification. In Processing and Analysis of Hyperspectral Data; IntechOpen:
London, UK, 2019.
Electronics 2023, 1, 0 32 of 35
9. Gu, Y.; Chanussot, J.; Jia, X.; Benediktsson, J.A. Multiple kernel learning for hyperspectral image classification: A review. IEEE
Trans. Geosci. Remote Sens. 2017, 55, 6547–6565.
10. Rani, A.; Kumar, N.; Kumar, J.; Sinha, N.K. Machine learning for soil moisture assessment. In Deep Learning for Sustainable
Agriculture; Elsevier: Amsterdam, The Netherlands, 2022; pp. 143–168.
11. Lakshmi, T.V.H.; Madhu, T. Satellite Image Resolution Enhancement Using Discrete Wavelet Transform and Gaussian Mixture
Model. Int. Res. J. Eng. Technol. IRJET 2015, 2, 95–100.
12. Maduranga, U. Dimensionality Reduction in Data Mining. 2020. Available online: https://towardsdatascience.com/
dimensionality-reduction-in-data-mining-f08c734b3001 (accessed on 25 December 2022).
13. Gu, Y.; Liu, H. Sample-screening MKL method via boosting strategy for hyperspectral image classification. Neurocomputing 2016,
173, 1630–1639.
14. Fang, L.; He, N.; Li, S.; Ghamisi, P.; Benediktsson, J.A. Extinction profiles fusion for hyperspectral images classification. IEEE
Trans. Geosci. Remote Sens. 2017, 56, 1803–1815.
15. Li, L.; Wang, C.; Li, W.; Chen, J. Hyperspectral image classification by AdaBoost weighted composite kernel extreme learning
machines. Neurocomputing 2018, 275, 1725–1733.
16. Li, F.; Lu, H.; Zhang, P. An innovative multi-kernel learning algorithm for hyperspectral classification. Comput. Electr. Eng. 2019,
79, 106456.
17. Li, D.; Wang, Q.; Kong, F. Adaptive Kernel Sparse Representation Based on Multiple Feature Learning for Hyperspectral Image
Classification. Neurocomputing 2020, 400, 97–112.
18. Gao, Y.; Cheng, T.; Wang, B. Nonlinear Anomaly Detection Based on Spectral-Spatial Composite Kernel for Hyperspectral Images.
IEEE Geosci. Remote. Sens. Lett. 2020, 18, 1269–1273.
19. Ma, K.Y.; Chang, C.I. Kernel-based constrained energy minimization for hyperspectral mixed pixel classification. IEEE Trans.
Geosci. Remote Sens. 2021, 60, 1–23.
20. Wang, Y.; Yu, W.; Fang, Z. Multiple kernel-based SVM classification of hyperspectral images by combining spectral, spatial, and
semantic information. Remote Sens. 2020, 12, 120.
21. Ansari, M.; Homayouni, S.; Safari, A.; Niazmardi, S. A New Convolutional Kernel Classifier for Hyperspectral Image Classifica-
tion. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 11240–11256.
22. Krishna, S.L.; Jeya, I.; Deepa, S. Fuzzy-twin proximal SVM kernel-based deep learning neural network model for hyperspectral
image classification. Neural Comput. Appl. 2022, 34, 19343–19376.
23. Wang, A.; Xing, S.; Zhao, Y.; Wu, H.; Iwahori, Y. A hyperspectral image classification method based on adaptive spectral spatial
kernel combined with improved vision transformer. Remote Sens. 2022, 14, 3705.
24. Dalla Mura, M.; Villa, A.; Benediktsson, J.A.; Chanussot, J.; Bruzzone, L. Classification of hyperspectral images by using extended
morphological attribute profiles and independent component analysis. IEEE Geosci. Remote Sens. Lett. 2010, 8, 542–546.
25. Licciardi, G.; Marpu, P.R.; Chanussot, J.; Benediktsson, J.A. Linear versus nonlinear PCA for the classification of hyperspectral
data based on the extended morphological profiles. IEEE Geosci. Remote Sens. Lett. 2011, 9, 447–451.
26. Dópido, I.; Li, J.; Marpu, P.R.; Plaza, A.; Dias, J.M.B.; Benediktsson, J.A. Semisupervised self-learning for hyperspectral image
classification. IEEE Trans. Geosci. Remote Sens. 2013, 51, 4032–4044.
27. Zhong, S.; Chang, C.I.; Zhang, Y. Iterative support vector machine for hyperspectral image classification. In Proceedings of the
2018 25th IEEE International Conference on Image Processing (ICIP), Athens, Greece, 7–10 October 2018; pp. 3309–3312.
28. Pathak, D.K.; Kalita, S.K.; Bhattacharya, D.K. Hyperspectral image classification using support vector machine: A spectral spatial
feature based approach. Evol. Intell. 2022, 15, 1809–1823.
29. Li, R.; Cui, K.; Chan, R.H.; Plemmons, R.J. Classification of hyperspectral images using SVM with shape-adaptive reconstruction
and smoothed total variation. arXiv 2022, arXiv:2203.15619.
30. Akbari, H.; Kosugi, Y.; Kojima, K.; Tanaka, N. Wavelet-based compression and segmentation of hyperspectral images in surgery.
In Proceedings of the International Workshop on Medical Imaging and Virtual Reality; Springer: Berlin/Heidelberg, Germany, 2008;
pp. 142–149.
31. Chen, C.; Guo, B.; Wu, X.; Shen, H. An edge detection method for hyperspectral image classification based on mean shift.
In Proceedings of the 2014 7th International Congress on Image and Signal Processing, Dalian, China, 14–16 October 2014;
pp. 553–557.
32. Quesada-Barriuso, P.; Argüello, F.; Heras, D.B. Spectral–spatial classification of hyperspectral images using wavelets and extended
morphological profiles. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2014, 7, 1177–1185.
33. Prabhakar, T.N.; Geetha, P. Two-dimensional empirical wavelet transform based supervised hyperspectral image classification.
ISPRS J. Photogramm. Remote Sens. 2017, 133, 37–45.
34. Ji, Y.; Sun, L.; Li, Y.; Ye, D. Detection of bruised potatoes using hyperspectral imaging technique based on discrete wavelet
transform. Infrared Phys. Technol. 2019, 103, 103054.
35. Anand, R.; Veni, S.; Aravinth, J. Robust classification technique for hyperspectral images based on 3D-discrete wavelet transform.
Remote Sens. 2021, 13, 1255.
36. Xu, J.; Zhao, J.; Liu, C. An Effective Hyperspectral Image Classification Approach Based on Discrete Wavelet Transform and
Dense CNN. IEEE Geosci. Remote. Sens. Lett. 2022, 19, 1–5.
37. Miclea, A.V.; Terebes, R.M.; Meza, S.; Cislariu, M. On Spectral-Spatial Classification of Hyperspectral Images Using Image
Denoising and Enhancement Techniques, Wavelet Transforms and Controlled Data Set Partitioning. Remote Sens. 2022, 14, 1475.
38. Ji, Y.; Sun, L.; Li, Y.; Li, J.; Liu, S.; Xie, X.; Xu, Y. Non-destructive classification of defective potatoes based on hyperspectral
imaging and support vector machine. Infrared Phys. Technol. 2019, 99, 71–79.
Electronics 2023, 1, 0 33 of 35
39. Cao, X.; Yao, J.; Fu, X.; Bi, H.; Hong, D. An enhanced 3-D discrete wavelet transform for hyperspectral image classification. IEEE
Geosci. Remote Sens. Lett. 2020, 18, 1104–1108.
40. Zikiou, N.; Lahdir, M.; Helbert, D. Hyperspectral image classification using graph-based wavelet transform. Int. J. Remote Sens.
2020, 41, 2624–2643.
41. Manoharan, P.; Boggavarapu, P.K.L. Improved whale optimization based band selection for hyperspectral remote sensing image
classification. Infrared Phys. Technol. 2021, 119, 103948.
42. Tulapurkar, H.; Banerjee, B.; Buddhiraju, K.M. Multi-head attention with CNN and wavelet for classification of hyperspectral
image. Neural Comput. Appl. 2022, 1–15. doi : 10.1007/s00521-022-08056-w.
43. Villa, A.; Benediktsson, J.A.; Chanussot, J.; Jutten, C. Hyperspectral image classification with independent component discriminant
analysis. IEEE Trans. Geosci. Remote Sens. 2011, 49, 4865–4876.
44. Santos, A.; Pedrini, H. A combination of k-means clustering and entropy filtering for band selection and classification in
hyperspectral images. Int. J. Remote Sens. 2016, 37, 3005–3020.
45. Schclar, A.; Averbuch, A. A diffusion approach to unsupervised segmentation of hyper-spectral images. In Proceedings of the
International Joint Conference on Computational Intelligence; Springer: Cham, Switzerland, 2017; pp. 163–178.
46. Jain, D.K.; Dubey, S.B.; Choubey, R.K.; Sinhal, A.; Arjaria, S.K.; Jain, A.; Wang, H. An approach for hyperspectral image
classification by optimizing SVM using self organizing map. J. Comput. Sci. 2018, 25, 252–259.
47. Ahmad, M.; Alqarni, M.A.; Khan, A.M.; Hussain, R.; Mazzara, M.; Distefano, S. Segmented and non-segmented stacked denoising
autoencoder for hyperspectral band reduction. Optik 2019, 180, 370–378.
48. Romaszewski, M.; Głomb, P.; Cholewa, M. Semi-supervised hyperspectral classification from a small number of training samples
using a co-training approach. ISPRS J. Photogramm. Remote Sens. 2016, 121, 60–76.
49. Li, L.; Sun, C.; Lin, L.; Li, J.; Jiang, S. A dual-layer supervised Mahalanobis kernel for the classification of hyperspectral images.
Neurocomputing 2016, 214, 430–444.
50. Nhaila, H.; Elmaizi, A.; Sarhrouni, E.; Hammouch, A. Supervised classification methods applied to airborne hyperspectral
images: Comparative study using mutual information. Procedia Comput. Sci. 2019, 148, 97–106.
51. Ren, J.; Wang, R.; Liu, G.; Feng, R.; Wang, Y.; Wu, W. Partitioned relief-F method for dimensionality reduction of hyperspectral
images. Remote Sens. 2020, 12, 1104.
52. Liu, H.; Li, W.; Xia, X.G.; Zhang, M.; Tao, R. Superpixelwise Collaborative-Representation Graph Embedding for Unsupervised
Dimension Reduction in Hyperspectral Imagery. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 4684–4698.
53. Ding, S.; Keal, C.A.; Zhao, L.; Yu, D. Dimensionality reduction and classification for hyperspectral image based on robust
supervised ISOMAP. J. Ind. Prod. Eng. 2022, 39, 19–29.
54. Qi, C.; Wang, Y.; Tian, W.; Wang, Q. Multiple kernel boosting framework based on information measure for classification. Chaos
Solitons Fractals 2016, 89, 175–186.
55. Yang, R.; Su, L.; Zhao, X.; Wan, H.; Sun, J. Representative band selection for hyperspectral image classification. J. Vis. Commun.
Image Represent. 2017, 48, 396–403.
56. Medjahed, S.A.; Ouali, M. Band selection based on optimization approach for hyperspectral image classification. Egypt. J. Remote
Sens. Space Sci. 2018, 21, 413–418.
57. Xie, F.; Li, F.; Lei, C.; Yang, J.; Zhang, Y. Unsupervised band selection based on artificial bee colony algorithm for hyperspectral
image classification. Appl. Soft Comput. 2019, 75, 428–440.
58. Sellami, A.; Farah, M.; Farah, I.R.; Solaiman, B. Hyperspectral imagery classification based on semi-supervised 3-D deep neural
network and adaptive band selection. Expert Syst. Appl. 2019, 129, 246–259.
59. Elmaizi, A.; Nhaila, H.; Sarhrouni, E.; Hammouch, A.; Nacir, C. A novel information gain based approach for classification and
dimensionality reduction of hyperspectral images. Procedia Comput. Sci. 2019, 148, 126–134.
60. Sawant, S.; Manoharan, P. Hyperspectral band selection based on metaheuristic optimization approach. Infrared Phys. Technol.
2020, 107, 103295.
61. Zhu, Q.; Wang, Y.; Wang, F.; Song, M.; Chang, C.I. Hyperspectral band selection based on improved affinity propagation.
In Proceedings of the 2021 11th Workshop on Hyperspectral Imaging and Signal Processing: Evolution in Remote Sensing
(WHISPERS), Amsterdam, The Netherlands, 24–26 March 2021; pp. 1–4.
62. Uddin, M.P.; Mamun, M.A.; Afjal, M.I.; Hossain, M.A. Information-theoretic feature selection with segmentation-based folded
principal component analysis (PCA) for hyperspectral image classification. Int. J. Remote Sens. 2021, 42, 286–321.
63. Zhang, J. A hybrid clustering method with a filter feature selection for hyperspectral image classification. J. Imaging 2022, 8, 180.
64. Imani, M.; Ghassemian, H. Binary coding based feature extraction in remote sensing high dimensional data. Inf. Sci. 2016,
342, 191–208.
65. Qi, C.; Zhou, Z.; Sun, Y.; Song, H.; Hu, L.; Wang, Q. Feature selection and multiple kernel boosting framework based on PSO with
mutation mechanism for hyperspectral classification. Neurocomputing 2017, 220, 181–190.
66. Ksieniewicz, P.; Krawczyk, B.; Woźniak, M. Ensemble of Extreme Learning Machines with trained classifier combination and
statistical features for hyperspectral data. Neurocomputing 2018, 271, 28–37.
67. Qiao, T.; Yang, Z.; Ren, J.; Yuen, P.; Zhao, H.; Sun, G.; Marshall, S.; Benediktsson, J.A. Joint bilateral filtering and spectral similarity-
based sparse representation: A generic framework for effective feature extraction and data classification in hyperspectral imaging.
Pattern Recognit. 2018, 77, 316–328.
68. Paul, S.; Kumar, D.N. Spectral-spatial classification of hyperspectral data with mutual information based segmented stacked
autoencoder approach. ISPRS J. Photogramm. Remote Sens. 2018, 138, 265–280.
Electronics 2023, 1, 0 34 of 35
69. Chen, Z.; Jiang, J.; Zhou, C.; Fu, S.; Cai, Z. SuperBF: Superpixel-based bilateral filtering algorithm and its application in feature
extraction of hyperspectral images. IEEE Access 2019, 7, 147796–147807.
70. Li, Q.; Zheng, B.; Tu, B.; Wang, J.; Zhou, C. Ensemble EMD-based spectral-spatial feature extraction for hyperspectral image
classification. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2020, 13, 5134–5148.
71. Wang, D.; Du, B.; Zhang, L.; Xu, Y. Adaptive spectral–spatial multiscale contextual feature extraction for hyperspectral image
classification. IEEE Trans. Geosci. Remote Sens. 2020, 59, 2461–2477.
72. Liang, N.; Duan, P.; Xu, H.; Cui, L. Multi-View Structural Feature Extraction for Hyperspectral Image Classification. Remote Sens.
2022, 14, 1971.
73. Ratle, F.; Camps-Valls, G.; Weston, J. Semisupervised neural networks for efficient hyperspectral image classification. IEEE Trans.
Geosci. Remote Sens. 2010, 48, 2271–2282.
74. Lin, Z.; Chen, Y.; Zhao, X.; Wang, G. Spectral-spatial classification of hyperspectral image using autoencoders. In Proceedings of
the 2013 9th International Conference on Information, Communications & Signal Processing, Taiwan, China, 10–13 December
2013; pp. 1–5.
75. Yue, J.; Zhao, W.; Mao, S.; Liu, H. Spectral–spatial classification of hyperspectral images using deep convolutional neural
networks. Remote Sens. Lett. 2015, 6, 468–477.
76. Hu, W.; Huang, Y.; Wei, L.; Zhang, F.; Li, H. Deep convolutional neural networks for hyperspectral image classification. J. Sens.
2015, 2015, 258619.
77. Chan, T.H.; Jia, K.; Gao, S.; Lu, J.; Zeng, Z.; Ma, Y. PCANet: A simple deep learning baseline for image classification? IEEE Trans.
Image Process. 2015, 24, 5017–5032.
78. Liu, P.; Zhang, H.; Eom, K.B. Active deep learning for classification of hyperspectral images. IEEE J. Sel. Top. Appl. Earth Obs.
Remote Sens. 2016, 10, 712–724.
79. Chen, Y.; Jiang, H.; Li, C.; Jia, X.; Ghamisi, P. Deep feature extraction and classification of hyperspectral images based on
convolutional neural networks. IEEE Trans. Geosci. Remote Sens. 2016, 54, 6232–6251.
80. Zabalza, J.; Ren, J.; Zheng, J.; Zhao, H.; Qing, C.; Yang, Z.; Du, P.; Marshall, S. Novel segmented stacked autoencoder for effective
dimensionality reduction and feature extraction in hyperspectral imaging. Neurocomputing 2016, 185, 1–10.
81. Yu, S.; Jia, S.; Xu, C. Convolutional neural networks for hyperspectral image classification. Neurocomputing 2017, 219, 88–98.
82. Chen, Y.; Zhu, L.; Ghamisi, P.; Jia, X.; Li, G.; Tang, L. Hyperspectral images classification with Gabor filtering and convolutional
neural network. IEEE Geosci. Remote Sens. Lett. 2017, 14, 2355–2359.
83. Li, Y.; Xie, W.; Li, H. Hyperspectral image reconstruction by deep convolutional neural network for classification. Pattern Recognit.
2017, 63, 371–383.
84. Mou, L.; Ghamisi, P.; Zhu, X.X. Deep recurrent neural networks for hyperspectral image classification. IEEE Trans. Geosci. Remote
Sens. 2017, 55, 3639–3655.
85. Zhang, M.; Li, W.; Du, Q. Diverse region-based CNN for hyperspectral image classification. IEEE Trans. Image Process. 2018,
27, 2623–2634.
86. Deng, C.; Xue, Y.; Liu, X.; Li, C.; Tao, D. Active transfer learning network: A unified deep joint spectral–spatial feature learning
model for hyperspectral image classification. IEEE Trans. Geosci. Remote Sens. 2018, 57, 1741–1754.
87. Liang, M.; Jiao, L.; Yang, S.; Liu, F.; Hou, B.; Chen, H. Deep multiscale spectral-spatial feature fusion for hyperspectral images
classification. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2018, 11, 2911–2924.
88. Wang, W.; Dou, S.; Jiang, Z.; Sun, L. A fast dense spectral–spatial convolution network framework for hyperspectral images
classification. Remote Sens. 2018, 10, 1068.
89. Yang, X.; Ye, Y.; Li, X.; Lau, R.Y.; Zhang, X.; Huang, X. Hyperspectral image classification with deep learning models. IEEE Trans.
Geosci. Remote Sens. 2018, 56, 5408–5423.
90. Pan, B.; Shi, Z.; Xu, X. MugNet: Deep learning for hyperspectral image classification using limited samples. ISPRS J. Photogramm.
Remote Sens. 2018, 145, 108–119.
91. Paoletti, M.; Haut, J.; Plaza, J.; Plaza, A. A new deep convolutional neural network for fast hyperspectral image classification.
ISPRS J. Photogramm. Remote Sens. 2018, 145, 120–147.
92. Chen, C.; Jiang, F.; Yang, C.; Rho, S.; Shen, W.; Liu, S.; Liu, Z. Hyperspectral classification based on spectral–spatial convolutional
neural networks. Eng. Appl. Artif. Intell. 2018, 68, 165–171.
93. Singh, S.; Kasana, S.S. Efficient classification of the hyperspectral images using deep learning. Multimed. Tools Appl. 2018,
77, 27061–27074.
94. Zhou, F.; Hang, R.; Liu, Q.; Yuan, X. Hyperspectral image classification using spectral-spatial LSTMs. Neurocomputing 2019,
328, 39–47.
95. Fang, B.; Li, Y.; Zhang, H.; Chan, J.C.W. Hyperspectral images classification based on dense convolutional networks with
spectral-wise attention mechanism. Remote Sens. 2019, 11, 159.
96. Liu, X.; Hu, Q.; Cai, Y.; Cai, Z. Extreme learning machine-based ensemble transfer learning for hyperspectral image classification.
IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2020, 13, 3892–3902.
97. Ramamurthy, M.; Robinson, Y.H.; Vimal, S.; Suresh, A. Auto encoder based dimensionality reduction and classification using
convolutional neural networks for hyperspectral images. Microprocess. Microsyst. 2020, 79, 103280.
98. Sharifi, O.; Mokhtarzade, M.; Beirami, B.A. A Deep Convolutional Neural Network based on Local Binary Patterns of Gabor
Features for Classification of Hyperspectral Images. In Proceedings of the 2020 International Conference on Machine Vision and
Image Processing (MVIP), Qom, Iran, 18–20 February 2020; pp. 1–5.
99. Cao, F.; Guo, W. Deep hybrid dilated residual networks for hyperspectral image classification. Neurocomputing 2020, 384, 170–181.
Electronics 2023, 1, 0 35 of 35
100. Nalepa, J.; Antoniak, M.; Myller, M.; Lorenzo, P.R.; Marcinkiewicz, M. Towards resource-frugal deep convolutional neural
networks for hyperspectral image segmentation. Microprocess. Microsyst. 2020, 73, 102994.
101. Vaddi, R.; Manoharan, P. Hyperspectral image classification using CNN with spectral and spatial features integration. Infrared
Phys. Technol. 2020, 107, 103296.
102. Jiao, L.; Shang, R.; Liu, F.; Zhang, W. Brain and Nature-Inspired Learning, Computation and Recognition; Elsevier: Amsterdam, The
Netherlands, 2020.
103. Singh, S.; Kasana, S.S. A Pre-processing framework for spectral classification of hyperspectral images. Multimed. Tools Appl. 2021,
80, 243–261.
104. Li, L.; Ge, H.; Gao, J. A spectral-spatial kernel-based method for hyperspectral imagery classification. Adv. Space Res. 2017,
59, 954–967.
105. Manifold, B.; Men, S.; Hu, R.; Fu, D. A versatile deep learning architecture for classification and label-free prediction of
hyperspectral images. Nat. Mach. Intell. 2021, 3, 306–315.
106. Xue, Z.; Yu, X.; Liu, B.; Tan, X.; Wei, X. HResNetAM: Hierarchical residual network with attention mechanism for hyperspectral
image classification. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 3566–3580.
107. Sellami, A.; Tabbone, S. Deep neural networks-based relevant latent representation learning for hyperspectral image classification.
Pattern Recognit. 2022, 121, 108224.
108. Zhan, Y.; Wu, K.; Dong, Y. Enhanced Spectral–Spatial Residual Attention Network for Hyperspectral Image Classification. IEEE J.
Sel. Top. Appl. Earth Obs. Remote Sens. 2022, 15, 7171–7186.
109. Sharifi, O.; Mokhtarzadeh, M.; Asghari Beirami, B. A new deep learning approach for classification of hyperspectral images:
Feature and decision level fusion of spectral and spatial features in multiscale CNN. Geocarto Int. 2021, 37, 1–26.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.