Download as pdf or txt
Download as pdf or txt
You are on page 1of 34

Table of contents

Topic: Applications of deep learning in GIS and


Remote Sensing

 Artificial intelligence
 Machine learning
 Deep learning
 Artificial intelligence Vs Machine learning Vs Deep learning
 Evolution of deep learning
 How deep learning works?
 Deep learning Neural networks
 Application of deep learning in GIS and RS
 The integration of GIS and DL
 Application of integration of DL with GIS
 Image classification
 Object detection
 Semantic segmentation
 Instance segmentation
 Deep learning for mapping
 Application of deep learning in remote sensing
 Importance of Deep Learning
 Challenges of deep learning in GIS and Remote sensing
 Main challenges of remote sensing image scene classification
 Gaps and future trends
 Conclusions
 Reference
s Artificial Intelligence

“Artificial intelligence (AI),


also known as machine intelligence, is a
branch of computer science that focuses on
building and managing technology that can
learn to autonomously make decisions and
carry out actions on behalf of a human
being.”

Artificial intelligence

The ideal characteristic of artificial intelligence is its ability to rationalize and take
actions that have the best chance of achieving a specific goal. A subset of artificial intelligence is
machine learning (ML), which refers to the concept that computer programs can
automatically learn from and adapt to new data without being assisted by humans. Deep
learning techniques enable this automatic learning through the absorption of huge amounts of
unstructured data such as text, images, or video.

The term AI was first introduced in 1955 by John McCarthy, a computer scientist and
professor at Stanford University.
 AI is not a single technology. Instead, it is an umbrella term that includes any type of
software or hardware component that supports machine learning (ML), computer
vision (CV), natural language understanding (NLU), natural language generation, natural
language processing (NLP) and robotics.
 Artificial intelligence (AI) refers to the simulation or approximation of human
intelligence in machines.
 AI is being used today across different industries from finance to healthcare.
 Weak AI tends to be simple and single-task oriented, while strong AI carries on tasks
that are more complex and human-like.
 Some critics fear that the extensive use of advanced AI can have a negative effect on
society.

Machine Learning

“Machine learning is an application of artificial intelligence that enables systems to


learn and improve from experience without being explicitly programmed. Machine learning
focuses on developing computer programs that can access data and use it to learn for
themselves.
 Machine Learning is said as a subset of artificial intelligence that is mainly
concerned with the development of algorithms which allow a computer to learn
from the data and past experiences on their own.
 The term machine learning was first introduced by Arthur Samuel in 1959.
 ML applications are fed with new data, and they can independently learn, grow,
develop, and adapt.
 The performance of ML algorithms adaptively improves with an increase in the
number of available samples during the ‘learning’ processes. For example, deep
learning is a sub-domain of machine learning that trains computers to imitate natural
human traits like learning from examples. It offers better performance parameters
than conventional ML algorithms.

Working of machine learning

Today, with the rise of big data, IoT, and ubiquitous computing, machine learning has become
essential for solving problems across numerous areas, such as

 Computational finance (credit scoring, algorithmic trading)


 Computer vision (facial recognition, motion tracking, object detection)
 Computational biology (DNA sequencing, brain tumor detection, drug discovery)
 Automotive, aerospace, and manufacturing (predictive maintenance)
 Natural language processing (voice recognition)
Deep Learning

“A type of machine learning based on artificial neural networks in which


multiple layers of processing are used to
extract progressively higher level features from
data.”(From oxford languages)

“Deep learning is a subfield of ML that uses algorithms


called artificial neural networks (ANNs), which are
inspired by the structure and function of the brain and
are capable of self-learning. ANNs are trained to “learn” models and patterns rather than being
explicitly told how to solve a problem.”

Explanation:
 Deep learning is a type of machine learning and artificial intelligence (AI) that imitates
the way humans gain certain types of knowledge. Deep learning is an important
element of data science, which includes statistics and predictive modeling.
 It is extremely beneficial to data scientists who are tasked with collecting, analyzing and
interpreting large amounts of data; deep learning makes this process faster and easier.
 In the process of deep learning, the computation power is more important. It depends
on our layers; If the layer is convenient then the feasible amount of GPU and CPU is
needed. Otherwise it’s hard to get the result after a day or month or maybe a year.

To understand deep learning, imagine a toddler whose first word is dog. The
toddler learns what a dog is -- and is not -- by pointing to objects and saying the word dog. The
parent says, "Yes, that is a dog," or, "No, that is not a dog." As the toddler continues to point to
objects, he becomes more aware of the features that all dogs possess. What the toddler does,
without knowing it, is clarify a complex abstraction -- the concept of dog -- by building a
hierarchy in which each level of abstraction is created with knowledge that was gained from the
preceding layer of the hierarchy.
Artificial intelligence Vs machine Learning Vs Deep Learning
 Broadly speaking, AI is the ability of computers to perform a task that typically requires
some level of human intelligence.
 Machine learning is one type of engine that makes this possible. It uses data-driven
algorithms to learn from data to give you the answers that you need.
 One type of machine learning that has emerged recently is deep learning. Deep
learning uses computer-generated neural networks, which are inspired by and loosely
resemble the human brain, to solve problems and make predictions.

Evolution of Deep Learning

 Most of us know Deep Learning to be a 21st Century invention, but believe it or not, it
has been around since the 1940s.
 The Deep Learning (DL) concept appeared for the first time in 2006 as a new field of
research within machine learning. Over the years, deep learning has evolved causing a
massive disruption into industries and business domains.

History of Deep Learning Over the Years


 The history of deep learning can be traced back to 1943, when Walter Pitts and
Warren McCulloch created a computer model based on the neural networks of the
human brain.
They used a combination of algorithms and mathematics they called
“threshold logic” to mimic the thought process. Since that time, Deep Learning has
evolved steadily, with only two significant breaks in its development. Both were tied to
the infamous Artificial Intelligence winters.
 1958: Frank Rosenblatt creates the perceptron, an algorithm for pattern recognition
based on a two-layer computer neural network using simple addition and subtraction.
He also proposed additional layers with mathematical notations, but these wouldn’t
be realized until 1975.
 In 1962, a simpler version based only on the chain rule was developed by Stuart Dreyfus.
Developing Deep Learning Algorithms
 The earliest efforts in developing deep learning algorithms date to 1965, when Alexey
Grigoryevich Ivakhnenko and Valentin Grigorʹevich Lapa used models with polynomial
(complicated equations) activation functions, which were subsequently analysed
statistically.
 During the 1970’s a brief setback was felt into the development of AI, lack of funding
limited both deep learning and artificial intelligence research. However, individuals
carried on the research without funding through those difficult years.
 1980: Kunihiko Fukushima proposes the Neoconitron, a hierarchical, multilayered
artificial neural network that has been used for handwriting recognition and other
pattern recognition problems.
 1989: Scientists were able to create algorithms that used deep neural networks, but
training times for the systems were measured in days, making them impractical for real-
world use.
 1992: Juyang Weng publishes Cresceptron, a method for performing 3-D object
recognition automatically from cluttered scenes.

Deep Learning from the 2000s and Beyond


 Mid-2000s: The term “deep learning” begins to gain popularity after a paper by
Geoffrey Hinton and Ruslan Salakhutdinov showed how a many-layered neural
network could be pre-trained one layer at a time.
 In 2001, a research report by META Group (now called Gartner) described he
challenges and opportunities of data growth as three-dimensional.
 In 2009, NIPS Workshop on Deep Learning for Speech Recognition discovers that with
a large enough data set, the neural networks don’t need pre-training, and the error
rates drop significantly.
 By 2011, the speed of GPUs had increased significantly, making it possible to train
convolutional neural networks “without” the layer-by-layer pre-training.
 In 2012, Google Brain released the results of an unusual free-spirited project called
the Cat Experiment which explored the difficulties of unsupervised learning.
 In 2014, Google buys UK artificial intelligence startup Deep mind for £400m
 In 2015, Facebook puts deep learning technology - called Deep Face - into operations
to automatically tag and identify Facebook users in photographs. Algorithms perform
superior face recognition tasks using deep networks that take into account 120
million parameters.
 In 2016, Google Deep Mind’s algorithm AlphaGo masters the art of the complex
board game Go and beats the professional go player Lee Sedol at a highly publicized
tournament in Seoul.

The promise of deep learning is not that computers will start to think like
humans. That’s a bit like asking an apple to become an orange. Rather, it demonstrates that
given a large enough data set, fast enough processors, and a sophisticated enough algorithm,
computers can begin to accomplish tasks that used to be completely left in the realm of human
perception — like recognizing cat videos on the web (and other, perhaps more useful
purposes).

How Deep Learning Works?

DL is a learning algorithm based on neural networks (Schmidhuber, 2015). A


neural network comprises neurons or units with certain activation a and parameters (Litjens et
al., 2017). DL models (networks) are composed of many layers that transform input data (e.g.,
images) to outputs (e.g., categories) while learning progressively higher-level features (Litjens
et al., 2017). Layers between the input and output are often referred to as “hidden” layers. A
neural network containing multiple hidden layers is typically considered as a “deep” neural
network—hence, the term “deep learning”.

Deep learning models are trained by using large sets of labeled data and neural
network architectures that learn features directly from the data without the need for manual
feature extraction.

Figure: Neural networks, which are organized in layers consisting of a set of interconnected
nodes. Networks can have tens or hundreds of hidden layers.

Deep Learning Neural Networks

A DL structure comprises vast amounts of data, sophisticated computational


resources such as algorithms, graphics processing units (GPUs), and network architects.
Training (or learning) and interpretation (or prediction) are the two stages of neural networks
that allude to development vs production. The number of layers and kind of neural network is
chosen by the developer, while the weights are determined through training. There are several
types of DL algorithms available today.
Neural networks

Autoencoders
Autoencoders are neural networks made up of input, hidden, and output
layers. An autoencoder can learn different coding patterns. In an autoencoder, the numbers of
nodes are the same in both the output and input layers. The autoencoder must identify its
inputs rather than target values based on the output vector.

Basic of Autoencoders

Multi-layer Perceptrons (MLP)


MLP is artificial neural network that is feed forward. The multilayer perceptron
is one of the most fundamental DL models, consisting of a sequence of fully connected layers.
Each successive layer is a collection of nonlinear functions based on the weighted sum of all
outputs from the previous one. These networks are widely utilized in voice detection and other
machine learning applications.

Multilayer perceptions

Convolutional Neural Networks (CNN)


A convolutional neural network (CNN) is a feed forward multilayer perceptron
model. The initial layers of a deep network identify characteristics, while subsequent layers
reassemble these characteristics into higher-level input parameters. CNN is excellent in visual
recognition and identifying various visual patterns. Aside from image processing, CNN has been
successfully applied to video identification and a variety of natural language processing tasks.
Recurrent Neural Networks (RNN)
The convolutional model takes a given quantity of inputs and outputs a fixed
sized vector after a predetermined number of steps. Recurrent networks enable us to act on
vector segments in both input and output. A recurrent network, unlike a typical neural
network, may include connections that loop back into previous layers (or into the same layer).
This feedback enables RNNs to retain recollection of previous inputs and simulate issues over
time. Long Short-term Memory is a widely known RNN. This type of network is utilized in the
creation of chatbots and text-to-speech systems.

Deep Reinforcement Learning to Neural Networks


This adopts a very different technique. It places an entity in a setting with
specific boundaries, defining productive and undesirable action and an overarching goal to
achieve. In some aspects it is comparable to supervised learning; within this, programmers
must provide algorithms with well-defined goals and specify positive and negative
reinforcement. The learning algorithm modifies the entity’s action during training. The learning
algorithm’s objective is to identify the best action that maximizes the long-term reward
obtained throughout the activity. Deep reinforcement learning can be utilized in planning and
control applications.
Application of deep learning in GIS and RS

Machine learning tools have been a core component of spatial analysis in GIS for
decades. AI, machine learning, and deep learning are helping us make the world better by
helping, for example, to increase crop yield through precision agriculture, understand crime
patterns, and predict when the next big storm will hit and being better equipped to handle it.

Esri has developed tools and workflows to utilize the latest innovations in deep
learning to answer some of the challenging questions in GIS and remote sensing applications.
Computer vision, or the ability of computers to gain understanding from digital images or
videos, is an area that has been shifting from the traditional machine learning algorithms to
deep learning methods.

The Integration of GIS and DL

 Massive increases in computing power are rapidly providing new possibilities,


particularly in DL, and GIS is strongly reliant on such computational capacities.
Integration of GIS with DL has proven to be a potential utility, particularly in 3D
modelling, map generation, and route calculation.
 When remotely sensed data are integrated with other geographical variables structured
inside a GIS framework, the analytical capability of the system is increased as a result of
the integration. GIS is helpful in every workflow of AI. For example, photographs
recognized using DL may be geo-tagged in GIS, and images can be detected in GIS using
an algorithm. ANN requires a data storage, retrieval, analysis, and visualization
environment that is compatible. This may be provided by integrating ANN with GIS.
 The integration of GIS with DL will be helpful in various applications such as the
classification of RS images or attribute data analysis. Combining a GIS database with
powerful DL models will aid in enhanced environmental mapping and object
search/detection in integrated database. As a result, they may be utilized to include GIS
data into remote sensing image classification.
Applications of Integration of DL with GIS

 GIS has evolved into a must-have tool for processing, analyzing, and visualizing spatial
data. Geographic data and geographic information systems (GIS) are so crucial in
environmental disciplines that we now consider them essential components of
research, education, and policy. There are numerous software programs available to aid
in GIS decision making.
 The Scopus, Google Scholar, and Web of Science databases were used to review
publications that included the integration of DL with GIS.
 One area of AI where deep learning has done exceedingly well is computer vision, or
the ability for computers to see. This is particularly useful for GIS because satellite,
aerial, and drone imagery is being produced at a rate that makes it impossible to
analyze and derive insight through traditional means.
 Image classification, object detection, semantic segmentation, and instance
segmentation are some of the most important computer vision tasks that can be
applied to GIS.

Image classification
“Image classification involves assigning a label or class to a digital image.”
Image classification is a methodology of defining pixels. Depending on which aspects of the
images can be detected and categorized, the images have varying spectral and spatial
resolutions.

 GIS has executed image classification for a considerable time, and DL has also been
used to perform image classifications.
 This type of classification is also known as object classification or image recognition,
and it can be used in GIS to categorize features in an image.

Example:
For example, the drone image on the left below might be labeled crowd, and the
digital photo on the right might be labeled cat.

Drone image Digital image

Object Detection:
“With object detection, the computer needs to find the objects within an
image as well as their location.”

This is a very important task in GIS because it finds what is in a satellite, aerial, or
drone image, locates it, and plots it on a map. This task can be used for infrastructure mapping,
anomaly detection, and feature extraction. This process typically involves drawing a bounding
box around the features of interest.

Example:
For example, in the remote sensing image below, the neural network found the location
of an airplane. In a more general computer vision use case, a model may be able to detect the
location of different animals.

Semantic segmentation
“Semantic segmentation occurs when each pixel in an image is classified as belonging to a
class.”

 In GIS, this is often referred to as pixel classification, image segmentation, or image


classification.
 In GIS, semantic segmentation can be used for land-cover classification or to extract
road networks from satellite imagery.

Example:
For example, in the image on the left below, road pixels are classified separately
from non road pixels. On the right, pixels that make up a cat in a photo are classified as cat,
while the other pixels in the image belong to other classes.

A
nice early example of this work and its impact is the success the Chesapeake Conservancy has
had in combining Esri GIS technology with the Microsoft Cognitive Toolkit (CNTK) AI tools and
cloud solutions to produce the first high-resolution land-cover map of the Chesapeake
watershed.

Land-cover classification uses deep learning

Instance segmentation
“Instance segmentation is a more precise object detection method in which the
boundary of each object instance is drawn.”

 Instance segmentation is a computer vision task for detecting and localizing an object
in an image. Instance segmentation is a natural sequence of semantic segmentation,
and it is also one of the biggest challenges compared to other segmentation
techniques.
 Instance segmentation can be used for tasks like improving basemaps.
 This can be done by adding building footprints or reconstructing 3D buildings from lidar
data.
 This type of deep learning application is also known as object segmentation.
Example,

Building reconstructed in 3D using aerial LiDAR. The same building reconstructed in 3D from
the masks digitized by human editors (left), and semantic segmentation masks produced by
the Mask R-CNN (right)

 Esri recently collaborated with NVIDIA to use deep learning to automate the manually
intensive process of creating complex 3D building models from aerial lidar data for
Miami-Dade County in Florida.
 This task used this data to create segmentation masks for roof segments that were then
used for 3D reconstruction of the buildings.
Deep learning for mapping
 An important application of deep learning for satellite imagery is to create digital maps
by automatically extracting road networks and building footprints.
 Imagine applying a trained deep learning model on a large geographic area and arriving
at a map containing all the roads in the region, as well as the ability to create driving
directions using this detected road network. This can be particularly useful for
developing countries that do not have high quality digital maps or in areas where newer
development have taken place.

Roads can be detected using deep learning and then converted to geographic features.

Good maps need more than just roads—they need buildings. Instance segmentation
models like Mask R-CNN are particularly useful for building footprint segmentation and can
help create building footprints without any need for manual digitizing. However, these models
typically result in irregular building footprints that look more like Antonio Gaudi masterpieces
than regular buildings with straight edges and right angles. Using the Regularize Building
Footprint tool in ArcGIS Pro can help restore the straight edges and right angles necessary for
an accurate representation of building footprints.
Application of deep learning in Remote Sensing

The aim of deep learning for remote sensing is to use science-based methods to
create decision-ready analysis. This involves gathering and preparing data—raster, point cloud,
vector, field observations, etc.—and analyzing data from surfaces, principle components, band
ratios, band indices, statistical models, metadata, etc., to provide decision-ready analysis that
includes impervious, occlusion, and encroachment data.

To extract information from imagery, traditional approaches often employ image


processing techniques, such as edge detection and hand-crafted feature extraction, such as
SIFT (Scale-Invariant Feature Transform), HOG (Histogram of Oriented Gradients), and BoW
(Bag of Words). However, most of this work uses natural scene images taken from an optical
camera and more challenges exist when the models are applied to remote sensing imagery.

For instance, such data provide only a roof view of target objects, and the area coverage is
large, but the objects are usually small. Therefore, the available information of objects is
limited, not to mention issues of rotation, scale, complex background, and object-background
occlusions. Therefore, expansion and customization are often needed when utilizing deep
learning models with remote sensing imagery.
Image-level classification
“Image-level classification involves the prediction of content in a remotely sensed
image with one or more labels. This is also known as multi-label classification (MLC).”

 MLC can be used for predicting land use or land cover types within a remotely sensed
images, it can also be used to predict the features, either natural or manmade, to
classify different types of images. In the computer vision domain, this has been a very
popular topic and has been a primary application area for CNN.
 In remote sensing image analysis, CNNs and their combination with other machine
learning models are leveraged to support MLC.
 Recent work shows that the combined use of CNN with GNN could in addition capture
spatio-topological relationships, and therefore contributes to a more powerful image
classification model.

Deep learning for Remote sensing image classification


Object detection
“Object detection aims to identify the presence of objects in terms of their classes and
bounding box (BBOX) locations within an image.”

There are in general two types of object detectors: region-based and regression-based.

 Object detection can find a wide range of applications across social and environmental
science domains. It can be leveraged to detect natural and human made features from
remote sensing imagery to the inspection of living conditions of underserved
communities. It has also found application in the aviation domain where satellite
images are used to detect aircraft which can help track aerial activities, as well as other
environmental factors, such as air and noise pollution owing to said traffic.
 CapsNet is a framework that enables the automatic detection of targets in remote
sensing images for military applications.

Object detection in remote sensing image

Semantic segmentation
“Semantic segmentation involves classifying individual image pixels into a
certain class, resulting in the division of the entire image into semantically varied regions
representing different objects or classes. It is also a kind of pixel-level classification.”
 Several methods have been developed to support semantic segmentation To achieve
this, most of the neural network based models utilize an encoder/decoder-like
architecture, such as U-Net, FCN, SegNet, DeepLab , AdaptSegNet, Fast-SCNN, HANet,
Panoptic-deeplab, SegFormer, or Lawin.
 The encoder conducts feature extraction through CNNs and derives an abstract
representation (also called a feature map) of the original image.
 The decoder takes these feature maps as input and performs deconvolution to
create a semantic segmentation mask.
 Semantic segmentation is frequently employed in geospatial research to identify
significant areas in an image. For example, Zarco-Tejada et al. developed an image
segmentation model to separate crops from background to conduct precision
agriculture. Land use and land cover analysis detect land cover types and their
distributions in an image scene.

Semantic segmentation

Height/depth estimation of 3D object from 2D images


Understanding 3D geometry of objects within remotely sensed imagery is an
important technique to support varied research, such as 3D modeling, smart cities, and ocean
engineering.

 In general, two types of information can be extracted from remote sensing imagery
about an 3D object: height and depth.
 LiDAR data and its derived digital surface model (DSM) data could support the
generation of a height or depth map to provide such information.
 There are generally two methods in the computer vision field to extract height/depth
from 2D images: monocular estimation and stereo matching.
 For estimating height/depth, images remotely sensed and from the field computer
vision have different characteristics and offer different challenges. For example,
remotely sensed images are often orthographic, containing limited contextual
information. Also, they usually have limited spatial resolution and large area coverage
but the targets for height/depth prediction are tiny.

Image super resolution


The quality of images is an important concern in many applications, such as medical imaging,
remote sensing, and other vision tasks from optical images.

 However, high-resolution images are not always available, especially those for public
use and that cover a large geographical region, due partially to the high cost of data
collection. Therefore, super resolution, which refers to the reconstruction of high-
resolution (HR) images from a single or a series of low-resolution (LR) images, has been
a key technique to address this issue.

Spatial resolution in remote sensing

 Recently, the development of deep learning has contributed much to image super
resolution research. Related work has employed CNN-based methods or Generative
Adversarial Network (GAN)-based methods. Dong et al. utilized a CNN to map between
LR/HR image pairs.
 In more recent years, approaches, such as EfficientNet, have been proposed to enhance
Digital Elevation Model (DEM) images from LR to HR by increasing the resolution up to
16 times without requiring additional information. Qin et al. proposed an Unsupervised
Deep Gradient Network (UDGN) to model the recurring information within an image and
used it to generate images with higher resolution.

Object tracking
Object tracking is a challenging and complex task. It involves estimating the position
and extent of an object as it moves around a scene.

 Applications in many fields employ object tracking, such as vehicle tracking, automated
surveillance, video indexing, and human-computer interaction.
 There are many challenges to object tracking , for example, abrupt object motion,
camera motion, and appearance change. Therefore, constraints, such as constant
velocity, are usually added to simplify the task when developing new algorithms. In
general, three stages compose object tracking: object detection, object feature
selection, and movement tracking.
 In the remote sensing context, object tracking is even more challenging due to low
resolution objects in the target region, object rotation, and object-background
occlusions. To solve the issue of low target resolution, Du et al. proposed an optical
flow-based tracker. An optical flow shows the variations in image brightness in the
spatio-temporal domain; therefore, it provides information about the motion of an
object.
Change detection
“Change detection is the process of identifying areas that have experienced
modifications by jointly analyzing two or more registered images, whether the change is
caused by natural disasters or urban expansions.”

 Change detection has very important applications in land use and land cover analysis,
assessment of deforestation, and damage estimation. Normally, before detecting
changes, there are some important image preprocessing steps, such as geometric
registration, radiometric correction, and denoising, that need to be undertaken to
reduce unwanted artifacts. For change detection, earlier studies employed image
processing, statistical analysis, or feature extraction techniques to detect differences
among images.
 For example, image differencing is the most widely used method. It generates a
difference distribution by the subtraction of two registered images and finds a proper
threshold between change and no-change pixels. Other approaches, such as image
rationing, image regression, PCA (Principal Component Analysis), and change vector
analysis, are also well developed.

Deep learning for change detection in Remote Sensing


Importance of Deep Learning
It is the capability to process large
amounts of features which make deep learning
extremely powerful when dealing with unstructured
data. However, deep learning algorithms can be
excessive for less convoluted issues because they
require entry to a huge bulk of data that needs to
be more effective. For occurrence, ImageNet is the
familiar criterion to training deep learning models
for far-reaching image perception in which this approach is over 14 million images.

If the data is too straightforward or inadequate, it is bare accessible for a deep


learning model to turn out overfitted and declines to conclude with new data. As a result, deep
learning models are not as powerful compared to other system techniques (such as linear
models and boosted decision trees) for greater practical work problems such as realization of
client churn, exposing fraudulent negotiations and in alternative cases with lesser data-sets and
lean features. Whereas in assured cases like multiclass analysis deep learning can work for
minor processes and deliberate data-sets.

Challenges of Deep Learning in GIS and Remote Sensing

Main challenges of remote sensing image scene classification


Different from object-oriented classification, scene classification is a considerably
challenging problem because of the variance and complex spatial distributions of ground
objects existing in the scenes. Historically, extensive studies of remote sensing image scene
classification have been made. However, there has not yet been an algorithm that can achieve
the goal of classifying remote sensing image scenes with satisfactory accuracy.

The challenges of remote sensing image scene classification include the following

 big intraclass diversity;


 high interclass similarity (also known as low between-class separability);
 large variance of object/scene scales;
 coexistence of multiple ground objects

Faced with the situation, it is difficult for single-label remote sensing image scene
classification to provide deep understanding for the contents of remote sensing images.

For estimating height/depth, images remotely sensed and from the field computer
vision have different characteristics and offer different challenges. For example, remotely
sensed images are often orthographic, containing limited contextual information. Also, they
usually have limited spatial resolution and large area coverage but the targets for height/depth
prediction are tiny.

In the remote sensing context, object tracking is even more challenging due to low
resolution objects in the target region, object rotation, and object-background occlusions.

Problems in general --- Training data


 In Remote Sensing field, at moment there is a lack of a large amount of
open training dataset. The formats of open training datasets are diverse.
 Little amount, lack of samples in different regions and different time phases
 Training data were made in the similar way as the other computer vision task,
not specialized for remote sensing tasks
 The training data were in a narrow spectrum bands e.g. R,G,B. Multispectral and
hyperspectral training data are needed
 In general, the samples are in 2D, not 3D
 Lack of uniform classification system
 Existing training data have different standard of classification, same class with
different names(e.g. Grassland/Grass), same name in different types of objects
 Lack of training data from diverse spectrums and sensors
 Existing training data typically from remote sensing images with the RGB colors.
Lack of infrared, multispectral and hyperspectral training data
 Lack of training data from multi-sensors, stereomodels, and multi-views data
 Models with poor generalization ability due to
 Remote sensing data with the spatial and temporal characteristics. Existing
training data are lack of time and space spans (e.g. location change, season
transition….)
 Existing training data with different quality of labels, difficult to evaluate
 Labels by expert vs Amateur,
 Automatic vs semi-automatic labels

Problems in general --- Neural network


 Existing neural network only consider the image geometry, difficult to take into account
the spectral characteristics and geoscience knowledge
 Lack of flexibility
 Data channel adaptive optimization
 Scalable memory
 Creation of scalable channels

Gaps and Future Trends

 DL approaches can generate classifications or predictions about a particular target


based on a collection of input attributes, and GIS can produce the necessary
geographical input variables for such an
DL model. However, there are
constraints when they are performed
independently. This may be avoided by
combining GIS with DL.
 Big data technology allows for the
integration of many predictors and
variables, which is not feasible with GIS.
To carry out such modelling, however,
DL requires a huge spatial database,
which may be collected via a GIS database. This will enhance the accuracy of the
modelling. Further study on similar issues is necessary.
 Environmental degradation indicators, environmental monitoring, climate change,
hydrological models such as groundwater and surface water modelling, assessment of
soil erosion, soil moisture, urban sprawl analysis, water distribution network analysis,
disaster risk analysis, site suitability, coral reefs, and sustainability have all been
studied using remote sensing and GIS.
 Future applications of DL with GIS and remote sensing include susceptibility mapping of
various natural hazards and investigating more complex feature selection or dimension
reduction methods to improve the prediction performance of DL methods.
 Integrating DL with GIS and Remote sensing aids in the development of a better
decision-making tool. The approach may be utilized as a support tool for the rapid and
efficient generation of various maps by organizations involved in disaster management,
water resources, the environment, and land use planning.

Deep learning in Future


Conclusions

DL is a fast-developing discipline of machine learning that enables data scientists to


use cutting-edge research while utilizing an industrial-strength GIS and remote sensing.
Because of this appealing characteristic, neural networks are increasingly being used to model
complicated physical processes, and a lack of precise field data predominate. GIS and remote
sensing technologies can assist with every phase of the data science workflow, including data
preparation and exploratory data analysis, training the model, executing spatial analysis, and
lastly, distributing results via online layers and maps and driving field activities.

DL algorithms, on the other hand, can handle tasks with complex data structures
and modelling, yielding high accuracy with greater flexibility and generalization capability.
Scene classification of remote sensing images has obtained major improvements through
several decades of development.

A detailed and comprehensive assessment of the approach used with the DL algorithm,
as well as its performance analysis, can be undertaken. Further investigations on real-time
geospatial intelligence employing DL and GIS and remote sensing to analyze the progression of
natural disasters, climate variability, and real-time rescue operations can be performed.
Reference:

 https://www.techopedia.com/definition/190/artificial-intelligence-ai
 https://www.investopedia.com/terms/a/artificial-intelligence-ai.asp
 htt https://www.techopedia.com/definition/190/artificial-intelligence
 aips://www.investopedia.com/terms/a/artificial-intelligence-ai.asp
 https://www.simplilearn.com/tutorials/artificial-intelligence-tutorial/what-is-artificial-intelligence
 https://www.expert.ai/blog/machine-learning-definition/
 https://www.spiceworks.com/tech/artificial-intelligence/articles/what-is-ml/
 https://www.javatpoint.com/machine-learning
 https://www.researchgate.net/publication/346031981_Deep_Learning_An_overview_and_its_pr
actical_examples
 https://www.techtarget.com/searchenterpriseai/definition/deep-learning-deep-neural-network
 https://www.mathworks.com/discovery/deeplearning.html#:~:text=Deep%20learning%20is%20a
%20machine,a%20pedestrian%20from%20a%20lamppost.
 https://www.aiche.org/resources/publications/cep/2018/june/introduction-deep-learning-part-
1?gclid=CjwKCAiA-8SdBhBGEiwAWdgtcOZ8cllwO9LzkBplQ6ln3bh__IbZFw4jHRCwP-
N7v4_dH1fN0qUBaxoCi9oQAvD_BwE
 https://www.slideshare.net/MohamedYousif13/using-deep-learning-in-remote-sensing
 https://www.esri.com/about/newsroom/arcwatch/where-deep-learning-meets-
gis/#:~:text=Deep%20learning%20uses%20computer%2Dgenerated,solve%20problems%2
0and%20make%20predictions.&text=Machine%20learning%20has%20been%20a%20core
%20component%20of%20spatial%20analysis%20in%20GIS.
 file:///C:/Users/aliahmad/Downloads/ijgi-11-00385-v2.pdf
 file:///C:/Users/aliahmad/Downloads/A_Brief_Review_of_Recent_Developmen.pdf
 https://www.researchgate.net/publication/342541335_Remote_Sensing_Image_Scene_C
lassification_Meets_Deep_Learning_Challenges_Methods_Benchmarks_and_Opportuniti
es
 https://www.sciencedirect.com/science/article/abs/pii/S1389041717303546
 https://www.analyticsinsight.net/the-history-evolution-and-growth-of-deep-learning/
 https://www.sciencedirect.com/science/article/pii/B9780128154809000153
 https://www.forbes.com/sites/bernardmarr/2016/03/22/a-short-history-of-deep-
learning-everyone-should-read/?sh=2b8b9bec5561
 https://pro.arcgis.com/en/pro-app/latest/help/analysis/image-analyst/deep-learning-in-
arcgis-pro.htm

You might also like