Artificial Intelligence and Data Analytics For Geosciences and Remote Sensing: Theory and Application

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 28

CHAPTER

Artificial intelligence and data


analytics for geosciences and
remote sensing: theory and
application
21
Feras Al-Obeidat1, Farhi Marir1, Fares M. Howari2, Abdel-Mohsen O. Mohamed3, 4, Neil Banerjee5
1
College of Technological Innovations, Zayed University, Abu Dhabi, United Arab Emirates; 2College of Natural and
Health Sciences, Zayed University, Abu Dhabi, United Arab Emirates; 3Uberbinder, Inc., Seattle, WA, United States;
4
EX Scientific Consultants, Abu Dhabi, United Arab Emirates; 5Faculty of Science, Western University, London, Canada

21.1 Introduction
Artificial intelligence (AI) is a form of computing that enables machines to perform intellectual
functions, such as acting or responding to an input. Traditional computing apps likewise respond to
data, but all of the reactions must be hand coded. The combination of AI with big data (BD) is capable
of handling large volumes of data, including structured and unstructured, to reveal patterns and trends
in a timely manner and function admirably together. That is because AI needs information to fabricate
its knowledge; the more data that are available to AI apps, the more exact the result that can be
accomplished. Previously, AI did not function well because of moderate processors and small amounts
of data and information. Innovations in computer technology have enabled the need for such data
processing.
BD and AI are two amazing modern technologies that empower machine learning (ML), which can
be used to update and restate data continuously in different sectors such as banking, health care, and
insurance. Organizations can greatly benefit from the use of this innovation, especially in the
processing of data analysis. For instance, BD can arrive at an accurate assumption of how the market is
progressing and support organizations’ decision-making in what should be done to enhance their
business effectively. Some related work reported that AI is expected to reduce human work in coming
decades using robotic machines that will make decisions based on the facts gathered (Lawerence et al.,
2004; Maxwell et al., 2018; Al-Obeidat et al., 2016).
In this chapter, we use remotely detected information, fieldwork, and spatial models to measure,
guide, and screen biophysical properties in earthbound, atmospheric, and oceanic conditions to
comprehend and deal with the world’s surroundings and assets more readily. Our exploration provides
private and open division organization with strategies to turn satellite and airborne pictures and field
review information into significant maps or data from one or numerous foci in time. These outcomes

Pollution Assessment for Sustainable Practices in Applied Sciences and Engineering. https://doi.org/10.1016/B978-0-12-809582-9.00021-9
Copyright © 2021 Elsevier Inc. All rights reserved.
1055
1056 Chapter 21 Artificial intelligence and data analytics

can then be used to understand where, how, and why conditions are changing and to control natural
changes created by human activities.
We present two case studies on how we use AI in mineral exploration and remote sensing. In
particular, we use ML algorithms to classify types of soil on a selected terrain, and we use a set of AI
approaches to explore the soil and detect minerals.

21.2 Machine learning applications


Universal approximators are ML algorithms. From a lot of training data, they get to know the
fundamental behavior of a framework. Another intriguing element of ML-based strategies is that they
need not bother with earlier information about the idea of connections between information. ML could
be used for the following conditions (Lary, 2010): (1) the deterministic model is available and costly,
and the ML has been adopted as a code accelerator tool; (2) the deterministic model is unavailable,
whereas an empirical ML-based model has been adopted for currently available information; and (3)
classification problems.

21.2.1 Mineral mining


AI models involve Bayesian, functional, and metaensemble characteristics, which are then used for
quantitative detection of high-risk areas for the purposes of safe mineral mining (Oh et al., 2019). The
data provided in this study were used to prevent subsidence over coal mine areas. The authors reported
the use of BD and AI techniques to forecast vulnerable areas. Similarly, Cai et al. (2018) reported the
use of AI on Landsat data. In their study, they pointed out the high-performance advantage of Landsat
data because of its desired spatial resolution, which enables the easy and quick classification of crops
and other environmental factors related to mineral mining. Lavreniuk et al. (2016) also supported these
results and highlighted that using AI on Landsat data enables quick computations owing to intelligent
computation methods, thus enhancing the classification of geospatial characteristics.
Lary et al. (2016) highlighted some critical points on advancements regarding the use of BD and
ML, part of AI, in situations in which theoretical knowledge is unnecessary. For instance, mineral
mining implies the observation of parts of the earth, such as ocean and land systems. In this phase, the
use of ML provides the potential to classify data accurately using existing data (Karpatne et al., 2019).
According to Charou et al. (2010), the use of AI in remote sensing provides integrated database
monitoring and analysis, which is essential for impact assessment and management.

21.2.2 Environmental monitoring


Because mining operations have a direct impact on the environment, BD technology is used to
identify, delineate, and monitor active mining areas to detect potential sources of pollution. In
Charou et al. (2010), the authors presented a case study conducted on the Vegoritis hydrological
basin, in which they discovered that the use of Landsat-5 and Landsat-7 images could provide the
possibility of mapping the natural environment and assessing the impact of mining. The results of
this study are supported by Vasileiou et al. (2012), who conducted a case study in Evia Island and
discovered that remote sensing data (RSD) integrated with AI techniques could be used to analyze
and predict the environmental impact of mining in a geographical region. Therefore, such
21.3 Satellite images and Landsat hyperspectral data processing 1057

information and techniques can be applied in the long-term management of the environment by
monitoring and rehabilitating the mining areas.

21.2.3 Mineral exploration


Saibi et al. (2018) introduced the application of RSD in geosciences. The authors conducted a case
study in the Aynak-Logar valley, where they efficiently applied RSD examination techniques to
improve the limit of mapping regional characteristics and geological structures of the area, facilitating
mineral exploration. The use of such techniques is enhanced through AI algorithms, which then
approximate the relationships among various factors, even without prior knowledge (Shahin et al.,
2015). In this case, the use of AI has enabled geotechnical engineers to employ sophisticated methods
used to estimate geological characteristics. In addition, Maxwell et al. (2018) highlighted that ML
provides the potential for efficient and effective classification of remotely sensed images. This means
that the use of BD in classifying images enables scientists to handle remotely sensed data of high
dimensionality as well as map regions that have complex characteristics.
The use of BD and AI in remote detection, mineral exploration, and Landsat analysis has turned out
to be powerful in the field of geosciences. The AI reasoning strategy and models have attributes that
enable the easy arrangement of remote sensing, exploration, and Landsat data sets. The artificial
insight system is a basis for evaluating the ecological effect of mineral exploration, and consequently
for legitimate administration.

21.3 Satellite images and Landsat hyperspectral data processing


In the following sections, we first explain some specific terminology, such as Landsat, aerial
photography, satellite imagery, Advanced Spaceborne Thermal Emission and Reflection Radiometer
(ASTER), raster images, and hyperspectral imaging.
Landsat is a logical satellite that studies and photographs the world’s surface using remote-sensing
methods. Different satellites are used to collect data from photographs of the terrestrial and coastal
area of the earth. These satellites are outfitted with sensors which respond to earth-reflected sunlight
and infrared radiation. In 1972, the first Landsat satellite was launched and is still circling the earth.
Landsat sensors have a moderate spatial resolution. They cannot show individual houses on a Landsat
picture, but they can show enormous man-made object, such as highways and skyscrapers. This spatial
resolution is significant because it is coarse enough for worldwide coverage yet detailed enough to
describe human-scale processes such as urban development. Researchers claimed that satellite photos
and aerial photography both give a perspective on the earth from above, and both are used to consider
geography, study regions of land, and even keep an eye on countries.
Both satellite imagery and airborne imagery applications give workflows that are progressively
automated. Image analysis, modeling, vector extraction, and computer processing performance on the
gathered image data sets have improved and can give finished map products to users. The main
difference between airborne cameras and satellite imagery is that the first gives ultrahigh resolution for
citywide measured regions that demand clarity and detail and need imagery applications to adjust
objects within the image, whereas satellite imagery covers larger areas, sometimes countrywide.
Furthermore, airborne imagery has a higher resolution, making the images more understandable.
A sensor imaging instrument flew on the Terra satellite ASTER sensor in Dec. 1999, which was
intended to collect land surface temperature, reflectance, emissivity, and elevation data. ASTER is an
1058 Chapter 21 Artificial intelligence and data analytics

joint venture between the National Aeronautics and Space Administration (NASA) and the Japanese
Ministry of Economy, Trade, and Industry.
Hyperspectral imaging, or imaging spectroscopy, combines the intensity of digital imaging and
spectroscopy. For every pixel in a picture, a hyperspectral camera obtains the light intensity (radiance)
for a huge number (up to several hundreds) of adjoining spectral bands. In this way, each pixel in the
picture contains a continuous spectrum (in radiance or reflectance) and can be used to characterize
objects in the scene with extraordinary precision and detail. Hyperspectral pictures give substantially
more detailed information about the scene than an ordinary color camera, which just secures three
different spectral channels comparing the visual primary colors red, green, and blue. Henceforth,
hyperspectral imaging intends to characterize objects in the scene depending on their spectral
properties. Multispectral imagery is delivered by sensors that measure reflected energy inside a few
specific bands in the electromagnetic range.
Raster pictures use many colorful pixels or individual building blocks to frame a complete image.
JPEGs, PNGs, and GIFS are basic raster image types. Generally, most photos found on the Web and in
print records are raster pictures. Because raster pictures are built using a fixed number of colored
pixels, they cannot be significantly resized without compromising their resolution. Raster files are
generally hard to alter without losing some data although there is software that can convert raster to
vector files. Vector images are more flexible than raster. They use pixels, which is more efficient in
processing when employed with AI algorithms. AI and BD have become the most effective approach
for conducting the classification and regression of nonlinear systems.
Airborne and satellite pictures, including ASTER, Landsat, and hyperspectral imagery, are
generally used in remote sensing and geographic information systems (GIS) to comprehend natural
processes, environmental change, and anthropogenic behavior. These kinds of data are typically
multispectral or hyperspectral with individual spectral bands stored in a raster file. In such cases, band
numbers range between 200 and 250, including firm information processing algorithms for separating
data content and removing redundancy.
Two main issues facing an analysis of such complex hyperspectral data and conventional statistical
methods need explicit preparation to deal with the variety of information of hyperspectral data and the
rigidity of classification algorithms in assigning objects to fixed classes. Henceforth, in this case study,
we have used a notable database composed of multispectral values of pixels from satellite pictures, in
which the order is based on the neighborhood connected to the central pixel, with the objective of
producing characterization depending on the multispectral value. To overcome this challenge, we have
used a hybrid classifier that joins the two arrangement techniques chosen from different character-
ization viewpoints: (1) the decision tree (DT), which depends on ML (Al-Obeidat et al., 2015); and
(2) multiple-criteria decision analysis (MCDA) approach Procédure d’Affectation Floue Pour la
Problématique du Tri Nominal (PROAFTN), which translates in English to Fuzzy Assignment
Procedure for Nominal Sorting.
The following sections discuss DT and the MCDA method (PROAFTN) and a case study to
demonstrate the success of the hybrid method.

21.3.1 Machine learning


ML algorithms are generally categorized into two major learning approaches: learning supervised and
unsupervised. Supervised refers to a classification problem whereas unsupervised is a clustering
21.3 Satellite images and Landsat hyperspectral data processing 1059

problem. To perform supervised learning, the training data must be labeled before generating the
classification model, which can be used later to assign new testing data. The good thing about the
supervised approach is that when we have a stable model, it can be used to classify any new instances
with the need to train the data again. In unsupervised learning, clusters have no labels and are
distributed into groups, where data with similar characteristics are clustered together. The major
advantage of clustering is that training model is not required, and each new data set object can be
assigned to its closest cluster by comparing similarity.

21.3.2 Decision tree


In this analysis, we focus on the supervised learning approach, because the data we are testing are
already labeled into known categories. As an example, one well-known supervised learning approach
is DT. DT is used in many applications and is known to be good classifier because it generates high-
accuracy outcomes. Furthermore, the outcomes of DT are understandable because it can be represented
by decision rules. In fact, every classification algorithm could have some shortcoming; the short-
coming of DT is that although it builds the model, it generates strict intervals on numeric values. The
process of doing this is called discretization. Such a process leads to a strict rule of assigning objects to
a specific category.
The asset and power of DT can be summarized thus:
• It is easy to understand (i.e., through the generation of decision rules).
• The generated model uses Boolean logic approach through yes/no.
• The mathematical formula used to build the tree is easy to understand and efficient.
• The outcomes and generated results are acceptable in most cases; however, this basically depends
on the correct choice of the data set.
The strengths of DTs make them a popular classifier in ML; therefore, they are used in research and
applications (Al-Obeidat et al., 2015).

21.3.3 Multiple-criteria decision analysis method PROAFTN


The MCDA Method PROAFTN is a fuzzy classifier method categorized as a supervised learning
approach to solve multicriteria classification problems (Al-Obeidat et al., 2010; Al-Obeidat and
Belacel, 2011); it has been implemented in the health, business, and security fields, as
examples (Al-Obeidat et al., 2009, 2018; El-Alfy and Al-Obeidat, 2014). The major features of
PROAFTN are that:
1. It provides a clear description of its classification approach and also specifies the chance of access
to increasingly distinguished data concerning the classification decision.
2. It can implement two learning criterions: inductive and deductive. In the deductive method, the
decision-maker composes the essential parameters for the examined problem in the deductive
methodology; however, the boundaries and classification models are automatically obtained from
the data sets in an inductive process.
3. It performs the outranking match to determine the preference relations.
1060 Chapter 21 Artificial intelligence and data analytics

4. Pairwise evaluations between objects and prototypes in PROAFTN eliminate the need for
normalization, which is necessary in MLs.
5. Its prototypes are constructed based on a fuzzy approach. The fuzzy index gives a clue about how
weak or strong the membership is for the matching classes.
The PROAFTN classification methodology is presented in Fig. 21.1. The fuzzy methodology is
used to classify an object for the closest class in PROAFTN.
The restriction of using PROAFTN is related to the involvement of numerous variables (e.g.,
intervals, discrimination terminal, and weights) that must be determined to perform the classification
procedure. In the MCDA paradigm, the parameters are usually obtained in two ways: (1) the direct
approach, which involves an interactive approach from the data mining (DM) to determine the
parameters; and (2) the indirect approach, which is based on automatic procedures to find required
parameters for the available data set (Al-Obeidat and Belacel, 2011). In addition, it is generally
problematic to allocate exact quantitative qualities to these parameters because they are time
dependent. Therefore, an automatic method is mostly used to infer these parameters.
Furthermore, to smooth the intervals and allow flexible bounds in DT, we used a fuzzy approach
embedded in the PROAFTN method. The latter has the advantages of using a fuzzy approach, which
specifies detailed data on an object assigned to a class. This benefit of PROAFTN (Fig. 21.1) replaces
the loss of data when using strict intervals, as in the case of DTs. Therefore, we first used DT to
produce the decision rules. Then, PROAFTN was applied to assign the point to the nearest class
depending on the methodology for MCDA. In addition, we used the cleaning attribute including
attribute selection and discretization to improve competence and accuracy.

21.3.4 Hybrid classification model


Characterization issues involve the construction of a classification model that finds the performances
and types of accessible objects to inform the allocation of the new undecided items to the right

FIGURE 21.1
PROAFTN computation of Cj (a, bih ).
21.4 Decision tree 1061

set. Researchers in a variety of disciplines, such as statistics, economics, MCDA (Roy, 1996), and
ML/AI, have tackled the issue of classification. Some former examinations used DT, classification,
and remote sensing to process Landsat information (Sebastian, 1989; Lawrence et al., 2004; Moran
et al., 2002). However, no examination has proposed the procedure characterized defined next to sort
a Landsat data set.
The DT algorithm is a broadly used routine in ML and data mining. It is a numerical methodology
that is mostly used to determine the pixel class of the prearranged population of multispectral values of
Landsat multispectral scanner picture information. The strengths of DT are in the generation of an
easily understand DT and the computational speed in generating it. However, despite its strengths, DT
key restriction is in producing none flexible interims while assessed case of the class (i.e., its rigorous
principles for labeling objects to a class), which means no bordering region has values other than yes
and no. Moreover, PROAFTN has intriguing qualities, including creating reasonable rules and using a
degree of fuzzy relationship, which gives comprehensive data on allotting an object to a class. This
improvement in PROAFTN compensates for the restriction of the DT approach in its lack of infor-
mation regarding using exact intervals.
To deal with the shortcomings of the two approaches (PROAFTN and DT), we compiled both
approaches to develop an optimum solution for the classification problem, called the fusion classifi-
cation technique. Our prime objective is to have a novel and efficient data mining/classification
strategy as far as accurateness and interpretability are concerned. Thus, the motivation in the following
section is the use of the MCDA method PROAFTN as the ML methodology (Al-Obeidat et al., 2010).
The preceding discussion shows that PROAFTN and DT can be described as a white box model.
Both algorithms can produce classification models, which can be simply rationalized and understood.
However, when assessing any classification method, an additional factor, classification exactness, must
be considered. According to our analysis of this case study, the PROAFTN technique produced greater
classification precision than both DT algorithms (ID3 and C4.5), which were an improvement on
Quinlan’s earlier ID3 algorithm (Quinlan, 1996; Al-Obeidat and Belacel, 2011).

21.4 Decision tree


21.4.1 Algorithm
As highlighted earlier, DT analyses (DTA) are part of data mining and ML, which use DT as a
classification approach that employs opinions and features about an object to determine the object’s
target category (Witten, 2005). DTA have three fundamentals: (1) nodes that represent the attribute
values, (2) branches that denote the contents of each attribute, and (3) leaf nodes, which recognize the
output of the classification model (class label). The characterization model in DTA is created from
the training examples, which depend on mathematical formulas that include entropy, information gain,
or gain ratio. For each quality in the available/residual information, an entropy quantity of a lot of
items is computed, as discussed subsequently.
There are various versions of DTA, the most universal of which is C4.5. The characterization model
in DTA is built on recursive approach training samples, e.g., entropy, data gain, or the gain ratio used
in C4.5 for each attribute in the available or residual data. The degree of entropy is measured as:
HðNÞ ¼  S pðcÞlog2 pðcÞ (21.1)
c˛C
1062 Chapter 21 Artificial intelligence and data analytics

where N denotes the data set; p(c), the percentage of examples in the data set of class c; and C, the set
of classes.
The information gain is computed as:
IGðAÞ ¼ HðNÞ  S pðtÞHðtÞ (21.2)
t˛T

where T embodies the divided subsets created from N; and H(t), the entropy of subclass t.
DTA generally ends the training procedure when one of the resulting conditions is met: (1) all
leaves are simply categorized, where in certain circumstances, clean nodes can be reached on subsets
of attributes; or, (2) no new features or cases exist for additional dividing.
The foremost assets of DTA methods are that (1) it can produce rules that are simply understood;
(2) it can manage both quantitative and nominal variables. Conversely, some DTA classifiers such as
ID3 generally necessitate several reprocessing attempts such as (discretization) before learning is
initiated; and (3) it provides a clear indication of which fields or attributes are most important for
classification.
Limitations of DTA methods are (1) they are prone to errors, mainly in classification problems in
which many classes are involved or a relatively small number of training examples are available for
training (Al-Obeidat and Belacel, 2011); (2) they are generally computationally expensive to train.
They require a comparison on all possible splits, and the pruning process (prepruning or postpruning)
is also expensive; and (3) discretization causes the loss of information, which affects the performance
of classification accuracy.
To explain the concept of DT, here is an example of applying DT on iris data using R. At the
beginning, let us look at Iris data. Iris is a flower data set, which is a multivariate introduced by British
statistician Ronal Fisher. The data set has five attributes describing each flower: Sepal Width, Sepal
Length, Petal Width, Petal Length, and Species, which represent the varieties of the flower. The data set
contains 150 samples distributed evenly on the species Iris versicolor, Iris setosa, and Iris virginica.
The length and width for each sample are measured in centimeters.

21.4.2 Implementation in R
To know the volume of the data set (e.g., the number of attributes and objects [flowers]), we can run
this command:

In this case, we have 150 flowers and five attributes. To know the name of each attribute and
structure we can run the summary, head or tail. The following command will show us the first six rows
of the Iris data set. This process helps to see how data look, the number of attributes, and the type.
21.4 Decision tree 1063

In the next stage, our target is to build the model based on our data set. To do that, it is essential to
know what the target of the model is, based on what we want to build. In the Iris example, we are
looking to build a model to classify a flower if it is I. versicolor, I. setosa, or I. virginica. This model
will consider the attributes Sepal Width, Sepal Length, Petal Width, and Petal Length.
In R, we can do the following:

Building the model in R:

By applying this command, we obtain the classification model that can be used to test the model’s
performance on historical data and apply it on new data/flowers to know its type.
Let us check the execution of the model on the available data. In R, we can do the following:

These results show the model’s performance on each category. The model performs well: (1) for I.
setosa, all flowers are classified correctly; (2) for I. versicolor, 49 are correctly classified and one is
misclassified; and (3) for I. virginica, 45 are correctly classified and five are misclassified.
To check the model’s performance, we can perform the following calculation:

As shown, the model generates 96% classification accuracy, which is considered high.

21.4.3 Model tree


Because the algorithm is called DT, we will show the generated model in a form of a tree. The figure
that will be generated from the output will make the classification process more interpretable and self-
explanatory. Using R, we can run the following command:
1064 Chapter 21 Artificial intelligence and data analytics

The model presented in Fig. 21.2 shows that any flower with a Petal Length of 1.9 cm or less is
classified as setosa. However, if Petal Length is greater than 1.9 cm, another question will be posed:
whether Petal Width is greater than 1.7 cm or less. If it is found to be greater than 1.7 cm, the type of
the flower is virginica.

The DT model is a good method of classification; it works with any type of data and is inter-
pretable. Classification is straightforward and can be read as decision rules in a form of IFeELSE.

FIGURE 21.2
Decision tree for Iris data.
21.5 PROAFTN method 1065

21.5 PROAFTN method


PROAFTN involves using several parameters to compose its prototypes for classification. In this work,
an automatic ML practice is used to obtain these factors from data. With this tactic, from the defined
instances recognized as the training set, the essential special information (prototypes) required for
characterization are first determined; then, these data are used to allocate the new instances (i.e.,
testing data). The automatic data-driven method used here is like the learning methodology used by
DTA and other ML classifiers by exploiting the training example to construct the classification model
and classification.
PROAFTN models, also called prototypes, are assembled for each class; after construction, it is
proposed again to PROAFTN to classify new cases. The following sections designate essential pa-
rameters and describe the classification procedures and methodology used by PROAFTN.

21.5.1 Initialization
Using data from Table 21.1, and from a set of n items identified as a training set, assume a is an item
that needs to be  this item a is defined by a set of m features ½fg1 ; g2 ; .; gm g and
categorized; accept
[z] categories C1 ; C 2 ; .; C z .
Furthermore, assumed an item a is defined by the score of m attributes; then, the phases of the
process are: (1) for every category [Ch ],hwedefine
 aset i
of [Lh ] prototypes; and (2)
h for
 each
 prototype
 i
[bhi ], and each feature [gj ], an interval S1j bhi ; S2j bhi is described, where S2j bhi ¼ S1j bhi .
When assessing a specific amount or a measure with a solid interim, we must circumvent two
conditions. The first is the possibility of constructing a doubtful assessment; hence, the solid interim
would be larger. The second is that a wrong judgment is likely to be made; hence, the yield measure
will be beyond the constraint restrictions of the solid interim. Therefore, the consistency of the ac-
quired results might be uncertain. Nevertheless, because PROAFTN uses fuzzy intervals, we do not
anticipate these problems will arise. The fuzzy intervals concurrently allow both doubtful and opti-
mistic models of the considered measures (Belacel et al., 2007).

Table 21.1 PROAFTN parameters.


A Objects {a1, a2, ., an} to assign to different categories
m Criteria or attributes, {g1, g2, ., gm}
U Categories or classes such as {C1, C2, ., CZ}, z  2
Bh Prototypes of hth category, where with bhi denoting the i prototype of hth category
B Set of all prototypes. For example, B ¼ Xkh¼1 Bh
h    i
S1j bhi ; S2j bhi Interval bhi for each attribute gj in each class Ch with j ¼ l, 2, ., m
   
dj1 bhi ; dj2 bhi Thresholds of the prototype bhi for each attribute gj in each class Ch
whj Weight for each attribute gj in each class Ch
1066 Chapter 21 Artificial intelligence and data analytics

h  i h  i
Accordingly, we initiated the limits dj1 bhi and dj2 bhi to outline, at the same time:
h    i
(a) the pessimistic interval S1j bhi ; S2j bhi , and
h        i
(b) the optimistic interval S1j bhi dj1 bhi ; S2j bhi þdj2 bhi .
 h i
Therefore, the fuzzy interval from S1j ðbhi Þ  dj1 ðbhi Þ to S2j ðbhi Þ þ dj2 ðbhi Þ will be picked
so that it is ensured it will not exceed the considered quantity over a crucial boundary, and the values
[(S1 to S2 )] encompass the highest true-like standards. The use of PROAFTN necessitates us to obtain
the pessimistic and optimistic interval for each feature.
One of the foremost tasks of this study is to suggest an indirect procedure to get these intervals from
data throughout the learning time. Then, after the intervals are defined, PROAFTN is ready to use for
assigning objects to the nearest class. The following subsections describe the steps needed to classify
Item a to Category [Ch ] using PROAFTN.

21.5.2 Fuzzy indifference relation


After the determination of the fuzzy pessimistic and optimistic interval for each feature, the next step is
to compute the fuzzy
 membership
 degree between Item a and Prototype [bhi ], which arithmetically is
mentioned as [I a; bhi ].  
The computation of [I a; bhi ] is recognized by:
  m  
I a; bhi ¼ S whj Cj a; bhi (21.3)
j¼1

where
(a) [whj ] is the weight that measures the rank of a relevant attribute [gj ] of exact class [C h ],
" #
m
(b) wj ˛½0; 1; S whj ¼ 1 where, j ¼ 1, .,m; h ¼ 1, ., z, and
j¼1
h i
(c) Cj ða; bhi Þ is defined as the numerical value that assesses the nearness of Item a to Prototype [bhi ]
based on Attribute [gj ].
h i h  i h  i
To calculate Cj ða; bhi Þ , two positive thresholds dj1 bhi and dj2 bhi are required; their
computation is specified by:
      
Cj a; bhi ¼ min Cj1 a; bhi ; Cj2 a; bhi ; (21.4)
2 3
  
dj1 ðbhi Þmin S1j ðbhi Þgj ðaÞ; dj1 ðbhi Þ
where 4Cj1 a; bhi ¼  5
dj1 ðbhi Þmin S1j ðbhi Þgj ðaÞ; 0

2 3
   
dj2 ðbhi Þmin gj a S2j ðbhi Þ; dj2 ðbhi Þ
and 4Cj2 a; bhi ¼   5
dj2 ðbhi Þmin gj a S2j ðbhi Þ; dj2 ðbhi Þ
21.5 PROAFTN method 1067

21.5.3 Membership evaluation


The degree of membership between Item a and Class ½Ch  depends on the degree of indifference
between a and its closest neighbor in ½Bh . The associated equation to detect the nearby neighbor is
expressed as:
      
d a; C h ¼ max I a; bh1 ; I a; bh2 ; .; I a; bhLh (21.5)

21.5.4 Categorization
The ultimate phase is to allocate Item a to the exact class ½C h ; the computation needed to locate the
right class is direct and explained as:
  
a ˛ Ch 5d a; Ch ¼ max d a; Ci = i ˛ ð1; .; zÞ (21.6)

21.5.5 PROAFTN learning


As discussed earlier, PROAFTN involves the elicitation of its boundaries for classification. In this
process, an automatic procedure is constructed in which (1) the set of instances were identified as the
training set; (2) the compulsory preferential data (prototypes) required to build the classification model
are obtained first; and (3) these data are used to assign the new instances (testing data).

21.5.6 Determination of PROAFTN intervals


Because most of the data set types used in AI application are numerical, it is necessary in some
algorithms to convert these data to categories or intervals. A classification model (e.g., in naive Bayes)
prefers this type of data preprocessing before the classification model is built.
In this study, the PROAFTN algorithm requires data discretization for numerical data as an
important step for building the model. Numerical data must be divided into a set of intervals to build
PROAFTN prototypes, which are required later to build the classification model. As an example,
suppose we have 1000 patients with ages ranges 0 to 100 years. These ages could be categorized into
intervals, which could be [0,10], [11,20], . , [91,100]. These intervals are then mapped for each value
in the age column. Thus, any value (for example, 14, 15, 7, and 92) is respectively mapped to [11,20],
[11,20], [0,10], and [91,100].
The discretization strategy is commonly used in some ML algorithms such as DTA, mostly ID3 and
naive Bayes (Fayyad and Irani, 1993). Throughout discretization processes, numerical valued
attributes are converted to nominal ones by splitting the features’ continuous points [min, max] into
subintervals. The discretization strategies can be primarily classified as supervised or unsupervised
(Al-Obeidat et al., 2009). Supervised discretization procedures involve the class or category during
splitting; however unsupervised discretization does not involve the class label. Several techniques such
as equal frequency binning (EFB) and equal width binning (Al-Obeidat et al., 2009) are considered as
1068 Chapter 21 Artificial intelligence and data analytics

unsupervised discretization methods. In addition, the clustering procedure k-means can be used also as
an unsupervised discretization method. In the context of supervised learning, Entropy-based and
information gain discretization is an example of supervised discretization that uses class labels through
discretization.
Discretization techniques are exploited in this study with PROAFTN, which we used for com-
parison with the DTA (ID3 or C4.5) and ML algorithms. The purposesh   of discretization
 i algorithms
with PROAFTN are fundamentally to (1) find the boundaries S1j bhi ; S2j bhi automatically for
each featureh in thetraining
 data
 set;
  and (2) fine-tune
i the obtained interims to get the other fuzzy
boundaries Sj bi dj bj ; Sj bi þdj bi , which will be used afterward to construct the
1 h 1 h 2 h 2 h

classification model.
The discretization used in DTA typically creates discrete and strict intervals; thus, it may produce
noise, vagueness, or loss of information, which ultimately could generate a poor classification
model (Peng, 2001). Because  the hproposed
  solution
 i is hto use both pessimistic
i and optimistic
intervals, the fuzzy interval from S1j bhi dj1 bhi to S2j ðbhi Þ þdj2 ðbhi Þ are employed with
PROAFTN to prevent the loss of information. Accordingly, the considered quantity will be within
reasonable boundaries and the points within the range from (S1 to S2) are considered to retain the
highest true values.
We have also applied k-Means and EFB as unsupervised discretization techniques to produce [and].
To resolve the values for [and], a fine-tuning was processed on [and] to have greater adaptability in
allocating items to the nearest classes. And, the intervals adjustment can be obtained as: .
       
dj1 bhi ¼ bS1j bhi ; and di2 bhi ¼ bS2j bhi ; b˛½0; 1 (21.7)

21.5.7 Classification model


Building the classification model is not a straightforward process. The following number of steps are
required: (1) discovery stage whereby recognizing the problem at hand and what is the hypothesis;
(2) data collection and preparation stage, which is very important task and time consuming; and
(3) data cleaning, preprocessing and normalization stage. Some of the previous stages has been
performed successfully; for examples, the use of a classifier such as Neural Nets, and Decision Tree
have been used to build the classification model. This learning process is finished by operating on a
data set division called a training data set and afterward using the unlabeled data set called a test data
set for justification.
As an example of classification, assume that we have a data set for bank customers. These
customers have information like age, salary, dependent, marital status, profession, education level.
According to each customer behavior or issues with the bank, the bank adds a column to label the
customer as good or bad. The goal is to build an automatic way to predict a new customer whether
good or bad based on attributes such as age, salary, and so on. In AI, the algorithms learn from
experience to apply on new experience. Back to our example, we can use DT as an example to create a
model from previous customers data to identify new clients or customers. The model helps us to
21.5 PROAFTN method 1069

automatically select and discriminate between potential good and bad customers. Based on the output
of the model, the bank can decide to offer a better loan for a customer with good credits. We can also
use this kind of modeling on several applications like education, image recognition, speech recog-
nition, and health. For example, in health, it can be applied where the model can predict the potential of
having diabetes in the future based on some aspects or criteria such as: family history, insulin, body
weight, activities, and so on.
Throughout this work, we offer another methodology dependent on the well-known DT algorithm
and augmented with the multicriteria decision analysis method PROAFTN. The objective is to build a
strong classification technique which uses the best of decision tree and PROAFTN. An induction
approach inspired from DT is proposed with some differences between them. The introduced induction
approach considers all features to be included during learning. Induction process used in DT may not
involve all features during learning, since DT algorithm stops realizing when the entire leaves are
classified by a selective set of attributes. Also, DT commonly uses the data gain or gain ratio in a
recursive manner to determine the optimum attributes, which are required to form the tree. In our
example, the induced tree depends on the percentage of information in each interim, which is
representative for each attribute in each class; hence, a threshold is achieved. This selection process is
obtained by measuring the proportion of data, if whether is exceeding or equivalent to the proposed
threshold, and if so, the interim is then selected as a prototype set.
The induction approach is given in Table 21.2, and in which the tree is assembled in a recursive
manner, i.e., top-down and divide-and-conquer. Each branch in the tree denotes the values of attributes
generated as set of intervals by using discretization technique. The choice of best branches, that
compose the prototypes, is based on the choice of the threshold.

21.5.8 Hybrid DT and PROAFTN


As
h  deliberated
   before,
  to  iemploy PROAFTN, we must find few parameters
Sj bi ; Sj bi ; dj bi ; dj bj
1 h 2 h 1 h 2 h for each feature in each class. At first, DT is initiated as an
earlier advance to get these parameters through the discretization method presented by Fayyad and
Irani (1993). The discretization step basically uses
h entropy
i and h data
 idescribed in Eqs. (21.1) and
(21.2). To define the fuzzy adjustments values dj1 bhi and dj2 bhi , a tuning exercise is pro-
h  i  
cessed on S1j bhi and S2j bhi to permit further elasticity in allocating patterns to the nearest
h  i h    i      
categories: dj1 bhi ¼ S1j bhi bS1j bhi and dj2 bhi ¼ S2j bhi  bS2j bhi where b in [0,1].

21.5.9 Classification model development


As an example of implementing DT on a data set, we presented in Fig. 21.2 how classification model is
generated. In this example, they are fragmented during the learning process. The mode output is shown
below:
1070 Chapter 21 Artificial intelligence and data analytics

Table 21.2 Building PROAFTN model.


Algorithm 1

1 Indexes used i for prototype; and h for class


2 g for attribute or feature; j for attributes values
3 Compute entropy of every attribute
4 Compose tree by splitting the set N into subsets based on (information gain)
5 Select node with best information gain
6 Recursively continue on remaining subsets gj
7 If the value of greatest information gain is located then
8 Choose intervals for prototype bhi for class ch
9 Else
10 Allocate whj ¼ 0 for this value
11 Go next subattribute
12 End if
13 Continue with leaning by recursively calling step 5 for the remaining nodes

From the output, we noticed the cut points took place at 1.9 of Petal Length, 1.7 of Petal Width, 4.8
for Petal Width. The discretization in such case is implemented while building the model. In the case
with PROAFTN method, the discretization of attributes is done in advance, i.e., before the start of the
learning (building the classification model).
To further explain the discretization and induction approach with PROAFTN, we present a real
numerical example using Iris data. In this example the attributes are first discretized for each class C
using k-means (k: number h i of bins or clusters). The lower and the upper bounds for each cluster
 
represent the intervals [ Ijh , r ¼ 1,2, .,k] for each attribute in the class gjh as shown in Table 21.3.
r

After the discretization step, the induction procedure is applied to compose the ultimate prototypes.
The induction tactic defines threshold b to nominate best intervals; whereby, b is defined as the ratio of
the whole number of items belonging to each interval for each attribute in each class. To further
illustrate this concept, let’s consider b ¼ 5%, which means that for any interval in prototype structure,
the number of objects within the selected interval(s) to the number of total objects in this class should
be  5%. For instance, in Table 21.4, b11 ¼ [5.30,5.80], [2.30,3.20], [1.60,1.90], [0.10,0.20] is chosen
21.6 Case study I: hybrid DT and PROAFTN method utilization 1071

Table 21.3 Structure of prototypes.


 
bh1 ; Class : Ch bh2 ; Class : Ch ðbhL Þ; Class :
h
Ch
 1  1  i
g1 : S11h ; S21h g1 : S11h ; S21h g1 : S11h ; S21h
 3  3  i
g2 : S12h ; S22h g2 : S12h ; S22h g2 : S12h ; S22h
 2  3  i
g3 : S13h ; S23h g3 : S13h ; S23h g3 : S13h ; S23h
 k  k  k
gm : S1mh ; S2mh gm : S1mh ; S2mh gm : S1mh ; S2mh

to be the first prototype to represent the first class (i.e., Setosa). The same procedure is further applied
to find another set of prototypes for another classes.

21.6 Case study I: hybrid DT and PROAFTN method utilization for soil
classification from Landsat satellite images
21.6.1 Data description
To examine and assess the performance of this described technique against other DT classifiers, a
comparative and analytical analysis focuses on outstanding Landsat data with the point of creating a
classification based on the obtained multispectral values. The data sets used in this case study are
accessible in the Machine Learning Repository database (archive.ics.uci.edu/ml/index.html) in the
public domain of the University of California at Irvine. (1) The data collection and analysis are
composed of the multispectral values of pixels in 3  3 neighborhoods in a satellite picture; (2) the
classification is related to the center pixel in each neighborhood; (3) the data set is composed of 6435
instances, each of which is depicted with 36 attributes (four spectral bands times nine pixels in
vicinity) sorted by more than six classes that recognize the forms: cotton crop, red soil, gray soil, soil
with stubble grass, damp gray soil, or very damp gray soil; (4) all characteristics are integers and vary
from 0 to 255; and (5) the pixel class is marked as a number in the sample database.

Table 21.4 Sample outputs of applying k-means (for k [ 3) to Iris data.


Sepal Sepal Petal Petal
Intervals length width length width Class
j[1 j[2 j[3 j[4 C
1
Ij1 [5.30,5.80] [3.70,4.40] [1.50,1.50] [0.10,0.20] Setosa
2
Ij1 [4.90,5.20] [3.30,3.60] [1.00,1.40] [0.30,0.30] Setosa
3
Ij1 [4.30,4.80] [2.30,3.20] [1.60,1.90] [0.40,0.60] Setosa
1
Ij2 [6.30,7.00] [2.00,2.50] [3.00,3.80] [1.50,1.80] Versicolor
3
Ij3 [7.10,7.90] [2.90,3.10] [5.40,5.90] [1.90,2.10] Virginica
1072 Chapter 21 Artificial intelligence and data analytics

21.6.2 Results
The current learning system was actualized in Java operating on a Linux platform. In addition, the
following procedures were implemented: (1) an algorithm was developed to examine Landsat data;
(2) it directed a relative report with C4.5 and ID3 algorithms that were executed in Weka (https://www.
cs.waikato.ac.nz/ml/weka/); and (3) it used default settings for stratified 10-fold cross-validation.
To explain this concept or cross-validation further, while building a classification model, it is
important, to have separate data for testing and training. The idea behind data separation is to
determine the efficiency of the model. One major step in determining model efficiency is to calculate
the accuracy of classification. The exactness of the model depends on how the real category or label
matches the predicted category. In cross-validation, data are separated into groups. In each iteration,
while building the model, one data set is chosen for testing and the remaining are for training. The
model is built mainly from training data; for evaluation, the testing set is used for evaluation to assess
the performance of the model. In 10-fold cross-validation, the data set is separated into 10 groups. In
each iteration, one group is picked up for testing and the remaining nine groups are for training. The
model iterates 10 times; in each iteration, accuracy is calculated for each group. Finally, the average of
all accuracy results is calculated.
As shown in Table 21.5, overall precision created by this strategy (the hybrid approach) is 88.290%.
Interestingly, the DTs C4.5 and ID3 resulted lower characterization precision of 85.720% and 82.00%,
respectively. In addition, in terms of runtime, the hybrid approach was relatively faster than that for the
DT classifiers (C4.5 and ID3); hence, overall precision increased using the hybrid model.
In addition, other execution measures (Table 21.6) include (1) accuracy as a measure of the
percentage of properly categorized instances; (2) the number of features to evaluate the data set
attribute; (3) the true positive rate; (4) the true negative rate; (5) the false positive rate; (f) the
false negative rate; (g) recall as a measure of the proportion of related instances that are recovered;
(h) precision, which is a positive predictive value; (i) FMeasure that combines both precision and recall
into a single measure, hence capturing both properties; and (j) the region under the receiver operating
characteristic curve.
The performance of the DT(C4.5) and DT(ID3) and the hybrid model is summarized in
Tables 21.7e21.9, respectively.

Table 21.5 Examination of accuracy versus time complexity for different approaches.
Approach Accuracy (%) Time (seconds)

Hybrid approach 88.29 10


(PROAFTN and
decision tree) described
in this case study
ID3 82.00 11
C4.5 85.71 9
21.6 Case study I: hybrid DT and PROAFTN method utilization 1073

Table 21.6 Various parameters and desired values.


Parameter name Preferred value Description

1. Accuracy High Percent properly categorized


instances
2. Number of features Low Set of attributes for the data set
3. True positive rate High Refers to true positive rate
4. True negative rate High Refers to true negative rate
5. False positive rate Low Refers to false positive rate
6. False negative rate Low Refers to false negative rate
7. Recall High The proportion of recovered related
instances
8. Precision High Positive predictive value

Table 21.7 Decision tree (C4.5) performance.


Area
True False under
positive positive the
Categories Count rate rate Precision Recall FMeasure curve

1. Red_Soil 1533 0.9520 0.0220 0.9410 0.9520 0.9470 0.9670


2. Cotton_Crop 703 0.9460 0.0080 0.9470 0.9460 0.9470 0.9690
3. Grey_Soil 1358 0.8840 0.0370 0.8840 0.8840 0.8840 0.9250
4. Damp_Grey_ soil 626 0.5540 0.0540 0.5550 0.5540 0.5550 0.7520
5. Soil with 707 0.7920 0.0220 0.8350 0.7920 0.8130 0.8820
Vegetation_Stubble
6. Very damp 1508 0.8510 0.0580 0.840 0.8510 0.8450 0.9000
Grey_Soil

Table 21.8 Decision tree (ID3) performance.


Area
True False under
positive positive the
Categories Count rate rate Precision Recall FMeasure curve

1. Red_Soil 1533 0.9130 0.0230 0.9390 0.9130 0.9260 0.9400


2. Cotton_Crop 703 0.8040 0.0210 0.8500 0.8040 0.8260 0.8880
3. Grey_Soil 1358 0.8480 0.0410 0.8710 0.8480 0.8590 0.9000
4. Damp_Grey_Soil 626 0.5430 0.0680 0.5010 0.5430 0.5210 0.7440
5. Soil with 707 0.7780 0.0460 0.7160 0.7780 0.7460 0.8730
Vegetation_Stubble
6. VeryDamp_Gre_Soil 1508 0.8430 0.0600 0.8410 0.8430 0.8420 0.8930
1074 Chapter 21 Artificial intelligence and data analytics

Table 21.9 Hybrid model performance.


True False Area
positive positive under the
Categories Count rate rate Precision Recall FMeasure curve

1. Red_soil 1533 0.9620 0.0220 0.9420 0.9620 0.9520 0.9740


2. Cotton_crop 703 0.9530 0.0070 0.9520 0.9530 0.9520 0.9730
3. Grey_Soil 1358 0.9210 0.0190 0.9380 0.9210 0.9290 0.9490
4. Damp_Grey_Soil 626 0.6500 0.0420 0.6450 0.650 0.6480 0.8050
5. Soil with 707 0.8200 0.020 0.8530 0.8200 0.8360 0.8980
Vegetation_Stubble
6. Very Damp_Grey_ 1508 0.8620 0.050 0.8550 0.8620 0.8580 0.9080
soil

21.6.3 Summary
The main objective of this case study was to develop a new data classification algorithm dependent on
a fuzzy approach to Landsat satellite image processing. The developed method uses a hybrid approach
combining a DT and the MCDA classifier PROAFTN. DT classifiers such ID3 and C4.5 were shown to
be effective in terms of accuracy and speed. However, in some cases, particularly when managing
numerous numeric information, DT models could not reach the required accuracy. To improve
precision, it is recommended that fuzzy PROAFTN be used to overcome the issue of smooth interims
generated by the DT.
Based on the preceding discussion, the hybrid method was demonstrated to be a superior
classification tool; however, the following further enhancements are recommended: (1) refine attri-
butes and choose the best collection of features from the data set, (2) use other common satellite
remote sensing images such as hyperspectral (airborne visible/infrared imaging spectrometer) imag-
ery, and (3) expend a related examination so as to incorporate various classification methods from the
ML model (e.g., K-means clustering).

21.7 Case study II: java-based analytical method for mineral


exploration at Flin Flon, Saskatchewan, Canada
21.7.1 Site description
As described by Al-Obeidat et al. (2016), the volcanic-associated massive sulfide (VMS) reserves are a
major source of base metals for the worldwide economy. The Flin Flon Domain (Fig. 21.3) includes an
array of Paleoproterozoic volcano-plutonic assemblies originating from various tectonic settings,
logically juxtaposed during the progressing Trans-Hudson Orogeny (1.84 1.69 Ga). This
poly-deformed rock formation is well-supplied with VMS reserves (e.g., Flin Flon, 62.4 Mt, produc-
tion plus reserves) contained in predominantly juvenile (derived mantle), primitive arc-rocks,
ordinarily connected with discrete felsic rock collections. However, the Flin Flon Domain rocks are
21.7 Case study II: java-based analytical method for mineral exploration 1075

FIGURE 21.3
Study area: the western end of the Flin Flon belt in Saskatchewan, Canada.

still partly uncovered. Precambrian belts are surrounded by Phanerozoic, carbonate, and clastic units
of the Western Canada sedimentary basin at the southern edge, which limits the analysis of VMS
mineralization. In this analysis, the western portion of the Flin Flon belt was included.
As described by Al-Obeidat et al. (2016), the examination region situated in Saskatchewan,
Canada, which relates toward the western end of the Flin Flon area. Fig. 21.3 describes the Flin Flon
research region from two viewpoints: (1) Fig. 21.3 portrays the mapped areas of mineral stores and
occurrences in the Flin Flon field. Based on the distribution of minerals, it can be seen that the
geographic area is one-sided, with the vast majority of the facts and deposits (including Flin Flon) are
located in the northern realm where the Flin Flon metavolcanics belt is uncovered; and (2) Fig. 21.4
presents the Shuttle Radar Topography mission (SRTM) digital elevation map on a near-global scale
from 56 S to 60 N, producing the most accurate high-resolution digital topographic for the Flin Flon
region. The importance of discontinuity in topographic data is a complicated topic, because there is
significant confusion as to the cause of such characteristics.
In general, from the SRTM data collection covering the Flin Flon region (Fig. 21.4), Al-Obeidat
et al. (2016) reported that a change in elevation of progressively depressed domains can be seen in
the southern portion of the map. Rather, the northern end has sudden shifts in topographic height with
strong indications of tectonic lineaments in a far high elevation zone. The square outline, shown in
Fig. 21.4, demonstrates the chosen study area for lineament extraction.
1076 Chapter 21 Artificial intelligence and data analytics

FIGURE 21.4
Shuttle Radar Topography mission digital elevation map on a near-global scale from 56 S to 60 N.

21.7.2 Java systematic feature extraction tool and its structure


Shaded relief strategies have been used in cartography to create the impression of a three-dimensional
(3D) relief map. The most recent advance has been the development of digital image processing
methods to display the digital elevation models (DEMs) as shaded relief pictures. Because of DEMs,
shaded relief photos and rock formation derivative items (aspect, slope, and curvature calculations)
have been seen to a substantial degree to be effective for lineaments and fault mapping. Along with
these advancements in image processing, the digital revolution in the field of equal computation has
intensely increased our ability to handle data, changing the way different researchers manage and
process data to derive correlations and large-scale semantics. This employment of Java has connected
disciplines and saved time previously spent in reprocessing: for example, SRTM data at the project
level in a huge number of contextual investigations, such as the one briefly outlined.
Edge improvement prompts sharpening of the image, in which the graphical features of the picture
may be changed and upgraded (Masoud and Koike, 2011; Jakob, 2001) (Figs. 21.5 and 21.6). High-
lights of drainages, lineaments, and certain landforms, for example, are frequently described by sudden
shifts in radiometric responses.
Edge enhancement is a valuable and productive strategy for amplifying such capabilities,
encouraging their visual and automatic identification. The Java-based analytical method created in
this investigation can automate this procedure by figuring the DEM shaded-relief response from
21.7 Case study II: java-based analytical method for mineral exploration 1077

FIGURE 21.5
Application of hill shading and feature detection: azimuth ¼ 225, elevation ¼ 25.

FIGURE 21.6
Application of hill shading and feature detection: elevation ¼ 45, azimuth ¼ 225.

various angles of illumination. This procedure includes an extra degree of processing that is especially
effective when evaluating the problem of shadowing smaller-scale lineaments occurring when a single
direction and angle of illumination is obtained in processing the picture for assessing the edge of
contrast (Al-Obeidat et al., 2016).
1078 Chapter 21 Artificial intelligence and data analytics

Table 21.10 Algorithm and data flow.


Algorithm 2

1 Procedure extract alignment according to azimuth (z) and elevation (e) values
2 Read SRTM-Data GeoTiff images
3 For each value e 25. 45, . do
4 For each value z 45, 90, 225, 340, . do
5 Apply hill-shade code using based on e and z i.e., azimuth and elevation
6 Generate hill-shade images g
7 Apply edge detection code on g to discover topographic details
8 Generate edge detection images g
9 End for
10 End for
11 Classify and analyze the generated images ac according to their similarity.
12 End procedure

To elaborate further on the framework of the suggested analytical tool, a specific algorithm and
data flow are shown in Table 21.10: Algorithm 2. As reported by Al-Obeidat et al. (2016), it requires
the construction of a prototype application that executes two main image processing algorithms: (1) a
multidirectional hill-shading algorithm to transform two-dimensional raster photos into pseudo-3D
raster photos, and (2) a Canny algorithm to perform multidirectional edge detection on a specified
SRTM data collection. In the subsequent stage, the research study aims to exploit cloud computing to
reach a higher degree of integration and representation of the processing data set across different
scales, with the goal of continuously evaluating the whole NASA-SRTM data collection.
In this preliminary analysis, Al-Obeidat et al. (2016) demonstrated that the Java method created a
few shaded relief photos; they encouraged testing and showed some of the pitfalls experienced while
attempting the automatic edge extraction methodology. Various values for solar elevation and solar
azimuth were attempted by adjusting the angle estimation of the two parameters, to obtain probabi-
listic estimates of the probability that a lineament would be present at a given pixel position in
the SRTM raster tiles. The goal was to limit the influence of directional illumination and improve the
efficiency of the instrument in capturing the distribution of observed linear features. A portion of the
pitfalls encountered revolves primarily around the use and translation of the origin of these linear
elements, which frequently mimic either human influence on land or the existence of water, especially
at the latitude considered for this investigation in Canada. Further developed iterations of the methods
should propose mixing with other information sources (e.g., geophysical data such as from the Gravity
Recovery and Climate Experiment satellite) to expand our capacity to separate different natural
processes automatically, prompting the creation of topographic discontinuities.
In addition, the created algorithms considered coding algorithms with two significant open source
libraries: ImageJ and JGrass. The analytical tool first reads the original SRTM data in GeoTiff image
form, enabling the user to adjust parameters such as elevation and azimuth for the investigation field.
Moreover, and the analytical methods produce a shaded region and develop a progression of GeoTiff
21.8 Summary and concluding remarks 1079

images that can be shown by image users (for example, ArcGIS), at which point it recognizes edges
and significant regions of the shaded images provided. A detailed description of the review is given in
Table 21.10, Algorithm 2.
The last stage in the advancement of the Java analytical method involves planning a common
database for the product distribution at different scales of examination. This speaks to potential
research in the project; furthermore, we predict progress in this method in the computational frame-
work used by the Hadoop ecosystem, because this stage is fully open source and has a huge group of
collaborators and developers. This architecture is especially engaging a result of the restricted capacity
of image data in HBase and the immense performance limitation in the Spark environment (e.g., MLlib
project). Although this project is in its earliest stages, a huge advance is permitting the use of ML
algorithms, and other statistical classification/segmentation algorithms. This proposes customizing
and fitting of a portion of such resources or newly created geoscience resources that reflect a significant
advance in managing BD geoscientific issues such as those frequently found in remote sensing.
In view of the size of the study (hundreds of square kilometers), in this initial case investigation, a
progression of tests was carried out on a tile of 90-m pixel resolution of SRTM data. These DEMs have
a resolution of 90 m at the equator and are available in 5  5-degree mosaic tiles for quick download
and use (Al-Obeidat et al., 2016). Comprehension of the degree of discontinuity in topographic data is
a complicated issue, because there is an immense weakness in the root of such characteristics, as seen
in Fig. 21.4. In reaction to sudden morphological variation caused by hydrogeological or geological
processes, lineaments typically may form. Regardless of whether this complexity occurs, probabilistic
simulations must be used to diminish variance in this uncertain formation. Figs. 21.5e21.8 are
preliminary outputs using various directions of solar elevation and solar azimuth.

21.7.3 Data analysis


The initial results suggest that depending on the direction of sun illumination, the effect of the observed
lineaments differs accordingly. The assessment of Canny’s success in mapping edges reveals that the
algorithm behaves differently as a result of topographic relief. Topographically depressed zones
involve linear features that are not distinguished by the Canny detection algorithm. At the illuminated
northwest corner of the map, there is a remarkable success owing to increased elevation, which is
attributed to an understanding of the geological units in this region, represented by the exposed
Archean shield rocks of the Flin Flon Domain. In particular, the function extraction algorithms tends to
identify geological contacts in a specific region. The fundamental results are encouraging considering
the value of contacts between felsic and mafic interims in this region.

21.8 Summary and concluding remarks


In this chapter, we presented two contextual analyses to address the limitations of conventional
statistics in dealing with hyperspectral data of satellite and airborne images. The first case study
presented the development of an AI and data analytics algorithm capable of classifying hyperspectral
data to support remote sensing and GIS researchers in understanding and predicting changes in natural
earth processes. The classification algorithm is based on a fuzzy approach combining a DT classifier
and fuzzy MCDA classifier.
1080 Chapter 21 Artificial intelligence and data analytics

FIGURE 21.7
Location of minerals.

The second case study presented the development of an AI tool that extracts features from
hyperspectral data to transform a 2D satellite and airborne image into a pseudo-3D image to enhance
edge contrast and produce multidirectional sun-shaded images and their edges. Such 3D images are
useful in supporting the discovery of ground for mineral mining, the extraction of important minerals,
or other geological materials from the earth, normally from an ore body, lode, vein, seam, reef, or
placer deposit, and overall to improve mining operations.
References 1081

FIGURE 21.8
Location of minerals using the method described in this chapter.

References
Al-Obeidat, F., Al-Taani, A.T., Belacel, N., Feltrin, L., Banerjee, N., 2015. A fuzzy decision tree for processing
satellite images and landsat data. Procedia Comput. Sci. 52, 1192e1197.
Al-Obeidat, F., Belacel, N., 2011. Alternative approach for learning and improving the MCDA method
PROAFTN. Int. J. Intell. Syst. 26 (5), 444e463.
Al-Obeidat, F., Belacel, N., Carretero, J.A., Mahanti, P., 2010. Automatic parameter settings for the PROAFTN
classifier using hybrid particle swarm optimization. In: Farzindar, A., Keselj, V. (Eds.), AI 2010. LNCS
(LNAI), vol. 6085. Springer, Heidelberg, pp. 184e195. https://doi.org/10.1007/978-3-642-13059-5_19.
Al-Obeidat, F., Belacel, N., Mahanti, P., Carretero, J., et al., 2009. Discretization techniques and genetic algorithm
for learning the classification method PROAFTN, In: International Conference on Machine Learning and
Applications, ICMLA 2009, pp. 685e688. IEEE (2009).
Al-Obeidat, F., Feltrin, L., Marir, F., 2016. Cloud-based lineament extraction of topographic lineaments from
NASA Shuttle radar Topography mission data. Procedia Comput. Sci. 83, 1250e1255.
Al-Obeidat, F., Spencer, B., Kafeza, E., 2018. The opinion management framework: identifying and addressing
customer concerns extracted from online product reviews. Electron. Commer. Res. Appl. 27, 52e64.
Belacel, N., Raval, H.B., Punnen, A.P., 2007. In: Learning Multicriteria Fuzzy Classification Method PROAFTN
from Data, vol. 34, pp. 1885e1898.
Cai, Y., Guan, K., Peng, J., Wang, S., Seifert, C., Wardlow, B., Li, Z., 2018. A high-performance and in-season
classification system of field-level crop types using time-series Landsat data and a machine learning
approach. Remote Sens. Environ. 210, 35e47.
1082 Chapter 21 Artificial intelligence and data analytics

Charou, E., Stefouli, M., Dimitrakopoulos, D., Vasiliou, E., Mavrantza, O.D., 2010. Using Remote Sensing to
Assess the Impact of Mining Activities on Land and Water Resources, vol. 29. Mine Water and the
Environment, pp. 45e52.
El-Alfy, E.-S.M., Al-Obeidat, F.N., 2014. A mulicriterion fuzzy classification method with greedy attribute
selection for anomaly-based intrusion detection. Pricedia Comput. Sci. 34, 55e62. EROS. https://www.usgs.
gov/centers/eros.
Fayyad, U., Irani, K., 1993. Multi-interval discretization of continuous-valued attributes for classification
learning. In: XIII International Joint Conference on Artificial Intelligence (IJCAI93), pp. 1022e1029.
Jakob, 2001. The Shuttle Radar Topography Mission (SRTM): a breakthrough in remote sensing of topography.
Acta Astronaut. 48 (5e12), 559e565.
Karpatne, A., Ebert-Uphoff, I., Ravela, S., Babaie, H.A., Kumar, V., 2019. Machine learning for the geosciences:
challenges and opportunities. IEEE Trans. Knowl. Data Eng. 31 (8), 1544e1554.
Lary, D.J., 2010. Artificial intelligence in geoscience and remote sensing, In: Imperatore, P., Riccio, D. (Eds.),
Geoscience and Remote Sensing, New Achievements. IN-TECH, Vukovar, Croatia, pp. 1e24.
Lary, D.J., Alavi, A.H., Gandomi, A.H., Walker, A.L., 2016. Machine learning in geosciences and remote sensing.
Geosci. Front. 7, 3e10.
Lavreniuk, M.S., Skakun, S.V., Shelestov, A.J., Yalimov, B.Y., Yanchevskii, S.L., Yaschuk, D.J., Kosteckiy, A.I.,
2016. Large-scale classification of land cover using retrospective satellite data. Cybern. Syst. Anal. 52 (1),
127e138.
Lawrence, R., Bunn, A., Powell, S., Zambon, M., 2004. Classification of remotely sensed imagery using stochastic
gradient boosting as a refinement of classification tree analysis. Remote Sens. Environ. 90 (3), 331e336.
Masoud, A.A., Koike, K., 2011. Auto-detection and integration of tectonically significant lineaments from Srtm
Dem and remotely-sensed geophysical data. ISPRS J. Photogramm. Remote Sens. 66, 818e832.
Maxwell, A.E., Warner, T.A., Fang, F., 2018. Implementation of machine-learning classification in remote
sensing: an applied review. Int. J. Remote Sens. 39 (9), 2784e2817.
Moran, C.J., Bui, E.N., 2002. Spatial data mining for enhanced soil map modelling. Int. J. Geogr. Inf. Sci. 16 (6),
533e549.
Oh, H.J., Syifa, M., Lee, C.W., Lee, S., 2019. Land subsidence susceptibility mapping using Bayesian, functional,
and meta-ensemble machine learning models. Appl. Sci. 9 (6), 1248.
Peng, H., 2001. Land use/cover change detection and analysis with remote sensing in Xiamen City. Geography.
Quinlan, J.R., 1996. Improved use of continuous attributes in C4.5. J. Artif. Intell. Res. 4, 77e90.
Roy, B., 1996. Multicriteria Methodology for Decision Aiding. Kluwer Academic.
Saibi, H., Bersi, M., Mia, M.B., Saadi, N.M., Al Bloushi, K.M.S., Avakian, R.W., 2018. Applications of Remote
Sensing in Geoscience. Recent Advances and Applications in Remote Sensing. IntechOpen.
Sebastian, R., 1989. A Survey of Tex and Graphics. Technical Report CSTR 89-7. Department of Electronics and
Computer Science, University of Southampton, UK.
Shahin, M.A., 2015. A review of artificial intelligence applications in shallow foundations. Int. J. Geotech. Eng. 9
(1), 49e60.
Vasileiou, E., Stathopoulos, N., Stefouli, M., Charou, E., Perrakis, A., 2012. Evaluating the environmental impacts
after the closure of mining activities using remote sensing methods-the case study of Aliveri mine area. In:
Annual Conference. International Mine Water Association, pp. 755e763.
Witten, H., 2005. Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann Series in
Data Management Systems.

You might also like