Koruk Kasimcan 202208 MASc

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 168

DEFINITION OF GEOLOGICAL DOMAINS WITH AN ENSEMBLE

IMPLEMENTATION OF SUPPORT VECTOR CLASSIFICATION

By

Kasimcan Koruk

A thesis submitted to the Department of Mining Engineering

in conformity with the requirements for

the degree of Master of Applied Science

Queen’s University

Kingston, Ontario, Canada

(August, 2022)

Copyright © Kasimcan Koruk, 2022


To my wife and my son.

i
Abstract

Building a resource model can be a challenging process because of the complex nature of geology.

Employing limited information to build the resource model inevitably gives rise to uncertainties,

and therefore uncertainty is an important phenomenon that needs to be taken into account while

building the model. In fact, quantifying the uncertainties can be a significant advantage for the

evaluation of mineral resources. Uncertainties can be quantified by traditional approaches, such as

geostatistical simulation methods. While traditional approaches generally focus directly on

domains to build geological models, today huge geochemical datasets obtained during exploration

campaigns are seen as an opportunity to better evaluate the geological models. In this context,

machine learning algorithms stand as a strong alternative to traditional approaches (explicit or

implicit modeling, or geostatistical techniques), because they can easily incorporate large and

multivariate datasets to reach consistent and semi-autonomous results.

This study proposes an ensemble learning approach to define geological domains and their inherent

uncertainty. The study adopts an unsupervised binary clustering method to label the domains of a

multivariate geochemical dataset, and the information obtained by this unsupervised method is

used to inform a supervised learning algorithm, which is Support Vector Classification. The

supervised learning step is applied to assign domaining in locations of a gridded model. The

unsupervised and supervised learning steps are repeated with subsets of geochemical variables and

with subsets of training samples to realize an Ensemble model. These models are combined to

define a strong model, and the uncertainty can be incorporated with the model. Finally, a

hierarchical application is applied on the results to build multiple domains. We demonstrate the

proposed workflow with an application to a real database from a porphyry copper deposit. The

ii
workflow was capable to build a robust model with high accuracy and with well-defined, curvy

boundaries.

iii
Acknowledgements

I wish to express my gratitude to my supervisor Dr. Julian Ortiz for his guiding and supportive

attitude throughout the research and daily life. The ideas given by Dr. Ortiz shaped my study.

I am also grateful for the scholarship funded by Ministry of National Education of the Republic of

Turkey. The financial support provided by the Ministry is priceless, and it gave me confidence and

assurance to complete the degree.

I would thank to Sebastian Avalos for his friendship and support during the education period in

Canada.

I would like to offer my sincere gratitude to my mother Meftun and my brother Nevzat Cagatay

since they always supported all of my decisions throughout my career. I am also thankful to my

son Timur because he joined our life during this period, and he gave us days with full of joy and

happiness. Lastly and most importantly, I feel eternal gratitude to my wife Gorkem. Every process

was easier thanks to her unconditional support and sacrifices throughout this education.

iv
Table of Contents
Abstract ................................................................................................................................................... ii
Acknowledgements...................................................................................................................................... iv
List of Figures .............................................................................................................................................. vii
List of Tables ............................................................................................................................................... xii
List of Abbreviations ...................................................................................................................................xiv
Chapter 1 Introduction .............................................................................................................................. 1
1.1 Evaluation of Mineral Resources .................................................................................................. 1
1.2 Objective and Approach................................................................................................................ 2
Chapter 2 Literature Review ...................................................................................................................... 5
2.1 Geological Modeling for Resource Estimation.............................................................................. 5
2.1.1 Explicit and Implicit Modeling ............................................................................................... 5
2.1.2 Deterministic and Stochastic Modeling ................................................................................ 7
2.2 Overview of Machine Learning Algorithms ................................................................................ 13
2.2.1 Support Vector Machine and Support Vector Classifier ..................................................... 20
2.2.2 Ensemble Learning .............................................................................................................. 24
Chapter 3 Methodology ........................................................................................................................... 26
3.1 Unsupervised Classification for the Definition of Geological Domains ...................................... 26
3.1.1 Exploratory Data Analysis ................................................................................................... 27
3.1.2 Optimization........................................................................................................................ 29
3.1.3 Domain Assignment ............................................................................................................ 34
3.2 Ensemble Learning with Support Vector Classification .............................................................. 37
3.2.1 Application of Support Vector Classification ...................................................................... 37
3.2.2 Ensemble Learning with 3-Combination of Variables and Bagging .................................... 39
3.2.3 Hierarchical Application ...................................................................................................... 41
3.3 Comparison with BlockSIS ........................................................................................................... 41
Chapter 4 Case Study ............................................................................................................................... 44
4.1 First Level of Hierarchical Application......................................................................................... 44
4.1.1 Unsupervised Clustering ..................................................................................................... 45
4.1.2 Supervised Learning and Ensemble Learning ..................................................................... 55
4.1.3 Comparison with BlockSIS ................................................................................................... 62
4.2 Second Level of Hierarchical Application .................................................................................... 71
4.2.1 Unsupervised Clustering ..................................................................................................... 71

v
4.2.2 Supervised Learning and Ensemble Learning ..................................................................... 80
4.2.3 Comparison with BlockSIS ................................................................................................... 86
Chapter 5 Conclusions and Recommendations ....................................................................................... 96
References ................................................................................................................................................. 99
Appendices ............................................................................................................................................... 104

vi
List of Figures

Figure 2.1 (a) An example of a failed explicit model. The model requires phantom points to build

the boundary represented by orange color. (b) An implicit model with successful folding structure

......................................................................................................................................................... 7

Figure 2.2 Deterministic model (left) and stochastic model with alternate models (right) ............ 8

Figure 2.3 Two of the positive definite functions to define variogram models: nugget effect model

and spherical model ...................................................................................................................... 11

Figure 2.4 General workflow of machine learning systems ......................................................... 13

Figure 2.5 Linear regression: an example of a simple regression model...................................... 17

Figure 2.6 Algorithm of neural networks ..................................................................................... 18

Figure 2.7 Bias-variance tradeoff ................................................................................................. 20

Figure 2.8 Hard margin linear SVC .............................................................................................. 21

Figure 2.9 Soft margin linear SVC; encircled samples are misclassified samples ....................... 22

Figure 2.10 A simple example of bootstrapping ........................................................................... 24

Figure 3.1 Schematic summary of unsupervised classification .................................................... 27

Figure 3.2 Log-normally distributed global histogram of a variable containing two distinct domains

....................................................................................................................................................... 29

Figure 3.3 A near-perfect fit of a combined distribution (red line) to the global data (blue dots) and

the optimum distributions of two domains forming the combined distribution (green and yellow

lines).............................................................................................................................................. 31

Figure 3.4 Topographical view of SE according to change in parameter µ of domain 1 and domain

2..................................................................................................................................................... 32

Figure 3.5 An example of quadratic form problem with two controlling parameters .................. 33

vii
Figure 3.6 Result of domain assignment according to optimal distributions ............................... 35

Figure 3.7 Result of domain assignment in tri-variate space ........................................................ 36

Figure 3.8 (a) Accuracy assessment with recurring loops and (b) accuracy assessment within one

loop. Notice that red circles show when the algorithm reaches to minimum Euclidean distance.

While the last points in graphs have the best fit between combined weighted CDFs and global data,

results with best clustering are obtained earlier. ........................................................................... 36

Figure 3.9 Main steps of SVC and technique of training-test split............................................... 39

Figure 3.10 Workflow of the study ............................................................................................... 40

Figure 3.11 Hierarchical application of domaining ...................................................................... 41

Figure 3.12 Parameter file of BlockSIS ........................................................................................ 42

Figure 3.13 Effect of image cleaning on the results ..................................................................... 43

Figure 4.1 Two-level hierarchical domaining ............................................................................... 44

Figure 4.2 Histograms - natural logarithms of variables .............................................................. 47

Figure 4.3 Cumulative distributions - natural logarithms of variables ........................................ 48

Figure 4.4 Alteration data obtained from field logging in tri-variate space ................................. 49

Figure 4.5 Optimization result of first case: cumulative probability plots of variables with initial

parameters of normal distribution on the left, cumulative probability plots of variables with

optimum parameters of distributions on the right ......................................................................... 51

Figure 4.6 Improvement in domain assignment ........................................................................... 53

Figure 4.7 Result of domain assignment - distribution of samples in tri-variate space ................ 53

Figure 4.8 Domain assignment result of first case: probability plots comparing achieved and

optimal domaining ........................................................................................................................ 54

Figure 4.9 Result of domain assignment process in coordinate system ....................................... 55

viii
Figure 4.10 Node model of the variables Mg/Al, Mn/K and Zn .................................................. 57

Figure 4.11 Final ensemble model ................................................................................................ 58

Figure 4.12 Vertical cross-sections taken along North axis of the SVC Ensemble model ........... 58

Figure 4.13 Boxplots showing the distribution of accuracies for each model according to subset of

variables. Notice that vertical axis of the graph is limited from bottom and top, 0.89 and 0.92

respectively. .................................................................................................................................. 59

Figure 4.14 Categorization of Ensemble learning according to optimum proportion of domains 60

Figure 4.15 Categorization of Ensemble model – Horizontal profiles ......................................... 61

Figure 4.16 Experimental variograms created along horizontal direction .................................... 63

Figure 4.17 Variogram models built along chosen directions ...................................................... 63

Figure 4.18 Parameter file of BlockSIS ........................................................................................ 64

Figure 4.19 Block models of BlockSIS with light image cleaning and heavy image cleaning .... 65

Figure 4.20 Effect of image cleaning on BlockSIS – horizontal profiles ..................................... 66

Figure 4.21 BlockSIS with heavy image cleaning - Vertical cross-sections taken along North axis

....................................................................................................................................................... 67

Figure 4.22 Categorization of BlockSIS model with heavy image cleaning ................................ 68

Figure 4.23 Categorization of BlockSIS – horizontal profiles ..................................................... 69

Figure 4.24 Comparison of two models with vertical cross-sections ........................................... 70

Figure 4.25 Variables chosen for sub-domains of domain 1 ........................................................ 72

Figure 4.26 Variables chosen for sub-domains of domain 2 ........................................................ 74

Figure 4.27 Optimization of domain 1 with variables Y, Mo and Re: cum. probability plots of

variables with initial parameters of normal dist. on the left, cum. probability plots of variables with

optimum parameters of distributions on the right. ........................................................................ 76

ix
Figure 4.28 Optimization of domain 2 with variables Zn, In and Te: cum. probability plots of

variables with initial parameters of normal dist. on the left, cum. probability plots of variables with

optimum parameters of distributions on the right. ........................................................................ 77

Figure 4.29 Hierarchical domain assignment of domain 1: probability plots comparing achieved

and optimal domaining for the case with variables Y, Mo and Re ............................................... 78

Figure 4.30 Hierarchical domain assignment of domain 2: probability plots comparing achieved

and optimal domaining for the case with variables Zn, In and Te ................................................ 79

Figure 4.31 Methodology followed to realize hierarchical ensemble model ................................ 81

Figure 4.32 Boxplots showing balanced accuracies of SVM models ........................................... 82

Figure 4.33 Categorization of ensemble model ............................................................................ 83

Figure 4.34 Result of second-level hierarchical domain assignment ........................................... 84

Figure 4.35 Vertical cross-sections taken from final ensemble model ......................................... 85

Figure 4.36 Two hierarchical ensemble models along horizontal benches .................................. 86

Figure 4.37 Variogram models built along chosen directions for each sub-domain .................... 87

Figure 4.38 Cumulative probability of E-Type BlockSIS model. Imposing local proportion on

BlockSIS increases the deterministic zones (red dashed circles). Assuming deterministic zones are

correctly estimated: correctly (a) and wrongly (b) assigned samples when global proportion is

neglected and considered, respectively. ........................................................................................ 90

Figure 4.39 Three consecutive cross-sections of BlockSIS model taken along north .................. 92

Figure 4.40 Comparison of BlockSIS models along horizontal benches with two categories (left)

and with four categories (right)..................................................................................................... 93

Figure 4.41 Final ensemble model (left) and BlockSIS model (right) along horizontal benches 94

x
Figure 4.42 Vertical cross-sections of final ensemble model (top) and BlockSIS model (bottom)

....................................................................................................................................................... 95

xi
List of Tables

Table 2.1 A confusion matrix ....................................................................................................... 19

Table 2.2 Types of kernel functions ............................................................................................. 23

Table 3.1 Optimization of thirteen parameters ............................................................................. 30

Table 4.1 Descriptive statistics of sample coordinates and variables. Elements are in ppm, and

ratios are adimensional. ................................................................................................................ 46

Table 4.2 Combinations of 3 out of 6 variables ............................................................................ 49

Table 4.3 Functions used to define the optimization problem ...................................................... 50

Table 4.4 Initial parameters of variables....................................................................................... 50

Table 4.5 Optimum parameters for each subset of variables ........................................................ 52

Table 4.6 Functions used to perform support vector classifier ..................................................... 55

Table 4.7 Scores of SVC with different combination of parameters according to 5-fold cross-

validation....................................................................................................................................... 56

Table 4.8 Mesh grid along which predictions made ..................................................................... 57

Table 4.9 Confusion matrix of categorized ensemble model ....................................................... 62

Table 4.10 Confusion matrices of all realizations with balanced accuracy .................................. 68

Table 4.11 Confusion matrix of categorized BlockSIS model ..................................................... 69

Table 4.12 Descriptive statistics of variables chosen for domain 1. Elements are in ppm, and ratio

is adimensional.............................................................................................................................. 72

Table 4.13 Descriptive statistics of variables chosen for domain 2. Elements are in ppm, and ratio

is adimensional.............................................................................................................................. 73

Table 4.14 Combinations of 3 out of 4 variables chosen for domain 1 and domain 2 ................. 74

Table 4.15 Initial parameters of variables chosen for domain 1 ................................................... 75

xii
Table 4.16 Initial parameters of variables chosen for domain 2 ................................................... 75

Table 4.17 Optimum parameters for each subset of variables chosen for domain 1 .................... 75

Table 4.18 Optimum parameters for each subset of variables chosen for domain 2 .................... 75

Table 4.19 Confusion matrix of final ensemble model................................................................. 84

Table 4.20 Confusion matrix of BlockSIS considering deterministic zones. ............................... 91

Table 4.21 Confusion matrix of BlockSIS considering global proportions.................................. 91

xiii
List of Abbreviations

CDF Cumulative Distribution Function

CG Conjugate Gradient Method

EDA Exploratory Data Analysis

KNN K-Nearest Neighbor

MLA Machine Learning Algorithms

Newton-CG Newton Conjugate Gradient Method

PCA Principal Component Analysis

RBF Radial Basis Function

SE Squared Error

SIS Sequential Indicator Simulation

SVC Support Vector Classifier

SVM Support Vector Machines

t-SNE t-distributed Stochastic Neighbor Embedding

xiv
Chapter 1 Introduction

1.1 Evaluation of Mineral Resources

Building a resource model to determine the value of ore deposits plays a crucial role in the mine

development stage. Scarcely made observations, both on surface and underground, are gathered in

resource models by geologists to predict the geological structure of the ore deposits at target

locations. Traditionally, geologists make use of 2D cross-sections drawn along the target space

according to the observations made through drillhole samples to build a geological model, and

how the orebody extends must be interpreted with a detailed work to make the model accurate.

While the traditional modeling, which is called explicit, is a time-demanding work, implicit

models, which are widely used thanks to computers now, semi-automate and support the process

of prediction much faster with particular functions used for the interpolation.

Whether the model is explicit or implicit, one must interpret the continuity between the drillhole

samples, thus this gives rise inevitably to uncertainties. In fact, the complex nature of geological

structures caused by discontinuities like faults, or geological non-deposition periods makes models

uncertain (Boisvert, 2010), and therefore any uncertainty must be taken into account to make a

resource model reliable. At this point, geostatistics is utilized both for the prediction of unsampled

locations and for assessment of their level of uncertainty. Geostatistics is the field of study which

employs statistical tools to estimate the spatial features and uncertainties of a geological model

(Isaaks and Srivastava, 1989; Rossi and Deutsch, 2016). To better define the uncertainties,

geostatistical simulations are performed to determine a distribution of possible values rather than

a single estimation (Boisvert, 2010). The simulated resource models allow geologists to determine

a confidence interval for the resource models. This is termed probabilistic approach.

1
Traditionally, kriging estimation and simulation methods are considered robust geostatistical

methods to make estimation and to define uncertainties, respectively. While the traditional

geostatistical methods like kriging and simulation methods remain the industry standard, the

mining industry is open to new technologies and methods to effectively manage big exploration

datasets (Hodkiewicz, 2014). In this context, machine learning algorithms (MLA) have been

utilized in several geostatistical studies to support the traditional methods (Cevik and Ortiz,

2020a). In fact, MLA can save important amounts of time and cost spent on labor-intensive works

like geological logging and modeling (Faraj and Ortiz, 2021; Gwynn et al., 2013). Moreover, MLA

offers flexible options thanks to several algorithms available which can be adapted to solve

different geostatistical problems. The applications of MLA performed so far have shown distinct

success (Ortiz, 2019; Shen et al., 2018). Today, machine learning keeps growing in geostatistics

just like all the other fields of scientific studies. Development of new learning algorithms and

theories has driven this ongoing growth (Jordan and Mitchell, 2015).

Although MLA is proven to be successful, MLA has also some problems when it comes to

accuracy of the modeling. To increase the accuracy of the MLA, ensemble approaches have been

developed. Basically, an ensemble approach combines several models to increase the accuracy of

the final model. While one model can be a weak learner with a restricted information or approach,

combining multiple weak models can yield a strong model. The ensemble methods become robust

especially when the knowledge informing the method is also robust.

1.2 Objective and Approach

The main goal of the study is to form a workflow to build a semi-autonomous geological model

utilizing multiple geochemical variables. The statement of the research is that an ensemble

approach based on Support Vector Classifier (SVC), used to determine the extent of geological

2
domains based on sets of geochemical variables, can provide a reliable model to obtain domain

volumes and to determine their uncertainty. One of the main features proposed by the research is

that the model is capable to handle the uncertainty, and it realizes multiple scenarios with different

possible extent of the domains. To do this, the workflow utilizes geochemical information and

multiple variables are combined to realize each scenario. The key point of the work is that the

model can be applied on any geological environment as long as the variables are chosen with

geological expertise. The proposed workflow can be applicable during advanced exploration as a

part of feasibility stage starting after initial exploration campaigns.

The workflow consists of two steps: a binary-unsupervised clustering approach originally

proposed by Faraj and Ortiz (2021) is combined with the second step, an ensemble learning method

formed by SVC. At the first step, the geochemical dataset is analyzed and the variables relevant to

the objective of clustering are determined by an exploratory data analysis (EDA). After EDA, the

unsupervised clustering approach is applied on the variables to split the dataset into two distinct

geological domains. At the second step, SVC is applied on the defined domains to predict the

unknown target locations. The key parameters controlling the SVC models are validated and

determined by k-fold cross-validation method. The individual SVC model is a weak classifier

alone. A stronger classifier is built by the combination of weak classifiers, repeating the process

over subsets of samples and subsets of variables. The uncertainty is also quantified under the strong

classifier by combining the weak classifiers. Finally, a hierarchical model having multiple domains

is built by repeating the process on each domain individually.

The details of the study are explained in the following chapters. A broad overview related to the

types of geological modeling is made in chapter 2. How a general machine learning approach

should be applied and some traditional geostatistical tools as the alternatives of MLA are also

3
discussed in chapter 2. The details of the methodology followed in the study is explained in chapter

3. A case study, on which a geochemical dataset from a Copper Porphyry Deposit is employed to

apply the proposed workflow, is investigated in chapter 4. For comparison, a model is built using

a geostatistical simulation technique, also documented in chapter 4. Finally, conclusions and

recommendations related to future studies are outlined in the last chapter.

4
Chapter 2 Literature Review

Earth sciences are always in search of advancements when it comes to geological modeling. It is

well-understood that traditional geological modeling approaches must be combined with data

analysis techniques which take advantage of the data to learn features, so that the features can be

imposed on the models. In this context, this section covers the current state of the traditional

geological modeling techniques, and then focuses on MLAs to clarify what possible improvements

can be applied on traditional geological modeling techniques.

2.1 Geological Modeling for Resource Estimation

Geological modeling is an important aspect of the resource estimation process. While early

modeling attempts were based on mostly explicit and deterministic applications due to

computational restrictions, today implicit and stochastic approaches are easily applicable on

geological models thanks to the advancements on computer technologies. This section is dedicated

to elaborate on the aforementioned concepts, which are explicit, implicit, deterministic and

stochastic modeling.

2.1.1 Explicit and Implicit Modeling

The earliest efforts to understand geology dates back more than 200 years. Cross-sections drawn

by William Smith in the early 1800s to conceive the underground have been the foundation of

geological mapping and modeling (MacCormack et al., 2015). Traditional geological models have

been built combining 2D cross-sections with linear interpolation. A borehole sample is represented

by a point and polylines are obtained by combining points to represent boundaries between

geological units, finally combining polylines with linear functions yields volumetric models

(Newell, 2018). This type of modeling is termed explicit in the literature. While explicit modeling

is highly capable to represent geology based on a conceptual assumption, any change on the model

5
is challenging because it is time demanding. The need for changes can arise either because new

hard data become available, or when the functions used are not capable to build logical layers. A

new conceptual model can be required if a new hard data is discordant with the current conceptual

model. On the other hand, phantom points, which are manually added points in the space, are

sometimes required to let the functions build logical layers due to the insufficient number of points

available. Limited number of points lead to subjectivity in the model and therefore simple

geological structures like folding and stratigraphic continuum can be further challenges for the

explicit modeling approach. The two challenges expressed above make explicit models hard to

update. Even if the time limitation is neglected, the facts that explicit models are built based on

subjective decisions and not accounting for uncertainty make explicit models inefficient.

These challenges forced researchers to look for faster and more flexible alternatives to explicit

modeling. Contrary to explicit models, implicit models are quick and capable to represent

thickness and continuity of stratigraphy of rocks (Figure 2.1) (Gautier, 2016). While explicit

modeling analyzes data on a section-by-section basis, implicit modeling uses input data

simultaneously to interpolate 3-D scalar fields (Newell, 2018). Advanced mathematical functions

used in implicit models define surfaces successfully. A general approach in an implicit model is

that first a scalar field is described by a signed distance function, and then the surface is

interpolated tracing the field values with the help of another function along tetrahedral mesh points

(Hillier et al., 2017; Guo et al., 2020). The most used interpolation function is the radial basis

function and its derivatives. Today, implicit modeling has been adopted by commercial software

packages like Leapfrog and GoCad/SKUA, and has been endorsed by the mining industry.

6
Figure 2.1 (a) An example of a failed explicit model. The model requires phantom points to build
the boundary represented by orange color. (b) An implicit model with successful folding structure

2.1.2 Deterministic and Stochastic Modeling

Besides the concepts of implicit and explicit modeling, two additional concepts, which are crucial

for a well-developed understanding on modeling, are the deterministic and stochastic approaches.

By definition, a deterministic approach is an assumption made about an aspect of a model which

is fixed and imposed by the user. On the other hand, probability can be described as an aspect of a

model which is guided by a random (stochastic) outcome of a probabilistic approach (Bentley and

Ringrose, 2021). Early applications of geostatistics have focused on deterministic approaches,

therefore they were devoid of the uncertainty assessment (Galli and Beucher, 1997). In

geostatistics, the terms probability and stochasticity are frequently used interchangeably. Today

the requirement of a probabilistic approach by means of random processes, also known as

stochastic approach, is an accepted fact in geological modeling (Deutsch and Journel, 1997). The

fact that there is only limited information available to represent a model makes randomness a

natural part of geological models. Stochastic models are generally obtained as collection of all

possible scenarios, while deterministic models depict only one average scenario (Figure 2.2). A

model considering all possible scenarios provides flexibility at the stage of decision-making.

Although focusing on only one scenario can cause overfitting of models, models represented by

7
high variations can also become inefficient. An optimum approach would be to develop some

deterministic approaches about geological features like faults, which can be identified by

geophysical applications, and then build the stochastic model under deterministic constraints (Galli

and Beucher, 1997).

Figure 2.2 Deterministic model (left) and stochastic model with alternate models (right)

Stochastic models can be reviewed under two methods, which are pixel-oriented and object-

oriented methods. As the name implies pixel-oriented models gives outputs at nodes that discretize

the working space, utilizing two-point variogram statistics (although more advanced techniques

with multiple-point statistics have been proposed). On the other hand, object-oriented models are

aimed to identify geological objects without fixing them in a strict Cartesian grid system

(Hassanpour and Deutsch, 2010).

Sequential indicator simulation (SIS) can be considered one of the most well-known traditional

methods of pixel-oriented stochastic models. It is important to review SIS to better understand

8
traditional pixel-oriented probabilistic models. Indicator variables are generally used to define

geological rock types or facies prior to definition of petrophysical properties of units (Deutsch,

2006). An indicator is a function describing 𝐾 categories with binary values, such that value is 1

if the location belongs to a category, otherwise 0. The indicator function is

1, 𝑖𝑓 𝑠𝑘 𝑝𝑟𝑒𝑣𝑎𝑖𝑙𝑠 𝑎𝑡 𝑢
𝐼𝑘 (𝑢; 𝑠𝑘 ) = { (2.1)
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒

where 𝑠𝑘 is the true category at location 𝑢. Indeed, the expected value of each indicator can be

viewed as probability of the corresponding category and its variance is given by

𝑉𝑎𝑟{𝐼(𝑢; 𝑘)} = 𝑝𝑘 (1 − 𝑝𝑘 ) (2.2)

where 𝑝𝑘 is the prior probability of category 𝑘. To realize SIS, the spatial continuity of the indicator

variables should be defined by the corresponding indicator variograms:

1
𝛾𝐼 (ℎ; 𝑠𝑘 ) = ∑ [𝐼(𝑢; 𝑠𝑘 ) − 𝐼(𝑢 + ℎ; 𝑠𝑘 )]2 (2.3)
2|𝑁(ℎ)|
𝑁(ℎ)

where ℎ is the distance between two points and 𝑁(ℎ) is the number of pairs considered in the

variogram calculation and 𝑠𝑘 is the 𝑘-th category. The traditional approach to discover the spatial

continuity of an indicator variable is that variograms are calculated experimentally along certain

directions at some lags, so that an understanding about the spatial continuity of the indicator

variable can be developed. Different directions in space are inspected to check if there is an

anisotropy of the spatial continuity. Finally, variogram models are built along the directions which

represent the anisotropy best, using nested structures of functions, so that SIS can use this spatial

continuity to determine the probability distribution of the categories at each point in space based

on the available information. The variogram model combines a nugget effect to consider short-

9
range randomness of variables, and n other valid variogram functions (Deutsch, 2003). The nugget

effect model is given by

0 𝑖𝑓 ℎ = 0
𝛾(ℎ) = { (2.4)
𝐶 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒

where h is the distance and C is the sill contribution of the function to the variogram. Other than

the nugget effect, one of the most common functions to define variogram models is the spherical

model:

3 ℎ 1 ℎ3
𝐶( − ) 𝑖𝑓 ℎ ≤ 𝑎
𝛾(ℎ) = { 2 𝑎 2 𝑎3 (2.5)
𝐶 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒

where h is the distance, a is the range, and C is the sill contribution of the function to the variogram

model (Figure 2.3). There are also other functions to define variogram models with different

characteristics, which are exponential, gaussian, power and sine cardinal models, among others.

For simplicity, the number of nested structures should not be more than three. A nested structure

of a variogram model with one nugget effect model and two more models is given by

𝛾(ℎ) = 𝐶𝑁 ∙ 𝑁𝑢𝑔(ℎ) + 𝐶1 ∙ 𝑀𝑜𝑑𝑒𝑙𝑎1 (ℎ) + 𝐶2 ∙ 𝑀𝑜𝑑𝑒𝑙𝑎2 (ℎ). (2.6)

10
Figure 2.3 Two of the positive definite functions to define variogram models: nugget effect model
and spherical model

Point-based evaluations are performed ‘sequentially’ using Monte Carlo simulation based on the

knowledge of the spatial distribution obtained using variograms and considering the closest points,

which are either available samples or previously simulated locations. Different order of point

evaluations and different drawings by Monte Carlo simulation over the conditional cumulative

distributions provides different realizations, and thus this ensemble of realizations provides the

stochastic model. Although SIS is a simple and robust algorithm, the realizations obtained by SIS

may contain short-scale variations which make a geological body modeling unrealistic. Post-

processing algorithms for image cleaning may be used on the problem of short-scale variations

(Deutsch, 2006). While SIS is normally considered a point-oriented stochastic simulation method,

when image cleaning techniques are applied, this renders the models akin an object-oriented

approach. BlockSIS offered by Deutsch (2006) is one of the programs which makes use of this

image cleaning technique, providing smoothed models that are comparable with interpreted

geological models obtained by explicit or implicit modeling. There are also other point-based

11
simulation techniques like truncated Gaussian simulation (TGS), truncated pluriGaussian

simulation (TPGS) and multiple point statistics (MPS), which are valid today. The name Gaussian

is given because raw data is transformed to Gaussian, and then a truncation rule is applied on the

continuous Gaussian data to transform simulated Gaussian values into discrete values according

to thresholds. TPGS is applied in the same way except multiple Gaussian fields are used to define

the domains. All the techniques aforementioned here are variogram-based techniques. On the other

hand, relatively newer techniques called multiple point statistics MPS search for patterns in the

data rather than two-point statistics of variography (Bentley and Ringrose, 2021).

Although pixel-oriented models are simpler and more popular, object-oriented models have also

been used in recent years. The main motivation of using object-oriented models is to determine

the volume and location of a well-understood geological object (Hassanpour and Deutsch, 2010).

While pixel-oriented models can encounter problems when it comes to highly curved boundaries,

object-oriented models can handle such boundaries. Typical examples of geological objects for

which object-oriented modeling is ideal are fluvial systems like paleo-channels, or submarine fan

systems. Traditional object-oriented approaches model the geological objects with simple

volumetric formulations and interpolation methods. Interpolation methods can be critical to model

objects with specific geometries. The process which object-oriented models employ is called

marked point process (Bentley and Ringrose, 2021; Holden et al., 1998). Points are randomly

picked, and an object is defined according to the imposed proportions and volumetric parameters

like width and orientation. The objects which do not satisfy the prior data constraints are rejected,

and this process is repeated until the model is completed. While traditional object-oriented models

are simpler in terms of formulation, it frequently requires user intervention or correction. These

models are hard to condition when many data are available.

12
2.2 Overview of Machine Learning Algorithms

Géron (2017) describes machine learning as the science of programming which lets computers

learn from data. Indeed, the word “learning” describes the inspirational part of the technique. In a

broad sense, raw data is processed to extract knowledge, and the knowledge is trained with

machine learning algorithms (MLA) after the preparation process. Model prediction and validation

is performed to check if the output is accurate or not. Finally, the output feedback the environment

to make possible improvements. This process is repeated until a satisfactory output is obtained

from the algorithm (Figure 2.4).

Figure 2.4 General workflow of machine learning systems

MLA can be classified under two main types which are supervised learning, where learning is

performed to make predictions according to a given set of information where inputs and outputs

are known, and unsupervised learning, where learning is performed without prior assumptions

13
about the outputs, to detect the potential clusters of input samples and identify patterns.

Unsupervised learning seeks for samples which may form possible patterns or groups given a set

of input variables. The main aim of unsupervised learning algorithms is to gather descriptive

information rather than modeling labels, or categories. On the other hand, supervised learning

algorithms tries to fit a model on the input based on the imposed response given by the known

outputs. Both supervised and unsupervised learning methods are frequently utilized in resource

modeling individually for specific purposes and sometimes an unsupervised learning method is

used to inform the subsequent supervised learning methods (Cevik and Ortiz, 2020b).

Unsupervised learning algorithms can be classified under three sub-topics, which are clustering,

dimensionality reduction and association rule learning (Géron, 2017). Generally, clustering is the

main goal of unsupervised learning algorithms. k-Means clustering is one of the traditional

unsupervised clustering algorithms. Basically, k-Means clustering method assigns samples

randomly to k number of clusters, then the algorithm tries to increase the accuracy of the clustering

by changing the assigned label of randomly picked samples. Accuracy of each cluster is assessed

by squared Euclidean distance (James et al., 2013):

|𝐶𝑘 | |𝐶𝑘 |
1
𝑊(𝐶𝑘 ) = ∑ ∑(𝑥𝑖 − 𝑥𝑗 )2 (2.7)
|𝐶𝑘 |
𝑖=1 𝑗=1

where 𝑊(𝐶𝑘 ) is the measure of how samples differ from each other, |𝐶𝑘 | is the number of

observations in the cluster 𝐶𝑘 , 𝑥𝑖 and 𝑥𝑗 are the pair of samples which belong to cluster 𝐶𝑘 . The

equation is in fact the sum of Euclidean distances of pair of samples picked from the same cluster

divided by the number of observations in cluster 𝐶𝑘 . As 𝑊(𝐶𝑘 ) decreases, the accuracy of 𝐶𝑘

increases. The main controlling parameter of the algorithm is the number of clusters into which

14
samples are assigned. Overall, k-Means clustering turns into a minimization problem given below.

As a result of Eq. ((2.8)), samples are assigned to a cluster in a way that average pairwise distance

of samples are minimized for each respective cluster.

minimize {∑ 𝑊(𝐶𝑘 )} (2.8)


𝐶1 ,…,𝐶𝑘
𝑘=1

Dimensionality reduction techniques are seen as necessary when the number of variables is too

large to develop a logical relationship between variables. Dimensionality reduction techniques aim

at reducing the number of variables to a reasonable amount with minimum loss of information

from the input data. To do this, dimensionality reduction techniques check for correlations and

variations between variables and record the information with maximum variation in the minimum

number of variables possible. A traditional example of dimensionality reduction techniques can be

considered Principal Component Analysis (PCA) offered by Hotelling (1933). PCA makes use of

eigenvalues and eigenvectors, and it detects the major linear directions of variation in a dataset.

While PCA is a linear-based dimensionality reduction technique, relationships between variables

may not be only explained linearly but also non-linearly. As an alternative to PCA, t-distributed

Stochastic Neighbor Embedding (t-SNE) is another dimensionality reduction technique which

accounts for non-linear relationships between the original variables. t-SNE constructs probability

distributions for both high dimensional data and low dimensional embeddings (van der Maaten

and Hinton, 2008), and it tries to match the two probability distributions to better represent the

data in the lower dimension. Finally, association rule learning is used to detect how variables are

associated to each other and final groups with related variables are built accordingly (Dobilas,

2021). Association rule does not use a prior knowledge about the variables, instead it tries to extract

the co-existing patterns in a dataset (Agrawal et al., 1993). While association rule methods are

15
originally developed to discover customers’ shopping habits, it is also applied in different fields

(Kumbhare and Chobe, 2014).

Some of the most common supervised learning algorithms are k-Nearest Neighbors (KNN), linear

regression, logistic regression, Naïve Bayes, Support Vector Machines (SVM), decision trees,

random forests, and neural network (Cevik and Ortiz, 2020b). Supervised learning algorithms can

be divided into two sub-topics according to the objective: classification and regression.

Classification algorithms train several variables based on the prior knowledge to make prediction

on discrete output variables, e.g., 0s for the false labels and 1s for the true labels. On the other

hand, regression algorithms try to fit a model on several variables and make predictions on

continuous samples (Figure 2.5). Linear regression and logistic regression can be good examples

of regression models, and KNN, SVM, and decision trees can be both applied for regression and

for classification tasks. The algorithm of SVM utilized for classification tasks is called Support

Vector Classifier (SVC). SVC will be explained in detail later. KNN, which is one of the most

frequently used supervised learning algorithm, makes predictions on test samples to assign a class

calculating the distance to the training samples (Christopher, 2021). The basic algorithm of KNN

selects K samples from the training data closest to the test sample and the most frequent sample is

assigned to test data. Decision trees builds decision rules to predict the value of a target (Pedregosa

et al., 2011). Decision trees are built from roots, where general rules and features are located, to

leaves where more detailed rules and features are located. Decision trees are generally preferrable

since they are relatively easier to understand and easy to visualize. On the other hand, neural

networks are conceived harder to understand because of their relatively complex structure. Neural

networks employ hidden layers made of tuning parameters between input and output layers to

make predictions. The tuning parameters of hidden layers are fit to the model to improve the

16
predictions (Figure 2.6), and fitting is controlled by a loss function. Neural network algorithms are

employed in complex artificial intelligence tasks like image recognition and search engines

(Géron, 2017).

Logistic regression, another popular algorithm, models the samples’ probability of belonging to a

category rather than an estimation (James et al. 2013). Logistic regression improves linear

regression function and limits the range of probabilities between 0 and 1 using logistic function

(2.9).

Figure 2.5 Linear regression: an example of a simple regression model

𝑒 𝛽0+𝛽1𝑋
𝑝(𝑋) =
1 + 𝑒 𝛽0+𝛽1𝑋 (2.9)

The function takes the final form when the exponential part of the function is left alone and the

logarithm of both side is taken,

𝑝(𝑋)
log( ) = 𝛽0 + 𝛽1 𝑋
1 − 𝑝(𝑋) (2.10)

17
and hence probability of 𝑋 has logarithmic behavior, while 𝑋 has linear behavior. The coefficients

𝛽0 and 𝛽1 are estimated using approaches like maximum likelihood to fit the model. Once the

coefficients are estimated, probability prediction can be done like a linear regression problem.

Figure 2.6 Algorithm of neural networks

A model always requires a validation process to understand whether it represents the data well or

not. Therefore, assessing the accuracy of the model is crucial for the final decisions made on the

models. Accuracy of an unsupervised learning algorithms is assessed either with qualitatively, or

with mathematical measures applied on the task since there is no prior knowledge related to the

data. Mathematical measures applied on the unsupervised learning tasks generally assess how well

clustering is performed.

On the other hand, a direct quantitative assessment on final results is possible for supervised

learning tasks since there are prior knowledge related to the data. To assess the accuracy of

supervised learning tasks, data is split into training and validation sets prior to modeling. While

training data is used to train the model, performance of the model is evaluated with validation set

18
of the data (James et al., 2013). If the supervised learning task is a classification task, a confusion

matrix can be built which compares predicted results with validation set (Table 2.1), so that

performance of the model can be observed for each label.

Table 2.1 A confusion matrix


Label Negative True-Negative False-Positive
True

Positive False-Negative True-Positive


Negative Positive
Predicted Label

While it is possible to obtain an overall accuracy from the confusion matrix, a balanced accuracy

can be calculated, which accounts for the arithmetic mean of true-positive and true-negatives,

when the proportion between binary labels is imbalanced (Pedregosa et al., 2011). The equation

for balanced accuracy is

1 𝑇𝑃 𝑇𝑁
𝐵𝑎𝑙𝑎𝑛𝑐𝑒𝑑 − 𝑎𝑐𝑐𝑢𝑟𝑎𝑐𝑦 = ( + )
2 𝑇𝑃 + 𝐹𝑁 𝑇𝑁 + 𝐹𝑃 (2.11)

where TP is true-positives, FN is false-negatives, TN is true-negatives, and FP is false-positives

obtained from the model prediction.

Basically, all MLA models should have low bias, which means having less difference between

training data and model, and low variance, meaning that having less change on the model when

the training data set changes (James et al. 2013). However, it is a general characteristic for a model

that bias and variance are inversely proportional to each other, and this is called bias-variance

tradeoff (Rocca, 2019). Because bias and variance are inversely proportional to each other, an

optimum model complexity should be determined for the pairs of optimum bias and variance

values (Figure 2.7).

19
Figure 2.7 Bias-variance tradeoff

2.2.1 Support Vector Machine and Support Vector Classifier

SVM is a powerful method for machine learning because it can handle complex regression and

classification problems (Bishop, 2006) and it has been popular for many years (Géron, 2017). The

name of SVM comes from the subset of training samples, known as support vectors, utilized in the

decision function. Support vectors let the computer use a small amount of training subsets, and

thus the model becomes memory efficient (Pedregosa et al., 2011). In the case of a classification

problem, support vectors optimize class margins to a maximum possible span. To do this, SVC

utilizes a complex mathematical algorithm including sparse kernel techniques.

The mathematics behind SVC can be challenging for a learner. The best approach to understand

the logic of SVC is to start from a simple binary-class and two-parameter case. For this simple

case, SVC model takes the linear form

20
𝑦(𝑥𝑛 ) = 𝑤 𝑇 Φ(𝑥𝑛 ) + 𝑏 (2.12)

where 𝑥𝑛 is the training input from 1 to N number of inputs, 𝑦(𝑥𝑛 ) is the target class for the input

𝑥𝑛 , Φ(𝑥𝑛 ) is a fixed transformation function of 𝑥, 𝑤 𝑇 is a coefficient vector which is maximizing

the distance between the binary classes, and b is the parameter to control the bias (Bishop, 2006).

The equation is quite similar to logistic regression technique, however the main difference between

logistic regression and SVC is that SVC maximizes the distance between classes with the help of

margins at each side of the class boundary (Figure 2.8). The samples located on these margins are

called support vectors.

Going back to the equation, SVC tries to approximate best values to the unknowns, 𝑤 𝑇 and 𝑏, to

maximize the margins of the boundary. The optimum condition is provided when 1/‖𝑤‖ is

maximized. If classification can be performed without letting any misclassification, the

classification is called hard margin classification like in Figure 2.8. In its simplest manner, the

problem takes the form

𝑡𝑛 (𝑤 𝑇 Φ(𝑥𝑛 ) + 𝑏) = 1 (2.13)

where 𝑡𝑛 is the target for samples from 1 to N number of samples.

Figure 2.8 Hard margin linear SVC


21
However, in most cases hard margin classification is not possible. At this point, soft margin

classification comes as a solution. Soft margin classification lets the model misclassify some

samples to reach the optimum condition (Figure 2.9). To make misclassification possible, slack

variables, ξ𝑛 , are introduced into the equation (Bishop, 2006). ξ𝑛 = 0 when data points are correctly

classified, and the rest is ξ𝑛 = |𝑡𝑛 − 𝑦(𝑥𝑛 )|, meaning that samples located on boundary or samples

passing boundary are penalized up to 1. After the introduction of slack variables, the classification

problem takes the form:

𝑡𝑛 𝑦(𝑥𝑛 ) ≥ 1 − ξ𝑛 (2.14)

Theoretically, minimizing the summation of condition (2.14) for n ranging from 1 to N number of

samples, can provide the best classification:

𝑁
1
𝑚𝑖𝑛 ∑ ξ𝑛 + ‖𝑤‖2 (2.15)
2
𝑛=1

The Eq. ((2.15)) is scaled by a parameter C, and C, with the condition of being higher than 0,

regularizes the complexity of the model.

Figure 2.9 Soft margin linear SVC; encircled samples are misclassified samples

22
So far, the mathematical explanation of SVC is done as simple as possible with focus on linearly

separable class problems. However, most of the real scenarios require non-linear solutions with

quadratic function problems. Detailed explanations for quadratic function problems can be found

in Bishop (2006). For the sake of an easier comprehension, the problem is kept simple. However,

it is worth to mention kernel functions to better understand the non-linear solutions. The

transformation functions we called earlier Φ(Xn) are kernel functions. Technically, Kernel

functions work as if more features are added to the sample space to make classes separable (Géron,

2017). The most popular kernel functions are tabulated in Table 2.2 below:

Table 2.2 Types of kernel functions


Kernel Types Function
Linear Kernel 𝑘(𝑥, 𝑥 ′ ) = 𝑥 𝑇 𝑥 ′
Polynomial Kernel 𝑘(𝑥, 𝑥 ′ ) = (𝑥 𝑇 𝑥 ′ + 𝑟)𝑑
Radial Basis Kernel ′ 2
𝑘(𝑥, 𝑥 ′ ) = 𝑒 (−𝛾 𝑥−𝑥 )
Sigmoid Kernel 𝑘(𝑥, 𝑥 ′ ) = tanh(𝛾(𝑥 𝑇 𝑥 ′ ) + 𝑟)

Linear kernel, polynomial and radial basis kernel are derived from the same equation. While d

equals 1 in polynomial kernel, polynomial kernel becomes linear kernel. Increasing d can be

considered as increasing the dimension of the feature space in the training process. If d

approximates to ∞, polynomial kernel approximates to radial basis function. The approximation is

performed by making use of Taylor Expansion Series. Fundamentally, employing radial basis

function can yield similar but much faster performance compared to polynomial kernel when d is

∞. Therefore, radial basis function (RBF) generally yields better results on the implementation of

SVC.

23
2.2.2 Ensemble Learning

Ensemble learning methods basically combine several models to build complex models with

optimum bias and variance. The models used to build ensemble model are called weak learners.

Typically, weak learners have high bias or high variance. Combining all weak learners reduces

bias and variance, hence the process yields better models. The new models are called strong

learners. The most well-known ensemble learning algorithms are bagging, boosting, and stacking.

Bagging stands for bootstrap aggregating meaning that averaging of bootstrapped subsets of data.

Bootstrapping is the process of sampling available data with replacement up to a determined

amount (Figure 2.10). After each random sampling, the sample is replaced back into the global

data, so that it can be sampled more than once for each attempt. While each model built using a

subset of data is considered weak typically with high variance, averaging all data can yield a strong

model with relatively low variance (James et al., 2013). Bagging can be applied to optimize the

parameters of a single MLA, and it is applicable to all MLA.

Figure 2.10 A simple example of bootstrapping

24
Boosting, which is another ensemble learning method, is also performed to optimize the

parameters of a single MLA. While boosting is performed using resampling with replacement like

bagging, the difference is that models are combined sequentially. After each resampling, the model

is trained according to a resampled subset of data and testing is performed according to the original

data. The next subset of data is more likely to be chosen from the poorly represented part of the

previous data, so that this time bias can be decreased. This process continues up to a determined

number of weak learners, and the final output model is obtained by combining all sequential

models. The weak learners with higher accuracy get a higher weight in the combination to increase

the accuracy of the ensemble model. This type of boosting is called adaboost standing for adaptive

boosting. In contrast to bagging, boosting utilizes weak learners with high bias to build a strong

model with low bias. Although the process decreases the bias substantially, it should be noticed

that keeping the number of weak learners too high may cause overfitting.

Stacking can be considered complex compared to bagging and boosting since it utilizes different

MLAs to build a strong model. The data is split into two, and one half is trained with some MLAs.

The outputs of the MLAs are combined with different techniques, such as stacked models,

blending or super ensemble. The combined outputs are used as input in a main MLA which is

called meta-model (Rocca, 2019). Here, outputs of MLAs used to train this meta-model are

considered weak learners. Finally, training the combined outputs as inputs in the meta-model

yields the final strong learner.

25
Chapter 3 Methodology

3.1 Unsupervised Classification for the Definition of Geological Domains

The unsupervised classification method adopted for the first step is the workflow outlined in Faraj

and Ortiz (2021). The workflow leans on the idea that certain geochemical elements of samples

belonging to a distinct geological unit show certain distribution characteristics in nature. The log-

normal distribution is frequently observed in geochemical elements. Moreover, the log-normal

distribution can approximate to normal distribution as the parameters approach small values.

Furthermore, the ratio of two elements with log-normally distributed populations also show a log-

normal distribution.

Based on the idea expressed above the distinct geological domains in a data can be identified if

the parameters of the log-normally distributed populations are determined correctly. These

parameters are the mean (µ) and standard deviation (σ) of each population, and their corresponding

proportion in the global data (w). The weighted combination of cumulative distributions of these

domains yields the global distribution. Considering that clustering more than 2 domains can be

costly in terms of computation process, this study adopted a binary clustering approach. The

method can be applied hierarchically to define multiple domains. The workflow is formed by three

main steps, which are:

1. Exploratory data analysis to determine the relevant geochemical variables,

2. Optimization of the parameters of the log-normal distributions of variables, and

3. Domain assignment according to the determined ideal distribution.

These steps are illustrated in Figure 3.1. The three steps are repeated using different sets of 3

variables of data to later utilize on the second step of the study.

26
Figure 3.1 Schematic summary of unsupervised classification

The word “domain” used in the thesis is a generalized term which refers to a volume of material

that shows consistent properties. It is a concept used to help model a phenomenon in space.

Domains are usually inferred from samples and variables available at sample locations. If the

variables used to define domains are elements of rock-forming minerals, then the domains likely

refer to lithology. Domains can be attributed to mineralization or alteration if the variables allow

the modeler distinguishing between different populations referred to these attributes. However, the

information used to define domains is not constrained to geochemical elements. For instance,

quantified geotechnical or structural properties of samples like texture, stability, resistance to

weathering or permeability can be employed to define geotechnical or structural domains.

3.1.1 Exploratory Data Analysis

The most relevant geochemical variables identifying the distinct domains are chosen in this step.

Important criteria of how geochemical variables are determined is whether the chosen variables

are discriminatory enough to show the distinct log-normally distributed domains. Geochemical

elements for domaining can be chosen visually with the help of plots like histograms (Figure 3.2).

27
On the other hand, specific domain knowledge (in this case, geological knowledge) plays a critical

role in the selection of variables, and the geological literature can significantly assist to determine

the geochemical variables that are relevant to discriminate among geological domains. Domain

knowledge can be based on alteration, rock type or mineralization. While some elements give

information about the rock type and geological environments, others can be useful to define

alteration and mineralization. In the case of porphyry copper deposits, the composition of major

elements like Zn, Pb, Al, Mg, and K can be attributed to the chlorite-sericite, quartz-sericite,

potassic and argillic alteration groups (Sillitoe, 2010; Faraj and Ortiz, 2021). The major alteration

groups can be roughly split into binary domains as chlorite-sericite & potassic and quartz-sericite

& argillic.

Although one variable can be enough to identify distinct domains theoretically, utilizing a higher

number of variables increases the accuracy of the domaining. Three geochemical variables are

utilized to identify domains in the study, so that variables having fuzzy boundaries between

domains can be combined with other variables to contribute to a better definition of these domains.

Once a set of variables is determined out of all variables, it is going to be employed in the following

steps.

28
Figure 3.2 Log-normally distributed global histogram of a variable containing two distinct
domains

3.1.2 Optimization

The optimization process is critical to have an accurate domaining. Optimization is performed

based on the assumption that the geochemical variables belonging to a geological domain are log-

normally distributed. It is important to notice that if a variable is log-normally distributed, the

variable’s natural logarithm shows a normal distribution with parameters µ and σ. Considering this

assumption, all the works are done utilizing variable’s natural logarithm.

When binary classification is performed using three variables, the number of parameters used to

define the distribution of each variable in each domain is in total thirteen. Considering each

variable under a domain should have the parameters µ and σ to define the log-normal distribution,

three variables with two geological domains require twelve parameters in total to define the

29
distribution of the global data. Adding the proportion of the two domains in the global data makes

the number of parameters required to be optimized thirteen in total (Table 3.1).

Table 3.1 Optimization of thirteen parameters


Domain 1 Domain 2
Variable 1 µ11, σ11 µ12, σ12
Variable 2 µ21, σ21 µ22, σ22
Variable 3 µ31, σ31 µ32, σ32
Proportion w (1-w)

The objective of the optimization is to fit a combined weighted cumulative distribution function

(CDF) of the two domains to the global distribution of the data for each one of the variables. The

level of the fit between combined CDF and the global distribution is assessed by calculating the

squared error (SE). An exact fitting combined CDF to the global distribution means that the

weighted combination of the CDFs assigned to the two domains matches exactly the global CDF

(Figure 3.3), and in that case, the SE converges to zero (Figure 3.4). Therefore, a minimization

approach is adopted to solve the optimization problem. Optimum parameters for each of the

variables are defined by solving the minimization problem expressed below:

𝑉 𝑁

𝑚𝑖𝑛𝑖𝑚𝑖𝑧𝑒{∑ ∑(𝐺𝑙𝑜𝑏𝑣 (𝑧𝑛 ) − 𝐶𝑜𝑚𝑏𝑣,𝑘 (𝑧𝑛 ))2 } (3.1)


𝑣=1 𝑛=1

where 𝐺𝑙𝑜𝑏𝑣 (𝑧𝑛 ) is the percentile rank of the sample n of variable v in the global data, and

𝐶𝑜𝑚𝑏𝑣,𝑘 (𝑧𝑛 ) is the combined CDF of the sample n of variable v according to the proportions of

k-domains:

𝐶𝑜𝑚𝑏𝑣,𝑘 (𝑧𝑛 ) = 𝑤1 ∗ 𝐶𝐷𝐹𝑣,1 (𝑧𝑛 ) + (1 − 𝑤1 ) ∗ 𝐶𝐷𝐹𝑣,2 (𝑧𝑛 ) (3.2)

where k is 2 for the binary classification case, 𝑤1 is the weight of domain 1 in the global data and

𝐶𝐷𝐹𝑣,𝑘 (𝑧𝑛 ) is the cumulative distribution of sample n of variable v in domain k.

30
In case of a poor fit between the combined weighted CDF of two domains and the global

distribution, it can be inferred that either the assumption made about the type of populations’

distribution is wrong, or the number of most distinct populations are more than 2. If the poor fit is

caused by a wrong assumption about the distributions, a better fit could be obtained when different

types of distribution, such as normal or Poisson distributions, are assumed for the populations.

Figure 3.3 A near-perfect fit of a combined distribution (red line) to the global data (blue dots)
and the optimum distributions of two domains forming the combined distribution (green and
yellow lines)

31
Figure 3.4 Topographical view of SE according to change in parameter µ of domain 1 and
domain 2.

There are different approaches to solve minimization problems. Some of the methods include

Powell’s method, Conjugate Gradient method (CG), Newton Conjugate Gradient method

(Newton-CG), BFGS which stands for Broyden, Fletcher, Goldfarb and Shanno method (Adams,

1996). Out of the methods mentioned, Powell’s method can be considered the simplest method.

Powell’s method is a quadratically convergent iterative function, meaning that it can be effective

on quadratic problems (Figure 3.5). However, considering most functions are quadratic near their

extrema, parameters defining the problem should be well defined to let the function converge to

extrema. Powell’s method is considered greedy because it can be challenging to finalize the

optimization when the parameters and constraints are not determined well. As an alternative, CG

32
is also a simple method used for quadratic problems aimed to find the solution giving zero gradient

(Shewchuk, 1994). CG method makes use of Eigenvalues and Eigenvectors to find the extrema.

BFGS and Newton-CG can be considered improved versions of CG method. However, while CG

does not require any extra information, BFGS and Newton-CG require first order partial

derivatives of the problem, which form the Jacobian matrix, and second order partial derivatives,

which is the Hessian matrix. Considering all, CG method is utilized in the minimization problem.

Figure 3.5 An example of quadratic form problem with two controlling parameters

33
3.1.3 Domain Assignment

Once the optimum parameters are determined, each sample must be assigned to one of the

domains. Initially, domains of samples are randomly assigned according to the optimum weight of

domains determined during the optimization stage, and hence the optimum weight is already

provided before starting the domain assignment step. This initial assignment creates two

distributions that do not match the parametrized distributions obtained in the previous step. The

final domain assignment is performed utilizing an algorithm which checks if swapping randomly

picked samples from each domain yields a lower relative error, meaning better fit for both domains

CDFs (Figure 3.6 & Figure 3.7). Notice that each domain swapping passes the value for all three

variables from one domain to another for each of the samples swapped. The equation of the relative

error is

𝐾 𝑉
|𝑓𝑣,𝑘 (𝑍𝑛 ) − 𝑓𝑣,𝑘 (𝑧𝑛 )|
∑∑ (3.3)
𝑓𝑣,𝑘 (𝑍𝑛 )
𝑘=1 𝑣=1

where 𝑓𝑣,𝑘 (𝑍𝑛 ) is the ideal CDF of sample n of variable v in domain k, and 𝑓𝑣,𝑘 (𝑧𝑛 ) is the current

rank of sample n of variable v in domain k. If the relative error decreases after the potential swap,

then the swapping of the randomly picked samples is realized by the algorithm. The loop continues

running until there is no change in domains up to a determined threshold value.

At the end of the loop, the improvement of the accuracy is assessed using the equation applied for

k-Means Clustering method:

𝐾 |𝐶𝑘 | |𝐶𝑘 |
1
∑ ∑ ∑(𝑥𝑖 − 𝑥𝑗 )2 (3.4)
|𝐶𝑘 |
𝑘=1 𝑖=1 𝑗=1

34
where 𝐶𝑘 is the domain, |𝐶𝑘 | is the number of observations in domain 𝐶𝑘 , K is 2 for the binary

clustering, and 𝑥𝑖 and 𝑥𝑗 are the pair of samples in domain 𝐶𝑘 . After computing the accuracy, the

loop of swapping samples starts again. This recurring system of loops stops when there is no

change in accuracy of the domains, and the domain assignment with the best K-Means Clustering

score is recorded at the end of the algorithm (Figure 3.8-a). As an alternative to the method of

accuracy assessment expressed above, accuracy can be calculated after certain number of

swapping attempts within only one loop, which ends when there is no swap after a determined

threshold value (Figure 3.8-b). While the first method of accuracy assessment provides better

clustering, the latter performs the clustering faster with insignificant amount of accuracy loss. It is

important to notice that the K-Means Clustering scorer minimizes the pair-wise distance between

samples belonging to a cluster, while maximizing the distance between clusters in trivariate space

(Figure 3.7). Because the objective of K-Means Clustering and of searching for the best fit between

the combined weighted CDFs and global data are different, the best fit does not necessarily give

the best clustering result (Figure 3.8). The final output data of the unsupervised clustering

algorithm is used to inform the next step, Ensemble learning with Support Vector Classification.

Figure 3.6 Result of domain assignment according to optimal distributions

35
Figure 3.7 Result of domain assignment in tri-variate space

Figure 3.8 (a) Accuracy assessment with recurring loops and (b) accuracy assessment within
one loop. Notice that red circles show when the algorithm reaches to minimum Euclidean
distance. While the last points in graphs have the best fit between combined weighted CDFs and
global data, results with best clustering are obtained earlier.
36
3.2 Ensemble Learning with Support Vector Classification

The second stage of the study focuses on the uncertainty modeling utilizing an ensemble approach.

The knowledge obtained after repeating the stage one on different set of variables is used to inform

SVC. SVC is applied on a subset of samples. This process is known as bootstrap aggregating, or

bagging, and the process is repeated with the different sets of geochemical variables. All the

models obtained employing different subsets of samples and different subsets of variables are

considered weak learners, and weak learners are combined to build a strong learner. This is the

principle of ensemble learning. At this point, it is important to notice that the number of variables

in the geochemical dataset should be large so that subsets of variables can be chosen to apply

ensemble learning. For example, if five variables are found to be useful for the classification, and

three variables are used for each learner, then there will be twenty combinations to find weak

learners to be used for the ensemble learning approach.

3.2.1 Application of Support Vector Classification

As stated previously, SVC is a supervised MLA used generally for binary classification problems.

A certain portion of the input data is used to train SVC, and the remaining part of the input is

reserved for validation of the model. Since the objective of the approach is domaining, for this

study, the coordinates of samples in 3-D space are utilized as input data, and the domains are

employed as output of the model (labels). The goal is to classify points in the 3D space to determine

the extent of each of these two domains. First, data are split into training and test datasets before

modeling. Twenty five percent of the data is left for testing and the rest is used for training.

Secondly, two important parameters of SVC, C and gamma, are tuned to optimum values. To tune

the parameters, the GridSearchCV module is utilized on the training dataset. GridSearchCV

(Pedregosa et al., 2011) is a module of scikit-learn used to perform cross-validated grid search.

37
Different combinations of parameters are assessed using the balanced-accuracy presented in Eq.

(2.11), under the application of 5-fold cross-validation.

During the operation of grid search, the training data is split into five equal subsets, and model

performance is assessed five times by leaving one subset of samples out for validation each time

and training is performed using the remaining four fifths of the data. The mean accuracy of the

five splits yields overall accuracy of the model. The parameters providing the highest accuracy are

chosen from the grid search. However, choosing extreme values for parameters may cause

overfitting of the model, therefore the user can make a subjective decision about the choice of

parameters.

After determining the optimum parameters, one more split is applied on the training dataset to

perform ensemble learning. Two third of the training data is selected to perform the actual training,

and the rest is neglected for one SVC model. The ratio of training data to all data becomes 50%

after the second split. The model is trained, and predictions are performed along the target

locations. This process is repeated for the subset of samples (Figure 3.9). This is generally called

bagging, however the key difference between the bagging and the technique utilized in this study

is that sampling is performed without replacement. The reason why sampling is performed without

replacement is that a reproduced sample does not have any effect on the training data, instead they

only cause model to be informed less. While the approach may cause bias, high bias may be

overcome by employing fewer samples during training.

38
Figure 3.9 Main steps of SVC and technique of training-test split

3.2.2 Ensemble Learning with 3-Combination of Variables and Bagging

Once SVC is performed for all the subsets of samples and subsets of variables, the ensemble model

can be built combining all the models. The total number of models to combine is the 3-combination

of n variables times the number of resampling done during bagging. For example, the total number

6
of models for the case of 6 variables and 20 resampling of SVC with bagging is ( ) 𝑥20 =
3
6!
𝑥20 = 400. Considering the outputs of all models are the predictions along the target
3! 𝑥 3!

locations of 3-D space, combining all the models yields a probabilistic model. The accuracy of

models can be assessed using Eq. (2.11). To assess the performance of the ensemble model, values

at each location are categorized back to binary values and assigned to one of the two domains,

according to the global proportions of domains obtained during the first step of the workflow, and

compared with the domains assigned using K-means clustering algorithm from the drillhole dataset

39
reserved for validation. In this way, a confusion matrix can be built, and the accuracy can be

assessed again using Eq. (2.11). A schematic summary of the methodology is illustrated in Figure

3.10.

Figure 3.10 Workflow of the study

40
3.2.3 Hierarchical Application

The workflow defined above can be repeated hierarchically to define more than two domains.

Once an ensemble model is built with two domains, the application can be repeated for each

domain, so that subdomains of each domain can be built (Figure 3.11). The final model is built by

constraining the subdomains to the preceding domains.

Figure 3.11 Hierarchical application of domaining

3.3 Comparison with BlockSIS

To compare the result of the ensemble model, a sequential indicator simulation (SIS) model is

built. BlockSIS is the implementation used to create the SIS simulations. The model uses soft

secondary data, which constrain the traditional SIS to impose local proportions (Deutsch, 2006),

and image cleaning provides further smoothing of the results. BlockSIS is a program developed

within the GSLIB environment (Figure 3.12), which means it has a similar programming structure

with the conventional algorithm in GSLIB, named SISIM. There are different options in BlockSIS

to use the secondary data. Simple kriging with locally varying mean probabilities (Figure 3.12,

Line1 - option 2) is chosen to obtain models that reproduce the local trends in the categories.

41
Figure 3.12 Parameter file of BlockSIS

BlockSIS is performed on the same target locations used for Ensemble learning, to make

comparison easy. To apply BlockSIS, a model of the local proportions, p(u), is generated using

the nearest neighbor method. First, the closest N composite drillhole samples within a certain

distance are recorded for each target location where BlockSIS is performed, and then p(u) is

calculated for each location with an estimation approach like inverse-distance weighting or simply

average of the nearest neighbors. The global proportions are assigned to the target locations which

has no neighbor drillhole samples within a certain distance. Once the local proportions are obtained

for one domain, the other domain is obtained subtracting p(u) from 1, 1-p(u), since the case is

42
binary. Beside local proportions, the rest of the requirements to run BlockSIS are the same as SIS.

Variogram models are built to represent the spatial continuity of the model, and global proportion

of domains corresponding to their prior probabilities are defined. The number of realizations and

some other details related to kriging should also be determined. When the application is completed,

image cleaning is applied on the data according to the chosen level of intensity (Figure 3.13).

Figure 3.13 Effect of image cleaning on the results

For the accuracy assessment of BlockSIS, the same process applied for the Ensemble model is

followed. This time, the accuracy assessment of BlockSIS is performed for each realization and

for the E-type. First, the E-type model is categorized back to binary values according to the

optimum proportions of the domains, and then, the closest domains are assigned using the nearest

neighbor algorithm from the drillhole dataset reserved for validation. In this way, a confusion

matrix can be built, and the accuracy can be assessed again using Eq. (2.11).

43
Chapter 4 Case Study

A geochemical dataset is employed to test the implementation. The geochemical data is obtained

from a drillhole campaign of a Porphyry Copper deposit. Implementation is applied on the data

with a two-level hierarchical approach (Figure 4.1). The case study is explained under two sub-

sections for each level of the hierarchical application, and each application includes three sections,

which explain (1) unsupervised clustering, (2) supervised learning and Ensemble implementation

and (3) validation of the work respectively. All the works are performed using Python

programming language.

Figure 4.1 Two-level hierarchical domaining

4.1 First Level of Hierarchical Application

The first step of the application divides the data into two distinct domains as domain 1 and domain

2. The following sub-sections explain how domaining is obtained step by step starting from

unsupervised clustering, which include exploratory data analysis, optimization of the distribution

of selected variables and domain assignment of samples. This is followed by ensemble modeling

according to the obtained domaining, and finally SIS modeling using the domaining information

obtained from unsupervised clustering.

44
4.1.1 Unsupervised Clustering

Unsupervised clustering is explained under three main steps: data preparation and exploratory data

analysis, optimization of the parameters defining the distribution of domains, and domain

assignment according to the optimized parameters of log-normally distributed data.

4.1.1.1 Data Preparation and Exploratory Data Analysis

The data contain 1138 drillholes, and each drillhole consists of 15-meter-long composites.

Considering the composites are samples, the data have 21641 samples. The data contain 49

chemical elements, which are Au, Ag, Mo, Re, Cd, Pb, Zn, Te, Bi, Sb, Hg, Co, Ni, Se, Al, As, B,

Ba, Be, Ca, Ce, Cr, Cs, Cu, Ga, Ge, Hf, In, K, La, Li, Mg, Mn, Na, Nb, P, Rb, S, Sc, Sn, Sr, Ta,

Th, Ti, Tl, U, V, W, Y and the other available information are coordinates, lithology, alteration,

and mineralization.

Samples with unidentified alteration are removed from the data. Data are further constrained to a

certain area where the number of samples are enough to represent the space. After this, the number

of samples decreased to 17074. The middle coordinates of each composite are used for their spatial

location (Table 4.1). Domain knowledge obtained in the first step of the workflow is expected to

represent alteration minerals of Copper Porphyry Deposit. The relevant variables representing

alteration are determined with EDA and literature (Sillitoe, 2010). Out of 49 geochemical

elements, Mg, Al, Zn, Pb, Y and Ga showed clear binary clustering. Furthermore, ratios of

elements are used to highlight some specific geological features. The final variables to be used in

the unsupervised clustering are Mn/K, Mg/Al, Zn, Pb, Y, Ga (Table 4.1).

45
Table 4.1 Descriptive statistics of sample coordinates and variables. Elements are in ppm, and
ratios are adimensional.
Count Mean St. Dev. Min 25% 50% 75% Max
x 17074 16711.02 601.49 14971.13 16227.29 16781.55 17163.53 18050.26
y 17074 106896.80 357.86 106000.01 106601.30 106916.51 107199.63 107500.00
z 17074 2843.23 108.77 2640.03 2758.09 2839.45 2927.61 3138.85
Mn/K 17074 923.10 1616.95 9.83 96.00 189.16 1094.44 18642.86
Mg/Al 17074 0.25 0.27 0.01 0.05 0.08 0.48 3.39
Zn 17074 398.79 742.73 2.00 19.00 73.00 486.92 10000.00
Pb 17074 60.97 188.66 0.20 3.83 16.00 54.93 10000.00
Y 17074 4.18 5.53 0.05 0.49 0.92 7.47 52.20
Ga 17074 2.38 2.01 0.19 0.95 1.47 3.30 12.81

While the objective of the implementation is to define domaining using only geochemical data,

alteration is turned into a binary variable to see how field-logging is distributed in relation to sets

of three of the selected variables. For this purpose, chloritic-sericitic and potassic alteration are

combined under domain 1, and argillic alteration and quartz-sericitic alteration are combined in

domain 2 as suggested in Faraj and Ortiz, 2021. The results which are going to be obtained by the

first level of hierarchical application are expected to be similar with the binary domaining obtained

by field-logging.

It is important to notice that the natural logarithm of log-normally distributed data follows a normal

distribution. For practical purposes, natural logarithms of the variables are preferred when

displaying histograms and probability plots (Figure 4.2 & Figure 4.3). Notice that Figure 4.2 and

Figure 4.3 show that two distinct clusters of samples are clearly observable, which is reflected by

the bimodal distributions. Distribution of alteration along natural logarithms of Mn/K, Mg/Al and

Zn is shown in Figure 4.4. Figure 4.4 shows that clustering between two domains is clearly visible,

although the boundary between two clusters is not clear likely because of man-made errors. In the

following sections, optimization and domain assignment are performed for each 3-combination of

6 variables (Table 4.2).

46
Figure 4.2 Histograms - natural logarithms of variables

47
Figure 4.3 Cumulative distributions - natural logarithms of variables

48
Figure 4.4 Alteration data obtained from field logging in tri-variate space

Table 4.2 Combinations of 3 out of 6 variables

Variables Variables
1 2 3 1 2 3
1 ln(Mn/K) ln(Mg/Al) ln(Zn) 11 ln(Mg/Al) ln(Zn) ln(Pb)
2 ln(Mn/K) ln(Mg/Al) ln(Pb) 12 ln(Mg/Al) ln(Zn) ln(Y)
3 ln(Mn/K) ln(Mg/Al) ln(Y) 13 ln(Mg/Al) ln(Zn) ln(Ga)
4 ln(Mn/K) ln(Mg/Al) ln(Ga) 14 ln(Mg/Al) ln(Pb) ln(Y)
5 ln(Mn/K) ln(Zn) ln(Pb) 15 ln(Mg/Al) ln(Pb) ln(Ga)
6 ln(Mn/K) ln(Zn) ln(Y) 16 ln(Mg/Al) ln(Y) ln(Ga)
7 ln(Mn/K) ln(Zn) ln(Ga) 17 ln(Zn) ln(Pb) ln(Y)
8 ln(Mn/K) ln(Pb) ln(Y) 18 ln(Zn) ln(Pb) ln(Ga)
9 ln(Mn/K) ln(Pb) ln(Ga) 19 ln(Zn) ln(Y) ln(Ga)
10 ln(Mn/K) ln(Y) ln(Ga) 20 ln(Pb) ln(Y) ln(Ga)

49
4.1.1.2 Optimization of Parameters Defining the Distribution of Variables

As explained before, an optimization process for the parameters, which define the distribution of

variables, is required to have an accurate domaining. The library NumPy is utilized to manage the

data and functions of SciPy are used to define the minimization problem, normal distribution, and

global distribution (Table 4.3). The minimization problem is defined by Eq. (3.1), which is

expressed in the methodology section.

Table 4.3 Functions used to define the optimization problem

Library Package Function


NumPy - -
SciPy stats norm
SciPy stats percentileofscore
SciPy optimize minimize

To minimize the approximation errors, initial parameters are approximately determined by visually

inspecting the distribution of global data in histograms (Figure 4.2). Initial parameters assigned to

each variable are tabulated in (Table 4.4). Because domain 2 has a slightly larger population size,

the initial parameter w is assigned as 0.4. Then, optimizations are performed for the twenty subsets

of variables tabulated in Table 4.2. Optimization for each case approximately lasted for 5 hours.

Optimization results of the first case can be seen in Figure 4.5. Optimized parameters of each case

are tabulated in Table 4.5. All the optimum parameters are utilized in the following step of domain

assignment.

Table 4.4 Initial parameters of variables


Mn/K Mg/Al Zn Pb Y Ga
μ σ μ σ μ σ μ σ μ σ μ σ w
Domain1 8 1.5 -0.5 0.5 6.5 1 4 1 2.4 0.75 1.5 0.5 0.4
Domain2 4 1.5 -3 0.5 3 1 2 1 -0.5 0.75 0.1 0.5 0.6

50
Figure 4.5 Optimization result of first case: cumulative probability plots of variables with initial
parameters of normal distribution on the left, cumulative probability plots of variables with
optimum parameters of distributions on the right

51
Table 4.5 Optimum parameters for each subset of variables
Three - Combination of Variable 1 Variable 2 Variable 3
Variables Domain1 Domain2 Domain1 Domain2 Domain1 Domain2
# Variable1 Variable2 Variable3 μ σ μ σ μ σ μ σ μ σ μ σ w
1 ln(Mn/K) ln(Mg/Al) ln(Zn) 7.39 0.85 4.79 0.70 -0.57 0.37 -2.80 0.45 6.53 0.83 3.43 1.14 0.40
2 ln(Mn/K) ln(Mg/Al) ln(Pb) 7.42 0.83 4.81 0.71 -0.56 0.35 -2.79 0.46 4.21 1.00 1.77 1.36 0.39
3 ln(Mn/K) ln(Mg/Al) ln(Y) 7.35 0.89 4.77 0.69 -0.59 0.39 -2.82 0.43 2.11 0.55 -0.52 0.63 0.41
4 ln(Mn/K) ln(Mg/Al) ln(Ga) 7.43 0.82 4.81 0.72 -0.55 0.35 -2.79 0.46 1.36 0.50 0.07 0.43 0.39
5 ln(Mn/K) ln(Zn) ln(Pb) 7.23 1.01 4.72 0.65 6.40 0.92 3.29 1.02 4.11 1.04 1.61 1.27 0.45
6 ln(Mn/K) ln(Zn) ln(Y) 7.22 1.02 4.72 0.65 6.39 0.93 3.28 1.01 2.02 0.65 -0.59 0.56 0.45
7 ln(Mn/K) ln(Zn) ln(Ga) 7.32 0.93 4.76 0.67 6.47 0.87 3.36 1.08 1.30 0.54 0.05 0.42 0.43
8 ln(Mn/K) ln(Pb) ln(Y) 7.26 0.99 4.73 0.66 4.12 1.03 1.63 1.28 2.04 0.63 -0.57 0.58 0.44
9 ln(Mn/K) ln(Pb) ln(Ga) 7.46 0.78 4.83 0.73 4.23 0.99 1.81 1.38 1.38 0.48 0.09 0.44 0.38
10 ln(Mn/K) ln(Y) ln(Ga) 7.28 0.97 4.74 0.66 2.06 0.61 -0.56 0.59 1.28 0.56 0.04 0.41 0.44
11 ln(Mg/Al) ln(Zn) ln(Pb) -0.58 0.38 -2.81 0.44 6.51 0.84 3.41 1.12 4.18 1.01 1.73 1.33 0.41
12 ln(Mg/Al) ln(Zn) ln(Y) -0.61 0.41 -2.83 0.42 6.47 0.87 3.36 1.08 2.09 0.58 -0.54 0.61 0.42
13 ln(Mg/Al) ln(Zn) ln(Ga) -0.58 0.37 -2.81 0.45 6.52 0.84 3.42 1.13 1.34 0.52 0.06 0.42 0.41
14 ln(Mg/Al) ln(Pb) ln(Y) -0.60 0.40 -2.82 0.43 4.16 1.02 1.70 1.32 2.10 0.57 -0.53 0.62 0.42
15 ln(Mg/Al) ln(Pb) ln(Ga) -0.56 0.36 -2.79 0.46 4.20 1.00 1.76 1.35 1.35 0.51 0.07 0.43 0.40
16 ln(Mg/Al) ln(Y) ln(Ga) -0.60 0.40 -2.82 0.43 2.10 0.56 -0.53 0.63 1.32 0.53 0.05 0.42 0.42
17 ln(Zn) ln(Pb) ln(Y) 6.33 0.97 3.23 0.97 4.06 1.05 1.55 1.23 1.97 0.71 -0.62 0.53 0.47
18 ln(Zn) ln(Pb) ln(Ga) 6.19 1.08 3.11 0.86 3.96 1.09 1.41 1.16 1.14 0.65 -0.01 0.38 0.52
19 ln(Zn) ln(Y) ln(Ga) 6.35 0.96 3.25 0.98 1.99 0.69 -0.60 0.54 1.23 0.59 0.02 0.40 0.47
20 ln(Pb) ln(Y) ln(Ga) 4.09 1.04 1.60 1.26 2.01 0.66 -0.59 0.55 1.25 0.58 0.03 0.40 0.46

4.1.1.3 Domain Assignment

The same libraries and functions used for the optimization step are used for domain assignment

(Table 4.3). Moreover, a compiler of a library specifically built to run faster codes containing

NumPy is adopted at this step. The compiler JIT of Numba is employed to calculate Eq. (3.4). The

methodology section can be reviewed for details of the domain assignment process. Domain

assignment is performed for each subset of variables, and this process lasts in average 1.5 hours

for each case. Figure 4.6 clearly shows that domain assignment reached to a certain maturity in

terms of clustering for the first case. Figure 4.7 also shows that the domain assignment splits the

data into two clearly distinct clusters. The quality of the achieved assignment for each variable can

be seen in Figure 4.8. As seen in Figure 4.8, there is no perfect fit between optimum distribution

52
and the achieved domaining. The results observed in Figure 4.8 suggest that there are more than

two sub-domains within the larger domains used for this binary split, which is an expected behavior

for an ore deposit.

Figure 4.6 Improvement in domain assignment

Figure 4.7 Result of domain assignment - distribution of samples in tri-variate space

53
Figure 4.8 Domain assignment result of first case: probability plots comparing achieved and
optimal domaining

54
4.1.2 Supervised Learning and Ensemble Learning

Coordinates of composite samples and domain labels defined after the unsupervised learning step

are used as input and output data in SVC, respectively (Figure 4.9). The libraries scikit-learn and

NumPy are employed for all steps of SVC (Table 4.6). The kernel function used for SVC was

RBF. Combination of possible SVC parameters are tested with 5-fold cross-validation technique

and the parameters are determined accordingly. GridSearchCV function of scikit-learn is utilized

to perform 5-fold cross-validation. The results of 5-fold cross-validation are tabulated in Table 4.7.

Figure 4.9 Result of domain assignment process in coordinate system

Table 4.6 Functions used to perform support vector classifier

Library Package Function


NumPy - -
Scikit-learn svm SVC
Scikit-learn preprocessing StandardScaler
Scikit-learn model_selection GridSearchCV
Scikit-learn model_selection train_test_split
Scikit-learn metrics Confusion_matrix

55
Table 4.7 Scores of SVC with different combination of parameters according to 5-fold cross-
validation
Parameters Test Score Mean of St. Dev.
Index C Gamma Split1 Split2 Split3 Split4 Split5 Scores of Scores Rank
1 1 scale 0.86 0.847 0.861 0.858 0.859 0.857 0.005 27
2 1 10 0.905 0.896 0.904 0.897 0.903 0.901 0.004 6
3 1 5 0.894 0.883 0.892 0.898 0.897 0.893 0.005 12
4 1 2 0.88 0.872 0.883 0.883 0.886 0.881 0.005 18
5 1 1 0.873 0.859 0.871 0.873 0.874 0.87 0.006 21
6 1 0.1 0.852 0.84 0.852 0.852 0.852 0.85 0.005 30
7 10 scale 0.867 0.854 0.868 0.865 0.869 0.865 0.006 24
8 10 10 0.911 0.896 0.907 0.9 0.908 0.904 0.005 3
9 10 5 0.904 0.894 0.9 0.898 0.904 0.9 0.004 7
10 10 2 0.89 0.883 0.888 0.89 0.898 0.89 0.005 14
11 10 1 0.88 0.87 0.882 0.879 0.885 0.879 0.005 19
12 10 0.1 0.858 0.847 0.858 0.857 0.857 0.855 0.004 29
13 50 scale 0.872 0.856 0.867 0.87 0.873 0.868 0.006 23
14 50 10 0.91 0.9 0.906 0.905 0.903 0.905 0.004 1
15 50 5 0.909 0.895 0.909 0.902 0.904 0.904 0.005 4
16 50 2 0.894 0.884 0.887 0.893 0.897 0.891 0.005 13
17 50 1 0.886 0.875 0.884 0.885 0.889 0.884 0.005 17
18 50 0.1 0.861 0.846 0.86 0.86 0.857 0.857 0.005 28
19 100 scale 0.873 0.858 0.868 0.872 0.874 0.869 0.006 22
20 100 10 0.906 0.9 0.908 0.899 0.901 0.903 0.003 5
21 100 5 0.911 0.896 0.906 0.902 0.907 0.905 0.005 2
22 100 2 0.898 0.886 0.89 0.897 0.898 0.894 0.005 11
23 100 1 0.889 0.879 0.886 0.888 0.892 0.887 0.004 16
24 100 0.1 0.861 0.849 0.861 0.857 0.862 0.858 0.005 26
25 1000 scale 0.878 0.862 0.873 0.877 0.879 0.874 0.006 20
26 1000 10 0.898 0.892 0.905 0.893 0.889 0.895 0.005 10
27 1000 5 0.904 0.894 0.903 0.902 0.895 0.9 0.004 8
28 1000 2 0.901 0.889 0.899 0.899 0.902 0.898 0.005 9
29 1000 1 0.892 0.879 0.886 0.887 0.895 0.888 0.005 15
30 1000 0.1 0.866 0.85 0.864 0.865 0.867 0.862 0.006 25

While applying SVC, the data was split into training and validation sets with 25% validation ratio,

and then one third of the training data are left out for bagging. Predictions were made along a

determined mesh grid (Table 4.8), and each point of the mesh corresponds to a node of the model

centered in a volume of 15x15x10 cubic meter with a total of 2.3 billion cubic meter for the entire

model. Considering an average rock density is 2.7 tonne/m3 modelled volume is equivalent to 6.3

billion tonne material. SVC is performed for each subset of variables, which are in total 20, and

for each subset of samples, which are again 20. In total, SVC is performed 400 times. The node

56
model determined when applying SVC to the variables Mg/Al, Mn/K and Zn is illustrated in Figure

4.10, and the ensemble model, which is the average of all SVC models for all sets of variables, is

illustrated in Figure 4.11. Vertical cross-sections overlaid by composite samples (Figure 4.12)

confirmed that the ensemble model represents boundaries well.

Table 4.8 Mesh grid along which predictions made


Range Number of points
Easting 15000 18000 201
Northing 106000 107500 101
Elevation 2640 3140 51

Figure 4.10 Node model of the variables Mg/Al, Mn/K and Zn

57
Figure 4.11 Final ensemble model

Figure 4.12 Vertical cross-sections taken along North axis of the SVC Ensemble model

58
The accuracy of all node models is tracked using the balanced-accuracy technique Eq. (2.11). All

models show high accuracy of approximately 90% (Figure 4.13). The ensemble model is

categorized back to binary values according to the global proportion of domains which is

0.41/0.59, so that an overall accuracy can be quantified. To provide the original proportion of the

domains, 1.4 is decided as threshold value (Figure 4.14), meaning that if the average of domain

labels is equal or less than 1.4, then the block becomes domain 1, else it becomes domain 2 (Figure

4.15). Total mass of domain 1 of the final model is equivalent to 2.6 billion tonnes of material, and

that of domain 2 is equivalent to 3.7 billion tonnes of material.

The domains of nodes closest to the validation samples are found using nearest neighbor method,

so that the categorized ensemble model can be compared with the validation dataset. Finally, a

91.3% balanced accuracy of the proposed ensemble model is obtained from the resulting confusion

matrix (Table 4.9).

Figure 4.13 Boxplots showing the distribution of accuracies for each model according to subset
of variables. Notice that vertical axis of the graph is limited from bottom and top, 0.89 and 0.92
respectively.

59
Figure 4.14 Categorization of Ensemble learning according to optimum proportion of domains

60
Figure 4.15 Categorization of Ensemble model – Horizontal profiles

61
Table 4.9 Confusion matrix of categorized ensemble model

Domain
1 0.898 0.102

True
2 0.071 0.929
1 2
Predicted Domain

4.1.3 Comparison with BlockSIS

BlockSIS is applied on the labelled data and implements simple indicator kriging with local

proportions, to determine the conditional distribution of categories at every node. Firstly, the local

proportion within a neighborhood is calculated using the k-means clustering method along the grid

specified in Table 4.8. An average of the closest 8 samples within 50 meters distance is used as

local proportion. The global average proportion is assigned to points when there are no samples

within 50 meters distance. The global proportions imposed on the program were 0.41/0.59.

Secondly, the spatial continuity of the domains is defined with indicator variogram models.

Horizontal continuity is analyzed with experimental variograms created along 12 directions from

Azimuth 0° to Azimuth 165° (Figure 4.16). Azimuth 75°, Azimuth 165° and vertical directions are

determined as orthogonal directions with maximum anisotropy. Variogram models were built

fitting the experimental variograms (Figure 4.17) with 0.05 nugget effect and 3 spherical structures

Eq. (4.1). The number of realizations was defined as 20. The rest of the details can be seen in

Figure 4.18 showing the parameter file of BlockSIS.

62
Figure 4.16 Experimental variograms created along horizontal direction

Figure 4.17 Variogram models built along chosen directions

63
𝛾𝐴−75° (ℎ) = 0.05 ∙ 𝑁𝑢𝑔(ℎ) + 0.63 ∙ 𝑆𝑝ℎ850 (ℎ) + 0.21 ∙ 𝑆𝑝ℎ55 (ℎ) + 0.11 ∙ 𝑆𝑝ℎ600 (ℎ)

𝛾𝐴−165° (ℎ) = 0.05 ∙ 𝑁𝑢𝑔(ℎ) + 0.63 ∙ 𝑆𝑝ℎ5000 (ℎ) + 0.21 ∙ 𝑆𝑝ℎ110 (ℎ) + 0.11 ∙ 𝑆𝑝ℎ20 (ℎ) (4.1)
𝛾𝑉𝑒𝑟𝑡 (ℎ) = 0.05 ∙ 𝑁𝑢𝑔(ℎ) + 0.63 ∙ 𝑆𝑝ℎ1500 (ℎ) + 0.21 ∙ 𝑆𝑝ℎ250 (ℎ) + 0.11 ∙ 𝑆𝑝ℎ40 (ℎ)

Figure 4.18 Parameter file of BlockSIS

64
Two scenarios were created with image cleaning; one is with light image cleaning and the other is

with heavy image cleaning options (Figure 4.19 & Figure 4.20). Pointy results occur because local

proportions are imposed on the simulation. Increasing the search neighborhood of the k-means

clustering method can smooth the pointy appearance, but leads to little uncertainty in the

boundaries between categories.

It can be observed from the results that as the intensity of image cleaning increases, models look

like object-based models. Moreover, the model obtained from BlockSIS with heavy image

cleaning is closer to the that of the Ensemble model, therefore comparison and validation is

performed using heavy image cleaning.

Figure 4.19 Block models of BlockSIS with light image cleaning and heavy image cleaning

65
Figure 4.20 Effect of image cleaning on BlockSIS – horizontal profiles

Cross-sections overlaid by composite samples (Figure 4.21) show that BlockSIS with heavy image

cleaning has less uncertainty along boundaries in comparison to the Ensemble model. Moreover,

extrapolation in BlockSIS is limited, resulting in the global proportions in areas without data.

Accuracy is assessed with the balanced accuracy technique again for each realization (Table 4.10).

To make the final E-type model comparable, values are categorized back to domains considering

the original proportion of domains (Figure 4.22 & Figure 4.23). Just like with the Ensemble model,

domains of closest blocks are assigned to validation data samples using the nearest neighbor

method, so that the categorized BlockSIS model can be compared with the validation dataset. The

66
balanced accuracy of the categorized E-type model obtained from the confusion matrix (Table

4.11) is 92.5%.

Figure 4.21 BlockSIS with heavy image cleaning - Vertical cross-sections taken along North axis

67
Table 4.10 Confusion matrices of all realizations with balanced accuracy
Confusion Matrix (Row, Column) Balanced
Realizations (0,0) (0,1) (1,0) (1,1) Accuracy
1 1319 187 100 2642 0.920
2 1325 181 123 2619 0.917
3 1331 175 128 2614 0.919
4 1326 180 108 2634 0.921
5 1333 173 138 2604 0.917
6 1324 182 119 2623 0.918
7 1308 198 114 2628 0.913
8 1300 206 119 2623 0.910
9 1295 211 95 2647 0.913
10 1324 182 135 2607 0.915
11 1331 175 127 2615 0.919
12 1339 167 131 2611 0.921
13 1308 198 123 2619 0.912
14 1333 173 148 2594 0.916
15 1317 189 125 2617 0.914
16 1312 194 114 2628 0.915
17 1328 178 126 2616 0.918
18 1309 197 120 2622 0.913
19 1322 184 115 2627 0.918
20 1313 193 108 2634 0.916

Figure 4.22 Categorization of BlockSIS model with heavy image cleaning

68
Figure 4.23 Categorization of BlockSIS – horizontal profiles
Table 4.11 Confusion matrix of categorized BlockSIS model
Domain

1 0.889 0.111
True

2 0.046 0.954
1 2
Predicted Domain

69
It is clear that both Ensemble model and BlockSIS model have high accuracy and the difference

of accuracy between the two models is small. The reason why accuracies are high is because the

validation set is determined by extracting individual composite samples from the original data.

This means that the validation data are always very close to neighboring composite samples, which

condition the block at the validation sample location very strongly. If the validation set was defined

by removing entire drillholes instead of sample, the accuracy would be much lower.

While there is no clear difference between the two models, there are some details worth

mentioning. First, when the two models are compared (Figure 4.24), Ensemble model depicts

uncertainty better especially at boundaries with smoother transitions. Second, the Ensemble model

provides extrapolations that follow the geological shape of the orebodies. Third, the pointy

appearance of BlockSIS caused using a local mean can be diminished if the neighborhood size of

the k-means clustering is increased, however this may reduce the local accuracy of the model and

increase smoothing.

Figure 4.24 Comparison of two models with vertical cross-sections

70
4.2 Second Level of Hierarchical Application

Each domain obtained from first level of hierarchical application are further split into two domains.

Domain 1 is split into domains 10 and 20, and domain 2 is split into domains 30 and 40. The

methodology followed for the first level are followed in the second level of hierarchical

application. While domain 10 and 20 represent chlorite-sericite group and potassic group

respectively, domain 30 and 40 represent quartz-sericite and argillic alteration groups respectively.

4.2.1 Unsupervised Clustering

Out of 17074 samples, 6142 samples were labelled as domain 1 and the rest were labelled as

domain 2 at first level of hierarchical application. Unsupervised clustering is performed using

samples of each domain separately.

4.2.1.1 Data Preparation and Exploratory Data Analysis

Best variables representing sub-domains of domain 1 and domain 2 are determined with

exploratory data analysis. For each domain, four variables are determined which has similar

distribution characteristics. The variables used for domain 1 are yttrium (Y), molybdenum (Mo),

rhenium (Re) and magnesium divided by aluminum (Mg/Al) (Table 4.12 and Figure 4.25).

Histograms of variables in Figure 4.25 clearly show that data has negative skewness which can be

the sign of two domains, one of which located at negative side has a smaller proportion.

71
Table 4.12 Descriptive statistics of variables chosen for domain 1. Elements are in ppm, and
ratio is adimensional
St.
Count Mean Dev. Min 25% 50% 75% Max
x 6142 16181.52 498.82 14971.13 15904.13 16120.12 16395.51 18050.26
y 6142 106931.66 370.18 106000.01 106632.29 106982.95 107254.12 107500.00
z 6142 2834.24 104.20 2640.03 2753.45 2828.08 2911.87 3106.07
Y 6142 9.83 5.43 0.88 6.28 9.54 12.22 52.20
Mo 6142 88.64 92.36 1.30 40.08 70.53 112.56 1870.00
Re 6142 0.34 0.36 0.00 0.12 0.25 0.44 5.73
Mg/Al 6142 0.57 0.19 0.09 0.44 0.59 0.71 1.09

Figure 4.25 Variables chosen for sub-domains of domain 1

72
The variables used for domain 2 are zinc (Zn), indium (In), tellurium (Te) and manganese divided

by potassium (Mn/K) (Table 4.13 and Figure 4.26). Likewise the variables chosen for domain 1,

the histograms of variables chosen for domain 2 also show skewness (Figure 4.26) which can be a

sign of data consisting of two domains, one of which is located at the positive side this time has

smaller proportion. Optimization and domain assignment steps of two domains are performed for

each 3 combinations of 4 variables (Table 4.14).

Table 4.13 Descriptive statistics of variables chosen for domain 2. Elements are in ppm, and
ratio is adimensional
St.
Count Mean Dev. Min 25% 50% 75% Max
x 10932 17008.52 423.36 15500.36 16749.28 17034.09 17319.83 17853.72
y 10932 106877.22 349.24 106000.01 106599.90 106900.23 107152.00 107500.00
z 10932 2848.29 110.95 2640.29 2761.28 2851.13 2935.05 3138.85
Zn 10932 92.64 257.88 2.00 14.00 26.00 64.00 8283.00
In 10932 0.22 0.37 0.01 0.06 0.12 0.24 6.15
Te 10932 1.16 2.51 0.03 0.26 0.48 1.01 61.18
Mn/K 10932 166.68 269.70 9.83 70.56 114.01 177.24 10437.81

73
Figure 4.26 Variables chosen for sub-domains of domain 2

Table 4.14 Combinations of 3 out of 4 variables chosen for domain 1 and domain 2

Variables of Domain 1 Variables of Domain 2


1 2 3 1 2 3
1 ln(Y) ln(Mo) ln(Re) 1 ln(Zn) ln(In) ln(Te)
2 ln(Y) ln(Mo) ln(Mg/Al) 2 ln(Zn) ln(In) ln(Mn/K)
3 ln(Y) ln(Re) ln(Mg/Al) 3 ln(Zn) ln(Te) ln(Mn/K)
4 ln(Mo) ln(Re) ln(Mg/Al) 4 ln(In) ln(Te) ln(Mn/K)

4.2.1.2 Optimization of Parameters Defining the Distribution of Variables

Initial parameters are approximately determined by visual inspection on the distribution of global

data. Initial parameters assigned to each variable and optimum parameters are tabulated in Table

4.15, Table 4.16, Table 4.17 and Table 4.18. One of the results is illustrated for both domain 1 and

domain 2 in Figure 4.27 and Figure 4.28 respectively.

74
Table 4.15 Initial parameters of variables chosen for domain 1
Y Mo Re Mg/Al
μ σ μ σ μ σ μ σ w
Domain10 0.7 0.5 2 0.5 -4.2 0.5 -2.5 0.5 0.15
Domain20 2.4 0.5 4.4 0.6 -1.5 1 -0.4 0.4 0.85

Table 4.16 Initial parameters of variables chosen for domain 2


Zn In Te Mn/K
μ σ μ σ μ σ μ σ w
Domain30 2.9 0.6 -2.2 0.7 -1 0.7 4.7 0.8 0.85
Domain40 6 0.5 0 0.5 0.5 2 6.5 0.5 0.15

Table 4.17 Optimum parameters for each subset of variables chosen for domain 1
Three - Combination of Variable 1 Variable 2 Variable 3
Variables Domain1 Domain2 Domain1 Domain2 Domain1 Domain2
# Variable1 Variable2 Variable3 μ σ μ σ μ σ μ σ μ σ μ σ w
1 ln(Y) ln(Mo) ln(Re) 1.36 0.50 2.38 0.32 3.25 1.29 4.40 0.59 -2.73 1.47 -1.18 0.72 0.26
2 ln(Y) ln(Mo) ln(Mg/Al) 1.09 0.91 2.35 0.34 3.60 1.28 4.40 0.56 -1.78 0.94 -0.46 0.23 0.39
3 ln(Y) ln(Re) ln(Mg/Al) 1.79 0.76 2.38 0.28 -2.03 1.46 -1.16 0.61 -0.91 0.46 -0.41 0.18 0.45
4 ln(Mo) ln(Re) ln(Mg/Al) 3.54 1.25 4.41 0.55 -2.34 1.53 -1.17 0.68 -1.06 0.42 -0.42 0.20 0.34

Table 4.18 Optimum parameters for each subset of variables chosen for domain 2
Three - Combination of Variable 1 Variable 2 Variable 3
Variables Domain1 Domain2 Domain1 Domain2 Domain1 Domain2
# Variable1 Variable2 Variable3 μ σ μ σ μ σ μ σ μ σ μ σ w
1 ln(Zn) ln(In) ln(Te) 2.81 0.62 4.26 1.23 -2.59 0.76 -1.51 0.95 -1.10 0.68 -0.04 1.14 0.53
2 ln(Zn) ln(In) ln(Mn/K) 2.79 0.55 3.98 1.29 -2.63 0.74 -1.69 0.99 4.79 0.43 4.66 1.02 0.42
3 ln(Zn) ln(Te) ln(Mn/K) 2.79 0.58 4.09 1.27 -1.12 0.65 -0.16 1.15 4.78 0.46 4.66 1.06 0.47
4 ln(In) ln(Te) ln(Mn/K) -2.63 0.74 -1.67 0.99 -1.12 0.63 -0.21 1.15 4.79 0.44 4.66 1.03 0.44

75
Figure 4.27 Optimization of domain 1 with variables Y, Mo and Re: cum. probability plots of
variables with initial parameters of normal dist. on the left, cum. probability plots of variables
with optimum parameters of distributions on the right.

76
Figure 4.28 Optimization of domain 2 with variables Zn, In and Te: cum. probability plots of
variables with initial parameters of normal dist. on the left, cum. probability plots of variables
with optimum parameters of distributions on the right.

77
4.2.1.3 Domain Assignment

Domain assignment is applied for each subset of variables of both domain 1 and domain 2. Domain

1 is split into two domains as domain 10 and domain 20 (Figure 4.29), and domain 2 is split into

two domains as domain 30 and domain 40 (Figure 4.30) according to optimum cumulative

distributions.

Figure 4.29 Hierarchical domain assignment of domain 1: probability plots comparing achieved
and optimal domaining for the case with variables Y, Mo and Re

78
Figure 4.30 Hierarchical domain assignment of domain 2: probability plots comparing achieved
and optimal domaining for the case with variables Zn, In and Te

79
4.2.2 Supervised Learning and Ensemble Learning

Supervised learning is applied on subdomains of domain 1 and domain 2, which are obtained from

the domain assignment step, separately. First, Ensemble models are built for domain 1 and domain

2, and then domaining of ensemble models are constrained to the first-level categorized ensemble

model (Figure 4.31). Accuracies of SVM models built for domain 1 range from 0.72 to 0.81, and

SVM models built for domain 2 range from 0.65 to 0.71 (Figure 4.32).

The average of global proportions, which are the optimized distributions from the unsupervised

classification step, is used to determine the categorization threshold of the Ensemble model (Figure

4.33). Again, the average of the assigned categories for each sample is categorized into four

domains to assess the accuracy of the Ensemble model (Figure 4.34). Assuming density of the

rocks is 2.7 tonne/m3, total mass of domains 10, 20, 30 and 40 are 1 billion tonne, 1.6 billion tonne,

1.7 billion tonne and 2 billion tonne, respectively. The confusion matrix was built for the final

model, now considering the four categories (Table 4.19). While domain-wise accuracies range

between 0.56 and 0.90, the balanced accuracy of the final model is 0.73. Vertical and horizontal

cross-sections are illustrated in Figure 4.35 and Figure 4.36, respectively. As seen in Figure 4.35,

there are some misclassifications in the final model. Moreover, it can be observed from the cross-

sections that continuity along vertical direction is higher than that of horizontal direction.

80
Figure 4.31 Methodology followed to realize hierarchical ensemble model

81
Figure 4.32 Boxplots showing balanced accuracies of SVM models

82
Figure 4.33 Categorization of ensemble model

83
Figure 4.34 Result of second-level hierarchical domain assignment

Table 4.19 Confusion matrix of final ensemble model


True Domain

10 0.557 0.255 0.083 0.106


20 0.049 0.904 0.034 0.013
30 0.026 0.015 0.799 0.160
40 0.033 0.057 0.263 0.648
10 20 30 40
Predicted Domain

84
Figure 4.35 Vertical cross-sections taken from final ensemble model

85
Figure 4.36 Two hierarchical ensemble models along horizontal benches

4.2.3 Comparison with BlockSIS

BlockSIS model is built following the same methodology which was followed in the first level of

the hierarchical application, but it is not constrained by the binary model initially built. Results of

the domain assignment step are employed to condition the model. First, indicator variograms are

built for each individual domain against the rest of the domains (Figure 4.37 and Eq. (4.2).

BlockSIS model is built imposed local proportions and with heavy image cleaning. As was the

case in the first-level hierarchical application, gridded prior proportions were computed using

86
Nearest Neighbor method with 50 meters distance. The global proportions determined for the

domains of model from domain 10 to domain 40 were 0.158, 0.247, 0.274 and 0.321 respectively.

Twenty realizations were performed.

Figure 4.37 Variogram models built along chosen directions for each sub-domain

87
Domain 10 Variogram models

𝛾𝐴−30° (ℎ) = 0.1 ∙ 𝑁𝑢𝑔(ℎ) + 0.4 ∙ 𝑆𝑝ℎ1100 (ℎ) + 0.3 ∙ 𝑆𝑝ℎ90 (ℎ) + 0.2 ∙ 𝑆𝑝ℎ20 (ℎ)

𝛾𝐴−120° (ℎ) = 0.1 ∙ 𝑁𝑢𝑔(ℎ) + 04 ∙ 𝑆𝑝ℎ120 (ℎ) + 0.3 ∙ 𝑆𝑝ℎ40 (ℎ) + 0.2 ∙ 𝑆𝑝ℎ1000 (ℎ)

𝛾𝑉𝑒𝑟𝑡 (ℎ) = 0.1 ∙ 𝑁𝑢𝑔(ℎ) + 0.4 ∙ 𝑆𝑝ℎ330 (ℎ) + 0.3 ∙ 𝑆𝑝ℎ50 (ℎ) + 0.2 ∙ 𝑆𝑝ℎ380 (ℎ)

Domain 20 Variogram models

𝛾𝐴−15° (ℎ) = 0.1 ∙ 𝑁𝑢𝑔(ℎ) + 0.6 ∙ 𝑆𝑝ℎ6000 (ℎ) + 0.2 ∙ 𝑆𝑝ℎ130 (ℎ) + 0.1 ∙ 𝑆𝑝ℎ510 (ℎ)

𝛾𝐴−105° (ℎ) = 0.1 ∙ 𝑁𝑢𝑔(ℎ) + 0.6 ∙ 𝑆𝑝ℎ950 (ℎ) + 0.2 ∙ 𝑆𝑝ℎ60 (ℎ) + 0.1 ∙ 𝑆𝑝ℎ3000 (ℎ)

𝛾𝑉𝑒𝑟𝑡 (ℎ) = 0.1 ∙ 𝑁𝑢𝑔(ℎ) + 0.6 ∙ 𝑆𝑝ℎ500 (ℎ) + 0.2 ∙ 𝑆𝑝ℎ500 (ℎ) + 0.1 ∙ 𝑆𝑝ℎ500 (ℎ)
(4.2)
Domain 30 Variogram models

𝛾𝐴−75° (ℎ) = 0.2 ∙ 𝑁𝑢𝑔(ℎ) + 0.35 ∙ 𝑆𝑝ℎ700 (ℎ) + 0.25 ∙ 𝑆𝑝ℎ60 (ℎ) + 0.2 ∙ 𝑆𝑝ℎ20 (ℎ)

𝛾𝐴−165° (ℎ) = 0.2 ∙ 𝑁𝑢𝑔(ℎ) + 0.35 ∙ 𝑆𝑝ℎ30 (ℎ) + 0.25 ∙ 𝑆𝑝ℎ260 (ℎ) + 0.2 ∙ 𝑆𝑝ℎ5000 (ℎ)

𝛾𝑉𝑒𝑟𝑡 (ℎ) = 0.2 ∙ 𝑁𝑢𝑔(ℎ) + 0.35 ∙ 𝑆𝑝ℎ450 (ℎ) + 0.25 ∙ 𝑆𝑝ℎ500 (ℎ) + 0.2 ∙ 𝑆𝑝ℎ80 (ℎ)

Domain 40 Variogram models

𝛾𝐴−0° (ℎ) = 0.1 ∙ 𝑁𝑢𝑔(ℎ) + 0.4 ∙ 𝑆𝑝ℎ80 (ℎ) + 0.3 ∙ 𝑆𝑝ℎ25 (ℎ) + 0.2 ∙ 𝑆𝑝ℎ500 (ℎ)

𝛾𝐴−90° (ℎ) = 0.1 ∙ 𝑁𝑢𝑔(ℎ) + 0.4 ∙ 𝑆𝑝ℎ40 (ℎ) + 0.3 ∙ 𝑆𝑝ℎ300 (ℎ) + 0.2 ∙ 𝑆𝑝ℎ10 (ℎ)

𝛾𝑉𝑒𝑟𝑡 (ℎ) = 0.1 ∙ 𝑁𝑢𝑔(ℎ) + 0.4 ∙ 𝑆𝑝ℎ330 (ℎ) + 0.3 ∙ 𝑆𝑝ℎ50 (ℎ) + 0.2 ∙ 𝑆𝑝ℎ380 (ℎ)

Next, average of global proportions is used to assign the four categories. Because local proportions

are imposed on the model, some zones with drillholes samples were modelled deterministically,

therefore global proportions were also affected by the results (Figure 4.38). The E-type model is

categorized considering deterministic values are correctly estimated since the zones with

deterministic values are highly populated by drillhole samples, however it is important to notice

that this choice comes with the neglection of the global proportion of the model (Figure 4.38-a).

The confusion matrix is built for the categorized model (Table 4.20). The accuracy of the model

for each domain ranges between 0.482 and 0.854, and the balanced accuracy of the model is 0.694

which is slightly less than the Ensemble model. If E-type model was categorized back to four

88
domains considering the global proportions, the balanced accuracy would decrease to 54.4%

(Table 4.21 and Figure 4.38-b), which would be a substantial decrease.

89
Figure 4.38 Cumulative probability of E-Type BlockSIS model. Imposing local proportion on
BlockSIS increases the deterministic zones (red dashed circles). Assuming deterministic zones
are correctly estimated: correctly (a) and wrongly (b) assigned samples when global proportion
is neglected and considered, respectively.

90
Table 4.20 Confusion matrix of BlockSIS considering deterministic zones.

True Domain
10 0.631 0.190 0.050 0.129
20 0.164 0.810 0.013 0.013
30 0.012 0.028 0.482 0.478
40 0.022 0.048 0.076 0.854
10 20 30 40
Predicted Domain

Table 4.21 Confusion matrix of BlockSIS considering global proportions


True Domain

10 0.691 0.131 0.05 0.129


20 0.787 0.187 0.012 0.014
30 0.016 0.024 0.447 0.513
40 0.026 0.044 0.069 0.861
10 20 30 40
Predicted Domain

Vertical cross-sections taken along North axis are illustrated in Figure 4.39. As seen in Figure

4.39, the effect of simulation is observable especially in zones where samples are scarce.

Compared to BlockSIS model of the first-level hierarchical application, the model gave similar

results (Figure 4.40). As was the case in the first level of the application, results of the BlockSIS

model have more alternating zones than the Ensemble model (Figure 4.41). In zones where there

91
are scarce drillhole samples, BlockSISmodel has pointy and discontinuous domains, whereas the

Ensemble model remains continuous (Figure 4.42).

Figure 4.39 Three consecutive cross-sections of BlockSIS model taken along north

92
Figure 4.40 Comparison of BlockSIS models along horizontal benches with two categories (left)
and with four categories (right)

93
Figure 4.41 Final ensemble model (left) and BlockSIS model (right) along horizontal benches

94
Figure 4.42 Vertical cross-sections of final ensemble model (top) and BlockSIS model (bottom)

95
Chapter 5 Conclusions and Recommendations

The main goal of the study was to propose a workflow to build a semi-autonomous domain model

for resource estimation, using a geochemical dataset. The motivation behind the goal was to

employ multiple variables that are linked to the geological domains in a consistent and semi-

autonomous way. This supports the construction of geological models that constrain the estimation

of the elements of interest in the resource model. This motivation was realized by adopting the

unsupervised classification method proposed by Faraj and Ortiz (2021) for the first step of the

study. Although the unsupervised classification proposed by Faraj and Ortiz (2021) is open to

developments, it was capable to cluster the data into distinct geological domains successfully.

There are some points worth to mention about the first step of the study. Global proportions during

the optimization of the parameters that define the distributions do not consider the spatial

distribution of the samples, but only the statistical distribution of the underlying variables. Using

global proportions obtained by optimization in the subsequent supervised classification step may

cause that the final model is biased towards oversampled zones. Correcting this issue can

significantly improve the Ensemble model. A possible solution can be proposed by applying

declustering techniques to the samples.

Another point related to unsupervised classification is choosing the right variables and determining

the importance of each variable. Investigating data assuming variables are log-normally distributed

increases the accuracy of the unsupervised classification substantially. However, it is important to

notice that some variables may affect domaining more compared to the rest (Faraj and Ortiz, 2021).

The unsupervised algorithm which considers the weight of variables can further improve the

domaining. A final consideration about the unsupervised classification can be developing an

algorithm which can model more than two domains, so that the hierarchical application is no longer

96
required. However, optimization of multiple domains may require a complex algorithm which can

be the focus of future studies.

Successful results were also obtained from supervised learning and ensemble learning, the next

step of the study. SVC was employed as a classifier, and the domaining obtained by SVC was

curvy and well-defined, which was another motivation of the study. SVC offers flexible models in

terms of continuity and shape of the domain thanks to the two parameters controlling the algorithm.

An optimum choice of parameters balancing the accuracy of the model and the expected shape of

domains can yield a successful final model. Reproducing the model with Ensemble learning further

improved the result, and thus results were able to define the spatial uncertainties. Accuracies of

Ensemble models were very high. However, it should be noticed that accuracy would be less if

training data was built considering drillholes rather than individual samples. One of the most

important challenges both on the first and second step of the study was the computing time.

Especially as the number of variables increases, building an Ensemble model becomes inefficient

in terms of time. Therefore, any improvement on algorithms to decrease the time spent on each

step can make the workflow more appealing for further studies.

The hierarchical application was another challenge for the proposed workflow. The most important

issue with the hierarchical application is building subdomains with an assumption that the

preceding domains are correct. However, when there are odd-numbered domains in a population,

the hierarchical application is not capable to assign true domaining using binary classification. In

such conditions, decisions made by users to merge two subdomains, which are previously split

into different branches of the hierarchical tree, can be a solution to the problem, but this may

require labor-intensive work and moreover model becomes prone to subjective errors. There is

also an accuracy issue related to the hierarchical application. It can be observed that the accuracy

97
decreases as the number of hierarchical levels increases. Another issue with the hierarchical

application is that the variables used for the preceding hierarchical level may not be efficient for

the descending level and using a different subset of variables comes with other challenges. When

a different subset of variables is employed for the descending level, then model must be built

according to the first level categorized ensemble model, which give rise to inconsistencies in zones

with high variation. Considering these two problems of hierarchical application, a model capable

of dealing with multiple domains in one step would be advantageous rather than the hierarchical

application.

In conclusion, the workflow proposed in this study stands as a robust alternative to traditional

modeling techniques, although there are issues that require further improvements. An ensemble

model informed by an unsupervised classification can be a promising roadmap to follow for the

field of resource estimation. The results of the case study proved that the workflow is capable to

build a robust model with high accuracy and with well-defined, curvy boundaries. Moreover, the

applications adopted for each step can be replaced by other alternative applications according to

the objectives of the studies.

98
References

Adams, L. M., Nazareth, J. L., (1996). Linear and Nonlinear Conjugate Gradient-related Methods.

AMS-IMS-SIAM Joint Summer Research Conference. Society for Industrial and Applied

Mathematics. Philadelphia

Agrawal, R., Imielinski, T., Swami, A., (1993). Mining Association Rules between Sets of Items

in Large Databases. ACM SIGMOD Rec. 22. 207-216. 10.1145/170035.170072.

Boisvert, J., (2010). Geostatistics with Locally Varying Anisotropy (Doctoral thesis, University of

Alberta, Edmonton, Canada). Available from ERA: Education and Research Archive of University

of Alberta. (https://doi.org/10.7939/R31X5C)

Bentley M., Ringrose P. (2021). The Rock Model. In: Reservoir Model Design. Springer, Cham.

https://doi.org/10.1007/978-3-030-70163-5_2

Bishop, Cristopher M., 2006. Kernel Methods & Sparse Kernel Machines. Pattern Recognition

and Machine Learning. Springer Science+Business Media, LLC, Singapore.

Cevik S. I., Ortiz J. M., (2020-a). Machine learning applied in mineral resource sector: An

overview, Predictive Geometallurgy and Geostatistics Lab, Queen’s University, Annual Report

2020, paper 2020-07, 106-129.

Cevik S. I., Ortiz J. M., (2020-b). Machine learning applied in mineral resource sector: An

overview, Predictive Geometallurgy and Geostatistics Lab, Queen’s University, Annual Report

2020, paper 2020-07, 106-129.

Christopher, A. (2021). K-Nearest Neighbor. medium.com, February 2, 2021. accessed June 2022,

https://medium.com/swlh/k-nearest-neighbor-ca2593d7a3c4

99
Deutsch C. V., (2003). Geostatistics, Encyclopedia of Physical Science and Technology, Third

Edition, pp. 697-707.

Deutsch C. V., (2006). A Sequential Indicator Simulation Program for Categorical Variables with

Point and Block Data: BlockSIS, Computer & Geosciences, Volume 32, Issue 10, 1669-1681,

https://doi.org/10.1016/j.cageo.2006.03.005.

Deutsch, C.V. and Journel, A.G., (1997). GSLIB Geostatistical Software Library and User’s

Guide, Oxford University Press, New York, second edition. 369 pages.

Dobilas, S., (2021). Apriori Algorithm for Association Rule Learning — How To Find Clear Links

Between Transactions. Towards Data Science, July 10, 2021, accessed April 2022,

https://towardsdatascience.com/apriori-algorithm-for-association-rule-learning-how-to-find-

clear-links-between-transactions-bf7ebc22cf0a

Faraj, F., Ortiz, J. M., (2021). A Simple Unsupervised Classification Workflow for Defining

Geological Domains Using Multivariate Data. Mining, Metallurgy & Exploration 38, 1609-1623,

https://doi.org/10.1007/s42461-021-00428-5.

Galli A., Beucher H., (1997). Stochastic models for reservoir characterization: a user-friendly

review. Paper presented at the Latin American and Caribbean Petroleum Engineering Conference,

Rio de Janeiro, Brazil, August 1997. doi: https://doi.org/10.2118/38999-MS.

Gautier, L., (2016). Iterative Thickness Regularization of Stratigraphic Layers in Discrete Implicit

Modeling. Mathematical Geosciences, Springer Verlag, 2016, 10.1007/s11004-016-9637-y. hal-

01338981.

100
Géron, A., (2017). Support Vector Machines. Hands-On Machine Learning with Scikit-Learn,

Keras, and TensorFlow., 2nd Edition, O'Reilly Media, Inc.

Gwynn, X. P., Brown M. C., Mohr P. J., (2013). Combined use of traditional core logging and

televiewer imaging for practical geotechnical data collection. In Proceedings of the 2013

International Symposium on Slope Stability in Open Pit Mining and Civil Engineering: 261-272.

Australian Centre for Geomechanics

Guo J., Wang J., Wu L., Liu C., Li C., Li F., Lin M., Jessell M. W., Li P., Dai X., Tang J., (2020).

Explicit-implicit-intergrated 3-D geological modelling approach: A case study of the Xianyan

Demolition Volcano (Fujian, China). Tectonophysics 795, 228648.

Hassanpour, R. M., Deutsch, C. V., (2010). An Introduction to Grid-Free Object-based Facies

Modeling. Centre for Computational Geostatistics Report 12, 107. University of Alberta, Canada.

Hillier, M.J., de Kemp, E.A., and Schetselaar, E.M., 2017. Implicit 3-D modelling of geological

surfaces with the Generalized Radial Basis Functions (GRBF) algorithm; Geological Survey of

Canada, Open File 7814, 15 p. https://doi.org/10.4095/301665

Hodkiewicz, P. F., (2014). Rapid Resource Modelling – A Vision for Faster and Better Mining

Decisions. Ninth International Mining Geology Conference / Adelaide, SA. 18-20 August 2014.

Hotelling, H. (1933) Analysis of a complex of statistical variables into principal components.

Journal of Educational Psychology, 24, 417-441. http://dx.doi.org/10.1037/h0071325

Isaaks, E. H., Srivastava, R. M., (1989). An Introduction to Applied Geostatistics. Oxford

University, New York, New York, USA.

101
James, G., Witten, D., Hastie, T., Tibshirani, R., (2013). An Introduction to Statistical Learning

with Applications in R. Springer, New York, USA.

Jordan, M. I., Mitchell, T. M., (2015). Machine Learning: Trends, perspectives, and prospects.

Science 349, 255-260. DOI: 10.1126/science.aaa8415

Kumbhare, T. A., Chobe, S. V., (2014). An Overview of Association Rule Mining Algorithms.

IJCSIT, International Journal of Computer Science and Information Technology Vol. 5 (1), 927-

930.

MacCormack, K.E., Thorleifson, L.H., Berg, R.C. and Russell, H.A.J., ed. (2016): Three-

Dimensional Geological Mapping: Workshop Extended Abstracts; Geological Society of America

Annual Meeting, Baltimore, Maryland, October 31, 2015; Alberta Energy Regulator, AER/AGS

Special Report 101, 99 p.

Newell, A. J., (2018). Implicit Geological Modelling: a new approach to 3D volumetric national-

scale geological models. British Geological Survey Internal Report, OR/19/004. 37pp.

Ortiz J. M., (2019). Geometallurgical modeling framework, Predictive Geometallurgy and

Geostatistics Lab, Queen’s University, Annual Report 2019, paper 2019-01, 6-16.

Pedregosa F., Varoquaux G., Gramfort A., Michel V., Thirion B., Grisel O., Blondel M.,

Prettenhofer P., Weiss R., Dubourg V., Vanderplas J., Passos A., Cournapeau D., Brucher M.,

Perrot M., Duchesnay E., (2011). Scikit-learn: Machine Learning in Python. Journal of Machine

Learning Research, vol. 12, pp. 2825-2830.

Platt, JC., 1999. Probabilistic Outputs for Support Vector Machines and Comparisons to

Regularized Likelihood Methods. MIT Press.

102
Rocca, J., (2019). Ensemble methods: bagging, boosting and stacking. Towards Data Science.

Retrieved from https://towardsdatascience.com/ensemble-methods-bagging-boosting-and-

stacking-c9214a10a205

Rossi, M. E., Deutsch, C. V. (2016). Mineral resource estimation. SPRINGER.

Shen C, Laloy E, Albert A, Chang F-J, Elshorbagy A, Ganguly S, Hsu K-L, Kifer D, Fang Z, Fang

K, Li D, Li X, Tsai W-P (2018). HESS Opinions: Deep learning as a promising avenue towards

knowledge discovery in water sciences, Hydrology and Earth System Sciences Discussions,

doi.org/10.5194/hess-2018-168.

Shewchuk J. R., (1994). An Introduction to the Conjugate Gradient Method without the Agonizing

Pain, School of Computer Science Carnegie Mellon University, Pittsburgh.

Sillitoe, R. H., (2010). Porphyry copper systems. Econ Geol 105(1):3–41

Tipping, Michael E., 2000. The relevance vector machine. Advances in neural information

processing systems, pp. 652 – 658.

van der Maaten, L and Hinton, G. (2008). Visualizing data using t-SNE. Journal of Machine

Learning Research. 9. 2579-2605.

103
Appendices

1. Optimization of the case with variables Mn/K, Mg/Al and Pb: cumulative probability plots

of variables with initial parameters of normal distribution on the left, cumulative

probability plots of variables with optimum parameters of distributions on the right.

104
2. Domain assignment of the case with variables Mn/K, Mg/Al and Pb: probability plots

comparing achieved and optimal domaining.

105
3. Optimization of the case with variables Mn/K, Mg/Al and Y: cumulative probability plots

of variables with initial parameters of normal distribution on the left, cumulative

probability plots of variables with optimum parameters of distributions on the right.

106
4. Domain assignment of the case with variables Mn/K, Mg/Al and Y: probability plots

comparing achieved and optimal domaining.

107
5. Optimization of the case with variables Mn/K, Mg/Al and Ga: cumulative probability plots

of variables with initial parameters of normal distribution on the left, cumulative

probability plots of variables with optimum parameters of distributions on the right.

108
6. Domain assignment of the case with variables Mn/K, Mg/Al and Ga: probability plots

comparing achieved and optimal domaining.

109
7. Optimization of the case with variables Mn/K, Zn and Pb: cumulative probability plots of

variables with initial parameters of normal distribution on the left, cumulative probability

plots of variables with optimum parameters of distributions on the right.

110
8. Domain assignment of the case with variables Mn/K, Zn and Pb: probability plots

comparing achieved and optimal domaining.

111
9. Optimization of the case with variables Mn/K, Zn and Y: cumulative probability plots of

variables with initial parameters of normal distribution on the left, cumulative probability

plots of variables with optimum parameters of distributions on the right.

112
10. Domain assignment of the case with variables Mn/K, Zn and Y: probability plots

comparing achieved and optimal domaining.

113
11. Optimization of the case with variables Mn/K, Zn and Ga: cumulative probability plots of

variables with initial parameters of normal distribution on the left, cumulative probability

plots of variables with optimum parameters of distributions on the right.

114
12. Domain assignment of the case with variables Mn/K, Zn and Ga: probability plots

comparing achieved and optimal domaining.

115
13. Optimization of the case with variables Mn/K, Pb and Y: cumulative probability plots of

variables with initial parameters of normal distribution on the left, cumulative probability

plots of variables with optimum parameters of distributions on the right.

116
14. Domain assignment of the case with variables Mn/K, Pb and Y: probability plots

comparing achieved and optimal domaining.

117
15. Optimization of the case with variables Mn/K, Pb and Ga: cumulative probability plots of

variables with initial parameters of normal distribution on the left, cumulative probability

plots of variables with optimum parameters of distributions on the right.

118
16. Domain assignment of the case with variables Mn/K, Pb and Ga: probability plots

comparing achieved and optimal domaining.

119
17. Optimization of the case with variables Mn/K, Y and Ga: cumulative probability plots of

variables with initial parameters of normal distribution on the left, cumulative probability

plots of variables with optimum parameters of distributions on the right.

120
18. Domain assignment of the case with variables Mn/K, Y and Ga: probability plots

comparing achieved and optimal domaining.

121
19. Optimization of the case with variables Mg/Al, Zn and Pb: cumulative probability plots of

variables with initial parameters of normal distribution on the left, cumulative probability

plots of variables with optimum parameters of distributions on the right.

122
20. Domain assignment of the case with variables Mg/Al, Zn and Pb: probability plots

comparing achieved and optimal domaining.

123
21. Optimization of the case with variables Mg/Al, Zn and Y: cumulative probability plots of

variables with initial parameters of normal distribution on the left, cumulative probability

plots of variables with optimum parameters of distributions on the right.

124
22. Domain assignment of the case with variables Mg/Al, Zn and Y: probability plots

comparing achieved and optimal domaining.

125
23. Optimization of the case with variables Mg/Al, Zn and Ga: cumulative probability plots of

variables with initial parameters of normal distribution on the left, cumulative probability

plots of variables with optimum parameters of distributions on the right.

126
24. Domain assignment of the case with variables Mg/Al, Zn and Ga: probability plots

comparing achieved and optimal domaining.

127
25. Optimization of the case with variables Mg/Al, Pb and Y: cumulative probability plots of

variables with initial parameters of normal distribution on the left, cumulative probability

plots of variables with optimum parameters of distributions on the right.

128
26. Domain assignment of the case with variables Mg/Al, Pb and Y: probability plots

comparing achieved and optimal domaining.

129
27. Optimization of the case with variables Mg/Al, Pb and Ga: cumulative probability plots of

variables with initial parameters of normal distribution on the left, cumulative probability

plots of variables with optimum parameters of distributions on the right.

130
28. Domain assignment of the case with variables Mg/Al, Pb and Ga: probability plots

comparing achieved and optimal domaining.

131
29. Optimization of the case with variables Mg/Al, Y and Ga: cumulative probability plots of

variables with initial parameters of normal distribution on the left, cumulative probability

plots of variables with optimum parameters of distributions on the right.

132
30. Domain assignment of the case with variables Mg/Al, Y and Ga: probability plots

comparing achieved and optimal domaining.

133
31. Optimization of the case with variables Zn, Pb and Y: cumulative probability plots of

variables with initial parameters of normal distribution on the left, cumulative probability

plots of variables with optimum parameters of distributions on the right.

134
32. Domain assignment of the case with variables Zn, Pb and Y: probability plots comparing

achieved and optimal domaining.

135
33. Optimization of the case with variables Zn, Pb and Ga: cumulative probability plots of

variables with initial parameters of normal distribution on the left, cumulative probability

plots of variables with optimum parameters of distributions on the right.

136
34. Domain assignment of the case with variables Zn, Pb and Ga: probability plots comparing

achieved and optimal domaining.

137
35. Optimization of the case with variables Zn, Y and Ga: cumulative probability plots of

variables with initial parameters of normal distribution on the left, cumulative probability

plots of variables with optimum parameters of distributions on the right.

138
36. Domain assignment of the case with variables Zn, Y and Ga: probability plots comparing

achieved and optimal domaining.

139
37. Optimization of the case with variables Pb, Y and Ga: cumulative probability plots of

variables with initial parameters of normal distribution on the left, cumulative probability

plots of variables with optimum parameters of distributions on the right.

140
38. Domain assignment of the case with variables Pb, Y and Ga: probability plots comparing

achieved and optimal domaining.

141
39. Optimization of domain1 distribution with variables Y, Mo and Mg/Al: cum. probability

plots of variables with initial parameters of normal dist. on the left, cum. probability plots

of variables with optimum parameters of distributions on the right.

142
40. Hierarchical application of domain1: probability plots compare achieved and optimal

domaining after domain assignment according to variables Y, Mo and Mg/Al.

143
41. Optimization of domain1 distribution with variables Y, Re and Mg/Al: cum. probability

plots of variables with initial parameters of normal dist. on the left, cum. probability plots

of variables with optimum parameters of distributions on the right.

144
42. Hierarchical application of domain1: probability plots compare achieved and optimal

domaining after domain assignment according to variables Y, Re and Mg/Al.

145
43. Optimization of domain1 distribution with variables Mo, Re and Mg/Al: cum. probability

plots of variables with initial parameters of normal dist. on the left, cum. probability plots

of variables with optimum parameters of distributions on the right.

146
44. Hierarchical application of domain1: probability plots compare achieved and optimal

domaining after domain assignment according to variables Mo, Re and Mg/Al.

147
45. Optimization of domain2 distribution with variables Zn, In and Mn/K: cum. probability

plots of variables with initial parameters of normal dist. on the left, cum. probability plots

of variables with optimum parameters of distributions on the right.

148
46. Hierarchical application of domain2: probability plots compare achieved and optimal

domaining after domain assignment according to variables Zn, In and Mn/K.

149
47. Optimization of domain2 distribution with variables Zn, Te and Mn/K: cum. probability

plots of variables with initial parameters of normal dist. on the left, cum. probability plots

of variables with optimum parameters of distributions on the right.

150
48. Hierarchical application of domain2: probability plots compare achieved and optimal

domaining after domain assignment according to variables Zn, Te and Mn/K.

151
49. Optimization of domain2 distribution with variables In, Te and Mn/K: cum. probability

plots of variables with initial parameters of normal dist. on the left, cum. probability plots

of variables with optimum parameters of distributions on the right.

152
50. Hierarchical application of domain2: probability plots compare achieved and optimal

domaining after domain assignment according to variables In, Te and Mn/K.

153

You might also like