Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Mizuno T. A.

Sequential Indicator Simulation applications


and the importance of categorical variables
modeling
Thiago Alduini Mizuno
University of Alberta, Edmonton, Canada

Abstract

Stochastic categorical simulation methods are widely used in complex non-stationary geological
conditions, uncertainty analysis, and when data variability is relevant in calculation. One of the main
stochastic simulation methods used is the Sequential Indicator Simulation (SIS). This article explains
the fundamentals of this methodology and presents some current applications for uncertainty
assessment of a copper deposit, soil contamination analysis, and simulation of facies fluvial
environments. It was possible to conclude that despite the method having several pertinent criticisms,
when properly applied, it proves to be a valid and straightforward technique for several cases where
the simulation of categorical variables is necessary.

1. Introduction

The Sequential Indicator Simulation (SIS) created by Alabert (1987) is an essential technique for
categorical simulation. It is present in most commercial software of reservoir modeling, and it is still
one of the most common choices for facies/rock types modeling. The method also has some recent
applications related to modeling mineralized zones (Sojdehee, 2015) and distributions of toxic
elements in soils (Wang, 2020).
In the paper to understand the principles of SIS, we review some concepts of Geostatistics. The
definition of: indicators, indicator variogram, categorical modeling and differences between
estimation and simulation. We also include applications, pros, and con of the method.
1.1. Why model facies/rock types?
We can ask ourselves why model facies if we can interpolate values directly. What changes when
we have a facies model? In Pyrcz and Deutsch (2014) there is a discussion on the subject and the
answer is stationarity decision. When a random function is stationary, the mean and covariance are
the same for the studied domain. Therefore, we assume a certain homogeneity, which may not be
accurate in most geological environments. For example, in geological contexts with substantial
differences between environments, such as fluvial systems, the floodplains have very different
distributions and variograms in comparison with fluvial channels. For this reason, they should be
treated as distinct domains. Characteristics such as porosity, for example, can change abruptly when
we move from a floodplain to a channel. Consequently, simple kriging estimates does not adequately
model this abrupt change. Facies model can help define these domains based on geological concepts.
1.2. Estimation and Simulation
One crucial point is to differentiate between Estimation and Simulation. A subject discussed by
several authors, (Journel, 1974; Goovaerts, 1997; Pyrcz and Deutsch, 2014). A summary of the
discussions follows.
Mizuno T. A. 2

Kriging seeks the best possible estimate by minimizing the error variance of the random function to
find the best weights for the samples. As a result, the kriging estimate has a lower variance than the
original data. In simple kriging, for example, in positions far from the sampled data, beyond the range
of variograms, the best estimate becomes the mean. Thus producing highly smoothed maps in these
regions.
In the simulation, premises defined by geostatistics explain characteristics that differentiate it from
kriging. A random function is a set of regionalized random variables. Each realization from a
simulation is a possible outcome of this random function. The "real" values that we have only had
access to a small sampled portion, and the realizations belong to the same random function.
Therefore, the realization has a similar structure and distribution to the real values. It gives the
realization of a characteristic not seen in kriging that. As a problem of this approach, we have several
possible results since several solutions to the problem.
In Journel (1974), presents of the best application of each approach. Kriging is indicated when the
problem does not depend on the data variability. The method is indicated for resources or recovered
reserve estimates with limited influence of cut-offs. Simulation is used when the variability of the
data is relevant in our analysis. When the transference equations are non-linear. For example, it is
used in reservoir modeling because permeability variation strongly influences the flow simulation.
In mining, circumstances with complex metallurgy or where the cut-off influences the measurement
of resources.
Goovaerts (1997), using an example of the economic impact of soil pollution, shows how we can
observe different results from kriging and simulation. In this case, the lack of variability in kriging
and smoothing of high values underestimates the cost. In situations of analysis of pollutants, it is
relevant to represent abnormally high values related to contaminated areas. In the example, the
estimated map variance was six times smaller than the sampled. Therefore, the cost of remediating
the contaminated area, estimated by kriging, was approximately 50% less than the actual cost.
Otherwise, the simulation had a cost similar to the actual one, only 7% lower.

2. Indicator Kriging

A technique that precedes the SIS, the indicator kriging (IK), is vital to understanding the concepts
behind the local uncertainty model. Journel (1983) defines the indicator as: if a value is under a
determined threshold, the indicator receives 1; otherwise, the value is 0. So I(u;z) is the probability
of higher threshold values occur, equation 1(Deutsch, 1992):

1, 𝑖𝑓 𝑍(𝒖)≤𝑧 (1)
𝐼 (𝒖; 𝑧) = { 0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒

The IK interpolate the probabilities are through kriging to generate probability maps for each
threshold and create a local uncertainty model. It is possible to obtain an estimate of the variable by
multiplying the means of each threshold by its probability according to the equation 2:
𝐾+1 (2)
∗ (𝑢 )
𝑍𝐼𝐾 0 = ∑[ 𝑖 ∗ (𝑢0 ; 𝑘) − 𝑖 ∗ (𝑢0 ; 𝑘 − 1)] ∗ 𝑚𝑘
𝑘=1

As Journel (1983) pointed out, an advantage of IK is the possibility of working with data with
significant variability. In kriging, variograms can present inconsistencies, especially regarding the
occurrence of outliers. Therefore, it is possible to avoid this unwanted effect using indicator kriging.
There is also the advantage of working with different variograms for each threshold and allows for
“risk-qualified spatial distributions.”
Mizuno T. A. 3

Now a day IK is not a popular method. Recently some results (Vincent, 2020) show that the
performance of IK as a reserve estimator does not outperform other kriging methods. Additionally,
the need of several modeling steps such as threshold definition and multiple variograms makes it a
less practical method, which is even more unfavorable for its use.
2.1. Indicator variogram and local uncertainty model
A key feature of SIS is that it is a kriging-based method, so it inherits some of its features. The main
one is measuring the spatial correlation of the data through variogram, which define the kriging
estimates used in the local uncertainty.
The previous section discussed the Indicator Kriging method, where a continuous variable is
converted into an indicator using cutoffs, allowing a distribution of the probabilities of these values
to occur or not. Similarly, we can use the indicator to define this relationship for categorical variables.
Equation 3.

1, 𝑖𝑓 𝑐𝑎𝑡𝑒𝑔𝑜𝑟𝑦 𝑘 𝑝𝑟𝑒𝑣𝑎𝑖𝑙𝑠 𝑎𝑡 𝑙𝑜𝑐𝑎𝑡𝑖𝑜𝑛 𝒖 (3)


𝑖 (𝒖; 𝑘 ) = { 0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 ,𝑘 = 1, … , 𝐾.
In each sampled location the probability of a given category occurs is i(u;k) = 1, if category k prevails
at location, otherwise is 0. The experimental indicator variogram follows the same procedure used
in any geostatistical method, equation 4:

2𝛾(𝒉) = 𝑉𝑎𝑟{𝑍 (𝒖 + 𝒉) − 𝑍(𝒖)} (4)


𝛾(𝒉) = 𝐶 (0) − 𝐶 (𝒉), ∀ 𝒖

A variogram models need to be defined. However, there is a restriction on using the gaussian model.
Indicators can only have linear increments, thus admitting only spherical and exponential model.
Deutsch (2006).
Over the years, several authors (Gomez-Hernandez and Srivastava, 1990; Goovaerts, 1994;
Goovaerts, 1997) implemented SIS solutions, with modifications related to various forms of
construction of the local uncertainty model and the use of secondary variables. Deutsch (2006)
combined these variations in a GSLIB (Deutsch and Journel, 1998) standard software. There are nine
SIS options with different indicator kriging. This study will focus on the most used options:
Stationary simple kriging, Ordinary kriging, and Nonstationary simple kriging with locally varying
mean.
Stationary simple kriging is an option where the global probability is stationary. It is indicated in a
geologically more homogeneous situation with a stationary domain. In this case, there is no use of
secondary variables. Equation 5 (Deutsch, 2006):
𝑛
∗ (𝒖;
𝑖𝑆𝐾 𝑘) − 𝑝𝑘 = ∑ 𝜆𝑆𝐾
𝛼 (𝒖; 𝑘) ∗ [𝑖(𝒖𝛼 ; 𝑘) − 𝑝𝑘 ] ,
𝛼=1
𝑛
∗ (𝒖;
𝑖𝑆𝐾 𝑘) = ∑ 𝜆𝑆𝐾
𝛼 (𝒖; 𝑘) ∗ 𝑖(𝒖𝛼 ; 𝑘)
𝛼=1
𝑛
(5)
+ [1 − ∑ 𝜆𝑆𝐾
𝛼 (𝒖; 𝑘)] ∗ 𝑝𝑘
𝛼=1

Ordinary kriging, calculate the weights in such a way as to add up to 1. This option does not use a
global probability as in simple kriging. Thus, all weight is given to the samples, which could lead to
Mizuno T. A. 4

excessive extrapolation of the influence of the initial data. However, it is also can be pointed out as
an advantage over simple kriging since the average is not stationary, favoring local trends. (Deutsch
2006) gives its equation 6:
𝑛
∗ (𝒖;
𝑖𝑂𝐾 𝑘) = ∑ 𝜆𝑂𝐾
𝛼 (𝒖; 𝑘) ∗ 𝑖(𝒖𝛼 ; 𝑘)
(6)
𝛼=1

Nonstationary simple kriging with locally varying mean, this option is similar in to Stationary simple
kriging with weights calculated in the same way, but with the advantage of nonstationary. A
secondary variable defines local averages, serving to estimate the probabilities. One important point
is how to define these averages. An alternative is the use of seismic data to help define the local
facies probabilities using the relationship between impedance and the proportion of each facies. It is
also possible to build average maps from conceptual geological models. Equation 7 (Deutsch, 2006):
𝑛

𝑖𝐿𝑀𝑉 (𝒖; 𝑘) − 𝑝𝑘 (𝒖) = ∑ 𝜆𝑆𝐾
𝛼 (𝒖; 𝑘) ∗ [𝑖(𝒖𝛼 ; 𝑘) − 𝑝𝑘 (𝒖𝜶 )]
(7)
𝛼=1

Since the kriging of indicators is independent, a correction in the proportions is necessary to avoid
negative values, and their sum be equals one. Deutsch (2006) explained that the most common
procedure is limiting the negative values to zero and dividing them by their sum.

3. Sequential Indicator Simulation

Once the local uncertainty model is defined, there are no significant differences between the facies
definition procedures. In Gómez-Hernández and Srivastava (1990), there is a description of the steps
used in the algorithm. SIS defines a random path, where nodes are visited sequentially. The
algorithm uses a searches neighborhood to find the conditioning data, removes redundant data, and
selects a maximum limit of data used due to operational efficiency. Estimating the local uncertainty
model and defining the category using a random value between 0 and 1 is drawn from an uniform
distribution following the Monte Carlo simulation. The simulation includes the simulated value to
condition future simulations. Therefore, there is an update of the local uncertainty model at each
stage.
In the software by Deutsch (2006), performs an optional post-processing step, based on the work of
Deutsch (1998). This step was defined to correct negative aspects related to facies transitions. A
technique called maximum-a-posteriori selection (MAPS) cleans the image using a neighborhood
that passes through each grid node where the most probable value replaces the value in each location.
The procedure preserves the original conditioning values and corrects not to change the global
proportions. Creating a final image with smoother facies transitions.

4. Applications

There are several examples of the use of SIS. However, we will focus on some recent uses (Sojdehee,
2015; Zhou, 2018; Wang, 2020) for our discussion.
Sojdehee, et al. (2015), uses SIS in the Daralu copper deposit to assess the uncertainty in the
mineralized zones. They defined four categories according to the type of hydrothermal alteration.
The simulation applied four isotropic spherical variogram models varying the nugget effect and
Mizuno T. A. 5

vertical range. To perform 100 realizations using in the simple kriging option. According to the
author, to avoid artifact-related effects when dealing with curved structures. Probability maps,
obtained from the realizations, allowed the definition of uncertainty in each mineralized zone for the
non-sampled zones. Improving the understanding of the probability of finding the most (hypogene)
and less (leached) prevalent zone.
In an application related to soil contamination, Wang, et al. (2020), uses SIS to determine the sources
and analyze the contamination of toxic elements in soil PTEs (As, Cd, Co, Cr, Cu, Hg, Mn, Ni, Pb,
and Zn) in central Yiyuan county of Shandong Province. In this study, four classes defined from
percentiles (20, 40, 60, and 80) of PTE concentrations are simulated. Resulting in one thousand
realizations, using spherical and exponential variograms adjusted for each class of the 10 PTEs. With
the results was possible to define, probability maps, and analyze the uncertainty. The spatial
correlation between elements, established the possible sources of contamination for each set, with
identification for three source: Natural, anthropogenic due to coal burning, and anthropogenic related
to the use of fertilizers. It was also possible to delimit a dangerous area related to Cd and Hg
contamination above the safety limits.
Zhou, et al. (2018) present a comparison between facies modeling algorithms. They evaluate three
algorithms, Multipoint statistics (MPS), object modeling, and SIS, using an image of the Amazon
River, in Brazil, to create a reference model that serves as a basis for evaluating the accuracy and
capacity of the models to reproduce the continuity observed in the original image. The SIS used an
exponential variogram with a range of 16km and approximately 1:2 of anisotropy oriented 45º
direction of higher continuity of the fluvial channels. For the objects, the authors defined channels
with geometries and orientations consistent with the reference image. In MPS, a version of the
reference image was used as a training image. The conclusion is that regarding MPS and SIS
accuracy, comparing the result to the reference image. Both techniques present similar accuracy
percentages, higher than object modeling. The SIS, however, demonstrated poor ability to reproduce
the continuity and geometry observed in the fluvial channels in the real image. In this regard, MPS
and Objects fared better. Moreover, as noted by the author, this can make a big difference in reservoir
flow modeling.

5. Conclusion

As discussed earlier, the approach adopted must consider the problem we want to solve, if we should
opt for facies modeling. For example, in situations where non-stationarity is relevant, especially
when we are interested in representing different geological concepts. Another critical decision is the
option to adopt simulation instead of estimation. When variability affects the result, or we want to
assess the uncertainty behind our predictions, simulation becomes necessary. SIS can be a viable
option for these cases.
Despite being a consolidated and widely used technique, there are several criticisms of its use. One
corresponds to conceptual problems of the indicator formalism and difficulties linked to the
sequential paradigm (Emery, 2004). Another criticism is the algorithm's low capacity to generate
geologically consistent images. Deutsch (2006) mentioned that the models sometimes present
irregularities and low structuring. These criticisms relate to the conclusions observed by Zhou et al.
(2018) and the low capacity of the SIS to reproduce fluvial channels.
The conclusion is that the SIS is a valid option for categorical simulation even considering the raised
criticisms. It is a relatively simple solution, especially compared to Multipoint Statistics. With
parameters that are easy to infer from few data. Furthermore, it may be a good choice in situations
where there are not very defined geometries, such as in diagenetic models (Pyrcz and Deutsch, 2014).
Mizuno T. A. 6

6. References

Alabert, F. Stochastic imaging of spatial distributions using hard and soft information (Master’s
thesis). Stanford University, Stanford, CA. 1987, 197 p.

Deutsch, Clayton V., and Andre G. Journel. "Geostatistical software library and user’s guide." New
York 119.147 (1992).

Deutsch, Clayton V. "Cleaning categorical variable (lithofacies) realizations with maximum a-


posteriori selection." Computers & Geosciences 24.6 (1998): 551-562.

Deutsch, Clayton V. "A sequential indicator simulation program for categorical variables with
point and block data: BlockSIS." Computers & Geosciences 32.10 (2006): 1669-1681.

Emery, Xavier. "Properties and limitations of sequential indicator simulation." Stochastic


Environmental Research and Risk Assessment 18.6 (2004): 414-424.

Gómez-Hernández, J. Jaime, and R. Mohan Srivastava. "ISIM3D: An ANSI-C three-dimensional


multiple indicator conditional simulation program." Computers & Geosciences 16.4 (1990): 395-
440.

Goovaerts, P. "Comparative performance of indicator algorithms for modeling conditional


probability distribution functions." Mathematical Geology 26.3 (1994): 389-411.

Goovaerts, Pierre. "Kriging vs stochastic simulation for risk analysis in soil contamination."
geoENV I—Geostatistics for Environmental Applications. Springer, Dordrecht, 1997. 247-258.

Journel AG. Geostatistics for conditional simulation of ore bodies. Economic Geology. 1974 Aug
1;69(5):673-87.

Journel AG. Nonparametric estimation of spatial distributions. Journal of the International


Association for Mathematical Geology. 1983 Jun;15(3):445-68.

Pyrcz, Michael J., and Clayton V. Deutsch. Geostatistical reservoir modeling. Oxford university
press, 2014.

Sojdehee, Mona, et al. "Probabilistic modeling of mineralized zones in Daralu copper deposit (SE
Iran) using sequential indicator simulation." Arabian Journal of Geosciences 8.10 (2015): 8449-
8459.

Vincent J, Deutsch C. Local Variations in Multiple Indicator Kriging Class Means.

Wang, Yameng, et al. "Identifying quantitative sources and spatial distributions of potentially toxic
elements in soils by using three receptor models and sequential indicator simulation." Chemosphere
242 (2020): 125266.

Zhou, Fengde, et al. "Comparison of sequential indicator simulation, object modelling and
multiple-point statistics in reproducing channel geometries and continuity in 2D with two different
spaced conditional datasets." Journal of Petroleum Science and Engineering 166 (2018): 718-730.

You might also like