Professional Documents
Culture Documents
Geostatistics For Environmental and Geotechnical Applications (Astm Special Technical Publication STP)
Geostatistics For Environmental and Geotechnical Applications (Astm Special Technical Publication STP)
Geostatistics For Environmental and Geotechnical Applications (Astm Special Technical Publication STP)
0AV IS
STP 1283
Geostatistics for
Environmental and
Geotechnical Applications
ASTM
100 Barr Harbor Drive
West Conshohocken, PA 19428-2959
Copyright © 1996 AMERICAN SOCIETY FOR TESTING AND MATERIALS, West Conshohocken,
PA. All rights reserved. This material may not be reproduced or copied, in whole or in part, in any
printed, mechanical, electronic, film, or other distribution and storage media, without the written
consent of the publisher.
Photocopy Rights
Authorization to photocopy items for internal or personal use, or the internal or personal
use of specific clients, is granted by the AMERICAN SOCIETY FOR TESTING AND MATERIALS
for users registered with the Copyright Clearance Center (CCC) Transactional Reporting
Service, provided that the base fee of $2.50 per copy, plus $0.50 per page is paid directly to
CCC, 222 Rosewood Dr., Danvers, MA 01923; Phone: (508) 750-8400; Fax: (508) 750-4744. For
those organizations that have been granted a photocopy license by CCC, a separate system of
payment has been arranged. The fee code for users of the Transactional Reporting Service is
0-8031-2414-7/96 $2.50 + .50
OVERVIEW PAPERS
ENVIRONMENTAL ApPLICATIONS
Indexes 277
Overview Papers
Marc V. Cromer l
ABSTRACT: Although successfully applied during the past few decades for predIcting the
spatial occurrences of properties that are cloaked from direct observation, geostatistical
methods remain somewhat of a mystery to practitioners in the environmental and
geotechnical fields. The techniques are powerful analytical tools that integrate numerical and
statistical methods with scientific intuition and professional judgment to resolve conflicts
between conceptual interpretation and direct measurement. This paper examines the
practicality of these techniques within the entitles field of study and concludes by introducing
a practical case study in which the geostatistical approach is thoroughly executed.
INTRODUCTION
1 Principal Investigator, Sandia National Laboratories/Spectra Research Institute, MS 1324. P.O. Box 5800,
Albuquerque, NM 87185-1342
3
4 GEOSTATISTICAL APPLICATIONS
IT'S GEOSTATISTICS
The field of statistics is generally devoted to the analysis and interpretation of uncertainty
caused by limited sampling of a property under study. Geostatistical approaches deviate
from more "classical" methods in statistical data analyses in that they are not wholly tied to a
population distribution model that assumes samples to be normally distributed and
uncorrelated. Most earth science data sets, in fact, do not satisfy these assumptions as they
often tend to have highly skewed distributions and spatially correlated samples. Whereas
classical statistical approaches are concerned with only examining the statistical distribution
of sample data, geostatistics incorporates the interpretations of both the statistical distribution
of data and the spatial relationships (correlation) between the sample data. Because of these
differences, environmental and geotechnical problems are more effectively addressed using
geostatistical methods when interpretation derived from the spatial distribution of data have
impact on decision making risk.
Geostatistical methods provide the tools to capture, through rigorous examination, the
descriptive information on a phenomenon from sparse, often biased, and often expensive
sample data. The continued examination and quantitative rigor of the procedure provide a
vehicle for integrating qualitative and quantitative understanding by allowing the data to
"speak for themselves." In effect, the process produces the most plausible interpretation by
continued examination of the data in response to conflicting interpretations.
With environmental restoration projects, the information collected during the remedial
investigation is the sole basis for evaluating the applicability of various remedial strategies,
yet this information is often incomplete. Incomplete information translates to uncertainty in
bounding the problem and increases the risk of regulatory failure. While this type of
uncertainty can often be reduced with additional sampling, these benefits must be balanced
with increasing costs of characterization.
The probabilistic roots deeply entrenched into geostatistical theory offer a means to quantify
this uncertainty, while leveraging existing data in support of sampling optimization and risk-
based decision analyses. For example, a geostatistically-based, costlrisklbenefit approach to
sample optimization has been shown to provide a framework for examining the many trade-
offs encountered when juggling the risks associated with remedial investigation, remedial
CROMER ON A TECHNOLOGY TRANSFERRED 5
design, and regulatory compliance (Rautman et. aI., 1994). An approach such as this
explicitly recognizes the value of information provided by the remedial investigation, in that
additional measurements are only valuable to the extent that the information they provide
reduces total cost.
GEOSTATISTICAL PREDICTION
The ultimate goal of geostatistical examination and interpretation, in the context of risk
assessment, is to provide a prediction of the probable or possible spatial distribution of the
property under study. This prediction most commonly takes the form of a map or series of
maps showing the magnitude and/or distribution of the property within the study. There are
two basic forms of geostatistical prediction, estimation and simulation. In estimation, a
single, statistically "best" estimate of the spatial occurrence of the property is produced based
on the sample data and on the model determined to most accurately represent the spatial
correlation of the sample data. This single estimate (map) is produced by the geostatistical
technique commonly referred to as kriging.
With simulation, many equally-likely, high-resolution images of the property distribution can
be produced using the same model of spatial correlation as developed for kriging. The
images have a realistic texture that mimics an exhaustive characterization, while maintaining
the overall statistical character of the sample data. Differences between the many alternative
images (models) provides a measure of joint spatial uncertainty that allows one to resolve
risk-based questions ... an option not available with estimation. Like estimation, simulation
can be accomplished using a variety of techniques and the development of alternative
simulation methods is currently an area of active research.
Despite successful application during the past few decades, geostatistical methods remain
somewhat of a mystery to practitioners in the geotechnical and environmental fields. The
theoretical complexity and effort required to produce the intermediate analysis tools needed
to complete a geostatistical study has often deterred the novice from this approach.
Unfortunately, to many earth scientists, geostatistics is considered to be a "black box."
Although this is far from the truth, such perceptions are often the Achilles' heel of many
mathematical/numeric analytical procedures that harness data to yield their true worth
because they require a commitment in time and training from the practitioner to develop
some baseline proficiency.
Geostatistics is not a solution, only a tool. It cannot produce good results from bad data, but
it will allow one to maximize that information. Geostatistics cannot replace common sense,
good judgment, or professional insight, in fact it demands these skills to be brought to bare.
The procedures often take one down a blind alley, only to cause a redirection to be made
because of an earlier miss-interpretation. While these exercises are nothing more than
cycling through the scientific method, they are often more than the novice is willing to
commit to. The time and frustration associated with continually rubbing one's nose in the
6 GEOSTATISTICAL APPLICATIONS
details of data must also take into account the risks to the decision maker. Given the
tremendous level of financial resources being committed to field investigation, data
collection, and information management to provide decision making power, it appears that
such exercises are warranted.
This introductory paper only attempts to provide a gross overview of geostatistical concepts
with some hints to practical application for these tools within the entitled fields of scientific
study. Although geostatistics has been practiced for several decades, it has also evolved both
practically and theoretically with the advent offaster, more powerful computers. During this
time a number of practical methods and various algorithms have been developed and tested,
many of which still have merit and are practiced, but many have been left behind in favor of
promising research developments. Some of the concepts that I have touched upon will come
to better light in the context of the practical examination addressed in the following suite of
three overview papers provided by Srivastava (1996), Rouhani (1996), and Desbarats (1996).
In this case study, a hypothetical database has been developed that represents sampling of
two contaminants of concern: lead and arsenic. Both contaminants have been exhaustively
characterized as a baseline for comparison as shown in Figures 1 and 2. The example
scenario proposes a remedial action threshold (performance measure) of 500 ppm for lead
and 30 ppm for arsenic for the particular remediation unit or "VSR" (as discussed by
Desbarats, 1996). Examination of the exhaustive sample histogram and univariate statistics
in Figures 1 and 2 indicate about one fifth of the area is contaminated with lead, and one
quarter is contaminated with arsenic.
The two exhaustive databases have been sampled in two phases, the first of which was on a
pseudo-regular grid (square symbols in Figure 3) at roughly a separation distance of 50 m. In
this first phase, only lead was analyzed. In the second sampling phase, each first-phase
sample location determined to have a lead concentration exceeding the threshold was targeted
with eight additional samples (circle symbols of Figure 3) to delineate the direction of
propagation of the contaminant. To mimic a problem often encountered in an actual field
investigation, during the second phase of sampling arsenic contamination was detected and
subsequently included in the characterization process. Arsenic concentrations are posted in
Figure 4 with accompanying sample statistics. The second phase samples, therefore, all have
recorded values for both arsenic and lead.
Correlation between lead and arsenic is explored by examining the co-located exhaustive data
which are plotted in Figure 5. This comparison indicates moderately good correlation
between the two constituents with a correlation coefficient of 0.66, as compared to the
slightly higher correlation coefficient of 0.70 derived from the co-located sample data plotted
in Figure 6.
There are a total of 77 samples from the first phase of sampling and 13 5 from the second
phase. The second sampling phase, though, has been biased because of its focus on "hot-
CROMER ON A TECHNOLOGY TRANSFERRED 7
0 30 200 ppm
44%
[J Number of samples: 7700
20 -
Number of samples =0 ppm: 1501 (19%)
18 "
Number of samples> 30 ppm: 1851 (24%)
16 +
~ Minimum: o ppm
c 14 1 Lower quartile: 1 ppm
:.=-
12 1 Median: 6 ppm
>-
()
c
Q)
10+ Upper quartile: 29 ppm
8t Maximum: 550 ppm
::J
c- 6T
Q)
....
li'~Q
Mean: 22 ppm
u.. 4 "'" Standard deviation: 35 ppm
2t
0 "- I I
a 20 40 60 80 100 120 140 160 180 200
As (ppm)
CROMER ON A TECHNOLOGY TRANSFERRED 9
FIGURE 3: SAMPLE PB DATA
• • • • • • • 1:1
• •
• • • • • • •
• • • •
• • III
• • •
• • • • •
• ••
•
•
•
• ,
., •
• •
C
.
I.,
•
cit • •
• •
•
•
• ••
• • ••
q. 0
•
•
,
• 0
•
•
Iff> CO• ~O , C).c9 . •• I •• •••
• • • • •
•
•
• • 00 0 c. •.
• • • III • • •
0
o
• 0 C•
• • • . . ,. , .' .,fI..,..
••••••
tP· · .O
. ...
o. • • • .... . I c9
o .. C
• o•
·• • , .
•• 0
. .. ,
It
o
•
• •
,
• ••0
~
.. I·, • I ·
•••
••
~CO c::> .c9 •
• 1°. • ••• ••• •
0 •••
•• •• •
•• • ••
•••••• o. • • • . : •• I ~ d
• <e 0 ~
•... •. ".
••• •
"0·. 0 ·
o 30 200 ppm
22%
[J Number of samples: 135
20
~
18
16 T
t Number of samples =0 ppm: 12 (9%)
Number of samples> 30 ppm: 51 (38%)
0 Minimum: 0 ppm
14 1
~ 12 I
Lower quartile: 6 ppm
>- Median: 21 ppm
()
c: 10 t Upper quartile: 50 ppm
Q)
8 Maximum: 157 ppm
:::J
C-
1
Q) 61
Mean: 33 ppm
4~
'-
u.. Standard deviation: 36 ppm
I
2t
0 "" ..J..J..L.1...l--W--4-J-L.4-J.~Ll..JJ.dJl!:ILt ill ~ ~ ~ ;
0 20 40 60 80 100 120 140 160 180 200
As (ppm)
CROMER ON A TECHNOLOGY TRANSFERRED 11
•• • •• •
• I.. • •••
••
. :.
".
• • •• •
160
•• • ••
, •
140 •
•
• •••
_120
E • ••••
a.
• ••
..9: 1
(/)
«
180
160
• •
140
• •
_120 • ••
E
a. • •
..9: 100 •
«
(/)
80
•
•••
•
.. •
• • •• •
60
••
...
• • I' •
• .' .
40
20
._... .-....,..
•
• •••• • •
\
0 •• I-
• n-...:J.!r ~' I I I I I I
0 100 200 300 400 500 600 700 800 900 1000
Pb (ppm)
12 GEOSTATISTICAL APPLICATIONS
spot" delineation. This poses some difficult questions/problems from the perspective of
spatial data analysis: What data are truly representative of the entire site and should be used
for variography or for developing distributional models? What data are redundant or create
bias? Have we characterized arsenic contamination adequately? These questions are
frequently encountered, especially in the initial phases of a project that has not exercised
careful pre-planning. The co-located undersampling of arsenic presents an interesting twist
to a hypothetical, yet realistic, problem from we can explore the paths traveled by the
geostatistician.
REFERENCES
Rautman, C.A., M.A. McGraw, J.D. Istok, 1M. Sigda, and P.G. Kaplan, "Probabilistic
Comparison of Alternative Characterization Technologies at the Fernald Uranium-In-
Soils Integrated Demonstration Project", Vol. 3, Technolo~y and Pro~rams for
Radioactive Waste Mana~ement and Environmental Restoration, proceedings of the
Symposium on Waste Management, Tucson, AZ, 1994.
Rautman, C.A. and M.V. Cromer, 1994, "Three-Dimensional Rock Characteristics Models
Study Plan: Yucca Mountain Site Characterization Plan SP 8.3.1.4.3.2", U.S.
Department of Energy, Office of Civilian Radioactive Waste Management,
Washington, DC.
Ryti, R., "Superfund Soil Cleanup: Developing the Piazza Road Remedial Design," .Im!1:nill.
Air and Waste Mana~ement, Vol. 43, February 1993.
INTRODUCTION
Unlike most classical statistical studies, in which samples are commonly assumed to be
statistically independent, environmental and geotechnical studies involve data that are
not statistically independent. Whether we are studying contaminant concentrations in
soil, rock and fluid properties in an aquifer, or the physical and mechanical properties
of soil, data values from locations that are close together tend to be more similar than
data values from locations that are far apart. To most geologists, the fact that closely
lManager, FSS Canada Consultants, 800 Millbank, Vancouver, Be, Canada V5V 3K8
13
14 GEOSTATISTICAL APPLICATIONS
spaced samples tend to be similar is hardly surprising since samples from closely spaced
locations have been influenced by similar physical and chemical processes.
This overview paper addresses the description and analysis of spatial dependence in
geostatistical studies, the interpretation of the results and the development of a math-
ematical model that can be used in spatial estimation and simulation. More specific
guidance on the details of analysis, interpretation and modelling of spatial variation can
be found in the ASTM draft standard guide entitled Standard Guide for Analysis of
Spatial Variation in Geostatistical Site Investigations.
60000 1.2
50000 1.0
c: 0.8
~ 40000 .12
.~ 30000
.i 0.6
~ 0.4
~ 20000 8 0.2
10000 0.0
o -0.2
o 20 40 60 80 100 120 0 20 40 60 80 100 120
Separation distance (in m) Separation distance (in m)
As can be seen by the examples in Figures 1 and 2, the variogram and the correlo-
gram are, in an approximate sense, mirror images. As the variogram gradually rises and
reaches a plateau, the correlogram gradually drops and also reaches a plateau. They are
not exactly mirror images of one another, however, and a geostatistical study of spatial
continuity often involves both types of plots. There are other tools that geostatistician
use to describe spatial continuity, but they all fall into two broad categories: measures
of dissimilarity and measures of similarity. The measures of similarity record how dif-
. ferent the data values are as a function of separation distance and tend to rise like the
variogram. The measures of dissimilarity record how similar the data values are as a
function of separation distance and tend to fall like the correlogram.
Sill: The plateau that the variogram reaches; for the traditional definition of the vari-
ogram - the average squared difference between paired data values - the sill is
approximately equal to twice the variance of the data. 3
3The "semivariogram", which is simply the variogram divided by two, has a sill that is approximately
equal to the variance of the data.
16 GEOSTATISTICAL APPLICATIONS
Range : The distance at which the variogram reaches the sill; this is often thought of
as the "range of influence" or the "range of correlation" of data values. Up to
the range, a sample will have some correlation with the unsampled values nearby.
Beyond the range, a sample is no longer correlated with other values.
Nugget Effect: The vertical height of the discontinuity at the origin. For a separation
distance of zero (i.e. samples that are at exactly the same location), the average
squared differences are zero. In practice, however, the variogram does not converge
to zero as the separation distance gets smaller. The nugget effect is a combination
of:
• short-scale variations that occur at a scale smaller than the closest sample
spacing
• sampling error due to the way that samples are collected, prepared and ana-
lyzed
80000
.,
Range
- ,, -- Sill
E 60000
!!! ,
.~ 40000 ,
>
I'll
20000 - --, -- - - Nugget effect
II: II
II' II
0
0 20 40 60 80 100 120
Separation distance (in m)
Of the three characteristics commonly used to summarize the variogram, it is the range
and the nugget effect that are most directly linked to our intuitive sense of whether the
phenomenon under study is "continuous" or "erratic". Phenomena whose variograms
have a long range of correlation and a low nugget effect are those that we think of as
"well behaved" or "spatially continuous"; attributes such as hydrostatic head, thickness
of a soil layer and topographic elevation typically have long ranges and low nugget
effects. Phenomena whose variograms have a short range of correlation and a high nugget
SRIVASTAVA ON SPATIAL VARIABILITY 17
E 40000 E 1200
~ ~ 1000
.2 30000 ·2
CG
soo
600
~ 20000 > 400
10000 200
o 0
o 20 40 60 80 100 120 0 20 40 60 80 100 120
Separation distance (in m) Separation distance (in m)
Figure 4. Lead and arsenic variograms for the sample data described by Cromer (1996).
In many earth science data sets, the pattern of spatial variation is directionally depen-
dent . In terms of the variogram, the range of correlation often depends on direction.
18 GEOSTATISTICAL APPLICATIONS
Using the example presented earlier in this volume by Cromer, the lead values appear to
be more continuous in the NW-SE direction than in the NE-SW direction. Geostatisti-
cal studies typically involve the calculation of separate variograms and correlograms for
different directions. Figure 5 shows directional variograms for the sample lead data pre-
sented by Cromer. The range of correlation shown by the NW-SE variogram (Figure 5a)
is roughly 80 meters , but only 35 meters on the NE-SW variogram (Figure 5b). This
longer range on the NW-SE variogram provides quantitative support for the observation
that the lead values are, indeed, more continuous in this direction and more erratic in
the perpendicular direction.
REFERENCES
ASTM, Standard Guide for Analysis of Spatial Variation in Geostatistical Site Investi-
gations, 1996, Draft standard from D18.01.07 Section on Geostatistics.
Cromer, M.V., 1996, "Geostatistics for Environmental and Geotechnical Applications:
A Technology Transfer," Geostatistics for Environmental and Geotechnical Appli-
cations, ASTM STP 1283, R. Mohan Srivastava, Shahrokh Rouhani, Marc V. Cro-
mer, A. Ivan Johnson, Ed., American Society for Testing and Materials, West
Conshohocken, PA.
Deutsch, C.V. and Journel, A.G., 1992, GSLIB: Geostatistical Software Library and
User's Guide, Oxford University Press, New York, 340 p.
Isaaks, E.H. and Srivastava, R.M., 1989, An Introduction to Applied Geostatistics,
Oxford University Press, New York, 561 p.
Journel, A.G. and Huijbregts, C., 1978, Mining Geostatistics, Academic Press, London,
600p.
Rouhani, S., 1996, "Geostatistical Estimation: Kriging," Geostatistics for Environmen-
tal and Geotechnical Applications, ASTM STP ma, R. Mohan Srivastava, Shah-
rokh Rouhani, Marc V. Cromer, A. Ivan Johnson, Ed., American Society for Test-
ing and Materials, West Conshohocken, PA.
Srivastava, R.M. and Parker, H.M., 1988, "Robust measures of spatial continuity,"
Geostatistics, M. Armstrong (ed.), Reidel, Dordrecht, p. 295-308.
Shahrokh Rouhani 1
ABSTRACT: Geostatistics offers a variety of spatial estimation procedures which are known as
kriging. These techniques are commonly used for interpolation of point values at unsampled
locations and estimation of average block values. Kriging techniques provide a measure of
accuracy in the form of an estimation variance. These estimates are dependent on the model of
spatial variability and the relative geometry of measured and estimated locations. Ordinary
kriging is a linear minimum-variance interpolator that assumes a constant, but unknown global
mean. Other forms of linear kriging includes simple and universal kriging, as well as co-kriging.
If measured data display non-Gaussian tendencies, more accurate interpolation may be obtained
through non-linear kriging techniques, such as lognormal and indicator kriging.
20
ROUHANI ON KRIGING 21
well-defmed statistical conditions, and thus, are superior to subjective interpolation techniques.
Furthennore, the automatic declustering of data by kriging makes it a suitable technique to
process typical environmental and geotechnical measurements.
Kriging also yields a measure for the accuracy of its interpolated values in the fonn of
estimation variances. These variances have been used in the design of sampling plans because of
two factors: (1) each estimate comes with an estimation variance, and (2) the estimation variance
does not depend on the individual observations (Loaiciga et al., 1992). Therefore, the impact of
a new sampling location can be evaluated before any new measurements are actually conducted
(Rouhani, 1985). Rouhani and Hall (1988), however, noted that in most field cases the use of
estimation variance, alone, is not sufficient to expand a sampling plan. Such plans usually
require consideration of many factors in addition to the estimation variance.
To use the estimation variance as a basis for sampling design, additional assumptions
must be made about the probability density function of the estimation error. A common practice
is to assume that, at any location in the sampling area, the errors are normally distributed with a
mean of zero and a standard deviation equal to the square root of the estimation variance,
referred to as kriging standard deviation. The nonnal distribution of the errors has been
supported by practical evidence (Journal and Huijbregts, 1978, p. 50 and 60).
Ordinary Kriging
Among geostatistical estimation methods, ordinary kriging is the most widely used in
practice. This procedure produces minimum-variance estimates by taking into account: (1) the
distance vector between the estimated point and the data points; (2) the distance vector between
data points themselves; and (3) the statistical structure of the variable. This structure is
represented by either the variogram, the covariance or the correlogram function. Ordinary
kriging is also capable of processing data averaged over different volumes and sizes.
Ordinary kriging is a "linear" estimator. This means that its estimate, Z', is computed as
a weighted sum of the nearby measured values, denoted as z!, ~, ... , and Zy,. The fonn of the
estimation is
n
2:A;Z; 1
;=}
where Ai'S are the estimation weights. Z· can either represent a point or a block-averaged value,
as shown in Fig. 1. Point kriging provides the interpolated value at an unsampled location.
Block kriging yields an areal or a volumetric average over a given domain.
• The kriging weights, Ai' are chosen so as to satisfy two suitable statistical conditions.
I These conditions are:
! (1) Non-bias condition: This condition requires that the estimator Z· to be free of any
! systematic error, which translates into
!
i
f
~
22 GEOSTATISTICAL APPLICATIONS
Zl
• z· .~
• •
Z4
Z:l
•
(a)
(b)
Fig. 1. Example of Spatial Estimation: (a) Point Kriging; (b) Block Kriging.
ROUHANI ON KRIGING 23
E
8 Q,
~ Q,
.5
5os
Cl
'g
..,
....l
'0
<:>
..,;.
rIl
<:>
." './:3
<Il
::l
os
~
>L1
N
o.i)
~
<:>
24 GEOSTATISTICAL APPLICATIONS
tAi
i==/
=1 2
(2) Minimum-variance condition: This requires that the estimator Z' have minimum variance
of estimation. The estimation variance of Z', d-, is defmed as
where Yio is the variogram between the i-th measured point and the estimated location and
n n n
0.2 = iLAiY io - 2:LAiAjYij + Y oo 3
i:} i:/ j : }
Yij is the variogram between the i-th and j-th measured points.
The kriging weights are computed by minimizing the estimation variance (Eq. 3) subject
to the non-bias condition (Eq. 2). The computed weights are then used to calculate the
interpolated value (Eq. 1). As Delhomme (1978) notes: "the kriging weights are tailored to the
variability of the phenomenon. With regular variables, kriging gives higher weights to the
closest data points, precisely since continuity means that two points close to each other have
similar values. When the phenomenon is irregular, this does not hold true and the weights given
to the closest data points are dampened." Such flexibility does not exist in methods, such as
distance weighting, where the weights are pre-defmed as functions of the distance between the
estimated point and the data point.
As noted in Cromer (1996), a soil lead field is simulated as a case study as shown in Fig.
2. The measured values are collected from this simulated field. Similar to most environmental
investigations, the sampling activities are conducted in two phases. During the first phase a
pseudo-regular grid of 50x50 m is used for soil sampling. In the second phase, locations with
elevated lead concentrations are targeted for additional irregular sampling, as indicated in Fig. 3.
The analysis of the spatial variability of the simulated field is presented in the previous
paper (Srivastava, 1996). Using this information, ordinary kriging is conducted. Fig. 4 displays
the kriging results of point estimations. The comparison of the original simulated field (Fig. 2)
and the kriged map (Fig. 4) shows that the kriged map captures the main spatial features of lead
contamination. This comparison, however, indicates a degree of smoothing in the kriged map
which is a consequence of the interpolation process. In cases where the preservation of the
spatial variability of the measured field is critical to the study objectives, then the use of kriging
for estimation alone is inappropriate and simulation methods are recommended (Desbarats,
1996).
Each kriged map is accompanied by its accuracy map. Fig. 5 displays the kriging
ROUHANI ON KRIGING 25
• • • • • •
• • •
e
0.
0.
.S
• • §
b
c:
0
.J:l
'"c:
••
C!)
g
0
U
.",
• • ••
'"
C!)
-l
·0
....0
Ul
• • • • 8
0()
en
C!)
..,
i5..
• • • • Ul
E
'"
.",
B
• • • • ffII#fI' •
Q
C!)
•
• • • •••~ C fI'
• 0
u
C!)
en
'"0.
..c:I
0
~
• • • • ••
E-<
M
bil
ti:
•• • •
I\)
0>
G)
m
oen
g
Cii
-t
ol>
r
l>
-0
-0
r
o
~
o
z
en
o 500 1000
I\)
-..J
28 GEOSTATISTICAL APPLICATIONS
standard deviation map of soil lead data. This latter map can be used to distinguish between
zones of high versus poor data coverage.
Block Kriging
In many instances, available measurements represent point or quasi-point values, but the
study requires the computation of the areal or volumetric value over a lager domain. For
instance, in environmental risk assessments, the desired concentration term should represent the
average contamination over an exposure domain. Depending on the computed average
concentration or its upper confidence limit, a block is declared impacted or not-impacted. This
shows that the decision is based on the estimated block value, and not its true value. So there is a
chance of making error in two forms:
(1) Wrong Rejection: Certain blocks will be considered impacted, while their true average
concentration is below the target level, and
(2) Wrong Acceptance: Certain blocks will be considered not-impacted when their true
average concentrations are above the target level.
As shown in Iournel and Huijbregts (1978, p. 459), the kriging block estimator, Z', is the linear
estimator that minimizes the sum of the above two errors. Therefore, the block kriging
procedure is preferred to any other linear estimator for such selection problems.
n "'
Z' = LA,;Z; + LWjY j 4
;~J j~J
were Zj is the i-th measured value of the "primary" variable with a kriging weight of A;, and Yj is
the j-th "auxiliary" measured value with a kriging weight of {OJ. Co-kriging is specially
advantageous in cases where the primary measurements are limited and expensive, while
ROUHANI ON KRIGING 29
~auxiliary measurements are available at low cost. Ahmed and de Marsily (1987) enhanced their
1limited transmissivity data based on pumping tests with the more abundant specific capacity data.
i This resulted in an improved transmissivity map. The present STP provides examples of co-
~'kriging, such as Benson and Rashad (1996) and Wild and Rouhani (1996).
(Non-linea, Kriging
i
~' The above linear kriging techniques do not require any implicit assumptions about the
~:"underlying distribution of the interpolated variable. If the investigated variable is multivariate
. :normal (Gaussian), then linear estimates have the minimum variance. In many cases where
\:the histogram of the measured values displays a skewed tendency a simple transformation may
~,produce normally distributed values. After such a transformation, linear kriging may be used. If
,Ithe desired transformation is logarithmic, then the estimation process is referred to as lognormal
I
""kriging. Although lognormal kriging can be applied to many field cases, its estimation process
':requires back-transformation of the estimated values. These back transformation are complicated
,~and must be performed with caution (e.g. Buxton, 1996).
~t Sometimes, the observed data clearly exhibit non-Gaussian characteristics, whose log-
~transforms are also non-Gaussian. Examples of such data sets include cases of measurements
i'with multi-modal histograms, highly skewed histograms, or data sets with large number of
\~'below-detection measurements. These cases have motivated the development of a set of
r~techniques to deal with non-Gaussian random functions. One of these methods is indicator
~'kriging. In this procedure, the original values are transformed into indicator values, such that
~l:they are zero if the datum value are less than a pre-defmed cutoff level or unity if greater. The
~.',I,'.,stimated value by indicator kriging represents the probability of not-exceedence at a location.
"I::!,his technique provides a simple, yet powerful procedure, for generating probability maps
"I~OUhani and Dillon, 1990).
:~~
:~i
~Recommended Sources
'l
i ~or more information on ~iging, readers are referred to Journel and Huijbregts (1978),
~de Marslly (1986), Isaaks and Snvastava (1989), and ASCE (1990). ASTM Standard D 5549-
, titled: "Standard Guide for Content of Geostatistical Site Investigations," provides
information on the various elements of a kriging report. ASTM DI8.0l.07 on Geostatistics has
so drafted a guide titled: "Standard Guide for Selection of Kriging Methods in Geostatistical
. ite Investigations." This guide provides recommendations for selecting appropriate kriging
methods based on study objectives and common situations encountered in geostatistical site
;investigations .
30 GEOSTATISTICAL APPLICATIONS
References
ABSTRACT: This paper, the last in a four part introduction to geostatistics, de-
scribes the application of simulation to site investigation problems. Geostatistical
simulation is a method for generating digital representations or "maps" of a variable
that are consistent with its values at sampled locations and with its in situ spatial
variability, as characterized by histogram and variogram models. Continuing the syn-
thetic case study of the three previous papers, the reader is lead through the steps of a
gebstatistical simulation. The si;nulated fields are then compared with the exhaustive
data sets describing the synthetic site. Finally, it is shown how simulated fields can
be used to answer questions concerning alternative site remediation strategies.
INTRODUCTION
32
DESBARATS ON SPATIAL VARIABILITY 33 'I
I
,I
as inverse-distance weighting or, preferably, using one of the least-squares weighting
methods collectively known as kriging discussed in Ro~hani (this volume). Regard- "I "!
!
'i
less of the interpolation method that is selected, the result is a representation of our I"~
iII'!.!;
variable in which its spatial variability has been smoothed compared to in situ reality.
: i
Along with this map of estimated values, we can also produce a map of estimation ii,
i!.
(or error) variances associated with the estimates at each unsampled location. This
I"i "
map provides a qualitative or, at best, semi-quantitative measure of the degree of
uncertainty in our estimates and the corresponding level of smoothing we can expect. '1:'
Unfortunately, maps of estimated values, even when accompanied by maps of estima- li.
tion variances, are often an inadequate basis for decision-making in environmental or 1.II
,[:
geotechnical site investigations. This is because they fail to convey a realistic picture i'"
"
of the uncertainty and the true spatial variability of the parameters that affect the
planning of remediation strategies or the design of engineered structures.
The alternative to estimation is simulation. Geostatistical simulation (Srivastava,
1994) is a Monte-Carlo procedure for generating outcomes of digital maps based on
the statistical models chosen to represent the probability distribution function and
the spatial variation structure of a regionalized variable. The simulated outcomes
can be further constrained to honor observed data values at sampled locations on the
map. Therefore, not only does geostatistical simulation allow us to produce a map
of our variable that more faithfully reproduces its true spatial variability, but we can
generate many equally probable alternative maps, each one consistent with our field
observations. A set of such alternative maps allows a more realistic assessment of the
uncertainty associated with sampling in heterogeneous geological media.
This paper presents an introduction to the geostatistical tool of simulation. Its
goals are to provide a basic understanding of the method and to illustrate how it
can be used in site investigation problems. To do this, we will continue the synthetic
soil contamination case study started in the three previous papers. We will proceed
step by step through the simulation study, pausing here and there to compare our
results with the underlying reality and the results of the kriging study (Rouhani, this
volume). Finally, we will use our simulated fields to answer some questions that can
arise in actual soil remediation studies.
STUDY OBJECTIVES
The objective of our simulation study is to generate digital images or maps of lead
(Pb ) and arsenic (As) concentrations in soil. We will then use these maps to de-
termine the proportion of the site area in which Pb or As concentrations exceed the
remediation thresholds of 150 ppm and 30 ppm, respectively. The maps are to repro-
duce the histograms and variograms of Pb and As in addition to observed measure-
ments at sampled locations. Although the full potential of the simulation method
is truly achieved only in sensitivity or risk analysis studies involving multiple out-
comes of the simulated maps, we will focus on the generation of a single outcome. In
many respects, even a sin~l~Il!~p of simulated concentrations is more useful than a
map of kriged values. This is because a realistic portrayal of in situ spatial variabil-
ity is often a sobering warning to planners whereas maps of kriged values are easily
34 GEOSTATISTICAL APPLICATIONS
HISTOGRAM MODELS
The first step in our simulation study is to decide what probability distribution func-
tions or, more prosaically, what histogram models are to be honored by our simulated
concentrations. We would like these histograms to be representative of the entire site.
Often, the raw histograms of sample data are the most appropriate choice. However,
here this isn't the case: The sampling of our contaminated site was carried out in
two stages. In the first stage, we obtained 77 measurements of Pb distributed on a
fairly regular grid. In the second stage, we focused our sampling on areas identified
in the first stage as having high Pb concentrations. Furthermore, by then we had
become aware that arsenic contamination was present and we analyzed an additional
135 samples for both Pb and As. Thus, our Pb data consist of 77 values that are
probably representative of the entire site area and another 135 values drawn from the
most contaminated region. As for arsenic, our 135 samples were obtained exclusively
from the most contaminated region and are probably not representative of the entire
site. The raw histograms of Pb and As shown in Cromer (this volume) reflect the
preferential or biased sampling procedure and do not provide adequate models for
our simulation.
The answer to this problem is to weight our sample data in such a way as to de-
crease the influence of clustered measurements while increasing that of more isolated
values. In geostatistics, this exercise is known as "declustering" and can be accom-
plished several ways (Isaaks and Srivastava, 1989; Deutsch and Journel, 1992). Here
we used a cell declustering scheme to find sample weights. This involved moving a
10 x 10 unit cell over N non-overlapping positions covering the study area. At a each
cell position, the number n of samples within the cell was counted and each sample
was then assigned a relative weight of liNn. This procedure may be expected to work
well for Pb but for As there is no escaping the fact that our samples are restricted to
a few small, highly contaminated patches and are hardly representative of the site as
a whole. Obtaining a reasonably representative histogram is crucial for a simulation
DESBARATS ON SPATIAL VARIABILITY 35
0 .200
0.100
The next step of our study involves transforming our Pb and As sample values into
standard Normal deviates (Deutsch and Journel, 1992). This "normal-score" trans-
formation is required because the simulation algorithm we will be using is based on
the multivariate Normal (or Gaussian) distribution model and assumes that all sam-
36 GEOSTATISTICAL APPLICATIONS
pIe data are drawn from such a distribution. In simple terms, this transformation
is performed by replacing the value correponding to a given quantile of the original
distribution with the value from a standard Normal distibution associated with the
same quantile. For example, a Pb value of 261 ppm correponding to a quantile of
0.50 (i.e. the median) in the sample histogram is transformed into a value of 0 corre-
sponding to the median of a standard Normal distribution. In mathematical terms,
we seek the transformations ZI and Z2 of Pb and As such that :
G(Zd
(1 )
Before we can proceed to the analysis of spatial variation, one last step is required.
Our simulation algorithm can only be used to generate fields of one variable at a time.
However, we wish to simulate two variables ZI and Z2, reproducing not only their
respective spatial variation structures but also the relationship between them shown
in Figure 2. We must therefore "decouple" the variables ZI and Z2 so that we can
simulate them independently. To do this, we use the following principal component
transformation which yields the independent variables Yi and 12 from the correlated
variables ZI and Z2 :
(2)
12 =
.;i-';.
00
2.00 0 00
0 Old. dey. 0.806
0
0 correlation 0.839
1.00 rank correlation 0.869
~
0.00 0
0
,
0
.. 0
0
-1.00
0 0
-2.00
.3.00-h-T""I"''TTTT"'TTTT"......,..,...'''..,..,.....,..,..,..,....,..,..,..,...,....
-3.00 -2.00 -1.00 0.00 1.00 2.00 3.00
Z1
Figure 2: Scatter plot of Zl and Z2 for 135 sample values. The correlation coefficient
is 0.839.
VARIOGRAM MODELS
In this section, we examine and model the spatial variation structure of the two
independent variables Y1 == Zl and l'2. The jargon and the steps involved in an
analysis of spatial variation are described in more detail by Srivastava (this volume)
so that only a summary of results is given here.
Directional variograms, or more specfically correlograms, were calculated for Y1
using all 212 data values, and for 1'2 using the 135 values of the second sampling cam-
paign. For each variable, eight directional correlograms were calculated at azimuth
intervals of 22.5° using overlapping angular tolerances of 22.5°. Lag intervals and
distance tolerances were 10 grid units and 5 grid units, respectively for Yi and 5 grid
units and 2.5 grid units, respectively for 1'2. The purpose of these directional correlo-
grams is to reveal general features of spatial variation such as directional anisotropies
and nested structures. Results for Yi. and Y2 are shown in Figures 3 a) and b), re-
spectively. These figures provide a planimetric representation of spatial correlation
structure, displaying correlogram values as a surface, function of location in the plane
of East~West (x) and North-South (y) lag components.
For Yi., we observe, in addition to a significant nugget effect, what we interpret
as two nested structures with different principal directions of spatial continuity. The
first, shorter scale, structure has a direction of maximum continuity approximately
North North-West, a maximum range of about 20 grid units and an anisotropy ratio
of about 1.4 : 1. The second, larger scale structure has a direction of maximum
continuity approximately West North-West, an indeterminate maximum range and a
minimum range of at least 40 grid units.
38 GEOSTATISTICAL APPLICATIONS
"d
20.0 10.0
-
.~
~ 0.0 0.0
...:I
~
::s0
rn -20.0 -10.0
I
:E
Z -40.0
-40.0 -20.0 0.0 20.0 40.0
-20.0
-20.0 -10.0 0.0 10.0 20.0
East-West Lag (grid units) East-West Lag (grid units)
3. A large-scale structure accounting for 15% of the spatial variance. This struc-
ture is also represented by an exponential model with, however, maximum conti-
nuity in the West North- West direction. The model range parameter is 300 grid
units with an anisotropy ratio of 10 : 1. Such a large maximum range ensures
that the "sill" value of the structure is not reached in the direction of maximum
continuity within the limits of the site. What we have done here is model a
"zonal anisotropy" (Journel and Huijbregts, 1978) as a geometric anisotropy
with an arbitrarily large range in the direction of greatest continuity.
This model is also ·shown in Figures 4 a) and b) for comparison with the experimental
results.
For Y2 , there are less data and we are careful not to over-interpret the directional
correlograms. Indeed, the apparent periodicity in the NNE direction is probably
an artifact of the sampling pattern . With somewhat more confidence, we note a
DESBARATS ON SPATIAL VARIABILITY 39
0.800 Co".log,.m 1I1/:Z1 ,*,d:Z1 dlrecHon 1 0.600 Comlog,.m II/I/:Z1 /wd:Zf direction 2
0.700 0.500
0.600 0.<400
0.500 0.300
~ ~
f 0.«10 f 0.200
> >
0.300 0.100
0.200 0.000
0.100 .Q.loo
0.000 .Q.2OO
0.0 10.0 20.0 30.0 40.0 SO.O 60.0 0.0 10.0 20.0 30.0 40.0 SO.O I 60.0
Dstanoe Dstanoe
strong nugget effect and a structure with maximum continuity in the West North-
West direction, a maximum range of about 30 grid units and an anisotropy ratio of
about 3 : 1. Detailed directional correlograms were calculated for the directions of
maximum and minimum continuity and are shown in Figures 5 a) and b), respectively.
The model fitted to these correlograms is the sum of two components:
2. A structure accounting for 45% of the spatial variance. This structure is repre-
sented by an exponential model with greatest continuity in the West North-West
direction . The model has a range parameter of 8 grid units and an anisotropy
ratio of 3 : 1.
This mQdel is shown in Figures 5 a) and b) for comparison with the experimental
results.
We are now ready to simulate fields of the two independent standard Normal variables,
Y1 and Y2 • The simulations of Y1 and Y2 are to be conditioned on 212 and 135 sample
values, respectively. Both fields are simulated on the same 110 x 70 grid as the
exhaustive data sets for Pb and As .
To perform our simulations, we are going to use the Sequential Gaussian method.
This method is based on two important theoretical properties of the multivariate
Normal (or Gaussian) distribution: First, the conditional distribution of an unknown
40 GEOSTATISTICAL APPLICATIONS
0.600 Corre/ogram IaIl:Y2 heIId:Y2 dlreclJon 1 0.600 Cotrelogram IIIII:Y2 _:Y2 direction 2
0.500 0.500
0.400 0.400
0.300
\ 0.300
! !
8'
~
0.200
" "- " ~
8' 0.200
·0.100 -0.100
·0.200 -0.200
0.0 5.0 10.0 15.0 20.0 25.0 30.0 0.0 5.0
Distance Distanct
1. Start with a set of conditioning data values at scattered locations over the field
to be simulated.
2. Select at random a point on the grid discretizing the field where there is not
yet any simulated or conditioning data value.
3. Using both conditioning data and values already simulated from the surrounding
area, calculate the Simple Kriging estimate and corresponding error variance.
These are the mean and variance of the conditional distribution of the unknown
value at the point given the set of known values from the surrounding area.
Thus, in many ways, the Sequential Gaussian simulation method is similar to the point
kriging process described by Rouhani (this volume). The difference is that we are
drawing our simulated value at random from a distribution having the kriged estimate
as its mean, rather than using the kriged estimate itself as a "simulated" value.
Intuitively, we see how this process leads to fields having greater spatial variability
than fields of kriged values.
BACK-TRANSFORMATIONS
We now have simulated fields of the two independent standard Normal variables Y1
and )12. In order to obtain the corresponding fields of Pb and As , we must reverse the
earlier transformations. First, we reverse equations (2) to get the correlated standard
Normal variables Zl and Z2 from Yi and Y2. Then we reverse equations (1) to get the
variables Pb and As from the standard Normal deviates Zl and Z2. Finally, we are
left with simulated fields of Pb and As on a dense 110 x 70 grid discretizing the site.
Although here we are focusing on single realizations of each of these fields, multiple
realizations can be generated by repeating the simulation step using different seed
values for the random number generator.
1000. Q-Q Plot for Pb Slmul.tlon 200. Q-Q Plot for A. Slmul.tlon
800. 160.
] 600.
I 120.
~
j
."
i 400.
1~
~
r/) r/)
80.
200. 40.
o. 200. 400. 600. 800. 1000. o. 40. 80. 120. 160. 200.
Cromer (this volume). The comparison shows that we did quite a respectable job of
reproducing the relationship between Pb and As in our simulated fields. Directional
correlograms calculated on our two simulated fields are shown in Figures 8 a) and
b). The main features of these correlograms compare favorably with those observed
in the correlograms presented by Srivastava (this volume). Given the limited number
of data and their spatial clustering, the models we fitted to the experimental correl-
ograms were quite successful in representing the true spatial variation structures of
Pb and As .
No comparison of true and simulated fields would be complete without looking at
images or maps of the simulated fields. Although qualitative, the visual comparison
of simulated and true fields is in fact the most stringent measure of the success of our
simulation. We must check how well we have captured the character of contaminant
spatial variability at the site, its "noisyness", the grain of any spatial patterns, and
any trends. We should also check to see what our simulated values are doing in areas
far from conditioning data values. The spatial variability in such areas should be
consistent in character with that observed in more densely sampled areas.'"
Grey-scale digital images of simulated Pb and As fields are shown in Figures 9
and 10, respectively. Comparison with the corresponding true images in Cromer (this
volume) shows that we have reason to be satisfied with our simulation. Discrepancies
between simulated and true fields exist; however, these are manifestations of the
uncertainty associated with our knowledge of site contamination as provided by the
rather limited sampling data. It should be emphasized that we are looking at but one
pair of images of contamination from amongst the many equally possible alternatives
that would be consistent with sampling information. We can also compare Figure 9
DESBARATS ON SPATIAL VARIABILITY 43
vs As : Simulated values
Number of data 7700
40.0 - -- - - "u-.::----,'-J,..,----.;:,-----,
20.0 20.0
~
t-=l 0.0
0.0
.-.
.....s ~
0
§ 20.0 0
00
'1:l
.~ .-.
'-'
S
p.
til) 40.0 p.
.S '-'
~
~ 0
0
S!
Z 60.0
~
with the kriged field shown in Rouhani (this volume) . We see that kriging smoothes
spatial variations in a non-uniform manner: less in regions with abundant sample
control, more in unsampled regions . This may lead the unsuspecting to conclude that
large portions of the site are quite homogeneous! Simulation, on the other hand,
preserves in-situ spatial variability regardless of the proximity of sampling points.
APPLICATION
Now that we have simulated fields of Pb and As that we confidently assume are
representative of the true yet unknown contamination at the site, we can use these
fields to answer some simple questions .
Perhaps the most basic question that we may ask is what fraction of the site
requires remediation given the contamination thresholds of 150 ppm and 30 ppm for
Pb and As , respectively? However, before we can attempt to answer that question,
we must decide on a "volume of selective remediation" or VSR. Note that the concept
of volume of selective remediation is identical to that of selective mining unit (smu)
described in the mining geostatistics literature (chapter 6 of Journel and Huijbregts,
1978; chapter 19 of Isaaks and Srivastava, 1989).
The volume, or in the present case, area of selective remediation is the smallest
portion of soil that can be either left in place or removed for treatment, based upon
its average contaminant concentration. The VSR may depend on several factors
including the size of equipment being used in the remediation and the sampling
information ultimately available for the selection process. It is an important· design
parameter because the variance of spatially averaged concentrations decreases as the
DESBARATS ON SPATIAL VARIABILITY 45
0
0.0 0
to
,-..
~
.....
§ 20.0
'0 0
.~ 0
"<I' ,-..
'-"
S
p.
bO 40.0 p.
.S '-"
1! ~ <
Z 60.0
0
C"I
VSR becomes larger. This reduces the spread and alters the shape of the histogram
of contaminant concentrations thereby affecting the proportion of values above a
given threshold and the fraction of the site requiring remediation. Here, the original
sample size or "support", as it is known in geostatistics, is a square of 1 x 1 grid units
(5m x 5m). The corresponding standard deviations of Pb and As concentrations are
218 ppm and 35 ppm, respectively. If we were to consider a VSR with a support of
10 x 10 grid units (50m x 50m), the standard deviations of VSR-averaged Pb and
As concentrations are reduced to 172 ppm and 18 ppm, respectively.
In Tables 1 and 2, we compare fractions of the site requiring remediation for VSRs
of 1 x 1 grid units and 10 X 10 grid units, respectively. Within each table, we also
compare remediation fractions based on kriged, simulated and true values.
For selection based on Pb concentration alone, results for both VSR sizes show
good agreement between remediated fractions calculated on simulated and true fields.
Remediated fractions based on kriged fields are overestimated for the smaller VSR.
We note that the fraction of the site requiring remediation increases for the larger
VSR. This is because the spatial averaging of Pb concentrations over a VSR smears
high values over the entire block area thereby pushing its average over the remediation
threshold. The same phenomenon may also happen in reverse, with low values diluting
a few high values and thus lowering the average VSR concentration below threshold.
In either case, it is obvious that the choice of VSR will have a significant impact on
the fraction of the site requiring remediation.
For selection based on As values alone, remediated fractions calculated on the
simulated fields are almost half those calculated on the true fields. On the other
hand, remediated fractions based on the kriged fields are much larger than those
46 GEOSTATISTICAL APPLICATIONS
based on the true fields. The cause of the poor simulation results can be traced back
to our difficulties in obtaining a representative histogram for As concentrations. The
poor kriging results are due to smearing of As values from the densely sampled highly
contaminated zone to the surrounding area. Considering only the results for the true
field, we see an increase in remediated fraction with the larger VSR size, as we saw
previously with Pb. For selection based on either Pb or As threshold exceedance,
results are similar to those for Pb selection alone : VSRs that would otherwise be
misclassified based on their As value are correctly classified based on their Pb value.
Although the simulated Pb field gave remediation fractions close to those obtained
for the true field, this may be partly fortituous and, in any case, does not ensure that
the blocks selected for remediation are the correct ones i.e. the same as in the true
field. In practice, multiple simulations should be performed and, for each VSR within
the site, a contamination threshold exceedance probability should be calculated from
the resulting distribution of simulated concentrations for that location. The decision
I'll on whether or not to renlediate a given VSR would then be based on its threshold
exceedance probability and not on a single concentration value.
DESBARATS ON SPATIAL VARIABILITY 47
CONCLUSIONS
,,
I
ACKNOWLEDGMENTS I:
,,
I :!
The author wishes to thank Doug Hartzell, who coined the term "VSR", and one
anonymous reviewer for their comments on the original manuscript. Geological Survey
of Canada contribution no. 20995.
REFERENCES
Cromer, M.V., C.A. Rautman and W.P. Zelinski, 1996, Geostatistical Simulation of
Rock Quality Designation (RQD) to Support Facilities Design at Yucca Moun-
tain, Nevada, Geostatistics for Environmental and Geotechnical Applications,
ASTM STP 1283, R. Mohan Srivastava, Shahrokh Rouhani, Marc V. Cromer,
A. Ivan Johnson, Eds. , American Society for Testing and Materials, Philadel-
phia.
Deutsch, C.V. and A.G. Journel, 1992, GSLIB : Geostatistical Software Library and
User's Guide, Oxford University Press, New York.
Journel, A.G. and C. Huijbregts, 1978, Mining Geostatistics, Academic Press, Lon-
don.
Rossi, R.E. and P.E. Evan Dresel, 1996, Declustering and Stochastic Simulation of
Ground- Water Tritium Concentrations at Hanford, Washington, Geostatistics for
Environmental and Geotechnical Applications, ASTM STP 1283, R. Mohan Sri-
vastava, Shahrokh Rouhani, Marc V. Cromer, A. Ivan Johnson, Eds. ,American
Society for Testing and Materials, Philadelphia.
Zuber, R.D. and R. Kulkarni, 1996, A Geostatistical Analysis of Lake Sediment Con-
taminants at a Superfund Site, Geostatistics for Environmental and Geotechni-
cal Applications, ASTM STP 1283, R. Mohan Srivastava, Shahrokh Rouhani,
Marc V. Cromer, A. Ivan Johnson, Eds. , American Society for Testing and
Materials, Philadelphia.
,
"
Environmental Applications
Bruce E. Buxton 1 , Darlene E. Wells 2 , Alan D. Pate 3
Bl!IFEBENCE: Buxton, B. E., Wells, D. E., Pate, A. D., "Geostatistical Site Characterization of
Hydraulic Head and Vranium Concentration in Groundwater," Geostatistlos for Environmen-
tal and Geoteohnioal Applloatlons, ASTM STP 128:3, R. Mohan Srivastava, Shahrokh Rouhani, Marc
V. Cromer, A. Ivan Johnson, and Alexander J. Desbarats, Eds., American Society for Testing and
Materials, 1996.
51
52 GEOSTATISTICAL APPLICATIONS
• After the available data were identified for each grid block,
the appropriate data weighting, estimated hydraulic head, and
estimation precision were calculated using the appropriate
semivariogram model.
13500 0
0
0 0
12000 0
0
0
OCQ,%>O~o
0:0 0 6
0 0
0 0
10500 0 800 0
0 of}
,..... 0
0 {) a 0
o 0
~ 9000 0 0 o %.00 0
0
....Q)Q) t:9
~o
o 0 0 0
0
E
'-"
7500
0 0
Cboo 0 0
0 0
.c 0009 0
0
0 6
t 000
o 0
0 tf::.
0 6000 0 0
Z 00 00 0<::0
0
0 0 0 6
4500 &&0 00 0 £t:,
0
0 8 o§o
~o
66
3000 o 0 0 <P 6
o 0
o 0
o 00 0
1500
o 00 0
0 00 0
0 0 0
I I I I
0 1500 3000 4500 6000 7500 9000 10500 12000 13500 15000
East (meters)
40.0
36.0
32.0
~
a; 28.0
Q)
..:!::; 24.0 -
E
CU
0, 20.0
.Q
.....
CU
16.0
>
'EQ) 12.0
III
8.0
4.0
0.0
0 600 1200 1800 2400 3000 3600 4200 4800 5400 6000
separation distance (feet)
9.0
8.0
~
.....Q) 7.0
Q)
LL
6.0
E
CU
..... 5.0
Ol
.Q
..... 4.0
CU
>
'E
Q)
3.0
(/) : : :
2.0 ......... '[ " ......... -;- .......... ! .... . .. . . . ~
0.0
0.0 50.0 100.0 150.0 200.0 250.0 300.0 350.0 400.0 450.0 500.0
separation distance (Days)
Fig. 3-- Tempora l Semivariogram from joi nt spatial- t emporal a n alysis of
hydraulic he ad l evels . No t e tha t 1 foot; 30 cm.
BUXTON ET AL. ON SITE CHARACTERIZATION 57
TABLE 1--Fitted semivariogram models of spatial and temporal
correlation. (Note that 1 foot = 30 cm.)
Steady-State Analysis
!
il
One primary reason for performing the joint spatial-temporal
analysis in the previous section was to select, for the steady-state
analysis, a single month which was representative of average hydraulic
head levels during the 1990-1993 time period. Examining the results in
Figure 4, it appears that three months can be considered representative:
January, 1990; November, 1991; and June, 1993. Hydraulic heads in each
of these three months appear to be approximately equal to the average
head levels across the entire 1990-1993 time period. However, several
new wells were installed in the area in 1993, particularly in the
southeastern part of the modeling grid. Therefore, a significantly
greater number of head measurements were available for June, 1993 in
comparison with January, 1990 and November, 1991. As a result, June,
1993 was selected as the month to represent steady-state conditions.
Hydraulic head measurements for June, 1993 were available for the
steady-state kriging analysis in 202 wells at various depths. The
horizontal spatial semivariograms for these data are shown in Figure 5.
As in the joint spatial-temporal analysis (Figure 2), horizontal
semivariograms in Figure 5 were calculated in four primary directions.
The fitted model is also shown in Figure 5, where the bold line denotes
the model in the north direction; and the dashed line denotes the model
in the east direction.
A kriging analysis was performed with the June, 1993 data and the
semivariogram model shown in Figure 5, as well as Table 1. This
analysis estimated steady-state head levels across the groundwater
modeling grid at regular 5 ft (1.5 m) vertical intervals from 390 to 540
ft (117 to 162 m) above sea level. The horizontal variability in
steady-state head levels at the 490 ft (147 m) elevation is shown in
58 GEOSTATISTICAL APPLICATIONS
527
526
525
524
523
522
i 521
.!
...... 520
519
~Q) 518
:r: 517
516
"0
....Q)a1 515
514
E
:;:; 513
CJ) 512
W
511
510
509
508
507
506
0 10 20 30 40 50
Time (months)
20.0 -
18.0 .
16.0 .
-
$!!...
Q) 14.0
-
2
E
ro
12.0
.... 10.0
OJ
0
.;:
8.0 .
ro
0.0
0 600 1200 1800 2400 3000 3600 4200 4800 5400 6000
separation distance (feet)
HEAD ± PREC
where HEAD is the estimated hydraulic head from Figure 6 and PREC is the
estimation uncertainty from Figure 7 .
~
o
o
:0
'0
·c
S
.£:
t:
o
Z
Krg. Head .. II ..
< 515 • < 518
II •
II • •
< 521
••• < 524 • •• > = 524
120
110
100
90
80
~
8 70
:0
u
.;:: 60
.g
..c. 50
t:
0
Z 40
30
20
10
15000
*
13500
12000 * ,j:!'** *
* 'lit< :;t* * *
* ~*** *" *
10500
* * **
*
** ** *
* * * <~* *
9000
*
~
t
** * *
0 7500 * ** * *
Z * * **
***
* * * *
* * *** * * *
6000
* *
4500 *** * * *
*
* * ** *
3000 * * **
* * >i<I<
*
1500 * *
* *
*
o T<"-'rrITT"-.rrTT"~rr"""-'"""-'rr"",,-r,,,,,,~
o 1500 3000 4500 6000 7500 9000 10500 12000 13500 15000
East
4.5
---
N
Cl
4.0
3.5
2-
-E
c: 3.0
2.5
"f- .,-<
ro
.....
Cl
.Q
.....
2.0
•
[
ro OMNI
> 1.5 0 N
·E 0 NE
OJ \1 E
en 1.0 I NW
Model
0.5
4.5
4.0
~
::::: 3.5
OJ
3
-
c 3.0
E
m 2.5
....
OJ
o 2.0
.;::
m
.~ 1.5
Q)
(J) 1.0
0.5
15.0 30.0 45.0 60.0 75.0 90.0 105.0 120.0 135.0 150.0
separation distance (Feet)
Fig . 10--Vertical semivariograms for log-transformed 1990 a v e rage
uranium concentrations . Note that 1 foot = 30 c m.
CONCLUSION
120
110
100
90
•
80
~
0
0 70
:0
"0
.1: 60
.9
£ 50
t::
0
Z 40
30
20
10
80
~
0
0 70
:0
u
01:: 60
.9
.c: 50
t
0
Z 40
30
20
10
REFERENCES
120
110
100
90
80
~
0
0 70
:0
u
.t: 60
.9
..t: 50
1::
0
Z 40
30
20
10
o 10 20 30 40 50 60 70 80 90
..
100 110 120
120
110
100
90
80
~
8 70
:0
"0 60
·c
,g
l: 50
-e
0
z 40
30
20
10
Krg. Prec. < 150 ••• < 180 ••• < 210
••• < 240 ••• > = 240
ABSTRACT: A case study is presented of building a map showing the probability that the
concentration in polycyclic aromatic hydrocarbon (P AH) exceeds a critical threshold. This
assessment is based on existing PAH sample data (direct information) and on an electrical
resistivity survey (indirect information). Simulated annealing is used to build a model of the
range of possible values for P AH concentrations and of the bivariate relationship between
P AH concentrations and electrical resistivity. The geostatistical technique of simple indicator
kriging is then used, together with the probabilistic model, to infer, at each node of a grid,
the range of possible values which the P AH concentration can take. The risk map is then
extracted for this characterization of the local uncertainty. The difference between this risk
map and a traditional iso-concentration map is then discussed in terms of decision-making.
69
70 GEOSTATISTICAL APPLICATIONS
Steelwork and coal processing sites are prone to contamination by polycyclic aromatic
hydroccu"bons (P}Jl), SOnte of which are known to be carcinogenic. Consequently, local and
state regulatory agencies require that all contaminated sites be characterized and that
remediation solutions be proposed. The traditional approach for delineating the horizontal and
vertical extent of the contamination is to use wells and boreholes to construct vertical profiles
of the contamination at several locations. This approach, however, is both time consuming
and expensive. Recent work has shown that, in some situations, electrical conductivity and
resistivity surveys could be used as a pathfinder for delineating the contaminated areas.
These geophysical surveys, which are both expedient and cost effective, could be used to
reduce the number of wells and boreholes to be drilled.
Geophysical data, however, do not provide direct information on soil chemistry. They are
indicative of the ground nature, which in turn may reflect human activities (backfill material,
tar tanks) and potential sources of ground pollution. These geophysical data have to be
treated as indirect and imprecise infonnation. The mapping of the contamination therefore
requires that imprecise geophysical data be correctly integrated with precise chemical analyses
from wells and boreholes.
Geostatistics offers an ideal framework for addressing such problems. Different types of
information can be integrated in a manner which takes into consideration not only the
statistical correlation between the different types of infonnation, but also the spatial
continuity characteristics of both. Using this approach, it is possible to provide maps showing
the probability that the P AH exceeds some critical level.
A case study from an industrial site in Lorraine (northern France) is used to compare the
geostatistical approach to the traditional approach of directly contouring the data from wells
and boreholes.
AVAILABLE DATA
Although the physical phenomena that govern P AH transport, and the reactions between P AH
and soils of different nature are not yet well understood (they are the subject of ongoing
research projects), the presence of P AH in significant amounts has been found to be be
COLIN ET AL. ON GEOPHYSICAL DATA 71
associated with low resistivity values (i.e it increases, locally, the soil conductivity). This
relationship, however, remains site specific and cannot be considered as a general law.
The available resistivity measurements (in ohm-meter) come from 14 electrical lines, tightly
criss-crossing the contaminated area. Gaps between electrical lines were filled, first, by
sequential simulation to produce the full resistivity map shown on Figure 2.
500 rn
o 0 PAH concentration :
0 0
•
00 0 0
> 200 ppm
••
0
0 0
• " 0 ...
·.
0 40 - 200 ppm
" 0 0"
0 0 0
" 0 • " < 40 ppm
0
o a- "· 0
" •
0 0
0
• "
0
• 0
"
0
0
0 0
0 600rn
500
15 ohm· m
25
~a~
40
•
II
95
250
•• 375
750
If
0
0 600
•• 1500
3000
The problem at hand is to delineate (on a 10 by 10 meter grid) areas where the risk that the
P AH contamination is in excess of a critical threshold is deemed large enough to warrant
either remediation or further testing. The critical threshold used for this study is 200 ppm
PAH, and three classes of risk were considered:
Low Risk : The probability that the concentration in P AH exceeds 200 ppm is less than
20 percent.
From a methodological point of view, assessing the risk of exceedance implies that, at each
node of the grid, the range of possible values for the P AH concentration, along with their
associated probability, be available.
Given the objectives of the study, we need to understand the following critical features from
the available data:
1- The range of possible values for the PAH concentrations which may be encountered away
from sampled locations;
- The number of data available to construct the histogram is fairly limited, resulting in a lack
of sufficient resolution : the class frequencies tend to jump up and down erratically and
there are gaps between data values.
- The histogram is extremely skewed to the left, the bulk of the values being below 300
ppm, with some erratic high values extending all the way up to 6500 ppm (Figure 3a). Not
surprisingly, the coefficient of variation is very high (3 .03).
- The mean and variance of the data are severely affected by this high variability and they
cannot be established at any acceptable level of reliability: removing the two largest values
reduces the average by a factor 3 and the variance by a factor 65!
- Ifwe use a logarithmic scale to visualize the same histogram (Figure 3b), we see clearly
the existence of three populations: a first one below 50 ppm and accounting for 61
percent of the total population, a second one in the range 50 to 500 ppm and including 34
percent of the population, and a third small (5 %) population characterized by extreme
P AH values ranging from 600 to 6500 ppm.
f Summary statistics
0.5 Number of data : 51
Mean : 382 ppm
Standard deviation : 1159 ppm
Coer variation : 3.03
0.25 Minimum : 2 ppm
1st quartile : 9 ppm
Median: 33 ppm
3rd quartile : 206 ppm
Maximum : 6500 ppm
0.0 -Jlllll:ilib:tlhl...frlliL..Illl.....IDlla,.__ PAH (ppm)
250 500
0. 1
The first population can be interpreted as representing the background concentration level on
the site. The second population seems clearly related to contamination itself, with the bulk of
it above the critical threshold of 200 ppm. The third population is a little more dubious to
interpret, primarily because it is represented only by three samples. Although it is obviously
associated with the contamination, it is not entirely clear whether it represents a different
factor of contamination or if it is merely the tail end of the second population.
From this analysis it is clear that the experimental histogram cannot be used as is as a model
of the distribution function of P AH concentrations over the site area. This probability
distribution function should, instead, be modelled and have the following features :
It should not be based on parameters like the mean and variance, which are highly affected
by extreme values and are, as a result, not known with any degree of reliability;
It should provide probabilities for the entire range of possible values, from the absolute
minimum to the absolute maximum and fill the gaps between existing data values;
It should reproduce the existence of the three populations and their respective frequencies.
Bivariate Distribution
The cross-plot shown on Figure 4 describes the relationship which exists between PAH
concentrations and electrical resistivity.
COLIN ET AL. ON GEOPHYSICAL DATA 75
PAH [ppm[
1000 •
•• • •••
•
100 • • ••
••• • •
10
• •• •
• • •
• • Resist .
ohm. m
10 100 1000
Covariance : -0.799
Correlation (pearson) : -0.360
Correlation (Spearman) : -0.266
The most important feature ofthis plot is the existence oftwo distinct clouds of points, which
is a direct consequence of the multi-modality of the PAH distribution, and of the bimodality
of the electrical resistivity:
An upper cloud where P AH values are in excess of 3 5 ppm and the electrical resistivity
ranges from 15 to 150 ohmomo Within this cloud, the correlation between the two
attributes is positive.
A lower cloud with PAH values below 35 ppm and with electrical resistivities in the 30
to 1600 ohmom range. The correlation, again, appears to be positive, but less significantly
so .
From this cross-plot it seems that high concentrations ofPAH (over 35 ppm) are associated
to rather low resitivity values. One possible explanation of this feature, which stilJ remains to
be confirmed, is that P AH, which are viscous fluids, tend to flow down through backfilJ
materials until they reach the top of the natural soil. At this level they filJ up the soil pore
volume, thus creating a flow barrier to water.
76 GEOSTATISTICAL APPLICATIONS
Traditionally, bivariate distributions are parametrized by the mean and variance of their
n;arginal distributions together with the correlation coefficient. Such an approach is
inapplicable in our case, since it will fail completely to reflect the most important feature of
the cross-plot which is the existence of the two populations.
The solution adopted for this study consists of using directly a bivariate histogram to describe
the bivariate distribution model. Because of the sparsity of data, the experimental cross plot
is not sufficient to inform all the possible bivariate probabilities: it is spiky and full of gaps.
The required bivariate histogram, therefore, will be obtained by an appropriate smoothing of
this experimental cross plot, making sure that the two clouds of points are correctly
reproduced.
The variogram analysis performed on the natural logarithm ofPAH concentrations (Figure
5) shows that the phenomenon is reasonably well structured, with a maximum correlation
distance (range) of approximately 70 meters. There was no evidence of anisotropy and the
shape of the variogram was exponential.
y(h)
.. '
.. ~'
3
0r-----------r----------.-----------r~ hIm)
so 100 ISO
Based on the results of the exploratory analysis, the probabilistic model to be used in
estimation and uncertainty assessment will consist of the following:
Several approaches have been proposed to produce smooth histograms and cross-plots:
quadratic programming (Xu 1994), fitting of kernel functions (Silverman 1986; Tran 1994)
and simulated annealing (Deutsch 1994). The technique selected for this study is simulated
,.,
annealing, because it was perceived to be the most flexible to accommodate all the i.
II:
requirements of the probabilistic model. Simulated annealing is a constrained optimization I.
IU
technique which is increasingly used in earth sciences to produce models which reflect
complex multivariate statistics. A detailed discussion of the technique can be found in (Press I!I,
et al. 1992; Deutsch and Journel 1992; Deutsch and Cockerham 1994). it
,.i'
In this study the modelling of the bivariate probabilistic model was done in two steps: first the !.
marginal distribution of the P AH concentration was modelled, and then the cross-plot
between PAH and electrical resisitivity (there was no need to model the marginal distribution
of resistivity, since it was directly available from the resistivity map).
The modelling of the marginal distribution ofPAH via simulated annealing was implemented
as follows:
1- An initial histogram is created by subdividing the range of possible values into 100
classes, and assigning initial frequency values to each of these classes by performing a
moving average of the original experimental histogram and then rescaling these
frequencies so that they sum up to one.
2- An energy function (Deutsch 1994) is defined to measure how close to desired features
of the final histogram the current histogram is. In the present case the energy function
takes into consideration the reproduction of the mean, variance, selected quantiles and
a smoothing index devised to eliminate spurious spikes in the histogram.
3- The original probability values are then perturbated by choosing at random a pair of
classes, adding an incremental value b.p to the first class and substracting it from the 'i
i
second, hence ensuring that the sum of the frequencies is still one.
4- The perturbation is accepted if it improves the histogram, i.e. if the energy function
decreases. Ifnot, the perturbation may still be accepted with a small probability. This will
il
if
'1'
I
ensure that the process will not converge in some local minimum.
,Ii'
1
5- This perturbation procedure is repeated until the resulting histogram deemed
IS
~
l
II
'I
!
1
I
i
:1
78 GEOSTATISTICAL APPLICATIONS
satisfactory (the energy function has reached a minimum value) or until no further
progress is possible.
1- An initial bivariate histogram is created by subdividing the range of possible values along
both axes into 100 classes, and assigning initial bivariate frequency values to each of these
cells by performing a moving average of the original cross-plot followed by a rescaling
of the frequencies to ensure that they sum up to one.
2- An energy function is defined to measure the goodness of fit of the current bivariate
histogram to desired features of the final one. In the present case the energy function
takes into consideration the reproduction ofthe marginal distributions defined previously,
the correct reproductions of some critical bivariate quantiles and, again, a smoothing
index devised to eliminate spurious spikes in the cross plot.
3- The original bivariate frequencies are perturbated by randomly selecting a pair of cells,
and adding an incremental probability t,p to the first class and subtracting it from the
second, leaving therefore the sum of frequencies unchanged.
4- As before the perturbation is accepted if it decreases the energy function and accepted
with a certain probability if not.
5- The perturbation mechanism is iterated until the energy function has converged to some
minimum value.
A detailed discussion on how to use simulating annealing for modelling histograms and cross
plots can be found in Deutsch (1994).
Having developed the bivariate probabilistic model for P AH and electrical resistivity, we will
now use it to infer the local conditional cumulative distribution function (ccdt) of the P AH
concentration.
1- The local a priori distribution function (cdt) of the P AH, given the local resistivity value,
is extracted from the bivariate histogram. This local cdf characterizes the uncertainty of
the P AH value based on the overall relationship existing between P AH and resistivity, but
COLIN ET AL. ON GEOPHYSICAL DATA 79
0.1
2- The local a priori cdf is then conditioned to the nearby P AH data values via simple
indicator kriging (Journel \989) . This ccdf now describes the uncertainty on the P AH
concentration once the local conditioning information has been accounted for. Note that
simple indicator kriging calls for a model of the spatial continuity of the residual
indicators (see Appendix I). This model is shown on Figure 8.
r-----.---~r---~----~-----r-----r~h[ml
70 140 210
In this approach, the two types of information (direct measurements ofP AH concentrations
and electrical resistivities) are mixed in a smooth, transitional fashion: when there is abundant
nearby sample data, the simple indicator kriging system will put a lot of emphasis on this
conditioning information and down play the influence of the indirect information, whereas
when there is little or no conditioning data, the range of possible outcome for P AH will be
primarily controlled by the local resistivity value.
PROBABILITY MAPS
Having inferred the local distribution functions of P AH concentrations, we can now build the
probability map (Figure 9c) showing the risk that this concentration exceeds the critical
threshold of200 ppm .
COLIN ET AL. ON GEOPHYSICAL DATA 81
PAH (ppm )
•
1000
.....
•
100
• • ••
••• •
• •
10
• • ••
•
• •• ResilL (ohm-m 1
10 100 1000
PAH (pp m(
Freguency
0
0.01
1000
0.02
;:m:-:::* 0.03
•
f.~~~*
0.04
II 0.05
100
•• 0.06
0.07
10
•• 0.08
0.09
500 m
a) Iso-concentration map
High risk
Mediwn risk
Low risk
0 600 m
500 m
c) Probability map (PAH & ResistivM
II High risk
~ Mediwn ri sk
II Low risk
0
0 600m
- A probability map built by indicator kriging also, but taking into account the P AH
concentrations only (Figure 9a), and
- A map showing the area where the estimated PAH concentration (expected value) exceeds
the critical threshold. This map was obtained by ordinary kriging (Figure 9a).
If we look first at the estimated map (Figure 9a), we see that the area where the P AH
estimated value exceeds 200 ppm forms a rather homogenous, smoothly contoured, zone
concentrated around the cokeworks, where all the high sample values are located. If this map
was used for decision-making, one could come to the conclusion that this zone, representing
a surface of 135,900 square meters, is the only one requiring attention.
Looking now at the probability map produced by indicator kriging based on the P AH
,
concentrations only (Figure 9b), we see that the picture gets more complex: the center zone
,I
II
is still high risk but less homogeneously so, and the medium risk zone extends further to the
south. The medium and high risk zones, now, represent 190,700 square meters. However, I.
because oflack of direct information, the periphery appears mostly as a low risk zone. I
II
Finally, by including the information provided by the electrical resisitivity, we see that there
is a significant probability (medium risk) that peripheral areas to the south, but also to the
north and east, are contaminated. These areas would probably warrant further testing to
II
!,
confirm the level of contamination. In this case, the medium and high risk areas total 271,900
square meters.
From these results it is clear that estimated maps are inadequate for delineating risks of
contamination: they provide information on the expected value of the concentration, but not
on the local uncertainty. And because of their intrinsic smoothing properties, they may either
vastly overestimate or underestimate the extent of the risk zone. In this case the contour map
indicates a risk area (estimated concentration greater than 200 ppm) which is half the size of
the medium and high risk area as shown on the probability map inferred from both the P AH
concentrations and the electrical resistivity.
It is worth remembering also that the objective of such study is to provide a classification of
whether or not the soil, at a particular location, is likely to be affected by the contamination,
and not to provide a good estimate of the concentration at that location. The challenge is to
come up with probabilities and not with estimated concentrations. One may even argue that
the latter is irrelevant to the task at hand: high estimated values may correspond to a low
probability of exceedance and, conversely, moderate estimated values may be associated with
high probabilities of exceedance.
CONCLUSION
information into the estimation process it is crucial that the bivariate relationship existing
bp,tween the direct and indirect information be correctly rendered. Very often this relationship
cannot be captured by the classical parametric description based on the means and variances
of the marginal distributions and the correlation coefficient. A more general alternative
consists in using a full bivariate histogram to describe the relationship between the two
attributes, and it is proposed that this bivariate histogram be modeled by simulated annealing.
This bivariate probabilistic model can then be used to infer the local a priori distribution
function ofthe contaminant concentration given the known value of the secondary attribute.
This local a priori distribution function is finally conditioned to the existing local contaminant
data by simple indicator kriging.
This approach is very general and can be used to address many different situations. It should
be stressed, however, that the relevance of its results depends heavily on how physically
meaningful is the relationship between the main and co-attribute as described by the bivariate
probabilistic model.
APPENDIX I
This paper is concerned with P AH concentration and electrical resistivity. The technique
however is completely general and can be used each time that secondary information is
provided under the form of a bivariate histogram.
Z(u) random variable describing the attribute of main interest, informed by N data
values, z(uu), a= 1, ... , N
Y(u) random variable describing the secondary attribute, informed by M data values,
y(u~),p= 1, ... ,M
u location coordinates vector
Prob{ Zj_1 ::; Z(u) < z; and Yj-J::; Y(u) < yj}
with:
Zj, i =1, .. , M = number of thresholds discretizing the range of values [zmm' zmar]
Yp j= I, ... , ~ = number of thresholds discretizing the range of values [ymm. YmaJ
At a given location u, the a priori distribution function of the main attribute will be given by:
COLIN ET AL. ON GEOPHYSICAL DATA 85
It
EtcU; ZI; y(U»
1=1
F O(u; zit I y(u» = Prob{Z(u) ~ zit I y(u)} N,
(1)
I:
Ii"
,~
I",
I,
,
I,
1
~II I
with
and
(3)
; otherwise
Simple kriging with a bivariate histogram, therefore, involves the following steps:
1- Select the number thresholds~, k=1, ... ,K required to provide an adequate discretization
of the local ccdf This number K of thresholds depends on the goal of the study and needs
not to be as large as the number of thresholds ~ used to build the bivariate histogram;
2- For each datum z(u a ), define the residual indicator values r(ua,zJ, k=1, ... , K, using the
local a-priori distribution function FO(Ua,Zk) ;
'1,/,'I!
l'
86 GEOSTATISTICAL APPLICATIONS
3- Establish the variogram model y /h; z.) of the residual indicator values;
- determine the local a priori distribution of the main attribute Z(u), given the local
secondary attribute value y( u) using equation I;
5- Process the estimated ccdf to extract the required probabilities, quantiles or estimated
values.
In principle, this involves solving, at each grid node u, K simple kriging systems since
variogram models may be different for each threshold z.. Ifit can be shown that the variogram
model does not change significantly from one threshold to another, then it is sufficient to
solve a single system, and to use the same weighting scheme for every threshold. This
approach, called Median Indicator Kriging - or "mosaic-model" (Lemmer 1984; Journel
1984), can simplifY and speed up the whole estimation process.
REFERENCES
Wenlong, Xu, 1994, "Histogram and Scattergram Smoothing Using Convex Quadratic
Programming," SCRF proceedings, Stanford University.
Silverman B., 1986, "Density Estimation for Statistics and Data Analysis," Chapman and
Hall, New-York.
Tran, T., 1994, "Density Estimation using Kernel Methods," SCRF proceedings, Stanford
University.
Deutsch, c.v., 1994, "Constrained Modelling of Histogram and Cross-Plots with Simulated
Annealing," SCRF proceedings, Stanford University.
Press W., et aI., 1992, "Numerical Recipes in C, The Art of Scientific Computing,"
CambridgeUniversity Press.
Deutsch, C.V. and Journel, AG., 1992, "GSLIB Geostatistical Software Library and User's
Guide," Oxford University Press.
Deutsch, C.V., and Cockerham, P.W., 1994, "Practical Considerations in the Application
of Simulated Annealing to Stochastic Simulation," Mathematical Geology, Vol. 26,
No.1, pp. 67-82.
COLIN ET AL. ON GEOPHYSICAL DATA 87
Journel, A. G.,1989, "Fundamentals of Geostatistics in Five Lessons," Volume 8, Short
Course in Geology, American Geophysical Union, Washington D.C.
Lemmer, I. c., 1984, "Estimating Local Recoverable Reserves via Indicator Kriging, "in
G.Verly et aI., Geostatistics for Natural Resources Charaterization, pp. 349-364,
Reidel,Dodrecht, Holland.
Journel, A. G., 1984, "The Place of Non Parametric Geostatistics," in G.Verly et aI.,
Geostatistics for Natural Resources Charaterization, pp. 307-335, Reidel, Dodrecht,
Holland.
, ,
; ,
1::1
Iii
I':
I,'
"
'I
II"
I
I,
"I,
I
i
Michael R. Wild i and Shahrokh Rouhani 2
REFERENCE: Wild, M. R., Rouhani, S., "Effective Use of Field Screening Techniques in
Environmental Investigations: A Multivariate Geostatistical Approach," Geostatistics for
Environmental and Geotechnical Applications, ASTM STP 1283, R. Mohan Srivastava, Shahrokh
Rouhani, Marc Cromer, A. Ivan Johnson, Alexander J. Desbarats, Eds., American Society for
Testing and Materials, 1996.
ABSTRACT: Environmental investigations typically entail broad data gathering efforts which
include field screening surveys and laboratory analyses. Although usually collected extensively,
data from field screening surveys are rarely used in the actual delineation of media contamination.
On the other hand, laboratory analyses, which are used in the delineation, are minimized to avoid
potentially high cost. Multivariate geostatistical techniques, such as indicator cokriging, were
employed to incorporate volatile organic screening and laboratory data in order to better estimate
soil contamination concentrations at a underground storage tank site. In this work, the direct and
cross variographies are based on a multi-scale approach. The results indicate that soil gas
measurements show good correlations with laboratory data at large scales. These correlations,
however, can be masked by poor correlations at micro-scale distances. Consequently, a classical
direct correlation analysis between the two measured values is very likely to fail. In contrast, the
presented multi-scale co-estimation procedure provides tools for a cost-effective and reliable
assessment of soil contamination based on a combined use of laboratory and field screening data.
Assessing the extent of soil contamination can be very costly. Laboratory analysis of common
environmental contaminants can range from $200 to $1000 per sample for standard method
testing. Consequently, many investigations first use field screening techniques to help identify
relative levels of contamination and then select a few samples for laboratory analysis. In many
cases, the validity of field data are questioned and such data are rarely used in the actual
delineation of source contamination. This paper presents a geostatistical technique for an optimal
and defensible incorporation of field screening and laboratory data. This approach is intended to
iProject Environmental Engineer, Dames & Moore, Inc., Six Piedmont Center, 3525
Piedmont Road, Suite 500, Atlanta, Georgia 30305.
88
WILD AND ROUHANI ON FIELD SCREENING 89
accomplish the following objectives:
Several screening devices or kits are available to measure various contaminants, such as volatile
organics, metals and pesticides. The most commonly used devices are portable soil-gas probes, x-
ray fluorescence spectrometers, and immunoassay kits. The measured results of these tools are
mostly qualitative in nature and may not correlate well to actual laboratory measurements.
BACKGROUND INFORMATION
Screening tools for VOCs, such as photoionization detectors (PID) and organic vapor analyzers
(OVA), can provide an inexpensive alternative to laboratory testing, especially for large, multi-
layer investigations. Manufacturers of these instruments advocate the use of these devices as an
effective method of measuring VOCs for preliminary site characterization, including the
delineation of subsurface contamination. Unfortunately, the reliability of these devices has proven
to be dependent on weather conditions, soil type and actual contaminant concentrations. ,,''"
,,'
Several published case studies exist that incorporate data gathered using these instruments. One
study performed by Marrin and Kerfoot (1988) used a portable gas chromatograph and PID to
predict the extent of groundwater contamination by measuring volatile organics in the soil gas.
Thompson and Marrin (1987) also measured soil gas concentrations at 49 locations to estimate
groundwater contamination. The results of this estimation, however, were verified by an
inadequate number of groundwater samples (five). Crouch (1990) used gas detection tubes to
estimate contaminant concentrations in soil vapor. According to Crouch, gas detection tubes were
used because the OVA and PID are not compound specific and only provide a total VOC
measurement.
None of the above investigations performed any correlation analyses between soil gas probe
90 GEOSTATISTICAL APPLICATIONS
readings and laboratory measurements and therefore assumed that the measured results from these
s<::reening devices were reasonably accurate. Seigrist (1991) compared gas chromatography to two
PIDs of varying ionization potentials to measure volatile organics in a controlled environment.
The results of the comparison showed poor correlation between both PIDs and the gas
chromatograph and demonstrated that the PIDs were very sensitive to water vapor and responded
to natural organics including methane, ethylenes and alcohols. Smith and Jensen (1987) tried
correlating OVA and PID readings to laboratory measurements for total petroleum hydrocarbons
(TPH) in soil. Again, poor correlation prohibited using screening tools to estimate actual TPH,
and the authors cautioned against using screening tools as a sole criterion for determining soil
contamination.
The above works clearly indicate that the screening tools which are widely used and accepted in
the environmental field provide only qualitative information on total VOCs. The use of these tools
in delineating contamination is cautioned against because their effectiveness and reliability remain
unverified. On the other hand, comprehensive laboratory analyses of environmental investigations
can be prohibitively expensive. This study presents geostatistical procedures, such as cokriging, to
link the two measurement techniques and produce accurate maps of contamination.
Cokriging is defined as the estimation of one variable using not only observations of the variable
but also data on one or more additional, related variables defined over the same field (Olea 1991).
Cokriging is suitable for cases where the targeted variable is not sampled sufficiently to provide
acceptably precise estimates of that variable over the entire investigated field (Joumel and
Huijbregts 1978). Such estimates may then be improved by correlating this variable with better-
sampled auxiliary variables as a function of their separation distance. This approach is fully
compatible with actual field conditions where VOC screening data are extensively collected but
only relatively few samples are verified by laboratory analysis. However, it must be emphasized
that the utility of cokriging depends on the level of spatial cross-correlation between the primary
and auxiliary variables. To obtain an adequate cross-correlation between investigated variables, it
may become necessary to apply data transformations prior to the actual co-estimation process.
This may require use of non-linear cokriging techniques.
The use of non-linear geostatistical techniques is preferable if one or both of the variables exhibit
non-gaussian tendencies. Such distributions are commonly observed in contamination
assessments of VOCs in soil. VOC data sets are usually characterized by a few significant outliers
and a majority of very low or non-detectable samples. Indicator kriging has been found to be
useful for highly variant phenomena where data present long-tailed distributions (JoumelI983).
This characterization also applies with respect to mineral deposits data sets (lsaaks and Srivastava
1989). This type of kriging uses a non-parametric approach that does not suffer from the impact
of outliers since the original values are transformed to either a 0 or 1 based on cutoff or threshold
limits (Isaaks and Srivastava 1989). The transformed values can then be used to estimate the
spatial distribution of the data.
GEOSTATISTICAL METHODOLOGY
Geostatistics provides tools for the analysis of spatially correlated data and is well-suited to the
WILD AND ROUHANI ON FIELD SCREENING 91
study of natural phenomena (Journel and Huijbregts 1978). The theory of geostatistics has been
well-documented over the years; therefore, only a general description of techniques applicable to
this investigation are provided. These techniques are cokriging and indicator kriging. For more
information on geostatistics, see Journel and Huijbregts (1978).
Geostatistics allows for the estimation of values at unsampled locations. This estimation approach
is commonly known as kriging and is a linear combination of known nearby values, as shown by
(1)
where Zo* = the estimated value ofZ (an arbitrary parameter) at location xo;
Zj = the measured value at location Xj;
A. j = the kriging weight of the parameter value at Zj; and
n = the number of nearby sample points to be used in the estimation.
The weights are calculated to produce the lowest estimation error or variance and to satisfy the
unbiasedness condition (LA.j = 1, I = 1 to n). The minimized variance for ordinary kriging can be
written as
n
v'o Lj=! Ajy~ + Jl (2)
Cokri~in~
Cokriging is the estimation of one variable based on measured values of two or more variables.
This procedure can be regarded as a generalization of kriging in the sense that, at every location,
there is a vector [Z(Xj), Y(Xj)' ... ] of many variables instead of a single variable Z(x) (Olea 1991).
The variable to be estimated is denoted as the target or primary variable while all other variables
are categorized as auxiliary or secondary variables. The secondary variable is cross-correlated
with the primary variable. The cokriging procedure is especially advantageous in cases where
abundant secondary values are more abundant than primary variables.
Z~ = Lj=! AjZj +L
j=!
vjYj (3)
where Vj = the weight factor for the secondary variable, Y, measured at Xj; and
m = the number of secondary-variable measurements (which is typically much greater than
n).
92 GEOSTATISTICAL APPLICATIONS
Minimizing the variance of estimation error, Vo', subject to cokriging unbiasedness conditions
(~;\'i =1, ~Vj = 0) results in
n m
V; = L Ai y~ + L Vj y~Y
i=l j=l
+ III (4)
This technique of cokriging improves the estimation and reduces the variance of estimation error
(Ahmed and De Marsily 1987).
Indicator Kriging
.
l(X;Z) =
{I if z(x) S Zk
0 if z(x) >Zk (5)
By using kriging, the interpolated indicator variable at any point Xo can be estimated by
n
(7)
CASE STUDY
The case site has been in operation for over 50 fifty years and is currently a commercial site. The
area under investigation has a total of nine underground storage tanks (USTs) situated around the
site. Some of the tanks have been in operation for up to 30 years and were suspected to have
WILD AND ROUHANI ON FIELD SCREENING 93
leaked for an unknown number of years. The site is approximately 90 to 95 percent covered with
concrete and is relatively level. The investigated data and some site characteristics have been
altered to maintain confidentiality.
The horizontal and vertical contamination at the site were assessed by three boring programs. A
total of 82 borings were advanced. All USTs were eventually excavated and removed after the
environmental investigation. A confirmatory campaign was performed after the tanks were
removed and produced 12 additional sample locations. Figure 1 shows the locations of the 94
borings and the previous locations of the tanks. A Foxboro® OV A3 was used to screen for VOCs
in each boring for all four investigations. The OV A was used to screen over 300 samples from
these borings.
•
• •
,..
• •
•
•
• •
• • • ••••
•• • •
• • • ••
• .~
• • •• •
•• •• :~ • •
•
• •
•
• •
• • ••••• • •• •
• • • •
• •
•
-
iii
•
B.lidirg
UST
Sci BOfirg
In this case study, the boring campaigns collected VOC information over the entire site.
Laboratory samples were severely under collected for the size of the site. From an agency
standpoint, this site characterization would be incomplete and unacceptable. Figure 2
demonstrates the collected laboratory samples in the surficial soils.
3A Foxboro® OVA uses the principle of hydrogen flame ionization for detection and
measurement of total organic vapors. The OVA meter has a scale from 0 to 10 which can
be set to read at 1 X, 10 X or 100 X or 0-10 ppm, 10-100 ppm and 100-1000 ppm,
respectively. The OVA is factory-calibrated to methane.
94 GEOSTATISTICAL APPLICATIONS
•
• •SarrpIes sent 10 lab
• . 1520 I (restfu in PID)
.2.5
•
• • • • .25
2.5-
•
• · •• .. i.
.250.
· •• .: I.¥'1
100:x>.. •
•
• • a:::
•
:~~ : ..
• •• -;.52.5•••
~....
.
.
• • • •
• _l1li ElJIldng
• •
LIST
• • Soil Boring
o 20'-'
L---...J
Only benzene, ethylbenzene, toluene and xylene (BTEX) were investigated. All other compounds
detected in the samples, together, made up less than eight percent of the total volatile compounds
detected (EPA Method 8240 or Priority Pollutant compounds were tested) . These compounds,
which were mostly methylene chloride measurements, were consistently measured at low levels.
Both the OVA and VOC measurements were grouped into 3-foot intervals. Because the three
boring campaigns did not always collect data at consistent depths, the intervals had varying
amounts of data and spatial distribution. However, these measurements were distributed over
various depths. Figure 2 shows only 13 samples in the surficial layer which was the most
impacted layer in terms of horizontal extent.
Cross-Correlation Analysis
The correlation analysis perfonned for each BTEX compound versus the corresponding OVA
reading produced poor correlations (Figure 3). The highest correlation coefficient was for the
-
1~.--------------------------------.
1400
1200
ill-
--
8' WOO
E; 800 ' -
~
o 600 -
400 •
200
o~
10,000 20,000 30,000 40,000
° Ethylbenzene (Ppb)
Figure 3-- OVA to ethylbenzene correlation analysis.
complete OVA data set versus ethylbenzene yielding an R2 = 0.37 . This low correlation
coefficient indicates that direct correlation between laboratory and OVA measurements could not
be justified. As discussed previously, similar results were also found by many investigators.
A structural analysis was perfonned on the surficial soil OVA measurements to determine their
spatial correlation. Due to the qualitative nature of the OV A, the structural analysis exhibited a
high degree of variability. It was therefore concluded that, given the non-gaussian shape of the
histogram of OVA measurements, an indicator transfonnation was preferable.
Two approaches were considered for this analysis. The first approach, suggested by Isaaks and
Srivastava (1989), uses the median value of the data as the cutoff. Given the qualitative nature of
the OVA data, the median cutoff value may have no real significance. Therefore, a second
approach was developed. This approach identifies an OVA cutoff that would provide a high
degree of confidence that the soil is less contaminated than an established regulatory threshold for
petroleum hydrocarbons. This threshold was based on a review of a number of government
guidelines on petroleum-contaminated soils. The threshold or target value could then be used to
develop the conditional probability
Prob[Ethylbenzene ~ Target / OVA] (8)
96 GEOSTATISTICAL APPLICATIONS
Calculation of the conditional probability for varying target levels (Figure 4) showed that there was
a greater than 95 percent chance that the ethylbenzene level in the soil was equal to or less than a
100% ,
-..~~.~:~....-.-.---.....-..-
,
95% . -. -.
0 90%
-.-. -. -.
~
~ 85%
-'-
.0
£ 80%
75%
70%0
200 400 600 800 1000
OVA Readings (ppm)
Figure 4-- Conditional probability based on OV A readings and
regulatory cleanup standards.
20 parts per billion (ppb) target level, given an OVA reading of 20 parts per million (ppm) or less.
Therefore, a target level of 20 ppb for ethylbenzene was selected as a conservative cleanup
standard. The 20 ppm cutoff value for the OVA readings was similar to the median values for the
surficial soils.
Using the surficial soils data, the indicator variogram of OVA at 20 ppm (Y) is shown as Figure 5.
0.4 r - - - - - - - - - - - - - - - - - - - - - ,
0.3 • •
0.3
!
o
0.2
~ 0.2
0.1
0.1
Distance (meters)
Figure 5-- Indicator variogram of OVA measurements at 20
ppm threshold
WILD AND ROUHANI ON FIELD SCREENING 97
As shown by this figure, the variogram demonstrated a well-defined spatial structure.
Recalling Figure 2, only 13 surficial ethylbenzene measurements were available for mapping soil
contamination. As detennined in exploratory data analysis, the ethylbenzene measurements
exhibited a tendency toward a log normal distribution. To account for this tendency, the natural
log of the ethylbenzene measurements were taken. Furthermore, in order to avoid the possibility
of numerical errors in the cokriging process, the log-transformed values were then normalized
(mean =0, standard deviation =1). This made the latter data set numerically consistent with the
indicator OVA values, thus minimizing the chance of numerical errors in cokriging.
Unlike the OVA measurements, the variogram for the standardized, log-transformed ethylbenzene ,i'
measurements (Z) demonstrated a relatively poor spatial structure (Figure 6). This short range
prohibited accurate mapping with the current ethylbenzene data set.
1.4,-------------------,
1.2
•
~ 1.0
~ 0.8
0.6
10 20 30 40 50
Distance (meters)
Figure 6-- Direct variogram of normalized ethylbenzene
measurements .
All the above direct variographies were performed using the U.S. Environmental Protection
Agency (EPA) public domain program, GEO-EAS (Englund and Sparks 1988).
Cross-variography between the above two variables was conducted based on the linear model of
co-regionalization (Rouhani and WackernageI1990). These computations were conducted using
EPA's program, GEOPACK (Yates and Yates 1989). In this approach, the relationship between
the direct and cross variograms is defined as
98 GEOSTATISTICAL APPLICATIONS
such that
(10)
The ratio of the fitted ci2/,\b i represents the correlation coefficient (Ri 2) at a scale consistent with
the range of the ith basic variogram.
Figure 7 depicts the cross-variogram between the standardized, log-transfonned ethylbenzene and
0.1.....---------------------,
~ -0.1
5b
0-0.2
'la
~ -0.3
-0.4
• •
-O.5L-~-L-~-L-~-L-~-L_~_L_~~
o 10 20 30 40 50 60
Distance (meters)
Figure 7-- Cross-variogram of standardized ethylbenzene and
indicator OVA.
the indicator OVA. Table I summarizes the variogram models used for all structural analyses.
WILD AND ROUHANI ON FIELD SCREENING 99
TABLE 1 -- Summary ofvario&raphy.
CONCLUSIONS
The above results show that field screening of contaminated sites can provide valuable information
for characterization and mapping. This objective is accomplished by:
•
•
•
2.1' •
•
•
••
•
•
_• BUldng
UST
• • Soil Booing
• E.b<mooe • aJ ppb
REFERENCES
Crouch, M. S., 1990, "Check soil contamination easily," Chemical Enl:'ineerinl:' Prol:'ress, pp 41-45
Isaaks, E. H. and R. M. Srivastava, 1989, Applied Geostatistics, Oxfords University Press, New
York
Marrin, D. L. and H. Kerfoot, 1988, "Soil gas surveying techniques," Environmental Science and
Technology, 22(7), pp 740-745
Olea, R. A., 1991, Geostatistical Glossary and Multilingual Dictionary, International Association
of Mathematical Geology Studies in Mathematical Geology No.3, Oxford University Press, 1991
Rouhani, S. and M. Dillon, 1989, "Geostatistical risk mapping for regional water resources
studies," The Use of Computers in Water Management, in International Water Resources
Association - Technical Session, Moscow, pp 216-228
Siegrist, R. L., 1991, "Volatile organic compounds in contaminated soil. The nature and validity
of the measurement process," Conference - Characterization and Cleanup of Chemical Waste
Sites, Washington D.C., Journal of Hazardous Materials, 29(1), pp 3-15
Smith, P. G., and S. Jensen, 1987, "Assessing the validity of field screening of soil samples for
preliminary determination of hydrocarbon contamination," Superfund '87, Hazardous Materials
Control Resources Institute, pp 101-103
Sullivan, J., 1984, "Conditional recovery estimation through probability kriging theory and
practice," in G. Verly ~ aI..., eds., Geostatistics for Natural Resource Characteristics, Part I, D.
Reidel Publishing Co., Dordrecht, pp 365-384
Yates, S.R., and M.V. Yates, 1989, "Geostatistics for Waste Management: A User's Manual For
the GEOPACK (Version 1.0) Geostatistical Software System," EPA, R.S. Kerr ERL, Ada, OK
Robert L. Johnson l
INTRODUCTION
102
JOHNSON ON SAMPLING PROGRAMS 103
mobilization costs, drilling or bore hole expenses, and sample analysis costs
are all included. For example, the Department of Energy (DOE) estimates that it will
spend between $15 and $45 billion dollars for analytical services alone over the next 30
years to support environmental restoration activities at its facilities (DOE 1992).
One of the primary products of a site characterization study is an estimate of the
extent of contamination. Traditional characterization methodologies rely on pre-planned
sampling grids, off-site sample analyses, and multiple sampling programs to determine
contamination extent. Adaptive sampling programs present the potential for substantial
savings in the time and cost associated with characterizing the extent of contamination.
Adaptive sampling programs rely on recent advances in field analytical methods (FAMs)
to generate real-time information on the extent and level of contamination (McDonald et
al. 1994). Adaptive sampling programs result in more .cost effective characterizations by
reducing the analytical costs per sample collected, by limiting the number of samples
collected by strategically locating samples in response to field data, and finally by
bringing characterization to closure in the course of one sampling program. Adaptive
sampling programs can result in characterization cost savings on the order of 50% to
80% (Johnson, 1993).
Supporting adaptive sampling programs requires the ability to estimate the extent
of contamination based on available information, to measure the uncertainty associated
with those estimates, to determine the reduction of uncertainty one might expect from
collecting additional samples, and to direct sample collection so that sample locations
maximize information gained. Two key characteristics of contaminated sites must be 1".,
I·r'
taken into account. The first is that spatial autocorrelation is often present when samples
are collected. The second is that there may be abundant "soft" information regarding the 'Ii
'II
location and extent of contamination, even if little "hard" sample data are initially :r,
available. Soft data refers to information such as historical records, non-intrusive
geophysical survey results, preliminary fate and transport modeling results, aerial
photographs, past experience with similar sites, etc.
A number of geostatistical approaches to the design of sampling programs for
characterizing hazardous waste sites have been proposed in the past. Early methods
focused on minimizing some form of kriging variance (e.g., Olea 1984 and Rouhani
1985). More recent work has centered on stochastic conditional simulation techniques,
Bayesian implementations of geostatistics and more complex decision rules (for example,
Englund and Heravi 1992; McLaughlin et al. 1993; James and Gorelick 1994). In
practice, site characterization sampling program designs tend to blend rigid sampling
grids with selective sampling based on best engineering judgement. Typically there is
little quantitative analysis to support the final sampling program design.
A combined Bayesianlgeostatistical methodology is well suited to quantitative
adaptive sampling program support. Bayesian analysis allows the quantitative
integration of soft information with hard data. Geostatistical analysis provides a means
for interpolating results from locations where hard data exists, to areas where it does not.
A general Bayesianlgeostastical approach to merging soft and hard data is the Markov
Bayes model described by Deutsch and Journel (1992). The Markov Bayes model
estimates conditional cumulative density functions by developing covariance
relationships between soft and hard data sets, and pooling the two different data sources
through a form of indicator cokriging.
104 GEOSTATISTICAL APPLICATIONS
The methodology described in this paper exploits the fact that environmental
indicator sampling resembles binomial sampling. Binomial sampling events allow for
the derivation of conjugate prior and posterior probability density functions, which in
turn greatly simplifies computational effort. By incorporating soft information into an
initial conceptual model that is subsequently updated as hard sampling data becomes
available, the development of covariance models between soft and hard data is avoided.
The classification of areas as clean or contaminated, and the selection of additional
sampling locations, is based on a form of Type I and Type II error analysis, an approach
consistent with the Environmental Protection Agency's (EPA) Data Quality Objectives
approach to environmental restoration decision making.
METHODOLOGY
Classical statistics estimates the most likely value for 1t, the probability of
encountering contamination, by using hard sample data results. For example, if 20
random locations were sampled at a site and 5 of these samples returned contamination
levels above an action threshold, then an unbiased estimator of the true probability of
observing contamination above that threshold for any random location at the site would
be the number of hits divided by the number of samples, or 0.25. In classical statistics
one could carry the analysis one step further and develop confidence intervals around this
estimator with some basic assumptions about the underlying probability distribution.
Kriging provides similar results for individual points in space, accommodating spatial
autocorrelation as well. Neither classical statistics nor geostatistics provide a means of
quantitatively accommodating soft information in the analysis. For the design of
sampling programs to characterize contamination extent and subsequent analysis of
sampling program results, soft information often plays a crucial role.
A Bayesian approach differs from classical statistics by assuming that parameters
(such as the presence of contamination at a node) are unknown initially, but have some
known probability distribution called the prior probability density function (pdf). As
additional information becomes available (such as results from new sampling locations),
these prior pdfs can be updated quantitatively using Bayes' rule to produce posterior
probability density functions:
P(XIY) is the posterior pdf for X, P(X) is the prior pdf for X, and P(YIX) reflects the
probability distribution associated with observing Y given the prior pdf of X.
From a Bayesian perspective, a two parameter beta distribution Be(o:,~) is a
conjugate prior in the context of Bernoulli trials and the binomial distribution (Lee
1989). Be( o:,~) ranges between zero and one, and can assume a variety of shapes
depending on the values of 0: and~. For a random variable 1t that follows a beta
distribution, the expected value of 1t is given by:
JOHNSON ON SAMPLING PROGRAMS 105
£(11:) (2)
(ex + P)
where:
ex,P = parameters associated with the beta pdf for 11:, ex,P >= O.
Var{11:) (3)
where
Xi locations where samples have been collected;
Z(x) = 0 or I, depending on whether the sample at Xi encountered
106 GEOSTATISTICAL APPLICATIONS
E
i= 1
CijW i + WN*l for j= 1, .... ,N (5)
(6)
where
Cij =
covariance between sample locations Xi and Xj;
CjO =
covariance between sample location Xj and the point where the
interpolation is taking place, "0.
N* at Xo can be tied to N, the number of samples taken, through the following
relationship:
- 1 (7)
Varestim
N
Varestim Coo - (E WiCiO + f.L) (8)
i= 1
where
Varestim = the estimation variance associated with the interpolation of p* at location
xo;
Coo the variance of the indicator values;
~ = the average of the indicator values for the sample locations involved in
the updating.
Equation (7) is heuristically based. When the sampled locations are all "distant"
from the point of interest (i.e., greater than the spatial autocorrelation range), N* goes to
zero, implying that the sampled locations contribute no information at the point of
interest. As a sampled location comes close to the point of interest, N* goes to infinity,
indicating that the sample information has specified the probability at the point of interest
exactly.
The methodology begins by defining a uniform grid over the region of interest.
Grid nodes are designated as Decision Points (DPs). At each DP, a pdf based on the two
JOHNSON ON SAMPLING PROGRAMS 107
parameter beta distribution Be(ct,p) is defined. The beta pdf associated with each DP
describes the probability of encountering contamination above a pre-selected threshold
level at that DP. Initial values for ct and p are selected to represent a synthesis of any
soft information available for a site, using equations (2) and (3). For a particular DP, the
values of ct and p relative to each other determine the expected probability of
contamination at that DP. The absolute sizes of ct and p determine the certainty
associated with the beta distribution at that DP. For example, both ct =P=O.4 and ct
=P=40 result in an expected probability of contamination equal to 0.5. However, in the
latter case, the variance as calculated in equation (3) is much less. In the unlikely case
where no information is available at a particular DP, a "non-informative" prior can be
selected that sets ct and p equal to one at that DP.
Updating the set of decision points with hard sampling data requires knowledge
of the variogram or covariance function for the site. Because the values of p and N at x
are independent of Coo' the primary covariance function parameters of concern are its
shape, or functional form, and its range. If sufficient hard data exist, one can estimate
the covariance function from an experimental variogram analysis.
A simple measure of the uncertainty associated with contamination extent is to
categorize decision points as either "clean", "contaminated", or state uncertain at a given
certainty level, where the probability of contamination being present at any given
decision point is based on equation (2) using the posterior beta pdf parameters that are
associated with that decision point. For example, if one wishes to be 90% certain that the
classification is correct when a decision point is classified as either clean or
contaminated, then decision points with E(n) ranging between 0.1 and 0.9 would be
classified as state uncertain. This definition of uncertainty parallels the use of
uncertainty by the EPA in its Data Quality Objectives approach to decision-making.
This method for handling uncertainty also leads naturally to measures of benefit
one might expect from additional data collection. For example, one might wish to
sample those locations that would be expected to maximize the number of decision points
classified as "contaminated" or "clean" at a given certainty level, or to minimize the
number classified as state uncertain.
EXAMPLE APPLICATION
)(
x
x
The owner wants to avoid remediating soils that are actually clean, and also to minimize
his characterization costs. After negotiations, the regulator agrees to tolerate a 20%
chance that a soil volume identified as clean is actually contaminated. The owner will be
responsible for removing and remediating all areas that have greater than 20% chance of
contamination being present.
There is no initial hard sampling data for this site. The available soft information
includes the location of the lagoon, scattered survey points from which a terrain model
can be built to indicate the probable direction of overland flow and hence contaminant
migration, the location of a utility building on site that would have been a barrier to flow,
and the location of roads with embankments that would have also blocked flow . This
soft information is used to construct the initial conceptual image of where contamination
likely is, and where it likely is not.
A grid is superimposed over the site that consists of 625 decision points (Figure
JOHNSON ON SAMPLING PROGRAMS 109
----;.I.~
·.......~
•
.~.~
,..... ..'..................~.
....______
.. •.....••.. ... IO~
• • • • • • • • • • •: • • r• • • • • • • _
~ ~ ~
•••;
-
----I
x
••••••• ~.i ••••• ~ •••••••••
•••••••••••••••••••••••••
-'It • • • • • • • • • • • • • • • • • • • • . • ..• • x
_ • • • • • • • • '. . . _ fi • • • • • • • • • r •
• ••
•• ••
••
• ••
• • • ~ • .• • •.! • • • • • • • • • • • • • • • • '~
'.
'
x
· I~_o______:_o____,_o_o~
x
2). At each decision point, a beta distribution is defined, with parameters selected to
reflect the soft information available. For decision points that are in the building, IX is set
equal to zero and Pto a very large number to reflect the fact that the interior of the
building is known clean. For decision points within the lagoon, IX is set equal to a very
large number and p equal to zero, to reflect that fact that the lagoon is known to be
contaminated. For the balance of the decision points, IX and Pare set to values less than
0.5, with their relative sizes selected so that equation (2) reflects the initial probability of
the presence of contamination.
Figure 3 shows the gray-scale representation of the initial conceptual model once
the beta distribution parameter values have been selected, along with a set of terrain
contours based on-the available survey points. As is shown in Figure 3, the initial
conceptual image is faithful to the location of the lagoon, building, and land surface
contours. The area demarcated with the heavy black line indicates soil with
contamination probability greater than 0.2 based on this initial conceptual modeL At this
110 GEOSTATISTICAL APPLICATIONS
x
Contam. Probability
I
o 1.00
point, without any sampling, the owner would have to clean up 34 440 m2 of soil, more
than four times what is actually contaminated.
Before the adaptive sampling program can begin, the methodology requires a
covariance function. At the outset there is no hard data upon which to base a covariance
function choice. If the covariance function were selected to honor the initial
conceptualization, a range of approximately 200 meters would be used. The larger the
assumed range, however, the fewer the samples that would be required to characterize the
site. As a conservative start, for this example an isotropic exponential covariance
function is assumed with range of 50 meters.
A traditional sampling program for a site such as this would probably rely on a
pre-planned, regular sampling grid. As a point of comparison for the subsequent
adaptive sampling examples, Figure 4 shows an example pre-planned sampling program
based on a triangular grid pattern. The gray-shaded surface contained in Figure 4 shows
the results when a non-informative initial conceptual model is updated with the
JOHNSON ON SAMPLING PROGRAMS 111
I. Sampling Poinl I
x
Contam. Probability
0.011 0.99
information that would have been derived from this sampling program. The underlying
beta distribution parameters for each decision point were set to a = P=0.1. In this
scenario, the 14 samples result in classifying 23230 m2 of soil as requiring remedial
action (i.e., the probability of contamination for these soils is greater than 0.2). This
captures 87% of the soils actually contaminated, and includes 16230 m2 of
uncontaminated soil.
The classification of much of the clean area in Figure 4 as being contaminated is
a product of the uncertainty associated with the use of an "ignorant" or non-informative
prior during the updating process. If one uses the initial conceptual model shown in
Figure 3, and updates it with the results from the sampling program shown in Figure 4,
one obtains a different interpretation of the site. Figure 5 shows the results graphically.
Using an initial conceptual model that reflects what is known at the outset about the site
results in classifying 22 000 m2 of soil as requiring remedial action. This captures more
than 98% of the soil actually contaminated, and includes 14 190 m2 of uncontaminated
112 GEOSTATISTICAL APPLICATIONS
soil.
If one incorporates the underlying soft information available for the site, as
displayed in Figure 3, and then sequentially selects 14 sampling locations that maximize
the area that would be classified as clean at the 80% certainty level, then one obtains the
pre-planned sampling program shown in Figure 6. The sequential selection of sampling
locations proceeded as follows. First, a set of potential sampling locations based on a
tight grid was established. Second, each potential sampling point was evaluated based on
the impact sampling that point would have on the categorization of soils as requiring
remedial action or not. If a potential sampling location had already been selected for
sampling, then it was discarded. In this evaluation, it was assumed that the sampling
result observed would be the most likely result based on the initial conceptual model
conditioned with any locations either already sampled, or already selected for sampling.
The potential sampling location that provided the greatest increase in the area of soil
classified as clean would be added to the list of locations to sample. This process was
JOHNSON ON SAMPLING PROGRAMS 113
I_ Sampling Point I
x
Contam. Probability
!
o 1.00
)( x
I •. Sampling Point I
x
x
Contam. Probability
I
o 1.00
used for identifying the next sampling location, the difference is that the decision is
conditioned on actual sample results, not assumed sample results as in the selection
process for the pre-planned program. An adaptive sampling program at this site, driven
by the objective of maximizing the area classified as clean at the 80% certainty level,
would initially follow the same course as the pre-planned program shown in Figure 6.
The reason is that for the fourteen samples collected as part of the pre-planned sampling
program, all encountered what was expected---no contamination.
In the case of an adaptive sampling program, one has the additional option of
continuing sampling until the goals of the program have been met. Figure 7 shows the
locations of an additional 14 samples for this site, along with the results from updating
the underlying conceptual model with their results. The additional 14 samples reduced
the area classified as requiring remedial action to 10070 m2 • This included 96% of the
soil actually contaminated, and 2 460 m2 of uncontaminated soils. Each sample
reclassified, on average, 350 square meters of soil, a significantly smaller amount than
JOHNSON ON SAMPLING PROGRAMS 115
obtained from the first 14 sampled. There are two reasons for this: first, there is simply
less area available for reclassification to clean. The second is that the sampling has
begun to encounter the unexpected---contaminated soil.
CONCLUSIONS
Adaptive sampling programs provide the opportunity for significant cost savings
during the characterization of a hazardous waste site. The challenge for adaptive
sampling programs is providing real-time sampling program support that both
incorporates the typically significant amounts of soft information available, and that
accounts for the spatial autocorrelation that is omnipresent. A.joint Bayesian
analysis/indicator geostastistical method can be used to guide the selection of sampling
locations, to estimate the extent of contamination based on available data, and to
determine the expected benefits to be gained from additional sampling.
The example illustrates how the addition of soft information to the design of a
sampling program can result in a more directed sampling strategy. When the ability to
guide the program while in the field is added, the potential for cost savings is great.
l.\tlll
ACKNOWLEDGEMENTS
i
The work presented in this paper was funded through the Mixed Waste Landfill III
Integrated Demonstration, funded by the Office of Technology Development, Office of JII
j"
Environmental Restoration and Waste Management, U.S. Department of Energy through I'
contract W-31-109-ENG-38.
REFERENCES
Deutsch, C. V. And A.G. Joumel, GSLIB: Geostatistical Software Library and User's
~, Oxford University Press, New York, NY, 1992.
Johnson, R. L., Adaptive Sampling Strategy Support for the Un!ined Chromic Acid Pit.
116 GEOSTATISTICAL APPLICATIONS
Lee, P. M., Bayesian Statistics; An Introduction, Oxford University Press, New York,
NY, 1989.
McLaughlin, L. B., L. B. Reid, S.-G. Li, and 1. Hyman, "A Stochastic Method for
Characterizing Ground-Water Contamination", Ground Water, Vo!. 31, No.2, 1993, pp.
237-249.
Rouhani, S., "Variance Reduction Analysis", Water Resources Research, Vo!. 21, No.6,
June, 1985,pp. 837-846.
Kadri Dagdelen 1 and A. Keith Turner 2
INTRODUCTION "
,~ :
117
118 GEOSTATISTICAL APPLICATIONS
This paper describes why a data set coming from an environmental site
may not be suitable to be analyzed under the assumptions of stationary
random function model. It shows how ignoring this condition leads to
biased kriged estimates and documents approaches to address the
stationarity issue, hereby producing more accurate estimates of
contaminant distribution.
SITE DESCRIPTION
Geologic Framework
The site straddles the eastern margin of a portion of the Colorado Front
Range. The western portions are dominated by Precambrian high-grade
metamorphic and igneous intrusive rocks. Younger sandstone formations
are found to the east of the Precambrian rocks. These now dip away from
the mountain front at relatively steep angles, up to 500. Consequently,
the eastern portions of the site are entirely restricted to the lower
and middle portions of the Fountain Formation. Large and small
fractures, faults, and shear zones, some over a mile wide and extending
for many tens of miles are common in the Precambrian rocks. Renewed
movements along several of these zones of weakness introduced fractures
and faults within the younger sedimentary rocks. These sandstones are
partly covered by unconsolidated Quaternary and Holocene deposits,
composed of silty sandy gravels with substantial proportions of clay.
However, the older of these units represent pediment surface deposits
and are distinct from the younger units, which are valley-fill-alluvium
deposited at lower elevations in more geographically restricted areas
DAGDELEN AND TURNER ON STATIONARITY 119
following a period of valley down-cutting.
Three distinct hydrologic regimes are obvious at the site: the older
Precambrian rocks, younger sedimentary rocks, and overlying
unconsolidated deposits. Each has distinctive characteristics and
interactions between these regimes are relatively complex.
,.'
••
30&10.0
'.'
'.' ~O
'.' 2~O
..
1.5
e~o
'.' '.'
,---L--"---"2 1oks,---L'--'-'- 2
' _----'--'-'--'--.,2,0k7·.,--J- - ' --.,'I a60----'--'-'--'--"2_ -'--'---'---21
Eas~ng
Figure 1 . Map of the site showing drill hole locations (the five heavier
circles represent the five largest - valued samples) .
Variable
In those zones where upward flow from the bedrock into the alluvium
appears to dominate, most wells monitoring the alluvial units report TCE
contamination. Yet, in this same zone, the majority of wells monitoring
the bedrock ground-water flow system show no TCE contamination. In
contrast, in those zones where downward flow dominates, alluvial wells,
with only a few exceptions, report no TCE contamination at locations
where most bedrock wells report TCE contamination. In the zone where
neither upward nor downward flow appears to dominate, many bedrock and
alluvial wells report TCE contamination. TCE contamination of the
Fountain bedrock thus appears to be mostly restricted to those portions
of the site where downward ground-water movement from the alluvial units
may be occurring. In these locations, small groups of bedrock wells
reporting TCE contamination are surrounded by non-contaminated bedrock
wells. The reported TCE contaminants in the bedrock thus appear to be
directly related to downward movement of contaminants from the overlying
unconsolidated materials.
N612500
N609500
N608000 -1--,------,--------,-------,--------,------,------1
210 750 2103125 2104500
3000
ANALYSIS PROCEDURES
Limitations to the use of ordinary kriging are illustrated with the data
from this site. For these analyses, a threshold limit of 3.0 ppm TeE
contamination was selected, because it seemed to be about the lowest
reported value in any of the wells and was slightly less than the
drinking water standard of 5 mg/l. When data from the entire site were
combined and evaluated by ordinary kriging procedures, the resulting
bias produced over-estimation of the observed concentration values over
much of the site(Table 1). The data were then divided into subregions
defined by careful interpretation of geologic conditions at the site, as
described previously. These subregional data sets were individually
analyzed with ordinary kriging procedures, and although a lower degree
124 GEOSTATISTICAL APPLICATIONS
Figure 6 shows results of ordinary block kriging of the entire data set
using the above parameters, and a minimum of 3 and a maximum of 16
samples. The map suggests that almost all the areas covered by drill
holes are contaminated at levels exceeding 3.0 ppm, although examination
of the observational data revealed that only 43.5% of the drill holes
exceed this value. Considerable bias toward over-estimation has
apparently occurred (Table 1). Figure 7 shows the results observed by
cross-validation of kriged estimates and sampled values. Cross
validation allows testing of the estimation method at locations of
existing samples. The sample value at a particular location is
temporarily discarded from the sample data set; the value at the same
location is then estimated using the remaining samples. The procedure is
repeated for all available samples (Isaaks and Srivastava 1989). On
Figure 7 (and also in Figures 9,11,and 13, a circle enclosing a plus-
sign represents locations
where the sample value is below 3.0 ppm, yet the kriged estimate is
DAGDELEN AND TURNER ON STATIONARITY 125
greater than 3.0 ppm. Thirty six percent of the locations (25 of the 72
locations that had at least 3 samples within the search window) were
estimated as contaminated (over 3.0 ppm) when , in reality , the sampled
value was below 3.0 ppm. These results are summarized in Table 1 .
500
400
LEGEND
300
1.80
200
1.60
100 1.40
1.20
0 1.00
Y
0 .80
-100
0 .60
0 .40
-200
0.20
0.00
-300
-400
-500+"---P---""f----t=="'---t--=="'--+--'--+-----'t-----¥-----''-fL----''----+_
-50
X
Figure 4. Contour map of the pairwise relative variogram surface for
the TCE data set for the entire site .
:::--
'~ !,---:::-----:::--..
DIoc.r-,h
= - = - -=, ..'"
01, - . " ,
...,
N612500
Property Boundary
N6 11000
N609500
EiID"g
Figure 6. Map showing results of ordinary kriging using the data set for
the entire site
E9
E90
•
o + _+
+ + + e!>
E9 '::-ED +
Property Boundary
.,10 <!+~
.... E9
E9
E9
..... +
+
0 E9
Eas1ing
Figure 7 . Map showing cross-validations for ordinary kriging applied to
the set for the entire site.
128 GEOSTATISTICAL APPLICATIONS
N01401JO
N612500
N81101JO
N809500
N608000 750----'-----'-'~2..,"'ojh-,'>2.5:----'-'-----'----,-,,21.J5oo,--L---"'----,.21rno"~87r;-.--'--'--'-'--'-'21"01250 J
Fa!Dg
Figure 8. Map showing results of ordinary kriging applied to the data
sets for the subregions.
61~q,-'--'--.--'--r--.--'--.--r--'-'--'--'--r--.--'--'--r-,
.,.
Truevalues->
2fooksoo '
Eas1ing
Figure 9 . Map showing cross-validations for ordinary kriging applied to
the data sets for the subregions.
DAGDELEN AND TURNER ON STATIONARITY 129
Indicator Kriging Using Da·ta from Entire Site
CONCLUSIONS '''I
N60BOOO
2'.,hoo , 2,U$a7'-.--'---'-'-->'2f'Oh50r"----"--->'2fO*'2•.---"---'---.,2fl
•
+
o ++ (!X!)
_+
+ $ 1-:-$ +
~ <!+~
Property Boundary
.... $ •
$
$
..... 0 $
0
++
0 0
$
0.5
TnHl values->
6060Q0.'-cg,orr11"'7so"'-'-'----'-'-2""rn'~h''''25.--'---'-'-->'2,mohook.r''----''--.,2",,0'/.a'''7S.--''---''--.,2,fnort,!z,..,.--'----'------m,k,.,----'------'~rn/,nm
Easling
Figure 11 Map showing cross - validation of i n dicator kriging applied to
the data set for the entire site.
DAGDELEN AND TURNER ON STATIONARITY 131
N608000
1)-1750 '
B*g
Figure 12, Map showing estimated probability of the threshold
exceedence by indicator kriging applied to the data sets for the
subregion.
,---J----'-'-21~
...1~25,-c-...J''-.,2roo1soo,---J-'-J'-~'''''5--'---'-'~2~1"'!o6~---'--
' --'-'~21-.Jo8s:!-~---'----'-~L
Easting
Figure 13. Map showing cross -validations for indicator kriging applied
to the data sets for the subregions.
132 GEOSTATISTICAL APPLICATIONS
REFERENCES
REFERENCE: Leonte, D. and Schofield, N., "Evaluation of a Soil Contaminated Site and Clean-
up Criteria: A Geostatistical Approach," Geostatistics foI' EnviI'onmental and Geotechnical Ap-
plications, ASTM STP 128:5, R. Mohan SI'ivastava, Shahrokh Rouhani, Marc V. Cromer, A. Ivan John- ,II>
son, Alexander J. Desbarats, Eds., American Society for Testing and Materials, 1996.
Ii"
I<
I'
'I<
IP
rNTRODUCTrON
133
134 GEOSTATISTICAL APPLICATIONS
Lead 300
Arsenic (total) 100
Cadmium 20
Benzo(a)pyrene 1
'III
ti,'
Substance Background Env. investigation
Antimony 4 - 44 20 ""I
Arsenic 0.2 - 30 20
Barium 20 - 200
Cadmium 0.04 - 2 3 !:
"
Chromium 0.5 - 110 50
Cobalt 2 - 170
I
~::i
Copper 1 - 190 60
Lead <2 - 200 300
Mercury 0.001 - 0.1 1
Molybdenum < 1 - 20
Nickel 2 - 400 60 "'Ij
Site Description
also known that the whole area, being a long strip of land along a
~') 7"ner wharf, is heavily contaminated. The contamination is associated
with old practices of dumping, both domestic and industrial residues,
in times when legislation controlling waste disposal in Australia was
nonexistent and mudflat "reclamation" practices of this manner were
actually encouraged. These residues are known from other nearby areas,
to be of both Australian ane overseas sources.
A development proposal to use the site for a medium density
residential development comprising some 200 residential units and a
retirement village complex initiated a site assessment as an initial
evaluation of the potential for soil contamination.
The site was sampled by taking 54 mrn diameter continuous cores
from boreholes located on a grid of approximately 25 x 25 m2 . The
boreholes were sampled every 500 mrn to a depth of 3 m and submitted for
chemical analysis. The 1 000 - 1 500 mrn and 1 500 - 2 000 mrn layers
were alternated between boreholes. The depth to which samples were
taken was based on the depth to groundwater and natural soil, which was
recorded on the borelogs. Samples were taken by splitting the core
down the middle. The remaining core was retained for reference and in
the cases of "hotspots", was used for further testing.
Sampling of Hotspots
bores 76 and 75, located to the south and north of bore 109, indicated '"
1i!
1:1
a much lower concentration in layer 1 and a higher concentration in
layer 2, as shown below: I'
rd,
B 76 B 75
"I
I,
I I~
o- 500 mm 470 ppm 14 ppm HI
1'1::
500 - 1 000 mm 2 650 ppm 3 150 ppm 1:11
1 000 - 1 500 mm 180 ppm not sampled
1'1
Statistics
.J!!
(.)
... # data: 323
Mean: 693.8
Variance: 8190539
~ Coet.Var: 4.12
~
Cl Minimum: 4.0
'0 1st Quart: 20.000
c::
0 Median: 105.0
.€
8. 3rd Quart: 430.0
0.
e Maximum: 40800.0
mAD: 665.5
Lead ppm
II
measure used is the correlogram of Srivastava and Parker, 1988[ft].
Analysis of directional variograms did not indicate any preferred "
II
\. ~
orientation to the lead contamination. There is no reason to believe
that some scructured pattern of dumping lead contaminated waste would
have been used at the site. f''l
cases, the sample variograms have been reasonably fitted with a nuggeC
and a single exponential model.
o
C\I
CO')
o
I'-
C\I
I
Cl
o c:
C\I
C\I ~
W
o
I'-
o
C\I
o
I'-
o
C\I
o o o o
C\I I'- C\I C\I
C\I
Figure 2- Con kour map of lead con cencrations in the first soil lay er
(0 SOO rom) showing bo r ehole and hocspo t l ocak i ons in grey shading
LEONE AND SCHOFIELD ON CONTAMINATED SITE 141
Lead Variog,a",
:2
):'
~
.~
<a Model Parameters
>
co: 0.1 1 nugg
C1: 0.91 exp range: 52
O.
0.0 .4 120.5 1SO.7
La Distance h
212~
159<\.~~' 196'1,
:2
/--
):'
~
!?
'lij
> Model P..,.",e",,"
CO: 0.11 nugg
0
C1 : 0.88 exp range: 61
O.
0.0 .4 120.5 1SO.7
19f1.o\
r~
2124,
:2
):'
~
.~
.
> Model Parame""8
CO: 0.51 nugg
0
C1 : 0.51 axp range: 72
O.
0.0 .4 120.5 1SO.7
La Distance h
220
ICI
t: 120
:E
~
o
Z
20
20 120 220 320
Easting (m)
Layer 2, Pr (Pb > 300 ppm)
220
E
CI
t: 120
:E
~
o
Z
20
20 120 220 320
Easting (m)
Figure 4 Cont ours of the conditional probability for lead
concentration to exceed the recommended level of 300 ppm. Hotspots are
shown by grey shading.
LEONE AND SCHOFIELD ON CONTAMINATED SITE 143
Figure 5 presents contour maps 0;: t he: '=s ·,:il r,at. e~1 pr::>(.clinl i ty for
lead concentration to exceed 500 ppm in soils for layers 1 and 2. On
these maps, the areas with very low probability of contamination are
clearly shown. The areas with potentially high contamination are also
clear with the southern area (20 m x 120 m northing and 270 m x 320 m
easting) again standing out as unrecognised by the previous
investigation.
~,
0.1-,--
1
7 -- - , -
1 1
I
01
~ 120
:E
t
o
Z
20
20 120 220 320
Easting (m)
Layer 2, Pr (Pb > 500 ppm)
220
I
01
~ 120
:E
t
o
Z
20
20 120 220 320
Easting (m)
Figure 5 Contours of the conditional probability for lead
concentration to exceed the recommended level of 500 ppm. Hotspots are
shown by grey shading .
144 GEOSTATISTICAL APPLICATIONS
CONCLtl'SrONS
1111
'I
I
I
I
2
Amilcar O. Soares', Pedro 1. Patinha2, Maria J. Pereira
ABSTRACT: This study aims to develop a methodology to simulate the joint behaviour,
in space and time, of some water quality indicators of a river, resulting from a mine
effluent discharge, in order to enable the prediction of extreme scenarios for the entire
system. Considering one pollutant characteristic measured in N monitoring stations along
the time T, a random function X(e, t), e=l, ... ,N, t=1, .. . ,T, can be defined. The proposed
methodology, a data driven approach, intends to simulate the realisation of a variable
located in station e in time t, based on values located before e and t, and using a sequential
algorithm. To simulate one value from the cumulative distribution F{X(e,t) I X(e-J,t), ...
,xU, t),x(e,t-I), ... ,x(e, I)}, the basic idea of the proposed methodology is to replace the e.t
conditioning values by a linear combination of those:
* e-I (-I
[x(e, t)1 = L: au X( u, t} + L: brt X( e, ~) which allows the values to be drawn sequentially
u=1 ~=I I-'
from bidistributions. The final simulated time series of pH and dissolved oxygen
reproduce the basic statistics and the experimental time and spatial covariances calculated
from historical data recorded over 15 months at a selected number of monitoring stations
on a river with an effluent discharge of a copper mine located in the south of Portugal.
'Professor, CVRM - Centro de Valoriza<;:ao de Recursos Minerais, Instituto Superior Tecnico, Av.
Rovisco Pais, 1096 Lisboa Codex, Portugal.
'Research Fellow, CVRM - Centro de Valoriza<;:ao de Recursos Minerais, Instituto Superior
Tecnico, Av. Rovisco Pais, 1096 Lisboa Codex, Portugal.
146
SOARES ET AL. ON SPACE-TIME SERIES 147
Monitoring Station N
t=1 draw a value Xl of F(X(N,I))
t=2 draw a value X2 of F(X(N,2)IX(N, 1)=Xl)
t=T draw a value Xr of F(X(N, 1)1X(N, 1)=x J, ••• ,x(N,T-l )=Xr-l)
The basic idea of the proposed methodology is to substitute the NxT conditioning
values by a linear combination of these:
N-l T-l
1
Ij
Monitoring Station N
I
":1
t=1 draw a value Xl of F(X(N,I))
t=2 draw a value X2 of F(X(N,2) I [X(N,2)]* = X(N,I)) = Xl)
,Ii
il
."I t=T draw a value xrof F(X(N,1) I [X(N, 1)]* = a\X(1,T-l)+ ...+b r_lX(N-l,T-l))
= F(X(N, 1)laIXl+H+br-l.xr-d
/ If the same pattern of neighbourhood values (in space and time) are used to
,,''" calculate all [X(e,t)]*, e=1 ,N, t=1 ,T, the bidistributions functions F(X(e,t) I [X(e,t)]*) can
'I
be inferred from the historical data.
Using the same neighbourhood pattern one can calculate the pairs (X(e,t), [X(e,t)]
*) for all t=1,T and e=I,N. The bi-plots (X(e,t),[X(e,t)]*) can thus be calculated for each
monitoring station and for homogeneous periods of time (see Fig.1). To simulate one
value for X(e,t) conditioned on the known value [X(e,t)]* =x* (calculated with the
previously simulated values X( e-l ,t-l)), first we need to select all pairs which belong to
the class of [X(e,t)]* and, afterwards, one value ofX(e,t) is drawn randomly from them.
Once the value Xs is simulated, X(e,t)=xs will be part of the next estimators [
(X(e+1,t)]* or [X(e,t+1)]*.
SOARES ET AL. ON SPACE·TIME SERIES 149
F(x(e,t) I x· = [X(e,t»)' X
• •
• • ••
• I.
I.' .
X .:~.'~ ••
S(o.') • 'C"•• '" •
~---t_--Jr----::-.~,t-' • •~ •
... .. ...
' :• '1#' :
~. ~ ; :
: ~'. ~
•• : • j
With this ty'pe of data driven approach usually one wishes to mime the real data in
some statistics of fundamental variables, which measure their spatial and time structures.
The proposed algorithm generates a time series in each monitoring station with some
statistics identical to the historical data: means, marginal histograms and spatial and time
correlations.
Considering for each monitoring station e, the statistics of historical data:
m(e)- mean of the time data
Fx(X;e)=prob{X(e,t)<x}- the marginal cdf of e
Ce(h) = E{X(e, t)X(e, t+h}- m(e)2 - the time covariance in e, and the spatial
covariance C(h)=E{X(e,t)X(e+h,t}- m(e).m(e+h)'
Ce(h) and C(h) are measures ofthe spatial and time structure of the variableX(e,t).
If the simulation sequence starts with a limited set of conditioning values with the
marginal distribution FxCX;e), the bidistribution F(X(e,t)1 [X(e,t)]*) will generate a
simulated time series with the same mean: m'(e) = m(e), and same marginal cdf F/ (X;e)
= FxCX;e) .
The simulated values Xs(e,t) reproduce the covariances between X(e,t) and [X(e,t)
]*:
150 GEOSTATISTICAL APPLICATIONS
e-J /-1
[1]
The simulated values Xs(e,t) reproduce an average of the time and space
covariances. Consequently, the weights au and bl3 must be chosen in such way that the
individual covariances are represented in final time series.
To define [X(e, t )]* = La" X(u, t) + Lb~X(e, (3) one could choose a centered
H
and minimum variance estimator: E{X(e,t)}= E{[X(e,t)]*} and min (var{X(e,t)-[X(e,t)]
*}), which entails to a kriging system written in terms of time and space covariances. For
., simplicity sake let us use the same notation for the weights and for the spatial and time
location. The estimator of any location Xo is written:
l
"
[X(xo)r = L A"X(X,,)
"
J~ A"C(X",x~) + Jl = C(x",xo)
l~ A~ = 1
The solution vector A combines two distinct effects:
i) The proximity of the samples to the estimated point Xo determine their influence
nd
in terms of weights (2 member of the kriging system).
ii) The declusterizing effect of the first member of kriging system leads usually to
underweighting of clustered samples.
'I Now the problem is that the samples in time series are usually clustered. Thus, due
to the declusterizing effect, this minimum variance estimator tends to overweight the
influence of the nearest sample and underweight the others influence. Consequently, the
average covariance of [1] represents mainly the short distances structures.
To avoid this drawback, in this study an estimator has been chosen which
accounts only for the proximity effect, i.e., the weights are directly proportional to the
correlation coefficient (p) between any sample Xu and the estimated point Xo :
1I l
A" = Pu,o + Nl 1- ~P~o J
Note: this is the solution of the kriging system when there is a null correlation between
the conditioning samples Xu .
SOARES ET AL. ON SPACE-TIME SERIES 151
Note - This estimator was chosen for this particular case to avoid the overweighting of the
small distances structures resulting from the ordinary kriging of clustered string data. The
solution consisted of putting all covariances between samples equal to zero. However,
other corrections of the covariances between samples can be adopted ( for example
Deutsch 1994, suggests a kriging estimator with a 10umel ' s redundancy measure
correction of the 1s l member of the equations system, for similar purpose).
The data set used to implement the stochastic simulation consisted of monitored
values of pH (daily analysis in lab) and dissolved oxygen at 4 monitoring stations located
along a river, with a effluent discharge of a mine (Fig. 2), during a period of 15 months.
The monitored data are shown in Figs.3a, 3b and FigsAa, 4b representing the time series,
histograms and time variograms of 15 months period for pH and dissolved oxygen.
Mine Site
; ..
Ir,. '
Fig 2 - Layout of the mine site and the 4 monitoring stations along the river.
152 GEOSTATISTICAL APPLICATIONS
"
10 120
80
~onitoring Station 2
pH R===~====~============R F ~----~~~~--~~--~
13
200
12
"
10
16 0
120
80
40
14
pH ~====~===============~
~o=n=it=o~riRng~tartrio~n~3______
. . __-,____________-.,
13
200 _ ..
12
,--
160
"
10
120
80
~ ..
40
soo o r--
10 12 14
pit
pH
~==========~==~=====~
==o=n=it=o~riRng~tartrio_n_4____r -__________- '______"
13 -
200
12
160
"
10
120
80
40
....'.
1.60
,
. 3.20
..... . ........
,
1. 20 2.40
.. '. .,.,
0.80
0.40
,'
....
....
,
"
1. 60 ", '.'
0.00
100 200 300
days
..
." .. .
120 0.16
.. '
.... ..... .................
II II _ ," •••••••
090 0.12 +-.a.----',-,WO..
-.~-~-------l
"'"
OJO
11.08
O.()4
.
.................
...............
Conditional Simulation
The estimators [X(e, t)]* were calculated with four samples before time t and two
samples spatially located before e:
[X(e, t)]*=a, X(e-l ,t)+ a2 X(e-2,t)+ b, X(e,t-l)+ b2 X(e,t-2)+ b3 X(e,t-3)+ b4 X(e,t-4)
Based on experimental bi-plots {X(e, t), [X(e,t)]*} for each monitoring station, the
simulation procedure has been initialised with real time series of the first station. The
resulting simulated time series of the remaining 3 stations are shown in Figs. 5, 6 and 7a,
7b, for the two elements studied. The time correlations of the real data are quite
satisfactory reproduced in the variograms of the simulated values.
154 GEOSTATISTICAL APPLICATIONS
Monitoring Start-ri_
o n~l~~~_ _~~-.-,--.-,--.-,-~~..--,
00 F
90 .0
80
60
70
40
60
50 20
10 20 30 40 50
40 100
days
Monitoring Sta_t~i;c
o;cn .::2;.,.-~~_ _~~-.--,-~-.-,-~-.-,-..-.--,
00 RF==~==~============R F r
90 80
80
60
70
40
60
50 20
40 90 100
days
Monitoring Station 3
00 RF==~====~==~======~ F ~--~--~~~--~~--~
90 80
80
60
40
20
40 50 60 90 100
days
Monitoring Station 4
00 R==========t=I Fr====:-:r:::::::==:-:r:::::::==::::q
90
80
80
60
70
50 20
40
100 200 300 500
40 90 100
days
.. '.'
.'0
. .' '. " "
.'." . ..
48
.'
0
48
.. ,.... ..
32 32
..... '
16 )6 '.
0 0
0 100 200 300 400 0 100 200 300 400
dlY' d.ys
..
y(d) y(d)
.......
. . ..'. '.
..........
128 64
.... ...... .
'
. .
96
..' ...... . .
'.' 48
.. . . . ' ...... .'. . I. I,
'0
64 ..... 32 .
32 16
Non-Conditional Simulation
The non-conditional simulation is presented only for pH. The time series of all
monitoring stations were simulated including the first one corresponding to the mine
effluent, based on experimental bi-histograms {X(e, t), [X(e,t)]*} .
The simulated time series of the four stations are reproduced in Figs. 8a, 8b as
well as the time covariances of the simulated values.
pH P===~========~======~
Monitoring Station 2
~--~~~__..-.. . _ -..
-
_ ~.- ~~~---r;
13 200
12
16 0
II
10 120
~::
9
.::: •• •• • • • 80
40
6
o 100 200 300 400 500
14
days
Monitoring Station 3
pH ~==~=============q ~~~~--~~~--~
13 200
12
160
II
~• '::
10
•••••
6
o 100 200 300 400 500
10 12 14
days pH
Monitoring Station 4
pH ~==========~====~ ~--~--~--------~
13
20 0
12
II 160
n
80
40 ~ ..
Monitoring Sta;::.t:;:i;o:.:n;.:..:::2~~~~~~~~_......~~..,
DO ~====~~====~~====~
~ .•••••~••
80
-
1--1-
60
40
20
40 b===~==~====~======~
100 200 300 400 500
days
40 '0 60 70 80 90 100
DO
60
40
40 b=====================~
100 200 300 400 500
20
0 "-
rh--
40 '0 '0 70 80 90 100
days DO
DO I-
~==========~=========M~on=i=to=rii:ngSta~t~io~n~4~~-r......--'~~_~-r_~
90 80 1-·
60 I--
40
20
....... ,.......,
3.2 1.2
..··0
," , ............
2.4
0.6
.. .......
1.6
0.3 "
0.8
............. .............
0.0 ~~~~~~~~~~~~~~~~~~
o 100 300 400 100 200 300 400
days
Monitoring Station 3
y(dl
0 16
, ,
0. 12 ...',..... ....... .......
..... ,-
0.08
"
.........
0.04
0.00 h~~~~~~~~~~~~~...--1
o 100 200 300 4()()
doys
48
"
", 96 ". " ....... "
",
-..,.
32 II
0"
','
•••••••
64 ......,," , ,
16 32
Monitoring Station 3
,(dl
32
16
0
0 100 200 300 400
doys
Monitoring Sta;.:t.;cio:.:n;....::.l~~~__~_~~~~~...,...,
PH
200
160
120
.0
40
100 300
"'" 12 !4
160
120
'0
40
300
100 200
do,.. "'" 14
PH ~======~====~======~M=o=n=i=
to=r~
ingStartrio_n~3~-r_____r -__________-.~
Il
200
" 160
120
80
40
"'" 10
_"
12 14
~=====================M
==o=n=it=o=r~
ingStartrio~nr-4__-._____r -__________- . ,
PH I:-
13 ..•..•••••.•.••••• . •.••.•••.•..
210
"
II .- .............. -.... __ ... ..... .. .. . ....... .... -.......... . 200
~: : : :
100
'0
CONCLUSIONS
REFERENCES
Deutsch, C., and Journel, A , 1992, GSLIB : Geostatistical Software Library and User 's
Guide,Oxford University Press, New York
Deutsch, C. , 1994, "Kriging with Strings of Data", Mathematical Geology, Vol 26, n- 5,
pp 623-638.
Johnson, M., 1987, Multivariate Statistical Simulation, John Wiley & Sons, New York
Law, A, and Kelton, D., 1982, Simulation Modeling and Analysis, McGraw Hill Int.Ed.,
New York
Ripley, B., 1987, Stochastic Simulation, John Wiley & Sons, New York
I
~
l 2 2
Gary N. Kuhn , Wayne E. Woldt , David D. Jones , Dennis D. Schulte3
I'I'
I;
I
Ii SOLID WASTE DISPOSAL SITE CHARACTERIZA nON USING NON-INTRUSIVE
ELECTROMAGNETIC SURVEY TECHNIQUES AND GEOSTATISnCS
\
~ I
REFERENCE: Kuhn G. N., Woldt W. E., Jones D. D., Schulte D. D., "Solid Waste
Disposal Site Characterization Using Non-Intrusive Electromagnetic Survey Tech-
niques and Geostatistics," Geostatistics for Environmental and Geotechnical Applica-
tions, ASTM STP 1283, R. M. Srivastava, S. Rouhani, M. V. Cromer, A. 1. Johnson, A. 1.
Desbarats, Eds., American Society for Testing and Materials, Philadelphia, 1995.
162
KUHN ET AL. ON SOLID WASTE DISPOSAL 163
INTRODUCTION
Past landfill management and operational practices in the United States have created
environmental problems and are commonly associated with soil and groundwater
contamination. These landfills have either been upgraded to meet current State and
Federal legislation or have closed. As a result, the number of operational landfills has
decreased from over 20,000 in 1978 to approximately 3,300 in 1994. Because of strict
State and Federal legislation passed dealing with closure of these landfills, the severe
impact that past operations had on the environment is becoming more apparent.
For example, in 1987 the State of Nebraska required the Nebraska Department of
Environmental Quality (NDEQ) to conduct a comprehensive assessment of all
community solid waste disposal sites (SWDS). The purpose of this assessment was to
ascertain compliance of SWDS to standards established by the Nebraska Environmental
Protection Act (NEPA) and the Federal Resource Conservation and Recovery Act
Subtitle D (RCRA Subtitle D)(SCS 1991).
In 1991, nearly 210 landfills in Nebraska had ceased operations or were recommended for
closure by NDEQ. This recommendation was based on insufficient capacities or
significant constraints imposed on owners to maintain compliance with RCRA Subtitle D
requirements (SCS 1991). Due to the risk posed to Nebraska's surface and groundwater
by the existence of unlicensed landfills, the NDEQ focused on closure activities.
Information compiled by NDEQ identified sites located near public or private drinking
water sources, that were underlain by a shallow water table surface, or were located in a
100 year flood plane~
The survey revealed that a significant number of SWDS should have further
investigation. Based on this study, the NDEQ directed most of its efforts at the currently
unregulated sites utilized primarily by rural communities (NDEQ 1990). These sites have
,
not been subject to regulation since 1972 when the Whitney Amendment to the Nebraska I
Environmental Protection Act specifically exempted all cities within the second class L
(5,000 population or less) and villages from state solid waste rules and regulations
(NDEQ 1990). Although these sites were exempt from state solid waste regulations,
NDEQ revoked this amendment in 1991 when RCRA Subtitle D was reauthorized.
As solid waste disposal sites are forced to close over the next few years, hundreds of
millions of dollars will be spent in the United States to identify, characterize and
remediate sites contaminated with hazardous materials. Traditional site investigation
techniques typically include compiling hydrogeologic and contaminant fate and transport
information from testhole and groundwater monitoring well data. Commonly,
background information is limited until the results of the first round of groundwater
samples is available. Only then does it become apparent that the plume may not be
completely delineated and additional monitoring wells are required.
164 GEOSTATISTICAL APPLICATIONS
To address this problem, non-intrusive field methods and geostatistical analysis tools
were utilized to gather preliminary subsurface information pertaining to hydrogeologic
features, horizontal and vertical extents of wastes and suspected leachate plumes. This
information can be utilized to optimize and minimize testhole or permanent monitoring
well locations.
LITERATURE REVIEW
Surface electrical methods have been used successfully in many types of subsurface
investigations. Kelly (1976) showed that the d-c resistivity method can be effective in
delineating a plume moving off-site from a landfill. The use of EM data sources for
delineation of contaminated groundwater has been described by Greenhouse and Siaine
(1986). French et al. (1988) utilized geoelectric surveying to identify anomalous regions
to focus subsequent boring and sampling activities. Hagemeister (1993) identified
potential waste volumes and suspected contaminant migration present at an unregulated
landfill. In each case, differing electrical conductivity was interpreted as an indication of
changes in the systems being investigated.
DATA COLLECTION
f A three dimensional data set, developed by obtaining readings at several sounding depths
\ across a gridded area, was subjected to geostatistical analyses. This data set was utilized
\ in conjunction with available background data to identify pertinent subsurface features
\ and approximate their general locations. The background data included boring logs,
\ groundwater analytical reports and industrial waste disposal permits. These permits
\\,
KUHN ET AL. ON SOLID WASTE DISPOSAL 165
allowed the disposal of industrial wastes until the late 1970' s. The methods utilized
during this study consisted of establishing a sampling grid, completing an
electromagnetic survey, and performing a geostatistical analysis. Each procedure was
directed towards non-intrusive characterization of the subsurface environment. Existing
testhole data was correlated with the predicted locations of pertinent site features for
validation purposes.
Site Description
Based upon information obtained from the NDEQ, the study site was operated as a
"trench and fill" SWDS from 1975 to 1987. During this time, the owner accepted
domestic and industrial waste from nearby rural communities. Information pertaining to
the actual quantities received are not available.
The site and the surrounding areas are located near the eastern most edge of the Nebraska
Sandhills region. The topography of this region is mostly undulating to rolling. The
elevation of the site is approximately 460 to 466 meters above mean sea level (MSL) near
the northeast and southwest corners, respectively (NDEQ 1990). The surface geology
consists of approximately 38 to 42 meters of fine to medium grain sands interbedded with
coarse sand and fine gravel deposits, which is characteristic of this region.
The uppermost monitorable aquifer is located in sand and gravel deposits of the High
Plains aquifer system. The water table is approximately 11 and 15 meters below grade
level (BGL) in the northeast and southwest corners of the site, respectively, and the
saturated thickness of the unconfined aquifer is approximately 27 meters (NDEQ 1990).
Based on regional bedrock maps for this area, it appears that the top of the uppermost
confining unit is the Niobrara Shale formation that underlies the water table aquifer at an
approximate elevation ranging from 422 to 424 meters above MSL near the northeast and
southeast corners of the site, respectively.
I
Under a Multi-Site Cooperative Agreement with Region VII of the Environmental I
Protection Agency (EPA), the NDEQ performed a Preliminary Assessment (PA) at the
site to assess the threat posed by the site to human health and the environment. The
NDEQ concluded that leachate from the site resulted in a leachate contaminant plume
migrating in an east-northeast direction towards a river 2.5 kilometers away. Because of
the low human and livestock population in the area, no evidence was found indicating
that the site posed an immediate threat to human health and the environment (NDEQ
1990).
Interviews with the site owner revealed that the standard operating procedures involved
excavating a 5 meter deep cell with a backhoe, depositing refuse at the toe of the working
face, compacting the refuse and providing 15 centimeters of daily cover material. After
each cell was completely filled, 1 to 1.5 meters of silty clay was placed on top of the
waste as a final cover. Based on this information, MSL elevations were assigned to the
pertinent subsurface features and are presented in Table 1.
166 GEOSTATISTICAL APPLICATIONS
Sampling Grid
Sampling point locations were established based on minimizing data collection efforts,
maintaining minimum measurement support volumes of each instrument, and spatially
defining the study area. The sampling point spacing utilized at the site was
approximately 30 meters in the north and east directions and extended nearly 30 meters
beyond all four property boundaries. The horizontal extent was selected based on the
geology and obtaining an adequate number of sampling points beyond the limits of the
suspected landfill cells to establish background subsurface conductivity levels.
Electromagnetic Survey
'I:
field. The EM receiver coil intercepts a portion of the magnetic field from each loop
I
generated by the transmitter coil and results in an output voltage which is also linearly
proportional to the terrain conductivity. The resulting reading is in milli-Siemans per
meter (mS/m)
II
~: The reading obtained from the EM instruments is a conductivity measurement averaged
iji over a volume of subsurface media. Because the effective depths of penetration are
small in comparison to the overall horizontal and vertical dimensions of the site, these
~' readings were interpreted as being representative of a sampling point at the calculated
,!
I effective depth.
Ii
The effective depth of penetration by the induced current is directly proportional to the
intercoil spacing and depends on the orientation of the instrument. By varying the
intercoil spacing, conductivity measurements can be collected at varying depths. Also,
operating the instrument in the horizontal dipole mode reduces the effective depth of
penetration by approximately one halfthat of the vertical dipole mode. Therefore, the
instrument was operated in the horizontal and vertical dipole positions at four different
intercoil spacings to obtain readings at eight different depths.
KUHN ET AL. ON SOLID WASTE DISPOSAL 167
Theoretically, the total instrument response represents a weighted average of subsurface
conductivities with a depth of infinity, but it does have practical limits. Interpretation, or
modeling, of geophysical data to determine a reasonable unique solution to the nonunique
problem was not performed. Although modeling this data provides a more
comprehensive interpretation of the data set, the preliminary nature ofthis research did
not warrant the level effort involved with the modeling process. Instead, Hagemeister
(1993) calculated effective exploration depths offour intercoil spacings for both the
vertical and horizontal dipole modes. These calculations are based on the assumption that
60 percent of the total signal contribution over a volume of subsurface media is
associated with a discernible layer. Based on the small diameter and thickness of the
support volume, in relation to the overall area of the site and the preliminary nature of the
investigation, each instrument reading was assigned to a point located at the centroid of
the calculated effective depth of penetration. Table 2 depicts the exploration depths at
various intercoil spacings.
GEOSTATISTICAL ANALYSIS
Environmental professionals are often confronted with the problem of providing detailed
information about a site based on a minimum number of sampling points. Geostatistics ,
provides a means of utilizing spatial continuity for estimating the expected value at I
unsampled locations. Geostatistics is commonly utilized to describe the spatial continuity I.
of earth science data and aims at understanding and modeling the spatial variability of the
data.
The geostatistical analytical process for this study consisted of 1) describing and
understanding the statistical distribution of the data, 2) modeling the spatial variability of
the data, 3) estimating expected values at unsampled locations and 4) computing
estimation variance values at the unsampled locations.
Univariate Description
Univariate description deals with organizing, presenting and summarizing data and
provides an effective means of describing the data by identifying outliers and extreme
values. The univariate descriptive tools utilized to analyze the conductivity data set were:
168 GEOSTATISTICAL APPLICATIONS
700 ,----------------------,
1) histograms, 2) probability plots and 3)
summary statistics. Because the data set is
618
600
three dimensional, a descriptive scatter plot
could not effectively be obtained.
500
Geo-EAS 1.2.1 Geostatistical
C>400
Environmental Assessment Software
5 (Englund and Sparks 1988) was utilized to
g.
~ 300 prepare histograms and probability plots for
both the observed data set and logarithmic
200
transformations of the observed data set.
Characteristic of many environmental data
100
sets, the observed data exhibited a large
nwnber of low values which offset the mean
o of the data distribution to the left of the
Conductivity (mmhoslm)
median. The data were transformed to
prepare lognormal histograms and
Figure 1. Histogram
probability plots to determine if the data
exhibits a lognormal distribution .. The tests
for lognormality indicated that the data does not approach this distribution. Therefore,
the observed data set was utilized for the analysis. Figure 1 presents a histogram plot for
the observed conductivity data set.
The summary statistics presented in Table 3 nwnerically describe the location, spread and
shape of the observed data distribution.
Experimental Variogram
A variogram is a plot of the variance, or one-half the mean squared difference, of paired
data points as a function of the distance between the two points (Deutsch and Journel
1992). An omnidirectional variogram can be developed to obtain a general understanding
of the spatial characteristics of the sample data. The omnidirectional variogram does not
take into account spatial continuity changes due to directional changes in the data.
Therefore, directional variograms are developed to identify these changes, if present. To
ensure a more realistic sample variogram, the window for the lag distance did not extend
greater than one-half the length or width of the data set. The lag distance between points
KUHN ET AL. ON SOLID WASTE DISPOSAL 169
was also restricted such that a minimum of 30 pairs per lag distance were available to
increase the confidence of variogram calculations.
GSLIB - Geostatistical Software Library and User's Guide (Deutsch and Journe11992)
was utilized to develop directional and omnidirectional experimental variograms from the
three dimensional conductivity data set. Generally, two sets of directional variograms
were developed by restricting the paired data points to be either horizontally coplanar or
vertically cocolumnar.
Attempts to identify directions of maximum and minimum continuity within a horizontal
plane revealed the same structure as the omnidirectional variogram in the all directions.
Therefore, isotropic conditions were considered for the horizontal plane. Paired data
.--- . .
-----
JOO.OO I
/' :
250.00 --7
.
E
e 100.00
t V
g' 0
·C 1 /
~ lSO.00
/"
100.00 --
50.00
0.00
o so 100 ISO 200 2SO
Model Yario"ram
Once an acceptable experimental variogram was developed, the model variogram was
constructed. Variogram modeling entailed fitting a mathematical function, using visual
techniques, to the experimental variogram points by varying the model type and the
nugget effect, sill and range values until the model variogram closely resembled the
experimental variogram.
170 GEOSTATISTICAL APPLICATIONS
The exponential variogram model was fit to both the horizontal omnidirectional and the
vertical variograms (Figure 2) utilizing the parameters presented in Table 4. Although
the nugget and sill are identical, the range significantly decreases in the vertical direction.
This is characteristic of geometric anisotropy and is commonly encountered in earth
SCIence.
Cross Validation
Model variograms were cross validated to compare the sample point values to the
estimated values at those locations. It is important to develop a variogram model that
, I would minimize the standard deviation of the estimation error as determined by the cross
validation process. A variogram model that produces good results does not necessarily
I , indicate that the estimation at unknown locations will be accurate. However, good results
from cross validation will suggest, with more confidence, the effectiveness ofthe selected
model.
, "
,i
Cross validation consists of removing a data point from the data set and calculating an
estimated value utilizing the model variogram. Once the estimate af6 is calculated, a
comparison can be made between the estimated and observed values at each sampling
point by calculating the difference between the two values.
The three summary statistics utilized to evaluate the cross validation results are : 1)
average kriging error (AKE), 2) mean squared error (MSE) and 3) standardized mean
squared error (SMSE) (Woldt 1990). The AKE provides a measure of the degree of bias
introduced by the kriging process and should equal 0 if the data is unbiased. The MSE
should be less than the variance of the measured values. The SMSE is a measure of the
consistency and is satisfied if the SMSE is within the interval 1.0±[2(2/n)1/2]. The results
are summarized in Table 5 along with their calculated expected values.
I'I
I
J,
KUHN ET AL. ON SOLID WASTE DISPOSAL 171
As depicted in Table 5, the criteria meet the recommended AKE and MSE values and is
just outside the range of expected SMSE values. These results are generally considered
acceptable.
Ordinary krigin~
Ordinary point kriging was selected to estimate expected values at unsampled locations.
This method was selected because it is a linear unbiased estimator that attempts to
minimize the error variance and generally has the lowest mean absolute error and mean
squared error in comparison to other estimation methods (i.e. polygonal, triangulation,
local sample mean, and inverse distance squared). _gSLIB (Deutsch and Journel 1992)
was utilized to calculate the expected values at 8,000 unsampled locations, or nodes, on a
20-x 20-~ 20 grid from the three dimensional conductivity data set. The nodes were
~paced at2S" meters in the north and east horizontal directions and 2 meters in the vertical
direction. These spacings were selected based on the anticipated spatial orientation and
depths of the pertinent subsurface features.
Search Nei~hborhood
The search neighborhood was established based on the following criteria: 1) selecting the
greatest distance on the variogram model that closely fit the experimental variogram, 2)
at least 30 pairs were utilized to calculate the experimental variogram at each point and 3)
the distance did not exceed half the length ofthe horizontal sampling grid diagonal.
The search neighborhood was defined in the horizontal plane at 250 meters. The
geometric anisotropy limited the search to within 25 meters in the vertical direction
which reflects the region of the variogram with the higher level of confidence.
Background Conductivities
Generally, the expected values located west and south of the site and outside the property
boundaries were considered to represent background subsurface conductivities, or
I,
expected values assumed not to be impacted by past site activities. This was established
based on the groundwater flow direction. These background values generally ranged I:"
Iii
from less than 0 mS/m to 12 mS/m in the vadose zone and from 12 mS/m to 24 mS/m
below the water table. These ranges of values established the basis of interpretation for
identifying hydrogeologic features, landfill cells and potential leachate migrating from
the cells.
DISCUSSION
The previous sections discussed a methodology that can be utilized to interpret surface
based electrical data in an effort to construct reliable maps of suspected subsurface
features. Based on the selected cross sections of expected conductivity values presented
172 GEOSTATISTICAL APPLICATIONS
in Figures 3 and 4 and limited knowledge of the site, interpretations of: 1) site specific
hydrogeology, 2) horizontal and vertical extents of waste and 3) potential sources for
leachate migration were developed
Station 000 North (Figure 3) depicts a vertical cross section of the background expected
subsurface conductivity values for background reference. This station is located
upgradient of the site and was utilized as an indication of subsurface conditions not
impacted by past landfill operations. information obtained from NDEQ records, the
natural subsurface environment adjacent to the boreholes consists of fine to medium sand
with a static water table elevation near 450 meters. Therefore, 0 to 12 and 12 to 24 mS/m
were determined to represent unsaturated and saturated fine to medium sands,
respectively. A variance from these ranges was interpreted as an indication of differing
subsurface structures, or impact from the landfill operation.
Based on the nature of the site operations, landfill cells are expected to be located from
the ground surface down to an approximate depth of 5 meters. The selected vertical cross
sections (Figure 3) depicted expect conductivity values near the surface ranging in excess
of 24 mS/m within the 0 to 5 meter BGL depth range.
Generally, the landfill cells appear to cover the entire site. Based on the vertical cross
section maps (Figure 3), two areas exhibiting conductivity values in excess of24 mS/m
were elongated in the north and south directions with the centerlines located near stations
150 East and 300 East. Based on a personal interview with the owner, it appears that
these two areas are actually several landfill cells spaced close together.
The primary concern with SWDS is the potential leachate contamination associated with
the nature of the operation. Leachate is a liquid that consists of refuse moisture and all
precipitation that mixes with this moisture as it migrates through the landfill. Leachate
migrating from a landfill naturally due to gravity or forced out as a result of the
consolidation of refuse may transport contaminants from the refuse to the groundwater
environment.
The presence of elevated conductivity values within the vadose and directly below the
landfill cells was interpreted as potential leachate or possible instrument interference
from the overlying waste. These conductivity values ranged from 12 mS/m to 24 mS/m
(Figure 3).
STATION 100 NORTH
_Uno
_UnoSTATION 000 NORTH I I
-
-
~
Q)
'ai
465
-
,-~ ,,,"
~
Q)
'ai
5
-
I~
5 455
2
455
~ :.,
~ 2
o 0 .
"" r
ciQ" 450
~
450
.,c «i= "
30
"
~ 445
" > 445
"
\H > 12 UJ 12
W
-I 440
• -I
W
<:
~
W -I
a... ~
:2:
435
(/)
:2:
~
.,
("")
Q
100 200 300 <CO
100 200 300 <CO
_
Z
_
::t.
Q STATION 225 NORTH STATION 425 NORTH m
= -i
»r
....
- ....
Q
M
>oi ~
465
-~
465
I
.... 0
z
-
"CI
...
~ 2Q) - Q)
'ai en
0
5 5
r
c
I~
~ 455
Q.
.,
2 2
. 0
-0
~
~
~
o
i=
::;
w
450
445
0
i=
~445
30
""
12
:E
»en
-i
m
~ 12
•
W •
"CI
ll 440 II 440
0
en
-I -I -0
(/) 435 (/) 435 0
:2: :2:
en
»r
430
100 200 300 400 0 100 200 300 <CO
"fj
ciQ' 400
.,=
n>
~
.
.!l
! 300
~
i!i
.,==
Q
~200
N'
Q
0
z
=
....
e:. 100
.,(j
Q 100 200 300
"""
'"'" EASTING (meten:)
rJJ
n>
....
t'l
S'
=
...,
Q
ELEV A TlON 452 meter.>
(unsaturMed conditions Ma,.,...tertable)
.,~
ciQ'
n>
Q.
0
....
~
~
~
~
'Q
Based on the information obtained from this study, it appears that leachate may have
migrated from the landfill and impacted the groundwater table. The plume appears to be
migrating horizontally in the northeast direction with a vertical component. Figure 5
consists of a plan view and cross section illustrating an interpreted leachate plume located
relative to the identified waste.
PROPERTY LINE
All data description and estimation efforts requiring computer software support was
completed utilizing an IBM 80386 processor. Software support utilized in this study
consisted of Geo-EAS, GSLIB and TecPlot Version 6.0 (TecPlot).
Probability and histogram plots describing the 1679 observed conductivity values were
prepared utilizing Geo-EAS. Geo-EAS generated on-screen plots within minutes
allowing efforts to be focused on the descriptive analyses.
The expected values. are presented on cross section contour maps included as Figures 3
and 4. The three dimensional expected value data set was imported into TecPlot.
TecPlot utilizes linear interpolation to construct each contour line. Tecplot generated
cross section maps based on a three dimensional data set. By fixing one dimension, a
cross section at a desired location was generated within minutes.
176 GEOSTATISTICAL APPLICATIONS
CONCLUSIONS
Geostatistical analysis demonstrated that the data is spatial correlated which allowed for
an interpreted subsurface model to be developed based on kriged estimated values. As an
alternative to traditional intrusive characterization techniques, surface based
electromagnetic surveying techniques proved to be a key non-intrusive, cost-effective
element in the refinement of the second phase of the hydrogeologic investigation.
Review of kriging error maps can further refine this second phase by focusing on the
areas with the largest error. This study demonstrated that this methodology, as a
preliminary field screening tool, can provide sufficient information to optimize the
placement and minimize the number of permanent groundwater monitoring wells.
REFERENCES
Cressie, N.A., 1989, "Geostatistics," American Statistician, Vol. 43, pp. 197-202.
Deutsch, C.V., Journel, A.G., 1992, "GSLIB - Geostatistical Software Library and User's
Guide," Oxford University Press.
Environmental Protection Agency, 1994, "EPA Criteria for Municipal Solid Waste
Landfills," The Bureau of National Affairs, Inc., 40 CFR Part 258.
Englund, E. and Sparks, A., 1988, "Geo-EAS 1,2,1 User's Guide", EPA Report #60018-
911008, EPA-EMSL, Las Vegas, Nevada.
French, R.B., Williams, T.R., Foster, A.R., 1988, "Geophysical Surveys at a Superfund
Site, Western Processing, Washington," Symposium on the Application of Geophysics to
Engineering and Environmental Problems, Golden, Colorado, pp. 747-753.
It
Ii11
KUHN ET AL. ON SOLID WASTE DISPOSAL 177
Isaaks E.H., Srivastava R.M., 1989, "An Introduction to Applied Geostatistics," Oxford
University Press, New York.
Joumel, A., Huijbregts, C., 1978, "Mining Geostatistics," Academica Press, New York.
NDEQ, February 1990, "Ground Water Quality Investigation of Five Solid Waste
Disposal Sites in Nebraska", Nebraska Department of Environmental Quality.
Woldt, W.E., 1990, "Ground Water Contamination Control: Detection and Remedial
Planning," Ph.D. Dissertation, University of Nebraska - Lincoln.
"
,,
:.,
"
,
,I
'.
I,:
"
Ii'
Ii'
'i,
,,,
Geotechnical and Earth Sciences Applications
I
'I
Craig H. Benson1 and Salwa M. Rashad 2
ENHANCED SUBSURFACE CHARACTERIZATION FOR PREDICTION OF
CONTAMINANT TRANSPORT USING CO-KRIGING
INTRODUCTION
The objective of the project described in this paper was to evaluate how
characterizing the subsurface affects predictions of contaminant transport. Simulations of
1Assoc. Prof., Dept. of Civil & Environ. Eng., Univ. of Wisconsin, Madison, WI, 53706, USA.
2Asst. Scientist, Dept. of Civil & Environ. Eng., Univ. of Wisconsin, Madison, WI, 53706, USA.
181
182 GEOSTATISTICAL APPLICATIONS
SYNTHETIC AQUIFER
Characteristics
A "synthetic aquifer" was used in this study because it can be fully-defined; that
is, the hydraulic properties throughout the aquifer are defined with certainty. In this
particular application, fully-defined means that hydraulic conductivities and soil
classifications can be assigned to every cell in the finite-difference grid used in
simulating flow and transport in the aquifer. Thus, flow and transport simulations
conducted with the "fully-defined" aquifer are representative of its "true" behavior.
Comparisons can then be made between results obtained using the fully-defined case and
cases where the aquifer has been characterized with a limited amount of sub-surface data.
This comparison provides a direct means to evaluate the inherent inaccuracies associated
with estimating subsurface conditions from a limited amount of information.
6 Clay
5 Clayey Sill
• Silty Sand
3 Ane Sand
2 Coarse-Medium
Sand
1 Clean Gravel
Upstream Downstream
Boundary Hydraulic Gradient. 0.01
• Boundary
long per side. Groundwater flow was induced by applying an average hydraulic gradient
of 0.01. Constant head boundary conditions were applied at the upstream and
downstream boundaries of the aquifer. No flow boundaries were applied along the
remaining surfaces of the aquifer.
An important feature of the aquifer is that soil types are layered to create
continuous and non-continuous soil lenses. Lenses with high hydraulic conductivity,
such as clean gravel and coarse to medium sand, simulate preferential flow paths that
might not be detected during a subsurface investigation. Low hydraulic conductivity
soils such as clayey silt and clay are layered to create pinches and stagnation points that
may cause the flow of groundwater to slow or even stop. These intricacies of the aquifer
also might not be detected during a subsurface investigation.
A soil classification was assigned to each geologic unit (i.e., the geologic facies)
in the fully-defined synthetic aquifer. The soil classifications used to describe geology of
the aquifer are: (1) clean gravel, (2) coarse to medium sand, (3) fine sand, (4) silty sand,
(5) clayey silt, and (6) clay. These soil classifications are represented numerically using
the integers 1-6. The writers note that the integer ordering of these classifications is
arbitrary. Consequently, results somewhat different than those described herein may
have been obtained had a different categorical scheme been used.
Each cell in a given geologic unit was assigned a single realization from the
distribution of hydraulic conductivity corresponding to the unit. Single realizations were
generated using Monte Carlo simulation via inversion. In addition, no spatial correlation
was assumed to exists within a geologic unit. Thus, the correlation structure inherent in
the aquifer is due primarily to the relative location and size of the geologic units.
fp __
Kmax
Oaa (1985)
Bowl.. (1984)
Domenico &
Schwartz (1990)
~~;i~c~~~
HoH.& I I
Hough (1981)
Kovacs (1969)
Lee a1 81. (1983)
I r==~~~==~ ~=---------,
CleanGr.-..l
I ~I~~~E;5$~
eo.,..· IHdIumSand
AM .....
:::: D:;m :
McCar1l1y (1982)&
Means so
=~:~!~=
Cloy
SIIy"""
Cloy., . .
0 .." Sand. Sand l Grwel Mill
t a:
Sanci,Sll,CI",MIIctu,.
SmHh(1978)
SOW.~(l·~ I:' I
Whitlow (1983)
Compoeito
The spatial correlation structure inherent in the soil type and hydraulic
conductivity fields was characterized by computing directional experimental variograms
in three dimensions. A model was then fit to the experimental variograms. A similar
approach was also used to characterize the spatial cross-correlation structure between
hydraulic conductivity and soil type.
Experimental variograms were computed using the program GAM3 from the
GSLIB geostatisticallibrary (Deutsch and Joumel 1992). The experimental variograms
were computed by:
N(h)
Y*In K (h) = 2 N(h) £.J
1 ~
[In K(xi + h) - In K(xi) ] 2 (1)
1=1
BENSON AND RASHAD ON CO-KRIGING 185
(2)
In Eqs. 1-2, Y*lnK(h) is the estimated variogram for InKs separated by the vector h, y* s(h)
is the estimated variogram for soil classifications (S), N(h) is the number of data pairs
separated approximately by the same vector h, and Xi is a generic location in the aquifer.
The cross-variogram between InK and S is computed as;
1 N(h)
Y~nK,S (h) = 2 N(h) L [S(Xj + h) - S(Xj)] [lnK(xj+h)-lnK(xD] (3)
1=1
The principal axes for soil classification were identified by computing a series of
experimental variograms each having a different orientation relative to the traditional
Cartesian axes. The analysis showed that mild anisotropy exists in the X- Y plane, with
the principal axis oriented 45° counterclockwise from the X-axis. For the vertical
direction, the principal axis coincided with the vertical (Z) axis (Benson and Rashad
1994).
The principal axes for the hydraulic conductivity field were assumed to
correspond to the principal axes for soil type because the hydraulic conductivity field was
generated directly from the soil type field. A similar assumption was made regarding the
cross-variogram (InK-soil type).
A spherical model with no nugget was found to best represent the experimental
variograms. The spherical variogram is described by (Issaks and Srivastava 1989);
=C
.
,
y(h) ifh ~ a (4b) ,•
:.1
where C is the sill and a is the range. Table 2 provides a summary of C and a for each
variogram.
The directional experimental variograms exhibit a mixture of geometric and zonal
anisotropies. Geometric anisotropy is characterized by directional variograms that have
approximately the same sill, but different range. In contrast, zonal anisotropy
corresponds to changes in the sill with direction, while the range remains nearly constant
(Issaks and Srivastava 1989). The X'-Z' anisotropy is primarily geometric, whereas the
X'- Y' and Z'- Y' anisotropies are primarily zonal.
186 GEOSTATISTICAL APPLICATIONS
0 .4
"
X'. Y' . and Z are components of h
along the principle axes
( a)
0 .0
0 400 800 1200 1600 2000
InK Variograms
35 .0
30 .0
g" 25 .0
'"
.~
20 .0
>
E
"
(/) 15.0
10 .0
X', Y·. and Z are components of h
5. 0 along the principle axes
0 .0 L~(b~)~~~~~~~~~==~~~
o 500 1000 1500 2000
Separation Distance, h (em)
""c
'>"
';ii -4.0
A
'E \
"
(/)
'"
~
e
-6.0
"",-6 "",!!J,
' <!~
() "l!~ 11." ,",
-8.0 u.. -0. -a." ~_""I!l
Cross-Semivariograms for
( c) Soil Classification and InK
A three-level nested structure was used to combine the geometric and zonal
anisotropies into a single variogram model. The model has the form:
(5)
=
The model YI,j is a spherical function (Eq. 4) having C 1; it provides the basis for the
nested structure. The subscript j denotes the variable being described (S, InK, or S-lnK
for the cross-variogram). The separation distance hi is an "equivalent" distance:
h -
1-
J hX')2
ax'
h y,)2
ay'
h ,)2
( - + ( - + (z-
az' (6)
The weight WI,S corresponds to the smallest sill for soil classification (X' axis: sill = WI,S
= 1.5 for S). The models Y2 and Y3 are also spherical models and they are used to ensure
that sills corresponding to the Z' and Y' axes are preserved. That is:
A summary of the weights used for the soil classifications, hydraulic conductivity, and
cross-variogram models is contained in Table 3.
It is important to note that data for hydraulic conductivity and soil classification
were available for each cell in the aquifer when computing the cross-variogram. In more
realistic cases, both types of data will probably not be available at each location.
Problems associated with this disparity can be resolved by using the pseudo-cross
variogram (Meyers 1991).
188 GEOSTATISTICAL APPLICATIONS
Variable wI w2 w3
Soil classification, S 1.5 0.25 0.5
Hydraulic 25.0 4.0 3.0
Conductivity, InK
Cross-Variogram -5.75 -0.75 -1.2
InK vs. S
MODFLOW
For each field of hydraulic conductivity that was simulated, MODFLOW was
used to compute the total heads at each node and the total flow rate (Q) emanating from
cells at the downstream end. The hydraulic head field is used by the advective transport
model for simulating contaminant transport.
PATH3D
The program PATH3D (Zheng 1988) was used to simulate advective contaminant
transport. PATH3D is a general particle-tracking program for calculating groundwater
paths and travel times in steady-state or transient, two- or three-dimensional flow fields.
A detailed description of PATH3D can be found in Zheng (1988). Changes in PATH3D
were required before it could be used in this study. These changes included modifying
the algorithm for time step adjustment and modifying the post-processor to describe
characteristics of the plume. Details of these changes can be found in Benson and
Rashad (1994).
Ground water flow and contaminant transport simulations were conducted for
three different conditions: (1) fully-defined aquifer, (2) partially defined aquifer using
only hydraulic conductivity data, and (3) partially-defined aquifer using hydraulic
conductivity and soil classification data. In the partially-defined cases, a co-kriging
program (based on the program COKB3D, Deutsch and Journel 1992) was used to
BENSON AND RASHAD ON CO-KRIGING 189
estimate hydraulic conductivity for each finite-difference cell in the aquifer using a liner
co-regionalization model for the variograms (see previous sections).
In this application, co-kriging was only used to estimate the primary variable,
hydraulic conductivity. In addition, point kriging was used instead of block kriging
because it was more easily implemented and the cells used to discretize the aquifer were
small. Nevertheless, a small error was introduced by point kriging.
The variogram models previously discussed were used to describe the spatial
correlation structure. Input for the co-kriging program included subsurface information
consisting of profiles of hydraulic conductivity or soil classifications. A description of
the co-kriging implementation can be found in Benson and Rashad (1994).
between the estimated hydraulic conductivity field and the fully-defined hydraulic
conductivity field along a transect (single row of cells; X=50 cm, Z=50 cm, Y=O to 2500
cm) through the aquifer. In one case, the hydraulic conductivity field was estimated with
kriging using only hydraulic conductivity data; co-kriging using hydraulic conductivity
and soil classification data was used for the other case.
The hydraulic conductivity fields estimated using kriging and co-kriging are
shown in Fig. 5. Hydraulic conductivities measured along three vertical profiles were
used as input when only kriging was conducted. The resulting estimated InK field is a
smooth, nearly linear interpolation between the profiles at which measurements were
made. The estimated InK field is very different from the "true" hydraulic conductivity
field obtained from the fully-defined synthetic aquifer. That is , the irregular spatial
variations in InK are not preserved.
~------,
",+-- Co-Kriging
-5.0 ,
/
·. ' ,
Knglng " :
)..
-15.0 ,
"
,
-20.0
addition of the secondary variable greatly improves the estimated hydraulic conductivity
field. The estimated InKs along the transect more closely resemble the "true" InKs.
However: even with co-kriging, the estimated field is smoother then the "true" field.
A consistent method was needed to select locations for additional profiles. The
writers chose to select subsequent profiles at locations where the co-kriging variance is
largest. These locations have the greatest uncertainty in the estimated hydraulic
conductivity. The writers note, however, that locations where the co-kriging variance is
largest are not necessarily the critical locations where uncertainty in hydraulic
conductivity has the greatest impact on contaminant transport. However, these critical
locations cannot be identified a priori, because under normal circumstances the detailed
characteristics of the aquifer are unknown.
Precision of the Hydraulic Conductivity Field -- In this section, two statistics are
used to characterize the precision of the estimated field. These statistics are: the
maximum co-kriging variance and the mean co-kriging variance. The mean co-kriging
variance (cr~) is used to quantify the global estimation error. The mean co-kriging
variance is computed as:
H
.... 2 _~ ~ 2 (9)
uci -
,
.1..1 cr cki'
N c 1=
where cr~k,i is the co-kriging variance at the ith cell in the finite-difference grid and Nc is
the total number of grid points (Nc = 12,500). In each case, cr c2k,I· is the variance of the
primary variable (hydraulic conductivity) being estimated, as described in Issaks in
Srivastava (1989, p. 404).
The mean co-kriging variance is shown in Fig. 7 as a function of the number of soil
classification profiles. In this case, the error is significantly larger when five hydraulic
conductivity profiles are used instead of ten profiles. A significant difference is
BENSON AND RASHAD ON CO-KRIGING 191
C>
c
'6>
~ Cl> 25.0
OU
()~
E 'fii 20.0
E>
.~
~ 15.0
10.0 L-~~~...........J_~~~.........J.._~~~.........J....--1
Kriging 1 10 100 Fully-
Only Defined
No. of Soil Classification Profiles
,
Q)
u
c
C"IJ
.~
20.0 L
I
>
15.0 r - - --
Cl
c
:§> t
I
--- ""'3 ~
~
o 10.0 ~
()
c
C"IJ t
L
\ -e-N K=5
J
I
I
Q)
:E 5.0 ~ i- aT- -N K=10 I
0.0 L
t ~~~"""""" __ ~......J....._~ _ _ .....Jj
o 10 100 Fully-
Defined
No. of Soil Classification Profiles
Figure 7 also shows that exploration schemes employing more soil classification
profiles with fewer hydraulic conductivity profiles can be as effective in reducing
uncertainty as schemes that simply use more hydraulic conductivity profiles. For
example, the scheme consisting of five hydraulic conductivity profiles and five soil
classification profiles has similar mean co-kriging variance as the scheme using ten
hydraulic conductivity profiles and no soil classification profiles. Furthermore, the
scheme using more soil classification profiles and fewer hydraulic conductivity profiles is
likely to be less expensive. Thus, a similar reduction in uncertainty can be obtained at
less cost.
Total Flow
One means to evaluate how well the aquifer is characterized is to compare the
total flow rate across the compliance surface for the fully-defined condition with the total
flow rate when the aquifer is characterized using a limited amount of subsurface data.
For the synthetic aquifer, the compliance surface was defined as the downstream
boundary (Fig. 1). If the flow rates are not nearly equal, then the aquifer is not
adequately characterized. If the flow rate is too high, low conductivity regions blocking
flow have been missed. In contrast, a flow rate that is too low is indicative of missing
preferential pathways (Fogg 1986).
Figure 8 shows total flow rate when the aquifer is characterized with 5 or 10
profiles of hydraulic conductivity and a variable number of soil classification profiles.
When no profiles of soil classifications are used (kriging only), the flow rate is one-third
to one-half the total flow rate. Apparently, the sampling program has inadequately
defined the preferential pathways controlling true total flow. However, when more soil
classification profiles are added, the flow rate begins to rise and then becomes equal (i.e.,
> 10 profiles) to the flow rate for the fully-defined condition.
Two other characteristics of Fig. 8 are notable. First, similar flow rates were
obtained when five or ten profiles of hydraulic conductivity (but no soil classifications)
were used to characterize the aquifer. Apparently, neither set of measurements is of
sufficient extent to capture the key features controlling flow. Second, the aquifer was
better characterized (in terms of total flow rate) using five hydraulic conductivity profiles
and 15 soil classification profiles then 10 hydraulic conductivity profiles and 15 soil
classification profiles. This indicates that focusing on collecting a greater quantity of
index measurements (i.e., soil classifications) may be more useful in characterization than
collecting a fewer number of more precise measurements (i.e., hydraulic conductivities).
In this case, hydraulic conductivity inferred from a soil classification had a precision of
two to three orders of magnitude, whereas the hydraulic conductivity "measurements"
were exact. Thus, in this case, simply defining the existence of critical flow paths
apparently is more important than precisely defining their hydraulic conductivity.
BENSON AND RASHAD ON CO-KRIGING 193
120
100
0Q)
III
80
"'E
.!:!.
Q) 60
OJ N =0
a:
~ 40 ~S_ _ _
0
u::
20
o L,_--'-"-'-.L.L.Uu.L---'--'-'-'-'-.u..U._............-'-'--W-U-'----;:!
Kriging 10 100 Fully-
Only No. Soil Classification Profiles Defined
At early times, the trajectory of the centroid does not depend greatly on the
exploration scheme. However, as the plume evolves, different trajectories of the centroid
are obtained. In particular, the plume moves more slowly in the down-gradient (X-
direction) when the aquifer is characterized with a limited amount of subsurface data.
Apparently, the preferential pathways controlling down-gradient movement were
inadequately characterized.
Addition of soil classification data did not consistently improve the trajectory in
the X-direction. Adding five soil classification profiles improved the trajectory
significantly, but the worst trajectory was obtained when 22 soil classification profiles
were used. Adding even more soil classification profiles (NS = 32 or 125) improved the
trajectory only slightly. This is particularly discouraging because 125 soil classification
profiles corresponds to sampling 25% of the entire aquifer.
900
Q)E
-u0:1-
700
600
- NK =10, Ns=22
-- -. - -- NK =10, Ns=32
- NK =10, N s=125
'"
. ~~
"E "o 500
o!=
oc
U.,
'U
x_ 400
0
Kriging Only
300 NK =10 ,-- /: '
~ -~
/-
200 .--- ,.,0:
L---.--'----
100
0.01 0.1 4
Time (years)
Fully-Defined
-- -
950
--~-
Q)E
-u
.......'\
0:1-
.5"0
"E "o 900
Kri~ng Only , "'- -
o!=
OC _ ~=10_-----1\ ~
r
U.,
'U - .- NK =10, N s=5 \
>--0
NK=10, Ns=22
-- - M- -- /'
850
- ... - NK =10, Ns=32 \.1
- - - NK=10, N s=125 ,,/
- - -
800
0.01 0.1 4
Time (years)
1100
1150 Fully-Defined , ,
1200
Q)E
-u 1250
0:1-
.5'0
5"g 1300
OC
u., ----- - - - - .,
' U 1350 NK =10, N s=5
N_
o NK =10, Ns=22
1400
1450
NK =10, N s=32
---.- NK=10, N s=125
t
Kriging Only
NK=10
1500 L -_ _ _ _ _ _ _ _ ~ _______ ~ __ ~
0.01 0.1 4
Time (years)
FIG, 9 - Centroid of plume: (a) X-coordinate, (b) Y-coordinate, and (c) Z-coordinate
BENSON AND RASHAD ON CO-KRIGING 195
and, in the case where 125 profiles were used, did result in a subsurface where the plume
moved upward and to the rear of the aquifer. Unfortunately, the degree of plume
movement existing when 125 soil classification profiles were used is still too small to
simulate the fully-defined condition.
For brevity, graphs of trajectory of the centroid are not shown for the exploration
schemes where 10 hydraulic conductivity profiles were used. These graphs can be found
in Benson and Rashad (1994).
Some general features of Fig. 10 are noteworthy of mention. First, the variance is
larger in the X and Z-directions. The Z-variance is large because the particles are
uniformly distributed along a vertical profile (i.e., Z-direction) at the onset of the flow
and transport simulation. A large X-variance occurs because down-gradient movement
of the plume occurs in the X-direction and thus the X-variance corresponds to
longitudinal spreading of the plume. Accordingly, the Y-variance is much smaller
because it corresponds to lateral spreading orthogonal to the average hydraulic gradient,
which is generally smaller than spreading in the longitudinal direction.
The X-variance increases with time. At short times, the variance is small because
little spreading of the plume has occurred. However, as the plume moves down-gradient,
the variance increases as more spreading occurs. Furthermore, the ability to capture the
true amount of spreading depends on the amount of subsurface information used in
characterization (Fig. 10). When less information is used (e.g., kriging only, NK=5,
NS=O), the variance is smallest and when more information is used (i.e., by adding soil
classification profiles) the variance increases. This is expected, because a smoother
subsurface containing fewer heterogeneities exists when less data are used in
characterization. However, adding more soil classification profiles does not consistently
improve the X-variance. In fact, the X-variance for NS=5 is closer to the X-variance in
the fully-defined case than the schemes having NS=22, 32, and 125.
Similar behavior was noted for the Y-variance. However, adding more soil
classification profiles had a more consistent effect on the Z-variance. Adding more soil
classification profiles consistently resulted in a Z-variance that was closer to the Z-
variance in the fully-defined case.
196 GEOSTATISTICAL APPLICATIONS
Fully-Defined '
800
~ - - NK=5'NS=5~
- K - - N =5 N =22
K ' S
NE
-+ N K=5, N =32
-
~ 600 s
C1I
0
- -- - NK=5, N s=125
c - - - -- -----
III
.~
400
>
X
200
0.1 4
I, Time (years)
Fully-Defined
200 l~
- - - N =5
K -
K
=5, N
Ns=22
.
K ' S
=5~
150 - + . N K=5, N s=32
- -- - NK=5, Ns=125
-~ . --- - -----
100
/
50 /
o ~---------~--------
0 .01 0.1
__ ____
~ ~
4
Time (years)
Fully-Defined ......
700
E
~ 650
C1I
0
c
III
.~
600
> NK=5, Ns =5 I
N N =5 N 5 =22 I
550 - NK=5' N =32
K ' 5
I
NK=5, Ns =: 25 1
500 I
0.01 0.1 4
Time (years)
FIG. 10 - Variance of plume: (a) X-variance. (b) Y-variance. and (c) Z-variance
BENSON AND RASHAD ON CO-KRIGING 197
When ten hydraulic conductivity profiles were used for characterization, the X-
variance in the estimated aquifers was similar to the X-variance in the fully-defined case
regardless of the number of soil classification profiles used in characterization (Benson
and Rashad 1994). Apparently, the ten hydraulic conductivity profiles used for
characterization resulted in a sufficiently heterogeneous subsurface such that spreading in
the down-gradient direction was preserved.
In contrast, spreading in the Y and Z-directions for the fully-defined case was
distinctly different from spreading that occurred in these directions when ten hydraulic
conductivity profiles were used for characterization (Benson and Rashad 1994). Adding
five soil classification profiles resulted in a V-variance that was closer to the V-variance
in the fully-defined case. However, as even more soil classification profiles were added
(NS=22, 32, 125), the V-variance became much different than was observed in the fully-
defined case. Apparently, the heterogeneities causing spreading in the V-direction were
inadequately represented when the subsurface was characterized with additional soil
classification profiles.
Kriging with only ten soil classification profiles resulted in a Z-variance that
differed greatly from the Z-variance in the fully-defined case. For larger times, the Z-
variance was much smaller than the Z-variance for the fully-defined condition. However,
when soil classifications were added, the Z-variance more closely resembled the Z-
variance for the fully-defined case. Thus, using soil classifications apparently resulted in
heterogeneities that were similar to those controlling spreading in the Z-direction in the
fully-defined aquifer.
Results of the flow and transport simulations show that soil classifications can be
used to augment or replace more costly hydraulic conductivity measurements while
maintaining similar accuracy in terms of total flow through the aquifer. However, the
geologic details that govern transport through the synthetic aquifer apparently were never
sufficiently characterized. Bulk movement of the plume (i.e., the centroid) and spreading
(i.e., variance) of the plume were never simulated accurately, regardless of the amount of
subsurface data (hard or soft) that were used for characterization.
198 GEOSTATISTICAL APPLICATIONS
ACKNOWLEDGMENT
The study described in this paper was sponsored by the U.S. Dept. of Energy
(DOE), Environmental Restoration and Waste Management Young Faculty Award
Program. This program is administered by Oak Ridge Associate Universities (ORAU).
,, Neither DOE or ORAU have reviewed this paper, and no endorsement should be implied.
REFERENCES
Bowles, J. (1984), Physical and Geotechnical Properties of Soils, 2nd Edition, McGraw-
Hill, New York.
Deutsch, C. and A. Joumel (1992), GSLIB Geostatistical Software Library and User's
Guide Book, Oxford University Press, New York.
Hough, B. (1969), Basic Soils Engineering, 2nd Edition, Ronald Press Co., New York.
Isaaks, E. and R. Srivastava (1989), Applied Geostatistics, Oxford Univ. Press, New
York.
Istok, J., Smyth, J., and Flint, A. (1993), "Multivariate Geostatistical Analysis of Ground-
Water Contamination: A Case History," Ground Water, 31(3), 63-73.
Lee, I., White, W., and Ingles, O. (1983), Geotechnical Engineering, Pitman Co., Boston.
Means, R. and 1. Parcher (1963), Physical Properties of Soils, Merrill Books, Columbus.
BENSON AND RASHAD ON CO-KRIGING 199
Meyers, D. (1991), "Pseudo-Cross Variograms, Positive Definiteness and Co-Kriging,"
Mathematical Geology, 23, 805-816.
Mickelson, D. (1986), "Glacial and Related Deposits of Langlade County, Wisconsin,"
Information Circular 52, Wisc. Geologic and Natural History Survey, Madison, WI.
Mitchell, J. (1976), Fundamentals of Soil Behavior, John Wiley and Sons, New York.
Simpkins, W., McCartney, M., and D. Mickelson (1987), "Pleistocene Geology of Forest
County, Wisconsin," Information Circular 61, Wisconsin Geologic and Natural
History Survey, Madison, WI.
Smith, G. (1978), Elements of Soil Mechanics for Civil and Mining Engineers, 4th Ed.,
Granada Publishing, London.
Seo, D-J, Krajewski, W., and Bowles, D. (1990a), "Stochastic Interpolation of Rainfall
Data from Rain Gages and Radar Using Co-Kriging," Water Resources Research,
26(3),469-477.
Seo, D-J, Krajewski, W., Azimi-Zonooz, A., and Bowles, D. (1990b), "Stochastic
Interpolation of Rainfall Data from Rain Gages and Radar Using Co-Kriging. Results,"
Water Resources Research, 26(5), 915-924.
Sowers, G. and G. Sowers (1970), Introductory Soil Mechanics and Foundations, 3rd
Ed., McMillan Co., New York.
Zheng, C. (1988), "PATH3D, A Groundwater Path and Travel Time Simulator, User's
Manual," S. S. Papadopulos and Associates, Inc., Rockville, MD.
Stanley M. Miller 1 and Anja J. Kannengieser 2
200
MILLER AND KANNENGIESER ON CONDUCTIVITY 201
TENSION INPILTROMETER
White 1988; Clothier and Smettem 1990; Ankeny et al. 1991, Reynolds and
Elrick 1991}. Field capable devices for such work are characterized as
"tension infiltrometers" or "disk permeameters." They allow direct
measurement of insitu infiltration (flow rate) as a function of tension,
which leads to estimation of the insitu Ku value.
,i The tension infiltrometer used in this study was manufactured by
t Soil Measurement Systems of Tucson, Arizona. It has a 20-cm diameter
infiltration head that supplies water to the soil under tension from a
Marriotte tube arrangement with a 5-cm diameter water tower and a 3.8-cm
diameter bubbling tower (Fig. I). Three air-entry tubes in the stopper
on top of the bubbling tower are used to set the operating tension. All
major parts are constructed of polycarbonate plastic, with a very fine
nylon mesh fabric covering the infiltration head. Pressure transducers
installed at the top and bottom of the water tower are used to measure
accurately the infiltration rate. Output from the transducers is fed
electronically to a field datalogger for real-time data acquisition and
storage. Procedures for field setup and use of the instrument are given
in the SMS User Manual (1992).
Using the measured flow rates, 0 (cm 3 /hr), from the field tests,
values of Ku can be obtained using formulae given by Ankeny et al.
(1988), Ankeny et al. (1991), and Reynolds and Elrick (1991). The first
step is to calculate the pore-size distribution parameter, a, for a
pair of tension settings:
(2 )
i:
MILLER AND KANNENGIESER ON CONDUCTIVITY 203
em
Transducer
Water level
Three-hole
stopper
Bubble tower
8 1 em
56 em
Nylon screen
Air tube
Infiltration disc
Shut-off valve
1--20 em---l
3.8 eM I I
I I
I
==0:::::'=:':::::::::::::::':==---- ---- ------- --
n
VB = L ai x(Ui) where the ai'S are the kriging weights. (4)
i=l
n
L aj Cij + A. = CBi (Sa)
j=l
(5b)
CASE STUDY
The site selected for the case study was a portion of a heap
leaching pad at a base-metal mine in the Western U.S. Material at the
site consisted of blasted ore, with particle sizes ranging from several
microns up to several tens of millimeters. Although not a typical soil
in agricultural terms (i.e., possessing necessary organic materials to
support plant life), this material would be classified by engineers as a
coarse gravel with some sand and fines. This type of coarse material
would provide a rigorous test for the SMS tension infiltrometer, which
was designed to be used primarily for agricultural-type soils.
Data Analysis
3396.
+ 124.
3394. + 211.
3392. + 138.
3382 + 161 .
3378.L-~--~--~--~--~--~~--~--~--~--L-~--~--~~
664. 666. 668. 670. 672. 674. 676. 678. 680. 682. 664. 686. 668. 690. 692. 694.
1800
"i 1600 ++ + _7 +
~1400
"0
+ !i8
~
1; 1200
u
-1000 "E 5
~4
~ 800 0
:.'"
.2 600 P2
.~ 400
200
.
·e
(/) 1
"
(/)
0 0
0 2 4 6 8 10 12 14 16 18 20 22 0 2 4 6 8 10 12 14 16 18 20 22
Lag Oi.lance (m) Lag Distance (m)
(a) (b)
n 20 875 875
mean 183.0 183.2 182.3
s.d. 35.9 11.8 13.1
var. 1291.0 140.1 170.8
min. 124.0 127.2 139.6
max. 238.0 237.8 215.7
210 GEOSTATISTICAL APPLICATIONS
207 .00
199.00
191.00
183.00
§.
175 .00
'"c
€
0
z 167 .00
159 .00
151.00
143 .00
135.00
Eas1ing 1m)
80
70 +
+
+ ++
E
~ 60
'"
.~ 50 +
..
.~ 40
C/) 30
~ 20
u
10
0
0 2 4 6 8 10 12 14 16 18 20 22
Lag Distance 1m)
FIG . 5--Es tima ted cross s emivariogram betwee n Kul-5) and PF2 with
fit ted spher ical mode l: Y(h) = 8.0 + 42.5 Sph12(h) .
207 .00
I 99 .00
191.00
183 .00
175 .0 0
167 .00
159.0 0
151 .00
143 .00
135 .00
Eosling 1m)
25.00
15.00
05.00
195.00
185.00
175 .00
165 .0 0
155 .0 0
145 .00
135 .00
Easling (meIer)
(a)
0.90
0.80
0.70
0.60
0.50
0.40
0.30
0.20
0.10
0.00
Easting (meIer)
(b)
250.00
230.00
210.00
190 .00
170.00
150.00
130.00
110.00
Eosling (meier)
The isotropic semi variogram for the simulated values was similar to that
shown in Fig. 3a.
Sequential Gaussian and Markov-Bayes simulations of Ku(-5) were
conducted on a I-m grid u sing software from GSLIB (Deutsch and Journel
1992), based on the 20 known data values and on the semivariogram model
of Fig. 3a. The Markov-Bayes procedure also uses secondary information
(PF2 in this case), but when both the primary and seconda ry attributes
have data values at the same locations, the primary information is given
precedence over the secondary (Zhu 1991; Miller and Luark 1993). Thus,
results of the two different simulation methods for four trials (simu-
lation passes, or iterations) were quite similar, s howing average mean-
square-errors on the order of 2,300 cm/day squared. These errors were
calculated as squared differences between the 875 simulated values and
the 875 values of the ground-truth image. It was not surprising to
observe that the largest of these errors occurred in the most s parsely
sampled areas of the study site , where uncertainties are greatest for
the simulated annealing approach and the other simulation methods.
Advantages of a bivariate simulation method, such as the Markov-
Bayes (MB) procedure, become apparent when the primary attribute i s
undersampled in regard to the secondary attribute. To illustrate this,
we selected several subsets (reduced data sets) containing 10 of the
original 20 Ku sampling sites. The goal became one of simulating 875
Ku (-5) values, given 20 PF2 sites and 10 Ku(-5) sites. The s e results
then could be compared to those ba sed on sequential Gaussian (SG)
simulation u sing only the 10 Ku(-5) data. Basic statistica l information
for the three different subsets , A, B, and C, is presented in Table 2.
214 GEOSTATISTICAL APPLICATIONS
n 20 10 10 10
mean 183.0 177.4 194.4 176.5
s.d. 35.9 35.6 30.3 40.1
var. 1291.0 1264.0 918.6 1610.0
min. 124.0 123.7 138.4 123.7
max. 238.0 237.8 232.8 232.8
Note that Set B has smaller variance than the original data set, and
that Set C has a larger variance than the original data set.
In terms of overall mean squared error, SG simulation outperformed
MB simulation for Subsets A and B where the sample variance was
relatively small. However, when the subset had a larger variance, MB
simulation was the better procedure according to this criterion. The
beneficial influence of secondary data as used in the MB simulation
method is shown clearly by the sample means and standard deviations of
simulated sets based on the three different subsets. Note especially
the more consistent results for Subsets Band C shown by the MB
procedure compared to more inconsistent results of the SG procedure.
Examples of MB simulation results for these two subsets are presented as
shaded contour maps in Fig. 9.
CONCLUSIONS
250 .00
230.00
210.00
190 .00
170 .00
150.00
130 .00
110 .00
Easling (melor)
(a)
250 .00
230 .00
210 .00
190.00
170 .00
150.00
130 .00
110.00
Easling (meier)
(b)
ACKNOWLEDGEMENTS
REFERENCES
Ankeny, M.D., M. Ahmed, T.C. Kaspar, and R. Horton, 1991, 'Simple Fie
Method for Determining Unsaturated Hydraulic Conductivity," £oil Sci.
Soc. of America Jour., Vol. 55, No.2, p. 467-470.
Ankeny, M.D., T.C. Kaspar, and R. Horton, 1988, "Design for Automated
Tension Infiltrometer," Soil Sci. Soc. of America Jour., Vol. 52, No.
p. 893-896.
Isaaks, E.H., 1984, "Risk Qualified Mappings for Hazardous Waste Sites:
A Case Study in Distribution-Free Geostatistics,· M.S. Thesis, Stanford
Univ., Stanford, CA, 111 p.
Klute, A., 1986, "Methods of Soil Analysis, Part 1," Amer. Soc. of
Agronomy. Monograph 9.
ABSTRACT: The conceptual design of the proposed Yucca Mountain nuclear waste
repository facility includes shafts and ramps as access to the repository horizon, located
200 to 400 m below ground surface. Geostatistical simulation techniques are being
employed to produce numerical models of selected material properties (rock
characteristics) in their proper spatial positions. These numerical models will be used to
evaluate behavior of various engineered features, the effects of construction and operating
practices, and the waste-isolation performance of the overall repository system. The
work presented here represents the first attempt to evaluate the spatial character of the
rock strength index known as rock quality designation (RQD). Although it is likely that
RQD reflects an intrinsic component of the rock matrix, this component becomes difficult
to resolve given the frequency and orientation of data made available from vertical core
records. The constraints of the two-dimensional study along the axis of an exploratory
drift allow bounds to be placed upon the resulting interpretations, while the use of an
indicator transformation allows focus to be placed on specific details that may be of
interest to design engineers. The analytical process and subsequent development of
material property models is anticipated to become one of the principal means of
summarizing, integrating, and reconciling the diverse suite of earth-science data acquired
through site characterization and of recasting the data in formats specifically designed for
use in further modeling of various physical processes.
1 Principal Investigator, Sandia National Laboratories/Spectra Research Institute, MS 1324, P.O. Box 5800,
Albuquerque, NM 87185-1342
2 Principal Investigator and Senior Member Technical Staff, Sandia National Laboratories, MS 1324, P.O.
Box 5800, Albuquerque, NM 87185-1342
3 Principal Investigator, Sandia National Laboratories/Spectra Research Institute, MS 1324, P.O. Box 5800,
Albuquerque, NM 87185-1342
218
CROMER ET AL. ON ROCK QUALITY DESIGNATION 219
INTRODUCTION
The Yucca Mountain site consists of a gently-eastward dipping sequence of volcanic tuffs
(principally welded ash flows with intercalated nonwelded and reworked units). Various
types of alteration phenomena, including devitrification, zeolitization, and the formation
of clays, appear as superimposed upon the primary lithologies. The units are variably
fractured and faulted. This faulting has complicated characterization efforts by offsetting
the various units, locally juxtaposing markedly different lithologies. Most design interest
is focused on the Topopah Spring Member and immediately adjacent units. By
comparison, the waste-isolation performance of the repository system must be evaluated
within a larger geographic region termed the "controlled area" (Figure 1).
The region evaluated by this study is contained entirely within the controlled area. In
general, this study is further restricted to the location of the subsurface access drift known
as the North Ramp, in keeping with a general engineering orientation. This two-
dimensional study represents the first attempt to identify local uncertainty in the rock
structural index known as Rock Quality Designation (RQD).
CONCEPTUAL MODEL
The U.S. Geological Survey provided the original geological cross-section model along
the North Ramp (USGS, 1993). That model was subsequently modified by others and
new cross-sections have also been prepared manually. For this study, the cross-section
shown in Figure 2 was recreated interactively using the Lynx GMS Geosciences
Modeling System, to insure that all of the new bore hole data and corroborative surface
control (Scott and Bonk, 1984) was honored.
The cross-section shown in Figure 2 is consistent with the conventional assumption that
all faults in the repository area are generally down-thrown on the west side. This
interpretation requires a variable, but relatively steep, dip to the beds that can locally
exceed 6 degrees (10% grade). This cross-section also suggests the possible existence of
one or more faults with the east side down thrown. The eight bore holes noted in Figure
2 are of variable lengths and are shown in their proper orientation with respect to the
220 GEOSTATISTICAL APPLICATIONS
- - --- -- ~- --) -- - -- -
NRG-l" North Ramp
NRG-S
NRG-7A
NRG-4
(Not to Scale)
During construction, emplacement, retrieval (if required), and closure phases of the
project, consideration of excavation stability must be incorporated into the design to
ensure worker health and safety, and to prevent development of potential pathways for
radionuclide migration during the post-closure period. In addition to the loads imposed
by the in-situ stress field, the repository drifts will be impacted by thermal loads
developed after waste emplacement and, periodically, by seismic loads. Rock mass
mechanical properties, which reflect the intact rock properties and the fracture joint
characteristics, are used in detailed mechanical analyses to evaluate the host rock
response to loading. The RQD index is widely used as an indicator of rock
quality/integrity in rock mechanics practice. The concept of RQD is that of a modified
core-recovery percentage that incorporates only nonfractured pieces of core that are 0.33
ft (0.10 m) or greater in length:
Run RQD =
L Piece Lengths ~ 0.33ft x 100%
Run Length(ft)
Although other parameters of rock quality are available and widely accepted e.g., rock
mass rating system (RMR) and rock mass quality (Q), the RQD index is considered to be
a good indicator since it reflects a combined measure ofjoint frequency, degree of
alteration and discontinuity filling, if these exist (Deere and Deere, 1989). In fact, both
RMR and Qmeasurements incorporate a factorial component ofRQD in their derivation.
Common tunnelers' rock quality classifications (Deere & Deere, 1989) are correlated to
RQD values in Table 1, while the information provided in Table 2 summarizes expected
shotcrete and additional support requirements for a tunnel in rock which has been
excavated by a boring machine (Cecil, 1970).
This study found the recovery data for individual core runs to be highly variable.
Typically, core runs with poor or no recovery are often short and numerous, while
intervals with high recovery are usually as long as the drillers could make them. RQD
was measured on individual core runs, but the high local variability in core recovery and
disparate lengths of core runs made analysis of core run data difficult. A weighted
average composite ofRQD values on 10-foot intervals provided useful information with
which to perform geostatistical analyses.
222 GEOSTATISTICAL APPLICATIONS
TABLE 1
TABLE 2
SHOTCRETE SUPPORTITHICKNESS
At the heart of geostatistics is the concept of the regionalized variable (ReV). Without
expanding upon random function theory, the ReV can be considered to be a single-valued
function defined over a metric space that has properties intermediate between a truly
random variable and one that is deterministic. In practice, a ReV is preferentially used to
describe natural phenomena which are spread out in space (andlor time) and which
display a certain structure. This structure is typically characterized by fluctuations that
are smooth at a global scale but erratic enough at a local scale to preclude their analytical
modeling (Olea, 1991). Unlike true random variables, the ReV has continuity from point
to point, but the changes in the variable are complex.
CROMER ET AL. ON ROCK QUALITY DESIGNATION 223
Previous studies ofRQD at Yucca Mountain (Lin et.al., 1993) concluded that, in general,
fracture frequency was found to increase with increasing degrees of welding in the
volcanic tuffs. The use of an average RQD value for representing the rock quality of an
entire unit, though, was not deemed appropriate to account for its observed spatial
dispersion. Lateral variation offracture frequencies and RQD observed by Lin, lead to
the recommendation that a range of values should be considered in the drift design
methodology. While Lin's previous work recognized lateral variability ofRQD within
units, this paper outlines the first attempt to model or further examine the nature of these
changes.
Although it would appear, at this point, that RQD could be considered a ReV in a manner
similar to other rock properties that can be expected to vary in space, e.g., porosity or
hydraulic conductivity, the dependence ofRQD upon not only the structural fabric of
Yucca Mountain but also the relationship between the vertical borehole data and that
same structure, produced some unanticipated problems.
Of the four drill cores available to Lin (1993) for evaluation, nearly 95% of the 4000
fractures measured occurred within the more densely welded units and possessed near-
vertical dip orientations. While this vertical nature of fracturing is consistent with most
of the faults and fractures in the Basin and Range geological province which characterizes
the Yucca Mountain area, it required Lin to make corrections when estimating the non-
directional volumetric fracture frequency for each unit. All the data available to this
study is also from drill cores and subject to similar considerations, i.e., there may be
question as to the validity ofRQD measurements derived from vertical drill holes that
align themselves sub-parallel with the structural fabric. For example, intervals of good
core recovery and relatively high RQD may simply reflect an isolated block of intact rock
in pervasively fractured ground. It can also be shown that the orientation on the drill
cores and the sample volume analyzed can influence the interpretation of RQD and
distort characterization and modeling of the ReV. Fortunately, this study is focused in
two-dimensions along the axis of a subsurface drift known as the North Ramp. The
definition of the ReV, and any interpretations made from its modeling, will therefore be
constrained and specific to this locale. Although limited, the RQD data are consistent in
their sample size (boring diameter and 10 foot composite lengths) and general orientation
(all vertical, except for drillhole NRG-3).
RQD data were developed along the North Ramp from eight boring cores. These borings
are shown in Figure 3 and are vertically exaggerated by five times their actual
displacement. The histogram in Figure 4 shows the frequency ofRQD values, as grouped
into 10 classes. A review of this histogram shows a positively skewed distribution
having a mean value of25.3 and a standard deviation of24.5. It is interesting to note that
75% of these values are below an RQD value of 44.0. The limited and sporadic
224 GEOSTATISTICAL APPLICATIONS
NRG-7A
§_~L~
21300.
19300 .
NRG-3\
\I
I
NOG
.
III
.1:1
'= 17300.
NRG-S
..
~
U
15300 . I
13300 . -!-o-.,.,-,'TT'.-.-..,..,..,,.......,............,,......................,n-r,.,......,.,rTT..-j
8000 6000 4000 2000 o
distance from NRG-l
Figure 3 Posting ofRQD data in the bore hole along the axis of the North Ramp.
50 .0
40 .0
!
>- 30.0
()
c
•
:::J
I:T 20.0
i
N
....f
10.0
0 .0 W.....J....J......L...J.....L....J.....L....l......L....J.....r:;::::;::t;::.::I:i::!l:i:;::J;;j;;;g;;;;=~
0 .0 10 .0 20 .0 30 .0 40 .0 50 .0 60 .0 70.0 80 .0 90 .0 100 .0
RQD
Figure 4 Histogram of the RQD data along the North Ramp.
CROMER ET AL. ON ROCK QUALITY DESIGNATION 225
There are many situations in which the pattern of spatial continuity of higher values is not
the same as that oflower values. For example, preferential groundwater flow resulting
from aquifer heterogeneity can often be attributed to isolated occurrence of sand/gravel
stringers/lenses that possess the very highest permeability. When such marked disparities
exist within a distribution, the higher values tend to increase within-lag variability and
make variogram interpretation difficult. The focus of this study was, therefore, directed
more specifically at trying to understand the nature of the spatial structure from the lower
RQD values and predict their occurrence as they relate to specific design issues.
Indicator methods (Isaaks and Srivastava, 1989) are non-parametric and provide the
flexibility needed to focus this study on particular classes of data. This focus is
accomplished by transforming the raw data distribution into K mutually exclusive classes
of binary, indicator variables. The indicator transform of the raw data is typically defined
as either zero or one, depending upon whether it falls above or below a particular data
value (cutoff threshold):
. {lifZ(X)~Zk
l(X·Z )=
'k Oif z(x» Zk
Contrary to the vertical orientation, the horizontal spatial structure (Figure 6) at the 25
RQD cutoff is not as clearly definable. Although some structure is apparent in Figure 6,
there are many instances were one must rely on additional information, external to the
sample data, to formulate a professional judgment on the horizontal component of this
anisotropy. For example, geologic information collected from surface transects can often
provide details on principal directions of spatial correlation for the ReV at a scale not
easily observed through isolated borings. An omni-directional variogram developed for
this cutoff would not, in an explicit fashion, draw attention to limitations and anisotropy
in the bore hole data... making this an excellent case for emphasizing that exploratory data
analysis cannot be discounted as simply an exercise, nor should geometric anisotropy be
ignored.
The lower RQD values display a relatively stable, continuous structure consistent with
the conceptual geologic framework. This stability, although, does not persist when
examining the 50 RQD indicator threshold. At the 50 RQD threshold, the vertical
variogram in Figure 7 continues to exhibit good structure and was modeled with the
parameters shown. The horizontal variogram (Figure 8), on the other hand, has
degenerated with the inclusion of 78 additional indicator data previously assigned a value
of 0 at the 25 cutoff. Not many conclusions can be drawn from such a relationship and,
therefore, the horizontal component was modeled with an effective nugget at the
theoretical sill of 0.147. The observed degeneration of the variogram structure (higher
nugget, poorly defined range) reflect upon the erratic spatial occurrences of higher RQD
values, which is again consistent with the conceptual framework.
Since limited and sporadic high RQD data imposed the observed rapid degeneration of
the variogram structure at higher thresholds, the potential for order relations problems
CROMER ET AL. ON ROCK QUALITY DESIGNATION 227
o.40 r;::::=:::::!==:::!:==-::---..--r-----,~-_,
MOdll Par_mltar,
Nut Rangt Nodal
1 7.
!' Nugglt
0 .17
0 .08
Spharlc&l
..
0 .30
•E
E 0 .20
..
•
'" Data Par.maters
Data F" = InttJkomt/mvcromaJrqd8k.mlYatlrqdeuct.dal
•
0 .00 L ........---''--........~==::::::i:==:j:==:::::;:==~
0.00 40.00 80 .00 120.00 160.00 200.00 240.00
Figure 5 Variogram along vertical principle direction of anisotropy, for the 25 RQD
indicator cutoff.
!'
1 1000 0 .17 Spherical
Nugget 0 .08
0.30
-----------------------:=,;-------------1
•E
E 0 .20
•
'" Dati; P.... m.t.'.
Data FUt : Inttlhomlllhnvcromt/rqd8k.mlvatllnd20v.gam
•
Figure 6 Variogram along horizontal principle direction of anisotropy, for the 25 RQD indicator cutoff.
228 GEOSTATISTICAL APPLICATIONS
0 .30
I' 1
Nu • • t
10 0 .017
0.08
Sph.r1c.1
•E
E
•
'"
0 .20
.... . D.tI p.t.m.t.,.
~
Data F.. : /n.tthom. /mvcromlllrqdeklmlV.rllndSOh .g.m
Gamm. F.. : In.tIhom./mvcromlllrqdlktmN.,nnd5OV .g.m
Indlce.tor S.",N.rlogt.", - Point 0.1&
0.10 lag : 10
Direction: I Plung. : 0
Horizontal Bandwidth : 40 Hal-Angl.: 10
V.rtSc.l l Bandwidth : 1 Hal-Anlill. : 0
Flit Column: 1 Row: 2 Laytr: o V.Iu. : a
Figure 7 Variogram along vertical principal direction of anisotropy, for the 50 RQD indicator cutoff.
0 .40 r;::::::::::::!==::::!:===f"""""'-""-""'---'-""""""
Mod" Par.ln.tIl,. ,
N.. t Rang. C Mod.1
1 .0 0 .087 Sph.rlc.1
NUIIII.t 0 .01
0 .3 0 1 - ' - - - - - - - - - - - '
•E '
.•
E 0.20
b--=--..:,----ID·~:·~;~::~m.hn .
cromlllrqd.k.m/v.'n rwl50h.SI.m
0 .10
I Gamm. F.. : fn.tIhom. /mvcromalrqdl • • mNarnndSOh.Q.m
Indicator 6.mtnrloar.m - Point 0.1&
Lag : 200
Dlrecllon: " Pklng.: 0
Horizontal B.ndwkUh : 40 Half-Angle : 40
V.rdcal B.ndwldth : t HaI. Anal. : 0
Fl . Column : 1 Row : 2 Lay.r: o Value: S
0 .00
0 .00 400 .00 800.00 1200 .00 1600 .00 2000 .00 2400.00
Figure 8 Variogram along horizontal principal direction of anisotropy, for the 50 RQD indicator cutoff.
CROMER ET AL. ON ROCK QUALITY DESIGNATION 229
existed. These problems threaten the capability of the simulation algorithm to represent
the higher two thresholds (50 and 75) because covariance reproduction is not constrained
where high nugget effects prevail. For this reason, quantitative inferences will not be
made from these upper thresholds.
The indicator transforms were simulated using the sequential indicator simulation
algorithm SISIM (Deutsch and Journel, 1992). Figure 9 shows three separate simulated
fields ofRQD along the axis of the North Ramp. Each field is conditioned to existing
bore-hole data and present a plausible version of the "reality" defined by first- and
second-order statistical moments. These figures have been vertically exaggerated by five
times the horizontal dimension for detailed examination. The three images display some
similar textural characteristics, while the uncertainty in their representation is captured by
the differences between images. If each image was used, for example, as input data to
some process modeling code for design purposes (say, in a Monte Carlo fashion), the
variation in outcomes from the process model would explicitly account for uncertainty.
Geostatistical simulation was selected over estimation in this study because of its
robustness in addressing potential "downstream" application questions. Simulation
differs from estimation in two major aspects; (1) simulation techniques provide high-
resolution models that strive for overall textural and statistical representation rather than
local accuracy and (2) the differences among the alternative models provide a measure of
joint spatial uncertainty. For example, some uncertainty issues can be addressed simply
from the equiprobable images (models) prior to any downstream process modeling. In
Figure lOa total of 100 conditional simulations have been processed using the crude
cumulative distribution function defined by the four indicator thresholds 10, 25, 50, and
75 to determine the distribution of expected (mean) RQD values. The gray-scale limits
the displayed variability in the image to range between a maximum of 50 and a minimum
of 0 to capture detail. Notice the limited occurrence of values that equal or exceed 50,
indicating and that these are isolated occurrences that should not be expected to propagate
spatially.
Most areas of the expected value map shown in Figure 10 are dominated by values that
range between 20.0 and 30.0. Although this is consistent with the histogram ofRQD
values shown in Figure 4, it may also be due, in part, to the selection of simple kriging as
the local estimator. Simple kriging was chosen over ordinary kriging because of the
scarcity of data and the risk of unwarranted data propagation (Deutsch and Joumel,
1992). If ordinary kriging is used with sequential simulation, there may be a tendency to
propagate locally simulated values in a manner inconsistent with the conceptual model.
This characteristic becomes more problematic when there is a lack of constraining,
original data.
Since each pixel (model grid cell) is simulated 100 times, the statistical distribution of
each local outcome allows us to query characteristics of the outcome distribution ofRQD
values on a pixel-by-pixel basis. Post-processing of several outcomes can provide
information such as the probability of exceeding a specified threshold or the average
230 GEOSTATISTICAL APPLICATIONS
"'.0
45.0
40.0
35.0
30.0
2>.0
20.0
15.0
10.0
5.0
.0
"'.0
45.0
40.0
35.0
30.0
25.0
20.0
15.0
10.0
5.0
.0
"'.0
45.0
40.0
35.0
30.0
2>.0
20.0
15.0
10.0
5.0
.0
Figure 9 Three alternative 2-D images (realizations) ofRQD along the axis of the North Ramp.
The angular trace in the middle of each image represents the vertical orientation of the ramp
within the cross-section. Each image can be considered equally probable given the state of
existing knowledge, because each is conditioned to the same sample data and honor the same
spatial statistics. The differences between the images provides a measure of joint spatial
uncertainty.
CROMER ET AL. ON ROCK QUALITY DESIGNATION 231
value above, or below, a threshold. A map showing the value at which an individual
pixel reaches a specified cumulative probability, for example, would provide valuable
information for quantifying risk. Figure 11 shows the probability of exceeding an RQD
value of 25. Although this map looks very similar to the expected value map of Figure
10, it is revealing very different information. The gray-scale in Figure 11 ranges between
zero (0% probability) and one (100% probability), unlike the expected value map.
CONCLUSIONS
Unfortunately, the 2-D simulated images along the North Ramp cross-section do not
explicitly focus information on the expected variability to be encountered along the drift
itself. To evaluate anticipated conditions specifically along the drift, the designed
inclination of the drift has been projected from the tunnel entrance and is shown as the
trace super-imposed on the images in Figure 9. The expected (mean) value ofRQD along
the tunnel projection has been extracted on a pixel-by-pixel basis for comparison against
each of the three simulations presented in Figure 9.
The graphs shown in Figure 12 allow us to compare the variability in simulated RQD
along the three tunnel projections (taken from Figure 9) as a function of distance from the
right (east) edge of the cross-section. As a point of reference, the east edge of the cross-
section also corresponds to the location of boring NRG-l. The most immediate
observation in Figure 12 is the widespread, erratic fluctuations in simulated values about
their expected (mean) value. This was to be expected following our variography
exercises and discovery of the limited horizontal correlation range (approx. 800.0 ft
(243.8 m)) for lower RQD values and negligible spatial correlation in the higher RQD
values. What is not so apparent is the performance of the simulation in areas that are
conditioned by the available boring logs. At distances ofless than 3200 ft (975.4m) from
NRG-l the simulations, in general, tend to deviate less from the expected value. Boring
log data in this region are available to constrain uncertainty and, therefore, reduce the
spread oflikely outcomes for a local prediction
FINAL THOUGHTS
Basic exploratory data analysis identified a great deal oflocal variability in RQD.
Although very low RQD (i.e. less than 25) can be anticipated periodically along the entire
length of the North Ramp, it would not be prudent to extrapolate this interpretation to the
entire mountain. Three factors were found to influence the interpretation ofRQD: 1)
stratigraphic setting, 2) proximity to major fault/fracture zones, and 3) very local foot-
by-foot factors (likely due to individual high-angle fractures sub-parallel to the drill core).
The high degree of variability over very short distances may require design planning to
accommodate the worst rock conditions along the entire length of excavation
232 GEOSTATISTICAL APPLICATIONS
Figure 10. Mean (expected) value map developed from 100 individual simulations of RQD.
1 .0
.90
.80
.70
.60
.50
.40
.30
.20
.10
.0
RN-30157
10 _________ _
RN-22475
10
"
~~~a~~~~§
.",IM_ """" NRO·'
Horlzontlll
Figure 12 Simulated RQD values along the proposed North Ramp taken
from the three fields shown in Figure 9. For comparison, also shown (in
bold) are their expected values derived from the 100 simulations.
234 GEOSTATISTICAL APPLICATIONS
This study has demonstrated how the measurement and analysis of data may lead to
interpretations that are not obvious or apparent using other means of research. Although,
many statistical tools are useful in developing insights into a wide variety of natural
phenomena; many others can be used to develop quantitative answers to a specific
questions. Unfortunately, most classical statistical methods make no use of the spatial
information in earth science data sets. However, like classical statistical tests,
geostatistical techniques are based on the premise that information about a phenomenon
can be deduced from an examination of a small sample collected from a vastly larger set
of potential observations on the phenomena. Geostatistics offers a way of describing the
spatial continuity that is an essential feature of many natural phenomena and provides
adaptations of classical regression techniques to take advantage of this continuity. The
quantitative methodology found in applications of geostatistical modeling techniques can
reveal the insufficiency of data, the tenuousness of assumptions, or the paucity of
i I information contained in most geologic studies.
II
REFERENCES
Cecil III, O.S., 1970, "Correlations of Rock Bolt--Shotcrete Support and Rock Quality
Parameters in Scandinavian TWmels," Ph.D. Thesis, University of Illinois, Urbana.
i' Deere, D.U., and D.W. Deere, 1989, "Rock Quality Designation (RQD) after twenty
I; years: US Army Corps of Engineers," Contract Report GI-89-1.
I
Deutsch C.V. and A.G. Journel, 1992, "GSLIB Geostatistical Software Library and User's
Guide," Oxford Univ. Press, New York, New York.
Lin, M., M. P. Hardy, and S. 1. Bauer, 1993, "Fracture Analysis and Rock Quality
Designation Estimation for the Yucca Mountain Site Characterization Project: Sandia
Report SAND92-0449," Sandia National Laboratories, Albuquerque, NM.
Scott, R.B. and J. Bonk, 1984, "Preliminary Geologic Map of Yucca Mountain, Nye
County, Nevada, with Geologic Sections," U.S. Geol. Survey Open-File Report 84-494.
US Geological Survey, 1993 "Methodology and source data used to construct the
demonstration lithostratigraphic model: second progress report".
236
CARR ON NORTHRIDGE EARTHQUAKE 237
119 118
Fig . 1. Seismic hazard model developed using Gumbel (1958); from Carr (1983) and
also published in Carr and Glass (1984). Contoured values are probabilities (%) of exceeding
an intensity VI over a 50 year period.
were geographically registered during the kriging process. Because the rasters are
registered, the final step in the indicator kriging model is simply a summing of all
rasters to form one, combined raster. A contour map of the combined raster shows
the frequency of exceeding a threshold VI over a particular time period. This
frequency constitutes the seismic hazard for a particular geographic region. Carr and
Bailey (1985) applied the indicator kriging model to the New Madrid, Missouri
seismic zone in the time period, 1811 - 1980.
wherein Z(x;) are data values at N nearest data locations to the estimation location, Xc;
Z'(Xc) is the estimated value at the estimation location, Xc, and the values, a, are
weights applied to the N data values to obtain the estimate. A restriction is placed on
the weights, a, in ordinary kriging such that their sum is 1; this assures unbiased
estimation.
as follows:
N
Y (h) = -.l L [Z(x) - Z(xi+h) J 2
2N i =1
225
r(h)
SiII - 165
?"'..,...------ Spherical model
7S
R:mge - 2.3
3 4
h nag distance. km)
value, XII (12), represents total damage, landsliding, fissuring, liquefaction, and so
on. A value, VI (6), is that value at which exterior structural damage is noticed, such
as cracked chimneys. Interior damage is noted with a value, V (5). Subsequent to an
earthquake, the United States Geological Survey distributes questionnaires to citizens
living within the region experiencing the earthquake. They are asked to describe what
they experienced during the earthquake. Examples include: 1) Did you observe
damage and, if so, what was the damage?; 2) Did you feel the earthquake and, if so,
where were you when you felt it? Intensity values are then assigned [subjectively] to
each questionnaire.
That modified Mercalli intensity data are subjective is obvious. What is not
obvious is that geostatistics (kriging) is validly applied to grid (estimate) such data.
Clearly, Glass (1978) showed this empirically. Joumel (1986) discusses the
application of geostatistics to "soft," or subjective, data in considerable detail.
Indicator kriging is a form of kriging that does not entail a change in the
equation for the kriging estimator, but does entail a change in the data to which
kriging is applied. With indicator kriging, a transform is applied to the data, in this
case modified Mercalli intensity values. This transform is a simple one: i(x) = 0, if
Z(x) < c; i(x) = 1 otherwise; this simple transform yields the indicator function, i.
Notice that the indicator function is a binary one, taking on only two possible values,
o and 1. Because of this, the indicator function is said to be a nonparametric
function, because the notion of a probability distribution for such a function is not
pertinent. The nonparametric nature of the indicator function has certain advantages
in geostatistics (Joumel 1983), chiefly the minimization of the influence of extreme
data values on the calculation of the semivariogram and in kriging. The value, c,
used to define the indicator function is called a threshold value. In this study of
seismic hazard, c is that critical ground motion value chosen to define the hazard. In
this study, c is chosen to be an intensity value of VI (6) because this intensity value is
that at which exterior structural damage is first noticed.
Van der Meer and Carr (1992) focused analytical attention on whether high
hazard correlated spatially with known, active faults. That study found that higher
hazard could not be directly related to anyone active fault in southern California.
This study verifies this conclusion. Higher hazard does not directly correlate spatially
with known active faults (Fig. 3).
119
Fig. 3. Indicator kriging hazard map with major active faults superimposed. Regions
associated with at least 6 episodes of intensity VI or higher ground motion in the time period, 1930-
1971 , are highlighted in gray. The faults are coded as follows : A) White Wolf Fault; B) Garlock
Fault; C) Big Pine; D) Santa Ynez; E) Oak Ridge; F) San Andreas ; G) San Gabriel; H) Newport-
Inglewood ; I) San Jacinto .
ground motion were experienced . But, a higher hazard would not necessarily be
expected within this gray-patterned region because it is not near anyone fault . Its
proximity to three active faults, however, makes it vulnerable to damage during
earthquakes occurring on all three faults .
As a test of the hypothesis (Fig. 4) , the active faults shown in Figure 3 are
idealized as shown in Figure 5. A digital raster is developed for each of these faults
as follows : 1) an attenuation function was designed from a general formula given in
Cornell (1968): intensity = 5.4 + M - 31nR, where M is Richter magnitude and R is
the distance from the fault; 2) a typical Richter magnitude was chosen for each of the
nine (9) faults (Table 1); 3) a 34 x 34 digital raster (an arbitrary choice of size) was
CARR ON NORTHRIDGE EARTHQUAKE 243
E- EPICENTER
- -' FAULT
119 118
developed, geographically registered to the kriged seismic hazard rasters (note that
this grid size is smaller than that used for indicator kriging; both grid sizes, however,
are of arbitrary size and merely facilitate the construction of contour maps). An
intensity value was estimated for each cell of the raster using the foregoing
attenuation formula (not by indicator kriging in this case); 4) if the estimated intensity
was VI or greater, the raster cell was
assigned a value of 1; otherwise the cell was assigned the value O. Once a digital
raster was developed by this procedure for each of the nine (9) active faults (Fig. 6),
a composite raster was formed as a sum of all nine rasters. Frequency of intensity VI
or greater ground motion was then contoured (Fig. 7). Gray shading highlights the
geographic regions associated with the highest frequency of damaging ground motion.
Fault Magnitude
CONCLUSION
<6
..,"<t ( _6
119
, >6
118
~
\19
Fig. 7. Resultant hazard map produced by hypothesizing earthquakes along the entire
spatial domain of active faults. Gray shading shows regions that the theoretical model pridicts
having the highest frequency of intensity VI or better ground motion.
246 GEOSTATISTICAL APPLICATIONS
118
INDICATOR KRIGING MODEL THEORETICAL MODEL
Fig . 8. The seismic hazard maps of Figures 4 and 8 with epicenters plotted for the
1987 Whittier Narrows, 1990 Upland, and 1994 Northridge earthquakes.
earthquakes occurring within a time window of interest, the rasters are simply
summed to yield a fmal composite map (e.g., Figure 3) . No normalization is
performed subsequent to this summing process. Final maps therefore do not represent
probabilities, but instead represent total frequency of experiencing ground motion
severe enough to cause damage. Moreover, this summing process assumes all rasters
are geographically registered, a condition easy to achieve with kriging because, when
used for gridding, geographic coordinates defming the grid must be entered into the
computer program performing the kriging .
In summary, geostatistics offers spatial analysis tools that are quite useful for
producing maps of seismic activity. Ground motion for individual earthquakes are
readily gridded using kriging. Furthermore, what is presented herein is nothing more
than a raster-based geographic information system. Hence, GIS programs having a
raster-import capability , such as ARC/INFO, are capable of displaying results given
herein once digital rasters have been formed using software such as is given in Carr
(1995).
CARR ON NORTHRIDGE EARTHQUAKE 247
References
Carr, J. R., 1995, Numerical Analysis for the Geological Sciences, Prentice-Hall,
Englewood Cliffs, New Jersey.
Carr, J. R. and Glass, C. E., 1984, "A Regionalized Variables Model for Seismic
Hazard Assessment," Eighth World Conf. on Earthquake Engineering, Prentice Hall,
Englewood Cliffs, New Jersey, Vol. 1, pp. 207-213.
Carr, J.R., and Bailey, R.E., 1985, "An Indicator Kriging Model for the
investigation of Seismic Hazard" Mathematical Geology, Vol. 18, No.4, pp.
409-428, 1986.
Gumbel, E. J., 1958, Statistics of Extremes, Columbia University Press, New York.
Journel, A. G., and Huijbregts, Ch. J., 1978, Mining Geostatistics, Academic Press,
London.
Matheron, G., 1963, "Principles of Geostatistics,." Economic Geology, Vol. 58, pp.
1246-1266.
REFERENCE: Goderya, F. S., Dahab, M. F., Woldt, W. E., Bogardi, I., "Spatial Pat-
terns of Field Measured Residual Soil Nitrate," Geostatistics for Environmental and
Geotechnical Applications, ASTM STP 1283, R. Mohan Srivastava, Shahrokh Rouhani,
Marc V. Cromer, A. Ivan Johnson, Alexander J. Desbarats, Eds., American Society for
Testing and Materials, 1996.
ABSTRACT: The purpose of this study was to assess the spatial variability of
residual soil nitrate, measured in three contiguous 16 ha fields. Available data for
residual soil nitrate were examined using conventional statistics. Data tended to be
skewed with the mean greater than the median. Geostatistical methods were used to
characterize and model the spatial structure. Three dimensional spatial variability was
examined using two semivariograms: horizontal-spatial and vertical. Two ~ .
dimensional horizontal-spatial serriivariograms were also computed for each O.3m (1ft)
layer. Semivariogram analysis showed that there were similarities in the patterns of
spatial variability for all fields. The results suggest that the spatial patterns in
residual soil nitrate may be correlated with irrigation practices. Furthermore, a trend
was found to be present along the vertical direction, which may be related to the time
of sampling.
INTRODUCTION
The origin and nature of soil resource variability includes natural and
management induced soil parameters, and factors exhibiting variability in space and
'Graduate Research Assistant, and 2Prof., Dept. of Civil Eng., University of Nebraska, Lincoln, NE 68588
3Assistant Prof., Dept. of Biological Systems Engineering, University of Nebraska, Lincoln, NE 68583.
248
GODERYA ET AL. ON SOIL NITRATE 249
time (Bouma and Finke 1992). It is an outcome of many processes acting and
interacting over a continuum of spatial and temporal scales. Nitrate is a mobile
nutrient, also, soil resource and meteorological variability obscures the contemplation
of its spatial structure. For example, soil nitrate concentrations from individual
samples are usually quite variable, in addition, the non-uniform distribution of
irrigation water complicates the issue.
To date, we are not aware of any attempts to characterize the spatial variability
of the residual soil nitrate using three dimensional spatial statistics. This information
is necessary since the spatial variability in residual soil nitrates has been considered a
major factor associated with inherent leaching of nitrate in many production
agriculture situations.
METHOOOWGY
Classical statistical parameters such as the mean, the standard deviation and
the coefficient of variation were calculated for each layer. Statistical parameters for
the overall three dimensional data sets (vertically averaged over core), as well as for
profile (vertically integrated nitrate content for each hole), were also calculated.
Structural analysis of the field data was used to evaluate the semivariogram
function using programs from GSLIB (Deutsch and Joumel, 1993). Semivariograms
(Joumel and Huijbregts 1978) were used to examine the spatial dependence between
measurements at pairs of locations as a function of distance of separation. Three,~~
dimensional spatial variability was examined for each of the fields using two
semivariograms: horizontal-spatial semivariogram and vertical semivariogram. The
semivariogram for horiwntal spatially related data identifies the variability due to \
distance and is combined for all the depths. However, the vertical semivariogram )
describes the variabilities due to depth irrespective of horizontal location. Hence, fo~/
, ,the available data set of each field, two semivariograms were constructed. '
Residual soil nitrate in the profile was highly variable, ranging from 64 to 650
kg/ha (57 to 580 Ibs/acre) with a mean of 192 kg/ha (173 Ibs/acre). Table 1 shows
GODERYA ET AL. ON SOIL NITRATE 251
the statistical parameters for the three fields. For each layer, data tended to be
skewed with the mean greater than the median. The general trend was toward an
increase in the values of coefficient of variation and a decrease in the values of
residual soil nitrogen with increasing depth. For overall 3-dimensional measurement
values, the distribution of data was skewed with large coefficient of variation.
TABLE 1--Residual soil nitrate from three fields.
Field Layer Minimum Maximum Mean Median Std. dev C.V (%)
(kg/ba) (kg/ba) (kg/ba) (kg/ba) (kg/ba)
The vertical experimental semivariograms and the models fitted are shown in
Figures 1b, 2b, and 3b. The maximum distance considered in the computation of the
semivariogram cannot exceed half the maximum dimension of the field, (i.e. 0.75 m)
for the vertical semivariogram (Journel and Huijbregts, 1975). Thus, only the first
two values of the vertical semivariogram are reliable. All vertical semivariograms do
not reach a sill, indicating a trend in the property studied. If the information
contained in the semivariogram is to be used for kriging at unsampled locations, the
trend may need to be removed, or universal kriging may be used. A reason for this
trend is most probably related to the presence of high amounts of residual soil nitrate
in the surface layer. Figure 4 shows the average amount of nitrate-N in each layer
for the three fields. Significant differences between the top layer and subsequent
layers may be related to the time of sampling. The results probably exhibit the
influence of temporal dynamics due to the spring sampling of the fields. This may be
because high mineralization and almost no precipitation/irrigation occurred at the time
of sampling of these fields. For this reason, two different types of theoretical models
were fitted to the vertical semivariograms; power and spherical models. If the data
are to be used for simulation purposes, then the power model may not be used, and
hence, another model should be used.
"';
Fitting a model to the experimental semivariogram is a significant step in the,
geostatistical analysis( tt1s1nrpottiiflCto-serecran-appiopii~I'I1OdetiUr-the-- -' .
semivariogram because each model yields different values for the nugget effect and
range. A satisfactory fit to the sample variogram was accomplished by the trial and
error approach as described by Isaaks and Srivastava (1989). Due to resource
constraints only omni-directional horizontal spatial semivariograms and vertical
semivariograms were fit to the sample variogram for each field. Table 2 provides the
values of semivariogram models for the above mentioned cases. Parameters for the
two types of theoretical semivariograms for the vertical direction also are provided in
Table 2. Good agreement was obtained between calculated semivariogram values and
the corresponding models, as shown in Figures 1, 2, and 3.
There were no data available for lag distances less than 30 m (100 ft) in the
GODERYA ET AL. ON SOIL NITRATE 253
3000~-------------'
1000~------------,
- [EJ
E 800 t2500
~
J 600 ~2000
'"§ 1500
§ I;,
I;, 400
~ 1000
~
E 200 ·s 500
ell ell
0~0--~0~
. 5--~1~-~1.~5--~2
100 200 300
DislAnce (m) DislAnce (m)
1000~------------, 2ooor--------------,
- EJ
~ 800 ~ 1600
"- + + "-
J 600 ~ 1200
5 5
f400V :1 800
's 200 ~
ell
400
ell
°0~--~10~0--~20~0--3~00--~400 0.5 1 1.5 2
DislAnce (m) DislAnce (m)
500r---------------, 2500r--------------.
- EJ
~2000
~::: ~-~+-+-+
200
J 1500
5
~ 1000
·r
E 100
/
:1 500
ell ell
0~0--~1~00~-~2~00~-~30~0--~400 0.5 1 1.5 2
DislAnce (m) DislAnce (m)
1100
"'-toO
.. ..
.-----
Field 1
C 80
~
~
• Field 2
oS 60
.9 Field 3
z
I
t'l
40 ...
+
... .. .
'-----
0 + +
Z
~
20
::
~
" 0
0 0.5 1 1.5 2
Depth (m)
Spatial variability can also be investigated using the semivariogram and the
relative nugget effect, that is the ratio of nugget to total semivariance expressed as
percentage. A ratio less than 25 % indicates strong spatial dependence, between 25 %
and 75 % indicates moderate spatial dependence, and greater than 75 % indicates weak
GODERYA ET AL. ON SOIL NITRATE 255
spatial dependence (Cambardella 1994). The horizontal-spatial semivariograms may
be described as having moderate spatial dependence for residual soil nitrate.
However, if one considers the spherical model for vertical semivariograms, then the
vertical semivariograms may be characterized by strong spatial dependence; exhibiting
ratios of less than 25 %. Strong to moderate spatially dependent structures may be
controlled by intrinsic and extrinsic variations as well as seasonal variations.
There was less nitrogen in the soil profile in the third field, and there was less
variability in samples from different layers of this field, as compared to the other two
fields. However, overall (vertically averaged over core) sample variability was the
same or higher (see Table 1 and Figures 3 and 7). Further investigation indicated
that this field received more irrigation water in the previous two years than the other
two fields. It is probable that the excessive application of irrigation water leached
much of the nitrate from the profile and reduced the amount and spatial variability of
residual soil nitrate.
Six directional semivariograms were calculated for each field. All directions
corresponded to rotations in the horizontal plane only. The directions considered
were North, N30E, N60E, N90E, N120E, and N150E, with azimuth half tolerance of
45 degrees. Directional semivariograms are presented as a contour map of the sample
variogram surface (planimetric form) in Figures 8, 9, and lO for Fields I, 2, and 3,
respectively. The values contoured are the semivariance in every direction to a
distance of at least 200 meters, with a contour interval units in (kg/ha)2. Differences
between direction-dependent semivariograms for the fields studied could be the result
of the differences in geology, topography, and/or management of the area.
layer 3
,""
~ 12000
9000
101 )( )I( )01 )I(
...
5
~ 1000
... layer 4 ~ 6000
'~ • • +++6111 0 ° 00
00
layer 5
'~
+
's 500 +
iI 10 ii ~~~~llIIilil*i .:: S 3000
~
)()()( )( )(
~ )( )( ><
FIGURE 5--2-D semivariograms for Field 1; (a) five depths; and (b) total in profile.
)()(
)(~
's
~
3000
)(
0 0
0 100 200 300 400 500 0 100 200 300 400 500
Distance (m) Distance (m)
FIGURE 6--2-D semivariograms for Field 2; (a) five depths, and (b) total in profile.
1200
EJ ...
- - ... 5000
~
~ 4000
i
layer 1
... ...
... ...... ......
900
layer 2
... ~3000
"" layer 3 ...... ......
!
'~
600
layer 4
5
'f2000
300
~ S 1000
S .. ••••• * •• +
~ g ~
II 1111 111111 II SI SI SIll
0 0
0 100 200 300 400 500 0 100 200 300 400 500
Distance (M) Distance (m)
FIGURE 7--2-D semivariograms for Field 3; (a) five depths, and (b) total in profile.
GODERYA ET AL. ON SOIL NITRATE 257
FIGURE 8--A contour map of the semivariogram values for Field 1. Contour interval
is 50 (kg/ha)A2.
FIGURE 9--A contour map of the semivariogram values for Field 2. Contour interval
is 50 (kg/hat2 .
than in the north south direction. In other words, the irrigation pattern seems to
result in high variability (larger sill values) with the variogram surface rising rapidly
in the north-south direction. Hence, the directional semivariograms indicates the
258 GEOSTATISTICAL APPLICATIONS
FIGURE lO--A contour map of the semivariogram values for Field 3. Contour interval
is 25 (kg/ha)'"'2.
Geostatistical analyses showed that residual soil nitrates in three fields were
spatially structured. This spatial structure is important to consider, both for fertilizer
application and for evaluation of potential pollutant transport to the groundwater. The
apparent spatial variability in the residual soil nitrate has the potential to seriously
limit the efficiency of fertilizer application according to traditional practices.
Conventional statistical analysis showed that the residual soil nitrate in the profIle was
variable, ranging from 64 to 650 kg/ha (57 to 580 lbs/acre) with a mean of 192 kg/ha
(173Ibs/acre). Data tended to be skewed with the mean greater than the median.
The two dimensional analysis showed a strong spatial pattern in the top layer,
which is displayed in the overall structure of the 2-dimensional semivariograms. The
analysis further revealed that the soil nitrates at 0.6m to 1.5m (2 to 5 ft) depths may
be sampled without a great sensitivity to location with a resulting similar variance.
Direction-dependent semivariograms showed that residual soil nitrates apparently
followed trends in irrigation water supply. This pattern resulted in high variability in
the direction perpendicular to irrigation water flow.
ACKNOWLEDGEMENT
This paper was supported, in part, by the Center for Infrastructure Research,
the Water Center, and the University of Nebraska-Lincoln and, in part, by the
Cooperative State Research Service (CSRS) of the U.S. Department of Agriculture
(Grant Number 92-34214-7457). Assistance provided by Dr. T.A. Peterson from
Department of Agronomy of University of Nebraska-lincoln is acknowledged.
260 GEOSTATISTICAL APPLICATIONS
REFERENCES
Beckett, P. H. T., and Webster, R., 1971, "Soil variability: A review", Soils and
Fertilizers, Vol. 34, No. l,pp. 1-15
Berndtsson, R., Bhari, A., and Jinno, K., 1993, "Spatial dependence of geochemical
elements in a semiarid agricultural field: I. Geostatistical properties", Soil Science
Society of America J., Vol. 57, pp. 1323-1329
Bhatti, A. U., Mulla, D. J., Koehler, F. E., and Gurmani, A. H., 1991, "Identifying
and removing spatial correlation from yield experiment", Soil Science Society of
America J., Vol. 55.pp.1523-1528
Biggar, J.W., Nielsen, D. R., and Erh, K. T.,1973, "Spatial variability of field-
measured soil-water properties", Hilgardia, Vol. 42, No.7, pp. 214-259
Biggar, J.W., and Nielsen, D. R., 1976, "Spatial variability of the leaching
characteristics of a field soil", Water Resources Research, Vol. 12, No. l,pp. 78-84
Bouma, J., and Finke, P. A., 1992, "Origin and nature of soil resource variability",
1992, Proceedings of the Soil Specific Crop Management Conference, Minneapolis,
Minnesota, April 14-16
Bresler, E., 1989, "Estimation of statistical moments of spatial field averages for soil
properties and crop yields", Soil Science Society of America J., Vol. 53, pp. 1645-
1653
Cambardella, C.A., Moorman, T. B., Novak, J. M., Parkin, T. B., Karlen, D. L.,
Turco, R. F., and Konopka, A. E., 1994, "Field-scale variability of soil properties in
central Iowa soils", Soil Science Society of America J., Vol. 58 (In press)
Dahiya, I. S., Ritcher, J., and Malik, R. S., 1984, " Soil spatial variability: A
review", International Journal of Tropical Agriculture, Vol. 11, No.1, pp. 1-102
Davis J., 1986, "Statistics and Data Analysis in Geology", John Wiley & Sons, New-
York, NY
Deutsch, C.V., and Journel, A. G., 1992, "GSLIB: Geostatistical Software Library
and User's Guide", Oxford University Press, New York, NY
Guaracio, M., David, M., and Huijbregts, C. J., 1975, "Advanced Geostatistics in
the Mining Industry", D. Reidel Publishing Company, Dordrecht, Holland
Jury, W. A., Russo, D., Sposito, G., and Elabd, H., 1987, "The spatial variability of
water and solute transport properties in unsaturated soil; I. Analysis of property
variation and spatial structure with statistical models", Hilgardia, Vol. 55, No.4, pp.
1-32
Kalinski, R.J., Kelly, W. E., Bogardi, I., and Pesti, G., 1993, "Electrical resistivity
measurements to estimate travel times through unsaturated ground water protective
layers", Journal of Applied Geophysics, Vol. 30, pp. 161-173
Mulla, D. J., 1988, "Estimating spatial patterns in water content, metric suction, and
hydraulic conductivity", Soil Science Society of America J., Vol. 52, pp.1547-1553
Ovalles F. A., and Collins, M. E., 1988, "Evaluation of soil variability in northwest
Florida using geostatistics", Soil Science Society of America J., Vol. 52, pp. 1702-
1708
Peterson T. A., and Schepers J. S., 1992, "Spatial distribution of soil nitrate at the
Nebraska MSEA site", Agriculture Research to Protect Water Ouality, Poster Paper,
USDA Agricultural Research Service, University of Nebraska, Lincoln, NE
Rolston, D. E., and Liss, H. J., 1989, "Spatial and temporal variability of water
soluble organic carbon in a cropped field", Hilgardia, Vol. 57, No.3, pp. 1-19
Sutherland, R. A., Kessel, C. V., and Pennock, D. J., 1991, "Spatial variability of
Nitrogen-15 natural abundance", Soil Science Society of America J., Vol. 55, pp.
1339-1347
Tabor, J. A., Warrich, A. W., Myers, D. E., and Pennington D. A., 1985, Snruial
variability of nitrate in irrigated cotton: II. Soil nitrate and correlated variables, Soil
Science Society of America J., Vol. 49, pp. 390-394
Webster, R., and Burgess, T. M., 1983, "Spatial variation in soil and the role of
Kriging", Agricultural Water Management, Vol. 6, pp. 111-122
Woldt, W., and Bogardi, I., 1992, "Ground water monitoring network design using
multiple criteria decision making and geostatistics", Water Resource Bulletin, Vol.
28, No.1, pp. 45-62
Woldt, W., Bogardi, I., Kelly, W. E., and Bardossy, A., 1992, "Evaluation of
uncertainties in a three-dimensional groundwater contamination plume", Journal of
Contaminant Hydrology, Vol. 9, pp. 271-288
Dae S. young1
REFERENCE: Young, D. S., "Geostatistical Joint Modeling and Probabilistic Stability Anal-
ysis for Excavations," Geostatistias for Environmental and Geoteahmaai AppllOatlOns, ASTMBTP
1283, R. M. Srivastava, S. Rouhani, M. V. Cromer, A. 1. Johnson, A. J. Desbarats, Eds., AmerIcan
Society for Testing and Materials, 1996.
262
YOUNG ON ANALYSIS OF EXCAVATIONS 263
model, which will yield local structural stability in terms of the
probability of failure.
since the block size distribution (i.e. blocks formed by the joints
in a rock mass) can describe or be related to these engineering
criteria, it is a pertinent characteristic parameter in numerous
engineering studies including tunneling and underground excavations,
rock bolting and other types of supporting systems, engineering
classifications of rock mass, key block analysis for structural
stability, drilling and blasting, and transmissibility of fluids through
fractured rock formations.
In this paper, a numerical method was developed to identify blocks
and calculate their sizes (or volumes), shapes, and locations, as well
as their stability. The connectivity matrix was introduced in this
numerical approach, which is equivalent to the stiffness matrix of the
finite element method of stress analysis. Then, the key block analysis
was extended for the probabilistic structural analysis based on the
connectivity matrix. Finally, the key question, the localized
probabilistic structural analysis was achieved by applying the finite
element approach for the key block analysis on to the discrete cell-
block model of the joint system.
JOINT MODELS
GEOSTATISTICAL APPROACH
input data or they can be converted into the equivalent continuum media
that represents the joint systems effectively. In this discrete model,
the entire area (or rock mass) to be modeled is divided into uniform
cell-blocks and the characteristic parameters are inferred for each
cell-block from the sparse sample data measured in the field.
A few cases of geostatistics applications to geotechnology are
reported where rock mass characteristic parameters were found to be
spatially correlated random variables and geostatistical interpretations
are a must to incorporate this phenomena into the modeling (Chiles 1988,
Young 1987a, 1987b, Miller 1979).
In most of these cases, the regionalized variables are in scalar
terms but the joint orientations or poles are considered unit vectors.
This means that the pole vectors should be kriged on the unit sphere
where they are projected and analyzed traditionally (a stereonet
projection for orientations) (Young 1987a, 19887b).
..
- z(x+h)]
-;------~~~-------r~E
(~ • Lagrange Multiplier)
As shown above, the kriging system of vector variables (or poles) is
the same as ordinary kriging (OK) of scalar variables, depending on the
definition of the estimation variance. In this vector kriging system,
the magnitude of the estimation error vector was optimized. The kriging
variance cr.2 is not a local conditional estimation variance and its
application is limited (Journel and Huijbregts 1979).
The kriged mean vector represents the average orientation of joints
within the cell-block, V, and its accuracy can be measured by its
kriging variance. So, it creates a deterministic model and is good
enough for only deterministic analysis of geotechnology, but it is still
a localized model.
However, the full statistical distribution of pole vectors within a
cell-block V is needed to generate a stochastic model of poles for
probabilistic engineering analysis. This was achieved by using
Indicator Kriging (IK) (Young 1987b, Young and Hoerger 1988a).
266 GEOSTATISTICAL APPLICATIONS
The traditional key block theorem for the stability analysis on the
structures excavated in a jointed rock mass is a deterministic method
based on deterministic infinite joint planes. This means that the
location and frequency of joints and size of joint planes are excluded
from the block failure analysis and it provides a worse case analysis.
Consequently a numerical approach was developed, which is general
for both joint system models and any structure (their size and shape).
Also, it is an effective algorithm to computerize the entire block
failure analysis in probabilistic terms. Therefore, it can be combined
easily with the local stochastic model of joint systems to achieve the
localized probabilistic analysis of block failures.
The numerical algorithm was developed based on the connectivity
matrix, which is comparable to the stiffness matrix of the finite
element method of stress analysis in the continuum mechanics.
Connectiyity Matrix Approach
The local area, where the joint systems were simulated and the key
block analysis desired, was replaced with the discrete finite element
model as used in the finite element method of engineering mechanics.
However, the elements were constructed by two-force bars as in truss
structures rather than by solid elements. Then, the local area can be
represented as a large truss structure with bars connected at nodal
points, whose continuity and immobility were secured.
When the rock mass in the local area is cut by joints, some elements
will be cut, as well as the connection bars within those elements.
Also, many independent small truss structures will be formed when the
rock mass is cut into many rock blocks by joints; that is, the whole
truss structure is cut into many parts corresponding to those rock
blocks. The connectivity matrix was introduced to define the connecting
YOUNG ON ANALYSIS OF EXCAVATIONS 267
condition of bars (or their continuity conditions), and the whole truss
structure was represented by the global connectivity matrix. Then, the
independent small size structure representing a block formed by joints
can be searched and identified as an independent block matrix system in
the global connectivity matrix. Each of the independent matrix
elemental blocks in the whole system matrix has its own size, shape, and
location. So, the complete information for a block geometry is known
and available from the nodal numbers of the matrix elemental block.
Element Model
The element constructing the whole system of the rock mass is
replaced with a truss formed by simple bars connected between two nodes.
The element can be in any shape or size with different numbers of nodes.
For simplicity in this paper, a rectangular element with 8 nodal points
was used for the rock block calculations. Then, the 8-point equal
parameter truss element appears like the usual 8-point solid element in
the finite element method, but it consists of 28 two-force bars as shown
in Figure 2. The volume of this element is the same to that of a solid
element and distributed equally on to its nodal points. Therefore, the
nodal point system and the number of freedoms in the element model were
not changed from the finite element model systems.
The global continuous truss structure for the entire rock mass was
developed by constructing this type of truss element on every element in
the model. The inner nodal points will have 26 bars connecting to
adjacent nodes around it.
7 1 1 1 1 1 1 1
1 7 1 1 1 1 1 1
1 1 7 1 1 1 1 1
K = 1 1 1 7 1 1 1 1
1 1 1 1 7 1 1 1
1 1 1 1 1 7 1 1
1 1 1 1 1 1 7 1
1 1 1 1 1 1 1 7
CASE S~IES
~~r----r----~--~--~~--~---,
6 _.-pi. loe.t i ona
~L-__~____~____~____~__~____~
'UL 1"'.
EAST (METERS)
no •. ..·,et
"-
15
- - glob.l IVg . ~Pf( .""J..J
80
o loca l esti ma te
..
.., 60
)0
.,·
." .
~
·
a. 40
0
00
j
10
...·. ooO!O~
~
~
zo 00000
o~
00000 ·
20
180 ZZ5 Z70 315 360 Probabill lyof C.:.llIolU , (I)
"¥ ; r---.,.---..,..---...,....--...,....---,-----,
!L-__'-__.......__.......__-'-__-'-__-'
..... ..... ..... "01 .
EAST ,METERSI "
IIUt. "".
'1f1
FIG . 7--Spatial distribution of PF (IK) in the pit.
The spatial distribution of PF (IK) plotted in Figure 7 indicates
cell-blocks of higher and lower PF's than the marg i nal 50%. Regional
zones of higher and lower PF's were formed and zones were scattered
throughout the pit, randomly, but it was noticed that block PF's were
not scattered randomly in the pit, showing that the analysis can
identify local stability trends in the mine.
A Subway Tunnel
A metropolitan subway tunnel was studied to illustrate the
difference between traditional key block theorem and the positional
probability of key block failures by the finite element approach for the
block failure. The joint systems and their statistical details were
published by cording and Mahar (1974).
An unit length of the tunnel was isolated based on the discrete
cell-block model. The joint systems within this cell-block were modeled
and simulated for the stability analysis as done for pit slopes.
When the frequency of positional block failure was projected along
the unit length of tunnel, the cumulative probability of positional
failure can be plotted around the tunnel as shown in Figure 8. Compared
with the worst case type of analysis by the traditional key block
theorem, the positional probability analysis shows clearly the size and
frequency of key block occurrences in this projection.
",""""",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
1"" ......"fI'"''''''''''''''''''''''''''''''''''''
",""""",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
i"lliI""''''''II,,,,,,,,,''OEEctJ'-'II''''''''''''''''''
"""""",,,,,,,,,,-o71,,-,,,,,,,,,,,,,,""""
'1#'''''''''''''''''''''''.'
iN
'-.11""'"''''''''''''''
................ :------ IIII."""'U,,,,,'''' 10' E92''''''''''',,,''''''
'II'---------z
" , •••• " ••" ' - 1 9 C
lJKI
6-"-,."""""",
-,-.""""
,",,"'11 .... ',',2 11- - f l ' "
""UU'''.
,,,,---""'2211223
"""-'-"
' '-''' '2Jl
31'5'......."'"
,....."',,,
",,,,,,,,-11223 51-''''''''
6 1- ,. " " "
#"""'''''-'23'
n""",,'''-'37 82""''''''
n-,II"""
mm:m:m;~ ~;mmm:
Z~~::;;=m:m::""""",,,,,,,,,,,,,,,m:::m:::
a ) HaxiDnDI reJlO••bl. are. b) Positional probabiliti.s of failur.
" ~------------------------------------~
~ ~ ~ ~
block volume
~ ~ ~ - =
FIG. 9--The frequency distribution of block sizes formed by three joint
sets.
CONCLUSIONS
1. The most important conclusion, in general, which could be drawn
from this work is that the localized probabilistic stability analysis
for geotechnical structures can be made at the early stages of
engineering design and construction, when only sparse sample data is
available. It leads to the optimum design of geotechnical structures,
optimum in their relative locations and orientations with other
peripheral structures, and their shapes and sizes. This is achievable
through the geostatistical model of characteristic parameters of rock
masses. Then, it can be said that this is an ideal model of joint
systems for many engineering analyses in both rock mechanics and
geohydrology.
2. As comparing PF (lK) and PF (sample), the local probabilistic
analysis of pit slopes is more powerful to draw a detailed and realistic
picture of slope stability conditions. The local variation of joint
orientations played an important role in the slope stability and should
274 GEOSTATISTICAL APPLICATIONS
REFERENCES
Baecher, G. B., Lanney, N. A. and Einstein, H. H., 1977, "statistical
Description of Rock Properties and Sampling," Proceedings of the
Eighteenth Symposium on Rock Mechanics, Golden, Colorado.
Chiles, J. P., 1988, "Fractal and Geostatistical Methods for Modeling of
a Fracture Network," Mathematical Geology, Vol. 20, pp. 631-654.
YOUNG ON ANALYSIS OF EXCAVATIONS 275
Cording, E. J. and Mahar, J. W., 1974, "The Effects of Natural Geologic
Discontinuities on Behavior of Rock in Tunnels," Proceedings of 1974
Rapid Excayation and Tunneling Conference, San Francisco, CA, pp. 107-
138.
Golub, G. H. and VanLoan, C. F., 1983, "Matrix Computations," John
Hopkins University Press, Baltimore, MD.
Goodman, R. E. and Shi, G. H., 1984, "Block Theory and Its Application
to Rock Mechanics," Prentice-Hall, Englewood Cliffs.
Grossman, N. F., 1985, "The Bivariate Normal Distribution on the Tangent
Plane at the Mean Attitude," proceedings of International Symposium on
Fundamentals of Rock Joints," Bjorkliden, Sweden, pp. 3-11.
Hoerger, S. F. and Young, D. S. 1987, "Predicting Local Rock Mass
Behavior Using Geostatistics," Proceedings of Twenty-eighth U.S.
Symposium on Rock Mechanics, Tucson, AZ, pp. 99-106.
Journel, A. G. and Huijbregts, Ch. J., 1978, "Mining Geostatistics,"
Academic Press, London.
Lemmer, 1. C., 1984, "Estimating Local Recoverable Reserves via
Indicator Kriging," Proceedings of Geostatistics for Natural Resources
Characterization (ed. by G. Verlys), D. Reidel, Dordrecht, pp. 349-364.
Miller, S. M., 1979, "Geostatistical Analysis for Evaluating Spatial
Dependence in Fracture Set Characteristics," Proceedings of the
Sixteenth Symposium on Application of Computers and Operations Research
in the Mineral Industry, Tucson, AZ, pp. 537-545.
Shanley, R. J. and Mahtab, M. A., 1975, "FRACTAN: A Computer Code for
Analysis of Clusters Defined on Unit Hemisphere," U.S. Bureau of Mines,
IC 8671, Washington, DC.
Warburton, P. M., 1980, "Stereological Interpretation of Joint Trace
Data: Influence of Joint Shape and Implications for Geological
surveys," International Journal of Rock Mechanics & Mining Science, Vol.
17, pp. 305-316.
Young, D. S., 1987a, "Random Vectors and Spatial AnalysiS by
Geostatistics for Geotechnical Applications," Mathematical Geology, Vol.
19, pp. 467-479.
Young, D. 5., 1987b, "Indicator Kriging for Unit Vectors; Rock Joint
Orientations," Mathematical Geology, Vol. 19, pp. 481-502.
Young, D. S. and Hoerger, S. F., 1988a, "Non-Parametric Approach for
Localized Stochastic Model of Rock Joint Systems," Geostatistical.
Sensitivity. and Uncertainty Methods for Ground-Water Flow and
Radionuclide Transport Modeling (ed. by B. Buxton), Battelle Press,
Columbus, OH, pp. 361-385.
Young, D. S. and Hoerger, S. F., 1988b, "Geostatistics Applications to
Rock Mechanics," Proceedings of Twenty-ninth U.S. Symposium on Rock
Mechanics, Balkema, Brookfield, pp. 271-282.
Author Index
C
p
Carr, J. R, 236
Colin, P., 69 Pate, A D., 51
Cromer, M. V., 3, 218 Patinha, P. J., 146
Pereira, M. J., 146
D
R
Dagdelen, K., 117
Dahab, M. F., 248 Rashad, S. M., 181
Desbarats, A J., 32 Rautman, C. A, 218
Rouhani, S., 20, 88
F
s
Froidevaux, R, 69
Schofield, N., 133
G Schulte, D. D., 162
Soares, A 0., 146
Garcia, M., 69 Srivastava, R M., 13
Goderya, F. S., 248
T
J
Turner, A K., 117
Johnson, R L., 102
Jones, D. D., 162
W
K
Wells, D. E., 51
Kannengieser, A J., 200 Wild, M. R, 88
Kuhn, G. N., 162 Woldt, W. E., 162,248
L
y
Leonte, D., 133
M z
Miller, S. M., 200 Zelinski, W. P., 218
277
Subject Index
A Electromagnetics, 162
Estimation procedures, 20
Annealing, 69 F
Arsenic,13
ASTM standards, 13, 32 Flow model, 51
Flow simulation, 181
B
H
Bayesian analysis, 102
Bivariate distribution, 69 Histogram, 32
Block failure, 262 Hotspots, 133
Block value estimation, 20 Hydraulic conductivity, 181,200
Hydraulic head analysis, 51
C
I
Conceptual interpretation, 3
Conditional probability, 133 Indicator simulation, 218
Conductivity, 162 Infiltrometer, 200
Contamination Interpolation techniques, 20
copper, 146 Inverse-distance weighting, 51
delineation, 88, 102 Irrigation practices, 248
lead, 13, 133
metal, 13, 51, 133, 146 J
plume mapping, 162
site mappmg, 69 Joint model, 262
soil, 133
stationarity, assessment with, K
117
subsurface, 162 Kriging,20,32,162,200,262
transport, 181 cokriging, 88, 181
uranium, 51 indicator, 69, 102, 117, 133,
Contouring, 133 236,262
Copper, 146 lognormal, 51
Core data, 218
Correlogram, 13 L
D Leachate, 162
Lead, 13, 133
Design analysis, 3
Direct measurement, 3 M