Geostatistics For Environmental and Geotechnical Applications (Astm Special Technical Publication STP)

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 288

rL~gL

0AV IS
STP 1283

Geostatistics for
Environmental and
Geotechnical Applications

Shahrokh Rouhani, R. Mohan Srivastava,


Alexander J. Desbarats, Marc V. Cromer,
and A. Ivan Johnson, editors

ASTM Publication Code Number (PCN):


04-012830-38

ASTM
100 Barr Harbor Drive
West Conshohocken, PA 19428-2959

Printed in the U.S.A.


Library of Congress Cataloging-in-Publication Data

Geostatistics for environmental and geotechnical applications/


Shahrokh Rouhani ... let al.l.
p. cm. - (STP: 1283)
Papers presented at the symposium held in Phoenix, Arizona on
26-27 Jan. 1995, sponsored by ASTM Committee on 018 on Soil and Rock.
Includes bibliographical references and index.
ISBN 0-8031-2414-7
1. Environmental geology-Statistical methods-Congresses.
2. Environmental geotechnology-Statistical methods-Congresses.
I. Rouhani, Shahrokh. II. ASTM Committee 0-18 on Soil and Rock.
III. Series: ASTM special technical publication: 1283.
QE38.G47 1996
628.5'01 '5195-dc20 96-42381
CIP

Copyright © 1996 AMERICAN SOCIETY FOR TESTING AND MATERIALS, West Conshohocken,
PA. All rights reserved. This material may not be reproduced or copied, in whole or in part, in any
printed, mechanical, electronic, film, or other distribution and storage media, without the written
consent of the publisher.

Photocopy Rights
Authorization to photocopy items for internal or personal use, or the internal or personal
use of specific clients, is granted by the AMERICAN SOCIETY FOR TESTING AND MATERIALS
for users registered with the Copyright Clearance Center (CCC) Transactional Reporting
Service, provided that the base fee of $2.50 per copy, plus $0.50 per page is paid directly to
CCC, 222 Rosewood Dr., Danvers, MA 01923; Phone: (508) 750-8400; Fax: (508) 750-4744. For
those organizations that have been granted a photocopy license by CCC, a separate system of
payment has been arranged. The fee code for users of the Transactional Reporting Service is
0-8031-2414-7/96 $2.50 + .50

Peer Review Policy


Each paper published in this volume was evaluated by three peer reviewers. The authors
addressed all of the reviewers' comments to the satisfaction of both the technical editor(s) and the
ASTM Committee on Publications.
To make technical information available as quickly as possible, the peer-reviewed papers in this
publication were printed "camera-ready" as submitted by the authors.
The quality of the papers in this publication reflects not only the obvious efforts of the authors and
the technical editor(s), but also the work of these peer reviewers. The ASTM Committee on
Publications acknowledges with appreciation their dedication and contribution to time and effort on
behalf of ASTM.

Printed in Ann Arbor, MI


October t 996
Foreword

This publication, Geostatistics for Environmental and Geotechnical Applications, contains


papers presented at the symposium of the same name held in Phoenix, Arizona on 26-27 Jan.
1995. The symposium was sponsored by ASTM Committee on DIS on Soil and Rock. The
symposium co-chairmen were: R. Mohan Srivastava, FSS International; Dr. Shahrokh Rouhani,
Georgia Institute of Technology; Marc V. Cromer, Sandia National Laboratories; and A. Ivan
Johnson, A. Ivan Johnson, Inc.
Contents

OVERVIEW PAPERS

Geostatistics for Environmental and Geotechnical Applications: A Technology


Transferred-MARc V. CROMER 3

Describing Spatial Variability Using Geostatistical Analysis-R. MOHAN SRI VASTA VA 13

Geostatistical Estimation: Kriging-sHAHROKH ROUHANI 20

Modeling Spatial Variability Using Geostatistical Simulation-ALEXANDER J.


DESBARATS 32

ENVIRONMENTAL ApPLICATIONS

Geostatistical Site Characterization of Hydraulic Head and Uranium Concentration in


Groundwater-BRUcE E. BUXTON, DARLENE E. WELLS, AND ALAN D. PATE 51

Integrating Geophysical Data for Mapping the Contamination of Industrial Sites by


Polycyclic Aromatic Hydrocarbons: A Geostatistical Approach-PIERRE COLIN,
ROLAND FROIDEVAUX, MICHEL GARCIA, AND SERGE NICOLETIS 69

Effective Use of Field Screening Techniques in Environmental Investigations: A


Multivariate Geostatistical Approach-MIcHAEL R. WILD AND SHAHROKH
ROUHANI 88

A BayesianiGeostatistical Approach to the Design of Adaptive Sampling Programs-


ROBERT L. JOHNSON 102

Importance of Stationarity of Geostatistical Assessment of Environmental


Contamination-KADRI DAGDELEN AND A. KEITH TURNER 117

Evaluation of a Soil Contaminated Site and Clean-Up Criteria: A Geostatistical


Approach-DANIELA LEONE AND NEIL SCHOFIELD 133

Stochastic Simulation of Space-Time Series: Application to a River Water Quality


Modelling-AMILcAR o. SOARES, PEDRO J. PATINHA, AND MARIA J. PEREIRA 146

Solid Waste Disposal Site Characterization Using Non-Intrusive Electromagnetic


Survey Techniques and Geostatistics-GARY N. KUHN, WAYNE E. WOLDT, DAVID
D. JONES, AND DENNIS D. SCHULTE 162
GEOTECHNICAL AND EARTH SCIENCES ApPLICATIONS

Enhanced Subsurface Characterization for Prediction of Contaminant Transport


Using Co-Kriging---CRAIG H. BENSON AND SALWA M. RASHAD 181

Geostatistical Characterization of Unsaturated Hydraulic Conductivity Using Field


Infiltrometer Data-sTANLEY M. MILLER AND ANJA J. KANNENGIESER 200

Geostatistical Simulation of Rock Quality Designation (RQD) to Support Facilities


Design at Yucca Mountain, Nevada-MARc V. CROMER, CHRISTOPHER A.
RAUTMAN, AND WILLIAM P. ZELINSKI 218

Revisiting the Characterization of Seismic Hazard Using Geostatistics: A Perspective


after the 1994 Northridge, California Earthquake-JAMES R. CARR 236

Spatial Patterns Analysis of Field Measured Soil Nitrate-FARIDA S. GODERY A, M. F.


DAHAB, W. E. WOLDT, AND I. BOGARD! 248

Geostatistical Joint Modeling and Probabilistic Stability Analysis for Excavations-


DAE S. YOUNG 262

Indexes 277
Overview Papers
Marc V. Cromer l

Geostatistics for Environmental and Geotechnical Applications: A


Technology Transferred

REFERENCE: Cromer, M. V., "Geostatistics for Environmental and Geotechnical


Applications: A Technology Transferred," Geostatistics for Environmental and Geo-
technical Applications. ASTM STP 1283, R. M. Srivastava, S. Rouhani, M. V. Cromer,
A. J. Desbarats, A. I. Johnson, Eds., American Society for Testing and Materials, 1996.

ABSTRACT: Although successfully applied during the past few decades for predIcting the
spatial occurrences of properties that are cloaked from direct observation, geostatistical
methods remain somewhat of a mystery to practitioners in the environmental and
geotechnical fields. The techniques are powerful analytical tools that integrate numerical and
statistical methods with scientific intuition and professional judgment to resolve conflicts
between conceptual interpretation and direct measurement. This paper examines the
practicality of these techniques within the entitles field of study and concludes by introducing
a practical case study in which the geostatistical approach is thoroughly executed.

KEYWORDS: Geostatistics, environmental investigations, decision analysis tool

INTRODUCTION

Although, geostatistics is emerging on environmental and geotechnical fronts as an


invaluable tool for characterizing spatial or temporal phenomena, it is still not generally
considered "standard practice" in these fields. The technology is borrowed from the mining
and petroleum exploration industries, starting with the pioneering work of Danie Krige in the
1950's, and the mathematical formalization by Georges Matheron in the early 1960's. In
these industries, it has found acceptance through successful application to cases where
decisions concerning high capital costs and operating practices are based on interpretations
derived from sparse spatial data. The application of geostatistical methods has since
extended to many fields relating to the earth sciences. As many geotechnical and, certainly,
environmental studies are faced with identical "high-stakes" decisions, geostatistics appears
to be a natural transfer of technology. This paper outlines the unique characteristics of this
sophisticated technology and discusses its applicability to geotechnical and environmental
studies.

1 Principal Investigator, Sandia National Laboratories/Spectra Research Institute, MS 1324. P.O. Box 5800,
Albuquerque, NM 87185-1342

3
4 GEOSTATISTICAL APPLICATIONS

IT'S GEOSTATISTICS

The field of statistics is generally devoted to the analysis and interpretation of uncertainty
caused by limited sampling of a property under study. Geostatistical approaches deviate
from more "classical" methods in statistical data analyses in that they are not wholly tied to a
population distribution model that assumes samples to be normally distributed and
uncorrelated. Most earth science data sets, in fact, do not satisfy these assumptions as they
often tend to have highly skewed distributions and spatially correlated samples. Whereas
classical statistical approaches are concerned with only examining the statistical distribution
of sample data, geostatistics incorporates the interpretations of both the statistical distribution
of data and the spatial relationships (correlation) between the sample data. Because of these
differences, environmental and geotechnical problems are more effectively addressed using
geostatistical methods when interpretation derived from the spatial distribution of data have
impact on decision making risk.

Geostatistical methods provide the tools to capture, through rigorous examination, the
descriptive information on a phenomenon from sparse, often biased, and often expensive
sample data. The continued examination and quantitative rigor of the procedure provide a
vehicle for integrating qualitative and quantitative understanding by allowing the data to
"speak for themselves." In effect, the process produces the most plausible interpretation by
continued examination of the data in response to conflicting interpretations.

A GOAL-ORIENTED, PROJECT COORDINATION TOOL

The application of geostatistics to large geotechnical or environmental problems has also


proven to be a powerful integration tool, allowing coordination of activities from the
acquisition offield data to design analysis (Ryti, 1993; Rautman and Cromer, 1994; Wild and
Rouhani, 1995). Geostatistical methods encourage a clear statement of objectives to be set
prior to any study. With these study objectives defined, the flow of information, the
appropriate use of interpretations and assumptions, and the customer/supplier feedback
channels are defined. This type of coordination provides a desirable level of tractability that
is often not realized.

With environmental restoration projects, the information collected during the remedial
investigation is the sole basis for evaluating the applicability of various remedial strategies,
yet this information is often incomplete. Incomplete information translates to uncertainty in
bounding the problem and increases the risk of regulatory failure. While this type of
uncertainty can often be reduced with additional sampling, these benefits must be balanced
with increasing costs of characterization.

The probabilistic roots deeply entrenched into geostatistical theory offer a means to quantify
this uncertainty, while leveraging existing data in support of sampling optimization and risk-
based decision analyses. For example, a geostatistically-based, costlrisklbenefit approach to
sample optimization has been shown to provide a framework for examining the many trade-
offs encountered when juggling the risks associated with remedial investigation, remedial
CROMER ON A TECHNOLOGY TRANSFERRED 5
design, and regulatory compliance (Rautman et. aI., 1994). An approach such as this
explicitly recognizes the value of information provided by the remedial investigation, in that
additional measurements are only valuable to the extent that the information they provide
reduces total cost.

GEOSTATISTICAL PREDICTION

The ultimate goal of geostatistical examination and interpretation, in the context of risk
assessment, is to provide a prediction of the probable or possible spatial distribution of the
property under study. This prediction most commonly takes the form of a map or series of
maps showing the magnitude and/or distribution of the property within the study. There are
two basic forms of geostatistical prediction, estimation and simulation. In estimation, a
single, statistically "best" estimate of the spatial occurrence of the property is produced based
on the sample data and on the model determined to most accurately represent the spatial
correlation of the sample data. This single estimate (map) is produced by the geostatistical
technique commonly referred to as kriging.

With simulation, many equally-likely, high-resolution images of the property distribution can
be produced using the same model of spatial correlation as developed for kriging. The
images have a realistic texture that mimics an exhaustive characterization, while maintaining
the overall statistical character of the sample data. Differences between the many alternative
images (models) provides a measure of joint spatial uncertainty that allows one to resolve
risk-based questions ... an option not available with estimation. Like estimation, simulation
can be accomplished using a variety of techniques and the development of alternative
simulation methods is currently an area of active research.

NOT A BLACK BOX

Despite successful application during the past few decades, geostatistical methods remain
somewhat of a mystery to practitioners in the geotechnical and environmental fields. The
theoretical complexity and effort required to produce the intermediate analysis tools needed
to complete a geostatistical study has often deterred the novice from this approach.
Unfortunately, to many earth scientists, geostatistics is considered to be a "black box."
Although this is far from the truth, such perceptions are often the Achilles' heel of many
mathematical/numeric analytical procedures that harness data to yield their true worth
because they require a commitment in time and training from the practitioner to develop
some baseline proficiency.

Geostatistics is not a solution, only a tool. It cannot produce good results from bad data, but
it will allow one to maximize that information. Geostatistics cannot replace common sense,
good judgment, or professional insight, in fact it demands these skills to be brought to bare.
The procedures often take one down a blind alley, only to cause a redirection to be made
because of an earlier miss-interpretation. While these exercises are nothing more than
cycling through the scientific method, they are often more than the novice is willing to
commit to. The time and frustration associated with continually rubbing one's nose in the
6 GEOSTATISTICAL APPLICATIONS

details of data must also take into account the risks to the decision maker. Given the
tremendous level of financial resources being committed to field investigation, data
collection, and information management to provide decision making power, it appears that
such exercises are warranted.

CASE STUDY INTRODUCTION

This introductory paper only attempts to provide a gross overview of geostatistical concepts
with some hints to practical application for these tools within the entitled fields of scientific
study. Although geostatistics has been practiced for several decades, it has also evolved both
practically and theoretically with the advent offaster, more powerful computers. During this
time a number of practical methods and various algorithms have been developed and tested,
many of which still have merit and are practiced, but many have been left behind in favor of
promising research developments. Some of the concepts that I have touched upon will come
to better light in the context of the practical examination addressed in the following suite of
three overview papers provided by Srivastava (1996), Rouhani (1996), and Desbarats (1996).

In this case study, a hypothetical database has been developed that represents sampling of
two contaminants of concern: lead and arsenic. Both contaminants have been exhaustively
characterized as a baseline for comparison as shown in Figures 1 and 2. The example
scenario proposes a remedial action threshold (performance measure) of 500 ppm for lead
and 30 ppm for arsenic for the particular remediation unit or "VSR" (as discussed by
Desbarats, 1996). Examination of the exhaustive sample histogram and univariate statistics
in Figures 1 and 2 indicate about one fifth of the area is contaminated with lead, and one
quarter is contaminated with arsenic.

The two exhaustive databases have been sampled in two phases, the first of which was on a
pseudo-regular grid (square symbols in Figure 3) at roughly a separation distance of 50 m. In
this first phase, only lead was analyzed. In the second sampling phase, each first-phase
sample location determined to have a lead concentration exceeding the threshold was targeted
with eight additional samples (circle symbols of Figure 3) to delineate the direction of
propagation of the contaminant. To mimic a problem often encountered in an actual field
investigation, during the second phase of sampling arsenic contamination was detected and
subsequently included in the characterization process. Arsenic concentrations are posted in
Figure 4 with accompanying sample statistics. The second phase samples, therefore, all have
recorded values for both arsenic and lead.

Correlation between lead and arsenic is explored by examining the co-located exhaustive data
which are plotted in Figure 5. This comparison indicates moderately good correlation
between the two constituents with a correlation coefficient of 0.66, as compared to the
slightly higher correlation coefficient of 0.70 derived from the co-located sample data plotted
in Figure 6.

There are a total of 77 samples from the first phase of sampling and 13 5 from the second
phase. The second sampling phase, though, has been biased because of its focus on "hot-
CROMER ON A TECHNOLOGY TRANSFERRED 7

FIGURE 1: EXHAUSTIVE PB DATA

o 500 1000 ppm

10 ~ Number of samples: 7700


I Number of samples = 0 ppm: 213 (3%)
9 ' Number of samples> 500 ppm: 1426 (19%)
8 1
~
~
d I
Minimum:
Lower quartile:
0 ppm
120 ppm
6 +
>. Median: 261 ppm
u
c:
51 Upper quartile: 439 ppm
Q)
:::J
cr
4
3
t
j
Maximum: 1066 ppm
Q)
.... Mean: 297 ppm
U. 2 ~ Standard deviation: 218 ppm
1 +
0
0 100 200 300 400 500 600 700 800 900 1000
Pb (ppm)
8 GEOSTATISTICAL APPLICATIONS

FIGURE 2: EXHAUSTIVE AS DATA

0 30 200 ppm

44%
[J Number of samples: 7700
20 -
Number of samples =0 ppm: 1501 (19%)
18 "
Number of samples> 30 ppm: 1851 (24%)
16 +
~ Minimum: o ppm
c 14 1 Lower quartile: 1 ppm
:.=-
12 1 Median: 6 ppm
>-
()
c
Q)
10+ Upper quartile: 29 ppm
8t Maximum: 550 ppm
::J
c- 6T
Q)
....

li'~Q
Mean: 22 ppm
u.. 4 "'" Standard deviation: 35 ppm
2t
0 "- I I
a 20 40 60 80 100 120 140 160 180 200
As (ppm)
CROMER ON A TECHNOLOGY TRANSFERRED 9
FIGURE 3: SAMPLE PB DATA

• • • • • • • 1:1
• •
• • • • • • •
• • • •
• • III
• • •
• • • • •
• ••


• ,
., •
• •
C
.
I.,

cit • •
• •



• ••
• • ••
q. 0


,
• 0


Iff> CO• ~O , C).c9 . •• I •• •••
• • • • •


• • 00 0 c. •.
• • • III • • •
0
o
• 0 C•

• • • . . ,. , .' .,fI..,..
••••••
tP· · .O
. ...
o. • • • .... . I c9
o .. C

o 500 1000 ppm

10 · Number of samples: 212


Number of samples = 0 ppm: 1 (0%)
91
Number of samples> 500 ppm: 91 (43%)
8t
~
0
7 Minimum: 0 ppm
c:
Lower quartile: 239 ppm
-=->- 6+
Median: 449 ppm
(.)
5+ Upper quartile: 613 ppm
c:
Q)
:J 4f Maximum: 1003 ppm
C" 3+
Q)
.... Mean : 431 ppm
LL 2" Standard deviation: 237 ppm
d
O ~ ~~~~~~~~~~~
0 100 200 300 400 500 600 700 800 900 1000
Pb (ppm)
10 GEOSTATISTICAL APPLICATIONS

FIGURE 4: SAMPLE AS DATA

• o•

·• • , .
•• 0

. .. ,
It
o

• •
,
• ••0
~
.. I·, • I ·
•••
••
~CO c::> .c9 •
• 1°. • ••• ••• •
0 •••
•• •• •
•• • ••
•••••• o. • • • . : •• I ~ d
• <e 0 ~
•... •. ".
••• •
"0·. 0 ·
o 30 200 ppm

22%
[J Number of samples: 135
20

~
18
16 T
t Number of samples =0 ppm: 12 (9%)
Number of samples> 30 ppm: 51 (38%)

0 Minimum: 0 ppm
14 1
~ 12 I
Lower quartile: 6 ppm
>- Median: 21 ppm
()
c: 10 t Upper quartile: 50 ppm
Q)
8 Maximum: 157 ppm
:::J
C-
1
Q) 61
Mean: 33 ppm
4~
'-
u.. Standard deviation: 36 ppm
I
2t
0 "" ..J..J..L.1...l--W--4-J-L.4-J.~Ll..JJ.dJl!:ILt ill ~ ~ ~ ;
0 20 40 60 80 100 120 140 160 180 200
As (ppm)
CROMER ON A TECHNOLOGY TRANSFERRED 11

FIGURE 5: EXHAUSTIVE DATA


Correlation coefficient: 0.66

•• • •• •
• I.. • •••
••
. :.
".

• • •• •
160
•• • ••
, •
140 •

• •••
_120
E • ••••
a.
• ••
..9: 1
(/)
«

800 900 1000

FIGURE 6: SAMPLE DATA


200 Correlation coefficient: 0.7

180

160
• •
140
• •
_120 • ••
E
a. • •
..9: 100 •
«
(/)

80

•••

.. •
• • •• •
60
••
...
• • I' •
• .' .
40

20
._... .-....,..

• •••• • •
\

0 •• I-
• n-...:J.!r ~' I I I I I I
0 100 200 300 400 500 600 700 800 900 1000
Pb (ppm)
12 GEOSTATISTICAL APPLICATIONS

spot" delineation. This poses some difficult questions/problems from the perspective of
spatial data analysis: What data are truly representative of the entire site and should be used
for variography or for developing distributional models? What data are redundant or create
bias? Have we characterized arsenic contamination adequately? These questions are
frequently encountered, especially in the initial phases of a project that has not exercised
careful pre-planning. The co-located undersampling of arsenic presents an interesting twist
to a hypothetical, yet realistic, problem from we can explore the paths traveled by the
geostatistician.

REFERENCES

Desbarats, A.J., "Modeling Spatial Variability Using Geostatistical Simulation,"


Geostatistics for Environmental and Geotechnical Applications, ASTM STP 1283, R.
Mohan Srivastava, Shahrokh Rouhani, Marc V. Cromer, A. Ivan Johnson, Eds.,
American Society for Testing and Materials, Philadelphia, 1996.

Rautman, C.A., M.A. McGraw, J.D. Istok, 1M. Sigda, and P.G. Kaplan, "Probabilistic
Comparison of Alternative Characterization Technologies at the Fernald Uranium-In-
Soils Integrated Demonstration Project", Vol. 3, Technolo~y and Pro~rams for
Radioactive Waste Mana~ement and Environmental Restoration, proceedings of the
Symposium on Waste Management, Tucson, AZ, 1994.

Rautman, C.A. and M.V. Cromer, 1994, "Three-Dimensional Rock Characteristics Models
Study Plan: Yucca Mountain Site Characterization Plan SP 8.3.1.4.3.2", U.S.
Department of Energy, Office of Civilian Radioactive Waste Management,
Washington, DC.

Rouhani, S., "Geostatistical Estimation: Kriging," Geostatistics for Environmental and


Geotechnical Applications, ASTM STP 1283, R. Mohan Srivastava, Shahrokh
Rouhani, Marc V. Cromer, A. Ivan Johnson, Eds., American Society for Testing and
Materials, Philadelphia, 1996.

Ryti, R., "Superfund Soil Cleanup: Developing the Piazza Road Remedial Design," .Im!1:nill.
Air and Waste Mana~ement, Vol. 43, February 1993.

Srivastava, R.M., "Describing Spatial Variability Using Geostatistical Analysis,"


Geostatistics for Environmental and Geotechnical Applications, ASTM STP 1283, R.
Mohan Srivastava, Shahrokh Rouhani, Marc V. Cromer, A. Ivan Johnson, Eds.,
American Society for Testing and Materials, Philadelphia, 1996.

Wild, M. and S. Rouhani, "Taking a Statistical Approach: Geostatistics Brings Logic to


Environmental Sampling and Analysis," Pollution En~ineerin~, February 1995.
R. Mohan Srivastava!

DESCRIBING SPATIAL VARIABILITY


USING GEOSTATISTICAL ANALYSIS

REFERENCE: Srivastava, R. M., "Describing Spatial Variability Using Geostatistical


Analysis," Geostatistics for Environmental and Geotechnical Applications, ASTM STP
1283, R. M. Srivastava, S. Rouhani, M. V. Cromer, A. J. Desbarats, A. I. Johnson, Eds.,
American Society for Testing and Materials, 1996.

ABSTRACT: The description, analysis and interpretation of spatial variability is one of


the cornerstones of a geostatistical study. When analyzed and interpreted propp-r'j', the
pattern of spatial variability can be used to plan further sampling programs, to improve
estimates and to build geologically realistic models of rock, soil and fluid properties.
This paper discusses the tools that geostatisticians use to study spatial variability. It
focuses on two of the most common measures of spatial variability, the variogram and
the correlogram, describes their appropriate uses, their strengths and their weaknesses.
The interpretation and modelling of experimental measures of spatial variability are
discussed and demonstrated with examples based on a hypothetical data set consisting
of lead and arsenic measurements collected from a contaminated soil site.

KEYWORDS: Spatial variation, variogram, correlogram.

INTRODUCTION
Unlike most classical statistical studies, in which samples are commonly assumed to be
statistically independent, environmental and geotechnical studies involve data that are
not statistically independent. Whether we are studying contaminant concentrations in
soil, rock and fluid properties in an aquifer, or the physical and mechanical properties
of soil, data values from locations that are close together tend to be more similar than
data values from locations that are far apart. To most geologists, the fact that closely
lManager, FSS Canada Consultants, 800 Millbank, Vancouver, Be, Canada V5V 3K8

13
14 GEOSTATISTICAL APPLICATIONS

spaced samples tend to be similar is hardly surprising since samples from closely spaced
locations have been influenced by similar physical and chemical processes.
This overview paper addresses the description and analysis of spatial dependence in
geostatistical studies, the interpretation of the results and the development of a math-
ematical model that can be used in spatial estimation and simulation. More specific
guidance on the details of analysis, interpretation and modelling of spatial variation can
be found in the ASTM draft standard guide entitled Standard Guide for Analysis of
Spatial Variation in Geostatistical Site Investigations.

DESCRIBING AND ANALYZING SPATIAL VARIATION


Using the sample data set presented earlier in this volume in the paper by Cromer,
Figure 1 shows an example of a "variogam", the tool that is most commonly used in
geostatistical studies to describe spatial variation. A variogram is a plot of the average
squared differences between data values as a function of separation distance. If the phe-
nomenon being studied was very continuous over short distances, then the differences
between closely spaced data values would be small, and would increase gradually as we
compared pairs of data further and further apart. On the other hand, if the phenomenon
was completely erratic, then pairs of closely spaced data values might be as wildly dif-
ferent as pairs of widely spaced data values. By plotting the average squared differences
between data values (the squaring just makes everything positive so that large negative
differences do not cancel out large positive ones) against the separation distance, we can
study the general pattern of spatial variability in a spatial phenomenon.
Figure 2 shows an example of another tool that can be used to describe spatial variation,
the "correlogram" or "correlation function". On this type of plot, we again group all
of the available data into different classes according to their separation distance, but
rather than plotting the average squared difference between the paired data values, we
plot their correlation coefficient. If the phenomenon under study was very continuous
over short distances, then closely spaced data values would correlate very well, and
would gradually decrease as we compared pairs of data further and further apart. On
the other hand, if the phenomenon was completely erratic, then pairs of closely spaced
data values might be as uncorrelated as pairs of widely spaced data values. A plot of
the correlation coefficient between pairs of data values as a function of the separation
distance provides a description of the general pattern of spatial continuity.
SRIVASTAVA ON SPATIAL VARIABILITY 15

60000 1.2
50000 1.0
c: 0.8
~ 40000 .12
.~ 30000
.i 0.6
~ 0.4
~ 20000 8 0.2
10000 0.0
o -0.2
o 20 40 60 80 100 120 0 20 40 60 80 100 120
Separation distance (in m) Separation distance (in m)

Figure 1. An example of a variogram us- Figure 2. An example of a correlogram


ing the sample lead data set described by using the sample lead data set described by
Cromer (1996). Cromer (1996).

As can be seen by the examples in Figures 1 and 2, the variogram and the correlo-
gram are, in an approximate sense, mirror images. As the variogram gradually rises and
reaches a plateau, the correlogram gradually drops and also reaches a plateau. They are
not exactly mirror images of one another, however, and a geostatistical study of spatial
continuity often involves both types of plots. There are other tools that geostatistician
use to describe spatial continuity, but they all fall into two broad categories: measures
of dissimilarity and measures of similarity. The measures of similarity record how dif-
. ferent the data values are as a function of separation distance and tend to rise like the
variogram. The measures of dissimilarity record how similar the data values are as a
function of separation distance and tend to fall like the correlogram.

INTERPRETING SPATIAL VARIATION


Variograms are often summarized by the three characteristics shown in Figure 3:

Sill: The plateau that the variogram reaches; for the traditional definition of the vari-
ogram - the average squared difference between paired data values - the sill is
approximately equal to twice the variance of the data. 3
3The "semivariogram", which is simply the variogram divided by two, has a sill that is approximately
equal to the variance of the data.
16 GEOSTATISTICAL APPLICATIONS

Range : The distance at which the variogram reaches the sill; this is often thought of
as the "range of influence" or the "range of correlation" of data values. Up to
the range, a sample will have some correlation with the unsampled values nearby.
Beyond the range, a sample is no longer correlated with other values.

Nugget Effect: The vertical height of the discontinuity at the origin. For a separation
distance of zero (i.e. samples that are at exactly the same location), the average
squared differences are zero. In practice, however, the variogram does not converge
to zero as the separation distance gets smaller. The nugget effect is a combination
of:

• short-scale variations that occur at a scale smaller than the closest sample
spacing
• sampling error due to the way that samples are collected, prepared and ana-
lyzed

80000
.,
Range

- ,, -- Sill
E 60000
!!! ,
.~ 40000 ,
>
I'll
20000 - --, -- - - Nugget effect
II: II
II' II
0
0 20 40 60 80 100 120
Separation distance (in m)

Figure 3. Terminology commonly used to describe the main features of a variogram.

Of the three characteristics commonly used to summarize the variogram, it is the range
and the nugget effect that are most directly linked to our intuitive sense of whether the
phenomenon under study is "continuous" or "erratic". Phenomena whose variograms
have a long range of correlation and a low nugget effect are those that we think of as
"well behaved" or "spatially continuous"; attributes such as hydrostatic head, thickness
of a soil layer and topographic elevation typically have long ranges and low nugget
effects. Phenomena whose variograms have a short range of correlation and a high nugget
SRIVASTAVA ON SPATIAL VARIABILITY 17

effect are those that we think of as "spatially erratic" or "discontinuous" j contaminant


concentrations and permeability typically have short ranges and high nugget effects.
Figure 4 compares the lead and arsenic variograms for the data set presented earlier in
this volume by Cromer. For these two attributes, the higher nugget effect and shorter
range on the arsenic variogram could be used as quantitative support for the view that
the lead concentrations are somewhat more continuous than the arsenic concentrations.

(a) Lead (b) Arsenic


60000 1600
50000 1400

E 40000 E 1200
~ ~ 1000
.2 30000 ·2
CG
soo
600
~ 20000 > 400
10000 200
o 0
o 20 40 60 80 100 120 0 20 40 60 80 100 120
Separation distance (in m) Separation distance (in m)
Figure 4. Lead and arsenic variograms for the sample data described by Cromer (1996).

(a) Northwest - Southeast (b) Northeast - Southwest


60000 60000
50000 50000
~ 40000 ~ 40000
g, 30000
.c:
g, 30000
.c:
~ 20000 ~ 20000
10000 10000
o o
o 20 40 60 80 100 120 o 20 40 60 80 100 120
Separation distance (in m) Separation distance (in m)
Figure 5. Directional variograms for the sample lead data described by Cromer (1996).

In many earth science data sets, the pattern of spatial variation is directionally depen-
dent . In terms of the variogram, the range of correlation often depends on direction.
18 GEOSTATISTICAL APPLICATIONS

Using the example presented earlier in this volume by Cromer, the lead values appear to
be more continuous in the NW-SE direction than in the NE-SW direction. Geostatisti-
cal studies typically involve the calculation of separate variograms and correlograms for
different directions. Figure 5 shows directional variograms for the sample lead data pre-
sented by Cromer. The range of correlation shown by the NW-SE variogram (Figure 5a)
is roughly 80 meters , but only 35 meters on the NE-SW variogram (Figure 5b). This
longer range on the NW-SE variogram provides quantitative support for the observation
that the lead values are, indeed, more continuous in this direction and more erratic in
the perpendicular direction.

MODELLING SPATIAL VARIATION


Once the pattern of spatial variation has been described using directional variograrns
or correlograms, this information can be used to geostatistical estimation or simulation
procedures. Unfortunately, variograms and correlograms based on sample data cannot
provide information on the degree of spatial continuity for every possible distance and
in every possible direction. The directional variograms shown in Figure 5, for example,
provided information on the spatial continuity every 10 m in two specific directions.
The estimation and simulation algorithms used by geostatisticians require information
on the degree of spatial continuity for every possible distance and direction. To create a
model of spatial variation that can be used for estimation and simulation, it is necessary
to fit a mathematical curve to the sample variograms.

(a) Northwest - Southeast (b) Northeast - Southwest


60000 60000
50000 50000
~ 40000 0- ~ 40000
~
·c
30000 ~
·c
30000
~ 20000 ~ 20000
10000 10000
o o
o 20 40 60 80 100 120 o 20 40 60 80 100 120
Separation distance (in m) Separation distance (in m)
Figure 6. Variogram models for the directional sample variograms shown in Figure 5.
SRIVASTAVA ON SPATIAL VARIABILITY 19

The traditional practice of variogram modelling makes use of a handful of mathematical


functions whose shapes approximate the general character of most sample variograms.
The basic functions - the "spherical", "exponential" and "gaussian" variogram models
- can be combined to capture the important details of almost any sample variogram.
Figure 6 shows variogram models for the directional variograms of lead (Figure 5). Both
of these use a combination of two spherical variogram models, one to capture short range
behavior and the other to capture longer range behavior, along with a small nugget effect
to model the essential details of the sample variograms. In kriging algorithms such as
those described later in this volume by Rouhani, it is these mathematical models of
the spatial variation that are used to calculate the variogram value between any pair of
samples, and between any sample and the location being estimated.

REFERENCES
ASTM, Standard Guide for Analysis of Spatial Variation in Geostatistical Site Investi-
gations, 1996, Draft standard from D18.01.07 Section on Geostatistics.
Cromer, M.V., 1996, "Geostatistics for Environmental and Geotechnical Applications:
A Technology Transfer," Geostatistics for Environmental and Geotechnical Appli-
cations, ASTM STP 1283, R. Mohan Srivastava, Shahrokh Rouhani, Marc V. Cro-
mer, A. Ivan Johnson, Ed., American Society for Testing and Materials, West
Conshohocken, PA.
Deutsch, C.V. and Journel, A.G., 1992, GSLIB: Geostatistical Software Library and
User's Guide, Oxford University Press, New York, 340 p.
Isaaks, E.H. and Srivastava, R.M., 1989, An Introduction to Applied Geostatistics,
Oxford University Press, New York, 561 p.
Journel, A.G. and Huijbregts, C., 1978, Mining Geostatistics, Academic Press, London,
600p.
Rouhani, S., 1996, "Geostatistical Estimation: Kriging," Geostatistics for Environmen-
tal and Geotechnical Applications, ASTM STP ma, R. Mohan Srivastava, Shah-
rokh Rouhani, Marc V. Cromer, A. Ivan Johnson, Ed., American Society for Test-
ing and Materials, West Conshohocken, PA.
Srivastava, R.M. and Parker, H.M., 1988, "Robust measures of spatial continuity,"
Geostatistics, M. Armstrong (ed.), Reidel, Dordrecht, p. 295-308.
Shahrokh Rouhani 1

GEOSTATISTICAL ESTIMATION: KRIGING

REFERENCE: Rouhani, S., "Geostatistical Estimation: Kriging," Geostatistics for Envi-


ronmental and Geotechnical Applications, ASTM STP 1283, R. Mohan Srivastava, Shahrokh
Rouhani, Marc V. Cromer, A. Ivan Johnson, Alexander 1. Desbarats, Eds., American Society
for Testing and Materials, 1996.

ABSTRACT: Geostatistics offers a variety of spatial estimation procedures which are known as
kriging. These techniques are commonly used for interpolation of point values at unsampled
locations and estimation of average block values. Kriging techniques provide a measure of
accuracy in the form of an estimation variance. These estimates are dependent on the model of
spatial variability and the relative geometry of measured and estimated locations. Ordinary
kriging is a linear minimum-variance interpolator that assumes a constant, but unknown global
mean. Other forms of linear kriging includes simple and universal kriging, as well as co-kriging.
If measured data display non-Gaussian tendencies, more accurate interpolation may be obtained
through non-linear kriging techniques, such as lognormal and indicator kriging.

KEYWORDS: Geostatistics, kriging, spatial variability, mapping, environmental investigations.

Many environmental and geotechnical investigations are driven by biased or preferential


sampling plans. Such plans usually generate correlated, and often clustered, data. Geostatistical
procedures recognize these difficulties and provide tools for various forms of spatial estimations.
These techniques are COllectively known as kriging in honor of D. G. Krige, a South African
mining engineer who pioneered the use of weighted moving averages in the assessment of ore
bodies. Common applications of kriging in environmental and geotechnical engineering include:
delineation of contaminated media, estimation of average concentrations over exposure domains,
as well as mapping of soil parameters and piezometric surfaces (Joumel and Huijbregts, 1978;
Delhomme, 1978; ASCE, 1990). The present STP offers a number of papers that cover various
forms of geostatistical estimations, such Benson and Rashad (1996), Buxton (1996), Goderya et
at. (1996), and Wild and Rouhani (1996).
Comparison of kriging to other commonly used interpolation techniques, such as distance-
weighting functions, reveals a number of advantages (Rouhani, 1986). Kriging directly
incorporates the model of the spatial variability of data. This allows kriging to produce site-
specific and variable-specific interpolation schemes. Estimation criteria of kriging are based on

IAssociate Professor, School of Civil and Environmental Engineering, Georgia Institute of


Technology, Atlanta, GA 30332-0355.

20
ROUHANI ON KRIGING 21
well-defmed statistical conditions, and thus, are superior to subjective interpolation techniques.
Furthennore, the automatic declustering of data by kriging makes it a suitable technique to
process typical environmental and geotechnical measurements.
Kriging also yields a measure for the accuracy of its interpolated values in the fonn of
estimation variances. These variances have been used in the design of sampling plans because of
two factors: (1) each estimate comes with an estimation variance, and (2) the estimation variance
does not depend on the individual observations (Loaiciga et al., 1992). Therefore, the impact of
a new sampling location can be evaluated before any new measurements are actually conducted
(Rouhani, 1985). Rouhani and Hall (1988), however, noted that in most field cases the use of
estimation variance, alone, is not sufficient to expand a sampling plan. Such plans usually
require consideration of many factors in addition to the estimation variance.
To use the estimation variance as a basis for sampling design, additional assumptions
must be made about the probability density function of the estimation error. A common practice
is to assume that, at any location in the sampling area, the errors are normally distributed with a
mean of zero and a standard deviation equal to the square root of the estimation variance,
referred to as kriging standard deviation. The nonnal distribution of the errors has been
supported by practical evidence (Journal and Huijbregts, 1978, p. 50 and 60).

Ordinary Kriging

Among geostatistical estimation methods, ordinary kriging is the most widely used in
practice. This procedure produces minimum-variance estimates by taking into account: (1) the
distance vector between the estimated point and the data points; (2) the distance vector between
data points themselves; and (3) the statistical structure of the variable. This structure is
represented by either the variogram, the covariance or the correlogram function. Ordinary
kriging is also capable of processing data averaged over different volumes and sizes.
Ordinary kriging is a "linear" estimator. This means that its estimate, Z', is computed as
a weighted sum of the nearby measured values, denoted as z!, ~, ... , and Zy,. The fonn of the
estimation is
n

2:A;Z; 1
;=}

where Ai'S are the estimation weights. Z· can either represent a point or a block-averaged value,
as shown in Fig. 1. Point kriging provides the interpolated value at an unsampled location.
Block kriging yields an areal or a volumetric average over a given domain.
• The kriging weights, Ai' are chosen so as to satisfy two suitable statistical conditions.
I These conditions are:
! (1) Non-bias condition: This condition requires that the estimator Z· to be free of any
! systematic error, which translates into
!
i
f
~
22 GEOSTATISTICAL APPLICATIONS

Zl
• z· .~
• •
Z4
Z:l

(a)

(b)

Fig. 1. Example of Spatial Estimation: (a) Point Kriging; (b) Block Kriging.
ROUHANI ON KRIGING 23

E
8 Q,
~ Q,

.5
5os
Cl
'g
..,
....l
'0
<:>
..,;.
rIl

<:>
." './:3
<Il
::l
os
~
>L1
N
o.i)
~

<:>
24 GEOSTATISTICAL APPLICATIONS

tAi
i==/
=1 2

(2) Minimum-variance condition: This requires that the estimator Z' have minimum variance
of estimation. The estimation variance of Z', d-, is defmed as
where Yio is the variogram between the i-th measured point and the estimated location and

n n n
0.2 = iLAiY io - 2:LAiAjYij + Y oo 3
i:} i:/ j : }

Yij is the variogram between the i-th and j-th measured points.

The kriging weights are computed by minimizing the estimation variance (Eq. 3) subject
to the non-bias condition (Eq. 2). The computed weights are then used to calculate the
interpolated value (Eq. 1). As Delhomme (1978) notes: "the kriging weights are tailored to the
variability of the phenomenon. With regular variables, kriging gives higher weights to the
closest data points, precisely since continuity means that two points close to each other have
similar values. When the phenomenon is irregular, this does not hold true and the weights given
to the closest data points are dampened." Such flexibility does not exist in methods, such as
distance weighting, where the weights are pre-defmed as functions of the distance between the
estimated point and the data point.

Case Study: Kriging of Lead Data

As noted in Cromer (1996), a soil lead field is simulated as a case study as shown in Fig.
2. The measured values are collected from this simulated field. Similar to most environmental
investigations, the sampling activities are conducted in two phases. During the first phase a
pseudo-regular grid of 50x50 m is used for soil sampling. In the second phase, locations with
elevated lead concentrations are targeted for additional irregular sampling, as indicated in Fig. 3.
The analysis of the spatial variability of the simulated field is presented in the previous
paper (Srivastava, 1996). Using this information, ordinary kriging is conducted. Fig. 4 displays
the kriging results of point estimations. The comparison of the original simulated field (Fig. 2)
and the kriged map (Fig. 4) shows that the kriged map captures the main spatial features of lead
contamination. This comparison, however, indicates a degree of smoothing in the kriged map
which is a consequence of the interpolation process. In cases where the preservation of the
spatial variability of the measured field is critical to the study objectives, then the use of kriging
for estimation alone is inappropriate and simulation methods are recommended (Desbarats,
1996).
Each kriged map is accompanied by its accuracy map. Fig. 5 displays the kriging
ROUHANI ON KRIGING 25

• • • • • •
• • •
e
0.
0.
.S

• • §
b
c:
0
.J:l

'"c:
••
C!)
g
0
U
.",

• • ••
'"
C!)
-l
·0
....0
Ul

• • • • 8
0()
en
C!)

..,
i5..

• • • • Ul
E
'"
.",
B

• • • • ffII#fI' •
Q
C!)


• • • •••~ C fI'
• 0
u
C!)
en
'"0.
..c:I
0
~

• • • • ••
E-<
M
bil
ti:

•• • •
I\)
0>

G)
m
oen
g
Cii
-t
ol>
r
l>
-0
-0
r
o
~
o
z
en

o 500 1000

Fig. 4. Soil Lead Concentration Map by Ordinary Kriging in ppm


(Blank spaces are not estimated)
:Il
oC
I
~
Z
o 120 240 oZ
A
:Il
Gi
Fig. 5. Kriging Standard Deviation of Soil Lead Concentration in ppm
Z
Gl

I\)
-..J
28 GEOSTATISTICAL APPLICATIONS

standard deviation map of soil lead data. This latter map can be used to distinguish between
zones of high versus poor data coverage.

Block Kriging

In many instances, available measurements represent point or quasi-point values, but the
study requires the computation of the areal or volumetric value over a lager domain. For
instance, in environmental risk assessments, the desired concentration term should represent the
average contamination over an exposure domain. Depending on the computed average
concentration or its upper confidence limit, a block is declared impacted or not-impacted. This
shows that the decision is based on the estimated block value, and not its true value. So there is a
chance of making error in two forms:
(1) Wrong Rejection: Certain blocks will be considered impacted, while their true average
concentration is below the target level, and
(2) Wrong Acceptance: Certain blocks will be considered not-impacted when their true
average concentrations are above the target level.
As shown in Iournel and Huijbregts (1978, p. 459), the kriging block estimator, Z', is the linear
estimator that minimizes the sum of the above two errors. Therefore, the block kriging
procedure is preferred to any other linear estimator for such selection problems.

Alternative Fonns of Kriging

As noted before, ordinary kriging is a linear minimum-variance estimator. There are


other folms of linear kriging. For example, if the global mean of the variable is known, the non-
bias condition (Eq. 2) is not required. This leads to simple kriging. If, on the other hand, the
global mean is not constant and can be expressed as a polynomial function of spatial coordinates,
then universal kriging may be used.
In many instances, added information is available whenever more than one variable is
sampled, provided that some relationship exists between these variables. Co-kriging uses a
linear estimation procedure to estimate Z' as

n "'
Z' = LA,;Z; + LWjY j 4
;~J j~J

were Zj is the i-th measured value of the "primary" variable with a kriging weight of A;, and Yj is
the j-th "auxiliary" measured value with a kriging weight of {OJ. Co-kriging is specially
advantageous in cases where the primary measurements are limited and expensive, while
ROUHANI ON KRIGING 29
~auxiliary measurements are available at low cost. Ahmed and de Marsily (1987) enhanced their
1limited transmissivity data based on pumping tests with the more abundant specific capacity data.
i This resulted in an improved transmissivity map. The present STP provides examples of co-
~'kriging, such as Benson and Rashad (1996) and Wild and Rouhani (1996).

(Non-linea, Kriging

i
~' The above linear kriging techniques do not require any implicit assumptions about the
~:"underlying distribution of the interpolated variable. If the investigated variable is multivariate
. :normal (Gaussian), then linear estimates have the minimum variance. In many cases where
\:the histogram of the measured values displays a skewed tendency a simple transformation may
~,produce normally distributed values. After such a transformation, linear kriging may be used. If
,Ithe desired transformation is logarithmic, then the estimation process is referred to as lognormal

I
""kriging. Although lognormal kriging can be applied to many field cases, its estimation process
':requires back-transformation of the estimated values. These back transformation are complicated
,~and must be performed with caution (e.g. Buxton, 1996).
~t Sometimes, the observed data clearly exhibit non-Gaussian characteristics, whose log-
~transforms are also non-Gaussian. Examples of such data sets include cases of measurements
i'with multi-modal histograms, highly skewed histograms, or data sets with large number of
\~'below-detection measurements. These cases have motivated the development of a set of
r~techniques to deal with non-Gaussian random functions. One of these methods is indicator
~'kriging. In this procedure, the original values are transformed into indicator values, such that
~l:they are zero if the datum value are less than a pre-defmed cutoff level or unity if greater. The
~.',I,'.,stimated value by indicator kriging represents the probability of not-exceedence at a location.
"I::!,his technique provides a simple, yet powerful procedure, for generating probability maps
"I~OUhani and Dillon, 1990).
:~~
:~i

~Recommended Sources
'l
i ~or more information on ~iging, readers are referred to Journel and Huijbregts (1978),
~de Marslly (1986), Isaaks and Snvastava (1989), and ASCE (1990). ASTM Standard D 5549-
, titled: "Standard Guide for Content of Geostatistical Site Investigations," provides
information on the various elements of a kriging report. ASTM DI8.0l.07 on Geostatistics has
so drafted a guide titled: "Standard Guide for Selection of Kriging Methods in Geostatistical
. ite Investigations." This guide provides recommendations for selecting appropriate kriging
methods based on study objectives and common situations encountered in geostatistical site
;investigations .
30 GEOSTATISTICAL APPLICATIONS

References

(1) ASCE Task Committee on Geostatistical Techniques in Geohydrology, "Review of


Geostatistics in Geohydrology, 1. Basic Concepts, 2. Applications," ASCE Journal of
Hydraulic Engineering, 116(5), 612-658, 1990.
(2) Ahmed, S., and G. de Marsily, "Comparison of geostatistical methods for estimating
transmissivity using data on transmissivity and specific capacity," Water Resources
Research, 23(9), 1717-1737, 1987.
(3) Benson, C.H., and S.M. Rashad, "Using Co-kriging to Enhance Subsurface
Characterization for Prediction of Contaminant Transport," Geostatistics for
Environmental and Geotechnical Applications, ASTM STP 1283, R Mohan Srivastava,
Shahrokh Rouhani, Marc V. Cromer, A. Ivan Johnson, Eds., American Society for
testing and Materials, Philadelphia, 1996.
(4) Buxton, B.E., "Two Geostatistical Studies of Environmental Site Assessments,"
Geostatistics for Environmental and Geotechnical Applications, ASTM STP 1283, R
Mohan Srivastava, Shahrokh Rouhani, Marc V. Cromer, A. Ivan Johnson, Eds.,
American Society for testing and Materials, Philadelphia, 1996.
(5) Cromer, M., "Geostatistics for Environmental and Geotechnical Applications,"
Geostatistics for Environmental and Geotechnical Applications, ASTM STP 1283, R
Mohan Srivastava, Shahrokh Rouhani, Marc V. Cromer, A. Ivan Johnson, Eds.,
American Society for testing and Materials, Philadelphia, 1996.
(6) Delhomme, J.P., "Kriging in the hydro sciences , " Advances in Water Resources, 1(5),
251-266, 1978.
(7) Desbarats, A., "Modeling of Spatial Variability Using Geostatistical Simulation,"
Geostatistics for Environmental and Geotechnical Applications, ASTM STP 1283, R
Mohan Srivastava, Shahrokh Rouhani, Marc V. Cromer, A. Ivan Johnson, Eds.,
American Society for testing and Materials, Philadelphia, 1996.
(8) Goderya, F.S., M.F. Dahab, and W.E. Woldt, "Geostatistical Mapping and Analysis of
Spatial Patterns for Farm Fields Measured Residual Soils Nitrates," Geostatistics for
Environmental and Geotechnical Applications, ASTM STP 1283, R Mohan Srivastava,
Shahrokh Rouhani, Marc V. Cromer, A. Ivan Johnson, Eds., American Society for
testing and Materials, Philadelphia, 1996.
(9) Isaaks, E. H. and R M. Srivastava, An Introduction to Applied Geostatistics, Oxford
University Press, New York, 561 p., 1989.
(10) Joumel, A. G. and C. Huijbregts, Mining Geostatistics, Academic Press, London, 600
p.,1978.
(11) Loaiciga, H.A., RJ. Charbeneau, L.G. Everett, G.E. Fogg, B.F. Hobbs, and S.
Rouhani, "Review of Ground-Water Quality Monitoring Network Design," ASCE
Journal of Hydraulic Engineering, 118(1), 11-37, 1992.
(12) Marsily, G. de, Quantitative Hydrogeology, Academic Press, Orlando, 1986.
ROUHANI ON KRIGING 31
(13) Rouhani, S., "Variance Reduction Analysis", Water Resources Research, Vol. 21, No.6,
pp. 837-846, June, 1985. '7
{14) Rouhani, S., "Comparative study of ground water mapping techniques," Journal of ~
Ground Water, 24(2), 207-216, 1986.
(15) Rouhani, S., and M. E. Dillon, "Geostatistical Risk Mapping for Regional Water
Resources Studies," Use of Computers in Water Management, Vol. 1, pp. 216-228, V/O
"Syuzvodproekt", Moscow, USSR, 1989 .
. (16) Rouhani, S., and Hall, T.J., "Geostatistical Schemes for Groundwater Sampling,"
Journal of Hydrology, Vol. 103, 85-102, 1988.
(17) Srivastava, R. M., "Describing Spatial Variability Using Geostatistical Analysis,"
Geostatistics for Environmental and Geotechnical Applications, ASTM STP 1283, R.
Mohan Srivastava, Shahrokh Rouhani, Marc V. Cromer, A. Ivan Johnson, Eds.,
American Society for testing and Materials, Philadelphia, 1996.
(18) Wild, M.R., and S. Rouhani, "Effective Use of Field Screening Techniques in
Environmental Investigations: A Multivariate Geostatistical Approach," Geostatistics for
Environmental and Geotechnical Applications, ASTM STP 1283, R. Mohan Srivastava,
Shahrokh Rouhani, Marc V. Cromer, A. Ivan Johnson, Eds., American Society for
testing and Materials, Philadelphia, 1996.
Alexander J. Desbarats 1

MODELING SPATIAL VARIABILITY USING GEOSTATISTICAL


SIMULATION

REFERENCE: Desbarats, J. A., "Modeling Spatial Variability Using Geostatistical


Simulation," Geostatistics for Environmental and Geotechnical Applications, ASTM STP
1283, R. M. Srivastava, S. Rouhani, M. V. Cromer, A. I. Johnson, 1. A. Desbarats, Eds.,
American Society for Testing and Materials, 1996.

ABSTRACT: This paper, the last in a four part introduction to geostatistics, de-
scribes the application of simulation to site investigation problems. Geostatistical
simulation is a method for generating digital representations or "maps" of a variable
that are consistent with its values at sampled locations and with its in situ spatial
variability, as characterized by histogram and variogram models. Continuing the syn-
thetic case study of the three previous papers, the reader is lead through the steps of a
gebstatistical simulation. The si;nulated fields are then compared with the exhaustive
data sets describing the synthetic site. Finally, it is shown how simulated fields can
be used to answer questions concerning alternative site remediation strategies.

KEYWORDS : Geostatistics, kriging, simulation, variogram

INTRODUCTION

In a geostatistical site investigation, after we have performed an exploratory analysis


of our data and we have modeled its spatial variation structure, the next step is
usually to produce a digital image or "map" of the variables of interest from a set of
measurements at scattered sample locations. We are then faced with a choice between
two possible approaches, estimation and simulation. This choice is largely dictated
by study objectives. Detailed guidance for selecting between these two approaches
and among the various types of simulation is provided in the draft ASTM Guide for
the Selection of Simulation Approaches in Geostatistical Site Investigations.
Producing a map from scattered measurements is a classical spatial estimation
problem that can be addressed using a non-geostatistical interpolation method such
IGeological Survey of Canada, 601 Booth st., Ottawa, ON KIA DES, Canada

32
DESBARATS ON SPATIAL VARIABILITY 33 'I
I
,I
as inverse-distance weighting or, preferably, using one of the least-squares weighting
methods collectively known as kriging discussed in Ro~hani (this volume). Regard- "I "!
!
'i

less of the interpolation method that is selected, the result is a representation of our I"~

iII'!.!;
variable in which its spatial variability has been smoothed compared to in situ reality.
: i
Along with this map of estimated values, we can also produce a map of estimation ii,
i!.
(or error) variances associated with the estimates at each unsampled location. This

I"i "
map provides a qualitative or, at best, semi-quantitative measure of the degree of
uncertainty in our estimates and the corresponding level of smoothing we can expect. '1:'
Unfortunately, maps of estimated values, even when accompanied by maps of estima- li.
tion variances, are often an inadequate basis for decision-making in environmental or 1.II
,[:
geotechnical site investigations. This is because they fail to convey a realistic picture i'"
"
of the uncertainty and the true spatial variability of the parameters that affect the
planning of remediation strategies or the design of engineered structures.
The alternative to estimation is simulation. Geostatistical simulation (Srivastava,
1994) is a Monte-Carlo procedure for generating outcomes of digital maps based on
the statistical models chosen to represent the probability distribution function and
the spatial variation structure of a regionalized variable. The simulated outcomes
can be further constrained to honor observed data values at sampled locations on the
map. Therefore, not only does geostatistical simulation allow us to produce a map
of our variable that more faithfully reproduces its true spatial variability, but we can
generate many equally probable alternative maps, each one consistent with our field
observations. A set of such alternative maps allows a more realistic assessment of the
uncertainty associated with sampling in heterogeneous geological media.
This paper presents an introduction to the geostatistical tool of simulation. Its
goals are to provide a basic understanding of the method and to illustrate how it
can be used in site investigation problems. To do this, we will continue the synthetic
soil contamination case study started in the three previous papers. We will proceed
step by step through the simulation study, pausing here and there to compare our
results with the underlying reality and the results of the kriging study (Rouhani, this
volume). Finally, we will use our simulated fields to answer some questions that can
arise in actual soil remediation studies.

STUDY OBJECTIVES

The objective of our simulation study is to generate digital images or maps of lead
(Pb ) and arsenic (As) concentrations in soil. We will then use these maps to de-
termine the proportion of the site area in which Pb or As concentrations exceed the
remediation thresholds of 150 ppm and 30 ppm, respectively. The maps are to repro-
duce the histograms and variograms of Pb and As in addition to observed measure-
ments at sampled locations. Although the full potential of the simulation method
is truly achieved only in sensitivity or risk analysis studies involving multiple out-
comes of the simulated maps, we will focus on the generation of a single outcome. In
many respects, even a sin~l~Il!~p of simulated concentrations is more useful than a
map of kriged values. This is because a realistic portrayal of in situ spatial variabil-
ity is often a sobering warning to planners whereas maps of kriged values are easily
34 GEOSTATISTICAL APPLICATIONS

misinterpreted as showing much smoother spatial variations.


For our study, we have chosen the concentrations of Pb and As as the two region-
alized variables to work with. This may seem like an obvious choice however we could
have taken another approach based on an indicator or binary transformation of our
original variables. The new indicator variables corresponding to each contaminant
would take a value of 1 if the concentration exceeds the remediation threshold and a
value of 0 otherwise. Proceeding in a somewhat different manner than shown here, we
could then generate maps of simulated indicator variables for the two contaminants.
From such maps, the proportion of the site requiring remediation is readily deter-
mined. The drawback with an indicator approach is that we have sacrificed detailed
knowledge of contaminant concentrations in exchange for simplicity and conciseness.
Should the remediation thresholds change, new indicator variables would have to be
defined and the study repeated. Here, we will stick with the more involved but also
more flexible approach of simulating contaminant concentrations. An application of
indicator simulation is described in Cromer et al. (this volume).

HISTOGRAM MODELS

The first step in our simulation study is to decide what probability distribution func-
tions or, more prosaically, what histogram models are to be honored by our simulated
concentrations. We would like these histograms to be representative of the entire site.
Often, the raw histograms of sample data are the most appropriate choice. However,
here this isn't the case: The sampling of our contaminated site was carried out in
two stages. In the first stage, we obtained 77 measurements of Pb distributed on a
fairly regular grid. In the second stage, we focused our sampling on areas identified
in the first stage as having high Pb concentrations. Furthermore, by then we had
become aware that arsenic contamination was present and we analyzed an additional
135 samples for both Pb and As. Thus, our Pb data consist of 77 values that are
probably representative of the entire site area and another 135 values drawn from the
most contaminated region. As for arsenic, our 135 samples were obtained exclusively
from the most contaminated region and are probably not representative of the entire
site. The raw histograms of Pb and As shown in Cromer (this volume) reflect the
preferential or biased sampling procedure and do not provide adequate models for
our simulation.
The answer to this problem is to weight our sample data in such a way as to de-
crease the influence of clustered measurements while increasing that of more isolated
values. In geostatistics, this exercise is known as "declustering" and can be accom-
plished several ways (Isaaks and Srivastava, 1989; Deutsch and Journel, 1992). Here
we used a cell declustering scheme to find sample weights. This involved moving a
10 x 10 unit cell over N non-overlapping positions covering the study area. At a each
cell position, the number n of samples within the cell was counted and each sample
was then assigned a relative weight of liNn. This procedure may be expected to work
well for Pb but for As there is no escaping the fact that our samples are restricted to
a few small, highly contaminated patches and are hardly representative of the site as
a whole. Obtaining a reasonably representative histogram is crucial for a simulation
DESBARATS ON SPATIAL VARIABILITY 35

Nunom.rofO.ta 212 ~mbtrotD. 1~


m.."
std. dri.
300.6708
225.5126 0.600 me., '4.3813
. .. day. 28.4787
coef. of". 0.7500 ooef. of " . 1.9803
ITI4IlCimum1003,0000 maxitn.lm 157.0000
upper (jUn. 454.0000 0.:500 upper qu.-tll 11 .0000
machn 274.5673 medi., 1.9190
w ... quam. 103.9253 lower quart. 0 .0000
rrinimum 0.0000 0._ minimum 0.0000
g
1 0 .300

0 .200

0.100

o. '0. 100. 1.0. 200.

Figure 1: Declustered histograms of a) Pb and b) As .

study therefore desperate measures are called for.


Although no geostatistical method can truly compensate for lack of data, the fol-
lowing "fix" was attempted here: Using our knowledge of the correlation between
Pb and As provided by the 135 samples of the second sampling campaign (Cromer,
this volume), we filled in the missing As values at the 77 locations of the first cam-
paign. For each of the 77 Pb values, we looked up the closest Pb value from the
second campaign and read off the corresponding As value. Thus, all 212 sample lo-
cations have both Pb and As measurements and the same declustering weights can
be used for both variables. The resulting histograms of weighted Pb and As samples
are shown in Figures 1 a) and b), respectively. They should be compared with the
un-declustered histograms shown in Cromer (this volume). We now have histograms
that, we think, provide reasonable models of the exhaustive distributions of Pb and
As that we are trying to replicate in our simulated fields. A peek at the true exhaus-
tive distributions (Cromer, this volume) shows that our declustered Pb histogram
does a fairly good job of reproducing the main statistical parameters whereas our
As histogram does a rather mediocre job despite our best efforts. Further discussion
of the declustering issue can be found in Rossi and Dresel (this volume).

NORMAL-SCORE TRANSFORMATION OF VARIABLES

The next step of our study involves transforming our Pb and As sample values into
standard Normal deviates (Deutsch and Journel, 1992). This "normal-score" trans-
formation is required because the simulation algorithm we will be using is based on
the multivariate Normal (or Gaussian) distribution model and assumes that all sam-
36 GEOSTATISTICAL APPLICATIONS

pIe data are drawn from such a distribution. In simple terms, this transformation
is performed by replacing the value correponding to a given quantile of the original
distribution with the value from a standard Normal distibution associated with the
same quantile. For example, a Pb value of 261 ppm correponding to a quantile of
0.50 (i.e. the median) in the sample histogram is transformed into a value of 0 corre-
sponding to the median of a standard Normal distribution. In mathematical terms,
we seek the transformations ZI and Z2 of Pb and As such that :

G(Zd
(1 )

where G( ) is the cumulative distribution function (cdf) of a standard Normal dis-


tribution and FI ( ) and F 2 ( ) are the sample cdfs for lead and arsenic, respectively.
Implementation of this transformation is fairly straightforward except when identi-
cal sample values are encountered. In such cases, ties are broken by adding a small
random perturbation to each sample value and ranking them accordingly (Deutsch
and Journel, 1992). Here, this "despiking" procedure was required to deal with a
large number of below-detection As values. In general however, it is good practice to
avoid extensive recourse to this procedure. If, for example, large numbers of samples
have values below detection limits, it is better to subdivide the data set into two
populations, above and below detection, and analyze each group seperately, or adopt
an indicator approach (Zuber and Kulkarni, this volume).

PRINCIPAL COMPONENT TRANSFORMATION

Before we can proceed to the analysis of spatial variation, one last step is required.
Our simulation algorithm can only be used to generate fields of one variable at a time.
However, we wish to simulate two variables ZI and Z2, reproducing not only their
respective spatial variation structures but also the relationship between them shown
in Figure 2. We must therefore "decouple" the variables ZI and Z2 so that we can
simulate them independently. To do this, we use the following principal component
transformation which yields the independent variables Yi and 12 from the correlated
variables ZI and Z2 :

(2)
12 =

where p is the correlation coefficient between ZI and Z2 which is found to be 0.839.


DESBARATS ON SPATIAL VARIABILITY 37

Pb II'S As : Declustered Normal Scores


Nurrber 01 daIa 135
Nurrbertrlmmed n
3.00 X Varillb..: mean 0.887
0
otd. daY. 0.783
0
Y Varillb..: mean 0.80S

.;i-';.
00
2.00 0 00
0 Old. dey. 0.806
0
0 correlation 0.839
1.00 rank correlation 0.869

~
0.00 0

0
,
0
.. 0

0
-1.00
0 0

-2.00

.3.00-h-T""I"''TTTT"'TTTT"......,..,...'''..,..,.....,..,..,..,....,..,..,..,...,....
-3.00 -2.00 -1.00 0.00 1.00 2.00 3.00

Z1

Figure 2: Scatter plot of Zl and Z2 for 135 sample values. The correlation coefficient
is 0.839.

VARIOGRAM MODELS

In this section, we examine and model the spatial variation structure of the two
independent variables Y1 == Zl and l'2. The jargon and the steps involved in an
analysis of spatial variation are described in more detail by Srivastava (this volume)
so that only a summary of results is given here.
Directional variograms, or more specfically correlograms, were calculated for Y1
using all 212 data values, and for 1'2 using the 135 values of the second sampling cam-
paign. For each variable, eight directional correlograms were calculated at azimuth
intervals of 22.5° using overlapping angular tolerances of 22.5°. Lag intervals and
distance tolerances were 10 grid units and 5 grid units, respectively for Yi and 5 grid
units and 2.5 grid units, respectively for 1'2. The purpose of these directional correlo-
grams is to reveal general features of spatial variation such as directional anisotropies
and nested structures. Results for Yi. and Y2 are shown in Figures 3 a) and b), re-
spectively. These figures provide a planimetric representation of spatial correlation
structure, displaying correlogram values as a surface, function of location in the plane
of East~West (x) and North-South (y) lag components.
For Yi., we observe, in addition to a significant nugget effect, what we interpret
as two nested structures with different principal directions of spatial continuity. The
first, shorter scale, structure has a direction of maximum continuity approximately
North North-West, a maximum range of about 20 grid units and an anisotropy ratio
of about 1.4 : 1. The second, larger scale structure has a direction of maximum
continuity approximately West North-West, an indeterminate maximum range and a
minimum range of at least 40 grid units.
38 GEOSTATISTICAL APPLICATIONS

--'§.s 40.0 20.0

"d
20.0 10.0
-
.~

~ 0.0 0.0
...:I
~
::s0
rn -20.0 -10.0
I

:E
Z -40.0
-40.0 -20.0 0.0 20.0 40.0
-20.0
-20.0 -10.0 0.0 10.0 20.0
East-West Lag (grid units) East-West Lag (grid units)

Figure 3: Directional correlograms (planimetric view) for a) Yi and b) Y2.

Detailed experimental correlograms were calculated in the directions of maximum


and minimum continuity of the larger scale structure. They are shown in Figures 4
a) and b), respectively. These correlograms were used in the fitting of a model to the
spatial variation structure of Yi. The fitted model is the sum of three components:

1. A nugget effect accounting for 35% of the spatial variance.

2. A short-scale structure accounting for 50% of the spatial variance. It is repre-


sented by an exponential model with maximum continuity in the North North-
West direction. It has a range parameter of 6 grid units and an anisotropy ratio
of 1.43 : 1.

3. A large-scale structure accounting for 15% of the spatial variance. This struc-
ture is also represented by an exponential model with, however, maximum conti-
nuity in the West North- West direction. The model range parameter is 300 grid
units with an anisotropy ratio of 10 : 1. Such a large maximum range ensures
that the "sill" value of the structure is not reached in the direction of maximum
continuity within the limits of the site. What we have done here is model a
"zonal anisotropy" (Journel and Huijbregts, 1978) as a geometric anisotropy
with an arbitrarily large range in the direction of greatest continuity.

This model is also ·shown in Figures 4 a) and b) for comparison with the experimental
results.
For Y2 , there are less data and we are careful not to over-interpret the directional
correlograms. Indeed, the apparent periodicity in the NNE direction is probably
an artifact of the sampling pattern . With somewhat more confidence, we note a
DESBARATS ON SPATIAL VARIABILITY 39

0.800 Co".log,.m 1I1/:Z1 ,*,d:Z1 dlrecHon 1 0.600 Comlog,.m II/I/:Z1 /wd:Zf direction 2

0.700 0.500

0.600 0.<400

0.500 0.300

~ ~
f 0.«10 f 0.200

> >
0.300 0.100

0.200 0.000

0.100 .Q.loo

0.000 .Q.2OO
0.0 10.0 20.0 30.0 40.0 SO.O 60.0 0.0 10.0 20.0 30.0 40.0 SO.O I 60.0

Dstanoe Dstanoe

Figure 4: Experimental correlograms for }'J in a) direction WNW; b) direction NNE.


The fitted model is shown in dashed lines.

strong nugget effect and a structure with maximum continuity in the West North-
West direction, a maximum range of about 30 grid units and an anisotropy ratio of
about 3 : 1. Detailed directional correlograms were calculated for the directions of
maximum and minimum continuity and are shown in Figures 5 a) and b), respectively.
The model fitted to these correlograms is the sum of two components:

1. A nugget effect accounting for 55% of the spatial variance.

2. A structure accounting for 45% of the spatial variance. This structure is repre-
sented by an exponential model with greatest continuity in the West North-West
direction . The model has a range parameter of 8 grid units and an anisotropy
ratio of 3 : 1.

This mQdel is shown in Figures 5 a) and b) for comparison with the experimental
results.

SEQUENTIAL GAUSSIAN SIMULATION

We are now ready to simulate fields of the two independent standard Normal variables,
Y1 and Y2 • The simulations of Y1 and Y2 are to be conditioned on 212 and 135 sample
values, respectively. Both fields are simulated on the same 110 x 70 grid as the
exhaustive data sets for Pb and As .
To perform our simulations, we are going to use the Sequential Gaussian method.
This method is based on two important theoretical properties of the multivariate
Normal (or Gaussian) distribution: First, the conditional distribution of an unknown
40 GEOSTATISTICAL APPLICATIONS

0.600 Corre/ogram IaIl:Y2 heIId:Y2 dlreclJon 1 0.600 Cotrelogram IIIII:Y2 _:Y2 direction 2

0.500 0.500

0.400 0.400

0.300
\ 0.300

! !
8'
~
0.200
" "- " ~
8' 0.200

0.100 '" 0.100

0.000 --- "-

·0.100 -0.100

·0.200 -0.200
0.0 5.0 10.0 15.0 20.0 25.0 30.0 0.0 5.0
Distance Distanct

Figure 5: Experimental correlograms for Y2 in a) direction WNW; b) direction NNE.


The fitted model is shown in dashed lines.

variable at a particular location, given a set of known values at nearby locations, is


Normal. Second, the mean and variance of this conditional distribution are given
by the simple kriging (SK) estimate of the unknown value and its associated error
variance. Simple kriging is a variant of ordinary kriging (OK) described in Rouhani
(this volume). Then, it follows that since the conditional distribution is Normal, it
is completely determined by the mean and variance provided by simple kriging. The
Sequential Gaussian Simulation algorithm is described in detail elsewhere (Deutsch
and Journel, 1992; Srivastava, 1994) ; however, because of its simplicity, it is briefly
outlined here:

1. Start with a set of conditioning data values at scattered locations over the field
to be simulated.

2. Select at random a point on the grid discretizing the field where there is not
yet any simulated or conditioning data value.

3. Using both conditioning data and values already simulated from the surrounding
area, calculate the Simple Kriging estimate and corresponding error variance.
These are the mean and variance of the conditional distribution of the unknown
value at the point given the set of known values from the surrounding area.

4. Select at random a value from this conditional distribution.

5. Add this value to the set of already simulated values.


DESBARATS ON SPATIAL VARIABILITY 41
6. Return to step 2 and repeat these steps recursively until all points of the dis-
cretized field have been assigned simulated values.

Thus, in many ways, the Sequential Gaussian simulation method is similar to the point
kriging process described by Rouhani (this volume). The difference is that we are
drawing our simulated value at random from a distribution having the kriged estimate
as its mean, rather than using the kriged estimate itself as a "simulated" value.
Intuitively, we see how this process leads to fields having greater spatial variability
than fields of kriged values.

BACK-TRANSFORMATIONS

We now have simulated fields of the two independent standard Normal variables Y1
and )12. In order to obtain the corresponding fields of Pb and As , we must reverse the
earlier transformations. First, we reverse equations (2) to get the correlated standard
Normal variables Zl and Z2 from Yi and Y2. Then we reverse equations (1) to get the
variables Pb and As from the standard Normal deviates Zl and Z2. Finally, we are
left with simulated fields of Pb and As on a dense 110 x 70 grid discretizing the site.
Although here we are focusing on single realizations of each of these fields, multiple
realizations can be generated by repeating the simulation step using different seed
values for the random number generator.

COMPARISON OF TRUE AND SIMULATED FIELDS

In addition to honoring values of Pb and As at sampled locations, the simulated fields


should reproduce the histogram and correlogram models that we used to character-
ize contaminant spatial variability. These fields should also reproduce the correlation
between Pb and As . Because this is a synthetic case study, we have exhaustive knowl-
edge of Pb and As contamination levels over the entire site, something we would never
have in practice. Therefore, we can conduct a postmortem of our study, comparing
our simulated fields with the exhaustive fields described in Cromer (this volume).
In order to compare the distributions of true and simulated values, we will use
what is known as a Q-Q plot. This involves plotting the quantile of one data set
against the same quantile of the other data set. Thus, we would plot the median (0.5
quantile) of our simulated values against the median of our true values. Therefore,
if the histograms of the two data sets are similar, all points should plot close to the
45° line. Figures 6 a) and b) show Q-Q plots between exhaustive and simulated val-
ues of Pb and As , respectively. These results show that while we did a rather good
job of reproducing the exhaustive histogram of Pb , we can claim no great success
for As . Although we did our best to correct for the effects of a grossly unrepresen-
tative sampling of As , in the end, this was not good enough. This failure serves
as a reminder that geostatistics alone cannot compensate for a biased site sampling
campaign. Next, we check how well our simulation reproduced the correlation be-
tween Pb and As concentrations. Figure 7 shows a scatter plot of simulated Pb versus
As values. This figure is to be compared with the scatter plot of true values given by
42 GEOSTATISTICAL APPLICATIONS

1000. Q-Q Plot for Pb Slmul.tlon 200. Q-Q Plot for A. Slmul.tlon

800. 160.

] 600.
I 120.
~
j
."

i 400.
1~
~
r/) r/)
80.

200. 40.

o. 200. 400. 600. 800. 1000. o. 40. 80. 120. 160. 200.

True valu•• True vw..

Figure 6: Q-Q quantile plots of exhaustive and simulated data: a) Pb ; b) As

Cromer (this volume). The comparison shows that we did quite a respectable job of
reproducing the relationship between Pb and As in our simulated fields. Directional
correlograms calculated on our two simulated fields are shown in Figures 8 a) and
b). The main features of these correlograms compare favorably with those observed
in the correlograms presented by Srivastava (this volume). Given the limited number
of data and their spatial clustering, the models we fitted to the experimental correl-
ograms were quite successful in representing the true spatial variation structures of
Pb and As .
No comparison of true and simulated fields would be complete without looking at
images or maps of the simulated fields. Although qualitative, the visual comparison
of simulated and true fields is in fact the most stringent measure of the success of our
simulation. We must check how well we have captured the character of contaminant
spatial variability at the site, its "noisyness", the grain of any spatial patterns, and
any trends. We should also check to see what our simulated values are doing in areas
far from conditioning data values. The spatial variability in such areas should be
consistent in character with that observed in more densely sampled areas.'"
Grey-scale digital images of simulated Pb and As fields are shown in Figures 9
and 10, respectively. Comparison with the corresponding true images in Cromer (this
volume) shows that we have reason to be satisfied with our simulation. Discrepancies
between simulated and true fields exist; however, these are manifestations of the
uncertainty associated with our knowledge of site contamination as provided by the
rather limited sampling data. It should be emphasized that we are looking at but one
pair of images of contamination from amongst the many equally possible alternatives
that would be consistent with sampling information. We can also compare Figure 9
DESBARATS ON SPATIAL VARIABILITY 43
vs As : Simulated values
Number of data 7700

X Variable: mean 296.479


160. std. dey. 225.313
Y Variable: mean 15.619
std. dey. 30.076
correlation 0.722
rank correlation 0.881

o. 200. 400. 600. 800. 1000.


Pb

Figure 7: Scatter plot of simulated Ph and As values

40.0 - -- - - "u-.::----,'-J,..,----.;:,-----,

20.0 20.0

~
t-=l 0.0

as~ -20.0 -20.0


~ 0
Z -40. 0 ~----r--"'r----Y'-----,""::"-fL-.----Y----, -40. 0 ~~--f.:""""';~,uLr--.-'-~..c:;
-40.0 -20.0 0.0 20.0 40.0 -40.0 -20.0 0.0 20.0 40.0
East-West Lag (grid units) East-West Lag (grid units)

Figure 8: Directional correlograms (planimetric view ) for simulated a) Ph and h) As .


44 GEOSTATISTICAL APPLICATIONS

0.0

.-.
.....s ~
0
§ 20.0 0
00
'1:l
.~ .-.
'-'
S
p.
til) 40.0 p.
.S '-'
~
~ 0
0
S!
Z 60.0
~

0.0 20.0 40.0 60.0 80.0 100.0


0
Easting (grid units) 0

Figure 9: Grey-scale digital image of the simulated Pb field.

with the kriged field shown in Rouhani (this volume) . We see that kriging smoothes
spatial variations in a non-uniform manner: less in regions with abundant sample
control, more in unsampled regions . This may lead the unsuspecting to conclude that
large portions of the site are quite homogeneous! Simulation, on the other hand,
preserves in-situ spatial variability regardless of the proximity of sampling points.

APPLICATION

Now that we have simulated fields of Pb and As that we confidently assume are
representative of the true yet unknown contamination at the site, we can use these
fields to answer some simple questions .
Perhaps the most basic question that we may ask is what fraction of the site
requires remediation given the contamination thresholds of 150 ppm and 30 ppm for
Pb and As , respectively? However, before we can attempt to answer that question,
we must decide on a "volume of selective remediation" or VSR. Note that the concept
of volume of selective remediation is identical to that of selective mining unit (smu)
described in the mining geostatistics literature (chapter 6 of Journel and Huijbregts,
1978; chapter 19 of Isaaks and Srivastava, 1989).
The volume, or in the present case, area of selective remediation is the smallest
portion of soil that can be either left in place or removed for treatment, based upon
its average contaminant concentration. The VSR may depend on several factors
including the size of equipment being used in the remediation and the sampling
information ultimately available for the selection process. It is an important· design
parameter because the variance of spatially averaged concentrations decreases as the
DESBARATS ON SPATIAL VARIABILITY 45
0
0.0 0
to

,-..
~
.....
§ 20.0
'0 0
.~ 0
"<I' ,-..
'-"
S
p.
bO 40.0 p.
.S '-"

1! ~ <
Z 60.0
0
C"I

0.0 20.0 40.0 60.0 80.0 100.0


0
Easting (grid units) 0

Figure 10: Grey-scale digital image of the simulated As field.

VSR becomes larger. This reduces the spread and alters the shape of the histogram
of contaminant concentrations thereby affecting the proportion of values above a
given threshold and the fraction of the site requiring remediation. Here, the original
sample size or "support", as it is known in geostatistics, is a square of 1 x 1 grid units
(5m x 5m). The corresponding standard deviations of Pb and As concentrations are
218 ppm and 35 ppm, respectively. If we were to consider a VSR with a support of
10 x 10 grid units (50m x 50m), the standard deviations of VSR-averaged Pb and
As concentrations are reduced to 172 ppm and 18 ppm, respectively.
In Tables 1 and 2, we compare fractions of the site requiring remediation for VSRs
of 1 x 1 grid units and 10 X 10 grid units, respectively. Within each table, we also
compare remediation fractions based on kriged, simulated and true values.
For selection based on Pb concentration alone, results for both VSR sizes show
good agreement between remediated fractions calculated on simulated and true fields.
Remediated fractions based on kriged fields are overestimated for the smaller VSR.
We note that the fraction of the site requiring remediation increases for the larger
VSR. This is because the spatial averaging of Pb concentrations over a VSR smears
high values over the entire block area thereby pushing its average over the remediation
threshold. The same phenomenon may also happen in reverse, with low values diluting
a few high values and thus lowering the average VSR concentration below threshold.
In either case, it is obvious that the choice of VSR will have a significant impact on
the fraction of the site requiring remediation.
For selection based on As values alone, remediated fractions calculated on the
simulated fields are almost half those calculated on the true fields. On the other
hand, remediated fractions based on the kriged fields are much larger than those
46 GEOSTATISTICAL APPLICATIONS

Table 1: Fraction of site requiring remediation based on a VSR of 1 x 1 grid units.


The Pb threshold is 150 ppm and the As threshold is 30 ppm.

Field Pb cutoff As cutoff Combined cutoff

Kriged 0.7956 0.4327 0.8360


Simulated 0.6793 0.1692 0.6796
True 0.6998 0.2474 0.7026

Table 2: Fraction of site requiring remediation based on a VSR of 10 x 10 grid units.


The Pb threshold is 150 ppm and the As threshold is 30 ppm.

Field Pb cutoff As cutoff Combined cutoff

Kriged 0.7975 0.4248 0.8011


Simulated 0.7922 0.1688 0.7922
True 0.7792 0.3116 0.7792

based on the true fields. The cause of the poor simulation results can be traced back
to our difficulties in obtaining a representative histogram for As concentrations. The
poor kriging results are due to smearing of As values from the densely sampled highly
contaminated zone to the surrounding area. Considering only the results for the true
field, we see an increase in remediated fraction with the larger VSR size, as we saw
previously with Pb. For selection based on either Pb or As threshold exceedance,
results are similar to those for Pb selection alone : VSRs that would otherwise be
misclassified based on their As value are correctly classified based on their Pb value.
Although the simulated Pb field gave remediation fractions close to those obtained
for the true field, this may be partly fortituous and, in any case, does not ensure that
the blocks selected for remediation are the correct ones i.e. the same as in the true
field. In practice, multiple simulations should be performed and, for each VSR within
the site, a contamination threshold exceedance probability should be calculated from
the resulting distribution of simulated concentrations for that location. The decision
I'll on whether or not to renlediate a given VSR would then be based on its threshold
exceedance probability and not on a single concentration value.
DESBARATS ON SPATIAL VARIABILITY 47
CONCLUSIONS

In this paper, we have described the steps involved in a geostatistical simulation .1


study, going from a small and possibly biased sample data set to a detailed numerical
representation of contamination levels at a hypothetical site.
We have shown that geostatistical simulation is a tool for producing maps of a
variable that honor data values at sampled locations and models of the histogram
and spatial variation structure that characterize the phenomenon.
We have shown that maps of a variable produced by simulation are, in general,
more useful than maps produced by kriging or other spatial interpolation methods
because they provide a more faithful representation of in situ variability.
We have shown that geostatistical theory and conditional simulation provide a
powerful means of studying alternative remediation strategies based on the concept
of volume of selected remediation.
We have shown, perhaps unintentionally, that geostatistical methods are not a
panacea. They cannot compensate for insufficient or grossly unrepresentative sam-
pling. In the end, a geostatistical study is only as good as the sampling data that it
is based on.
Hopefully, through the case study, we have shown that the geostatistical approach
is flexible, allowing for the incorporation of much collateral information and expert
judgement concerning a site that might otherwise be neglected. Indeed, the tailoring
of a geostatistical approach to specific site conditions is the hallmark of a successful
study.
With the overview of geostatistical simulation provide here, the reader should
!
now be able to fully appreciate the subsequent papers on the topic contained in these
proceedings. I' I

,,
I
ACKNOWLEDGMENTS I:
,,
I :!
The author wishes to thank Doug Hartzell, who coined the term "VSR", and one
anonymous reviewer for their comments on the original manuscript. Geological Survey
of Canada contribution no. 20995.

REFERENCES

ASTM Standard Guide for the Selection of Simulation Approaches in Geostatistical


Site Investigations, American Society for Testing and Materials, Philadelphia,
draft submitted for Society approval by section DI8.01.07.

Cromer, M., 1996, Geostatistics for Environmental and Geotechnical Applications:


A Technology Transfer, Geostatistics for Environmental and Geotechnical
Applications, ASTM STP 1283, R. Mohan Srivastava, Shahrokh Rouhani, Marc
V. Cromer, A. Ivan Johnson, Eds. , American Society for Testing and Materials,
Philadelphia.
48 GEOSTATISTICAL APPLICATIONS

Cromer, M.V., C.A. Rautman and W.P. Zelinski, 1996, Geostatistical Simulation of
Rock Quality Designation (RQD) to Support Facilities Design at Yucca Moun-
tain, Nevada, Geostatistics for Environmental and Geotechnical Applications,
ASTM STP 1283, R. Mohan Srivastava, Shahrokh Rouhani, Marc V. Cromer,
A. Ivan Johnson, Eds. , American Society for Testing and Materials, Philadel-
phia.

Deutsch, C.V. and A.G. Journel, 1992, GSLIB : Geostatistical Software Library and
User's Guide, Oxford University Press, New York.

Isaaks, E.H. and R.M. Srivastava, 1989, An Introduction to Applied Geostatistics,


Oxford University Press, New York.

Journel, A.G. and C. Huijbregts, 1978, Mining Geostatistics, Academic Press, Lon-
don.

Rossi, R.E. and P.E. Evan Dresel, 1996, Declustering and Stochastic Simulation of
Ground- Water Tritium Concentrations at Hanford, Washington, Geostatistics for
Environmental and Geotechnical Applications, ASTM STP 1283, R. Mohan Sri-
vastava, Shahrokh Rouhani, Marc V. Cromer, A. Ivan Johnson, Eds. ,American
Society for Testing and Materials, Philadelphia.

Rouhani, S., 1996, Spatial Variability and Geostatistical Estimation : Kriging,


Geostatistics for Environmental and Geotechnical Applications, ASTM STP
1283, R. Mohan Srivastava, Shahrokh Rouhani, Marc V. Cromer, A. Ivan John-
son, Eds. , American Society for Testing and Materials, Philadelphia.

Srivastava, R.M., 1994, An Overview of Stochastic Methods for Reservoir Char-


acterization, in Stochastic Modeling and Geostatistics : Principles, Methods
and Case Studies, J. Yarus and R. Chambers, Eds. , American Association of
Petroleum Geologists, Computer Applications 3, p.3-16, American Association
of Petroleum Geologists, Tulsa.

Srivastava, R.M., 1996, Describing Spatial Variability Using Geostatistical Analysis,


Geostatistics for Environmental and Geotechnical Applications, ASTM STP
1283, R. Mohan Srivastava, Shahrokh Rouhani, Marc V. Cromer, A. Ivan John-
son, Eds. , American Society for Testing and Materials, Philadelphia.

Zuber, R.D. and R. Kulkarni, 1996, A Geostatistical Analysis of Lake Sediment Con-
taminants at a Superfund Site, Geostatistics for Environmental and Geotechni-
cal Applications, ASTM STP 1283, R. Mohan Srivastava, Shahrokh Rouhani,
Marc V. Cromer, A. Ivan Johnson, Eds. , American Society for Testing and
Materials, Philadelphia.
,
"
Environmental Applications
Bruce E. Buxton 1 , Darlene E. Wells 2 , Alan D. Pate 3

GEOSTATISTICAL SITE CHARACTERIZATION OF HYDRAULIC HEAD AND URANIUM


CONCENTRATION IN GROUNDWATER

Bl!IFEBENCE: Buxton, B. E., Wells, D. E., Pate, A. D., "Geostatistical Site Characterization of
Hydraulic Head and Vranium Concentration in Groundwater," Geostatistlos for Environmen-
tal and Geoteohnioal Applloatlons, ASTM STP 128:3, R. Mohan Srivastava, Shahrokh Rouhani, Marc
V. Cromer, A. Ivan Johnson, and Alexander J. Desbarats, Eds., American Society for Testing and
Materials, 1996.

ABSTRACT: The first case study presented in this paper describes an


assessment of the spatial distribution and temporal changes in hydraulic
head pressure in the groundwater beneath a retired federal government
uranium processing facility. Analysis of the hydraulic heads involved
ordinary kriging which was found to be a better mapping method than such
alternatives as inverse-distance weighting, mainly because kriging
provides measures of estimation uncertainty. The objective of this
kriging was to provide estimated steady-state head values for use in
calibrating a groundwater flow model for the site. In the second case
study, the spatial distribution of potential uranium contamination in
the aquifer was assessed with lognormal kriging. Uranium measurements
for this analysis were available at roughly three-month intervals across
a four-year time period. The objective of the analysis was to assess
where the uranium concentrations were highest. A second objective, not
addressed in this paper, was to determine if the concentrations were
changing significantly during the four-year time period.

KEYWORDS: ordinary kriging, lognormal kriging, joint spatial temporal


analysis

Kriging is a statistical interpolation method for analyzing


spatially and temporally varying data. It is used to estimate
groundwater hydraulic heads (or any other important parameter) on a
dense grid of spatial and temporal locations covering the region of
interest. At each location, two values are calculated with the kriging
procedure: the estimate of hydraulic head (in feet above sea level),
and the precision of the estimate (also in feet above sea level). The
precision can be interpreted as the half-width of a 95% confidence
interval for the estimated head.
The kriging approach includes two primary analysis steps:

1. Estimate and model temporal and spatial correlations in the


available monitoring data using a semivariogram analysis.

lprogram Manager, Battelle, 505 King Avenue, Columbus, OR 43201.

2S en ior Data Analyst, Battelle, 505 King Avenue, Columbus, OR


43201.

3Research Scientist, Battelle, 505 King Avenue, Columbus, OH


43201.

51
52 GEOSTATISTICAL APPLICATIONS

2. Use the resulting semivariogram model and the available


monitoring data to interpolate (i.e., estimate) hydraulic head values at
unsampled times and locations; calculate the statistical precision
associated with each estimated value.

Spatial Correlation Analysis

The objective of the spatial correlation analysis is to


statistically determine the extent to which measurements taken at
different locations and/or times are similar or different. This section
is written in terms of hydraulic head measurements; however, the
analysis approach is similar for any measured parameter of interest.
Generally, the degree to which head measurements taken at two locations
are different is a function of the distance and direction between the
two sampling locations. Also, for the same separation distance between
two sampling locations, the spatial correlation may vary as a function
of the direction between the sampling locations. For example, head
values measured at each of two locations, a certain distance apart, are
often more similar when the locations are at the same depth, than when
they are at the same distance apart but at very different depths.
Spatial/temporal correlation is statistically assessed with the
semivariogram function, r(h), which is defined as follows (Journel and
Huijbregts, 1981):

where Z(~) is the hydraulic head measured at location ~, h is the vector


of separation between locations ~ and ~+h, and E represents the expected
value or average over the region of interest. Note that the location ~
might be defined by an easting, northing, and depth coordinate, or for
joint spatial/temporal data by an easting, northing, and time
coordinate. Similarly, the vector of separation might be defined as a
three-dimensional shift in space, or for joint spatial/temporal data as
a shift in both space and time. The semivariogram is a measure of
spatial differences, so that small semivariogram values correspond to
high spatial correlation, and large semivariogram values correspond to
low correlation.
As an initial hypothesis, it is always wise to assume that the
strength of spatial correlation is a function of both distance and
direction between the sampling locations. When the spatial correlation
is found to depend on both separation distance and direction it is said
to be anisotropic. In contrast, when the spatial correlation is the
same in all directions, and therefore depends only on separation
distance, it is said to be isotropic.
The spatial correlation analysis is conducted in the following
steps using all available measured hydraulic head data:

• Experimental semivariogram curves are generated by organizing


all pairs of data locations into various separation distance
and direction classes (e.g., all pairs separated by 500-1500 ft
(150-450 m) in the east-west direction ± 22.5°), and then
calculating within each class the average squared-difference
between the head measurements taken at each pair of locations.
The results of these calculations are plotted against
separation distance and by separation direction.

• To help fully understand the spatial correlation structure, a


variety of experimental semivariogram curves are generated by
subsetting the data into discrete zones, such as different
depth horizons or time periods. If significant differences are
found in the semivariograms they are modeled separately; if
not, the data are pooled together into a single semivariogram.
BUXTON ET AL. ON SITE CHARACTERIZATION 53
• After the data have been pooled or subsetted accordingly, and
the associated experimental semivariograms have been calculated
and plotted, a positive-definite analytical model is fitted to
each experimental curve. The fitted semivariogram model is
then used to input the spatial correlation s~ructure into the
subsequent kriging interpolation step.

In this study, the computer software used to perform the


geostatistical calculations was the GSLIB software written by the
Department of Applied Earth Sciences at Stanford University, and
documented and released by Prof. Andre Journel and Dr. Clayton Deutsch
(Deutsch and Journel, 1992). The primary subroutine used to calculate
experimental semivariograms was GAMV3, which is used for three-
dimensional, irregularly spaced data.

• For three-dimensional spatial analyses, horizontal separation


distance classes were defined in increments of 1000 ft (300 m)
with a tolerance of 500 ft (150 m), while vertical distances
were defined in increments of 20 ft (6 m) with a tolerance of
10 ft (3 m). Horizontal separation directions were defined in
the four primary directions of north, northeast, east, and
northwest with a tolerance of 22.5°.

• For the joint spatial/temporal analysis, spatial separation


distances and directions were defined in the same way as
described immediately above, although there was no vertical
direction associated with this analysis. For the temporal
portion of this analysis, separation distance classes were
defined in increments of 30 days with a tolerance of 15 days.

Interpolation Using Ordinary Kriging

Ordinary kriging is a linear geostatistical estimation method


which uses the semivariogram function to determine the optimal weighting
of the measUrecfhydraulic head values to be used for the required
estimates, and to calculate the estimation precision associated with the
estimates (Journel and Huijbregts, 1981). In a sense, kriging is no
different from other classical interpolation and contouring algorithms.
However, kriging is different in that it produces statistically optimal
estimates and associated precision measures. It should be noted that
the ordinary kriging variance, while easy to calculate and readily
available from most standard geostatistical software packages, may have
limited usefulness in cases where the data probability distribution is
highly skewed or non-gaussian. The ordinary kriging variance provides a
precision measure associated with the data density and spatial data
arrangement relative to the point or block being kriged. However, the
ordinary kriging variance is independent of the data values themselves,
and therefore may not provide an accurate measure of local estimation
precision (e.g., appropriate width of estimation confidence interval).
The kriging analysis was conducted in this study using the GSLIB
computer software (subroutine KTB3D). The primary steps involved in
this analysis were as follows:

• A three-dimensional grid was defined, specifying the locations


at which estimated head values were required. The network
included 112 blocks in the northern direction and 120 blocks in
the eastern direction, and all blocks were 125 ft (37.5 m)
square. For three-dimensional spatial kriging, the network
included 30 vertical blocks 5 ft (1.5 m) thick. For joint
spatial/temporal kriging, the network included 43 monthly
blocks in increments of 30 days, starting at January 20, 1990.

• At each block in the grid, the average hydraulic head across


the block was estimated using all measured data found within a
54 GEOSTATISTICAL APPLICATIONS

pre-defined search radius. For three-dimensional spatial


kriging of steady-state hydraulic head, the search radius was
6000 ft (1800 m) in all directions. For joint spatial/temporal
kriging, the search radius was anisotropic and extended 6000 ft
(1800 m) in space and 72 days in time.

• After the available data were identified for each grid block,
the appropriate data weighting, estimated hydraulic head, and
estimation precision were calculated using the appropriate
semivariogram model.

• Output from the kriging process was typically displayed in the


form of contour maps, to represent spatial variations, and
time-series graphs to represent temporal variations.

ANALYSIS OF HYDRAULIC HEAD

Steady-state hydraulic heads were needed for calibration of a


steady-state groundwater flow model. A two-step data analysis approach
was used to estimate the steady-state heads.

1. A joint spatial-temporal kriging analysis was performed to


estimate monthly hydraulic head changes at one depth horizon, and to
select a single month representative of steady-state conditions.

2. A three-dimensional spatial kriging analysis was performed with


data from the selected month at all available depth horizons to estimate
steady-state hydraulic heads.

Joint Spatial-Temporal Analysis

The joint spatial-temporal kriging analysis was performed using


monthly hydraulic head measurements in 177 wells (Figure 1) collected
during the period from January, 1990 through July, 1993. Figure 1 shows
the well locations where data were available for only the joint spatial-
temporal analysis (denoted "JST" in the figure), for only the three-
dimensional steady-state analysis (denoted "55"), and for both of the
analyses (denoted "Both"). There were a total of 3791 joint spatial-
temporal measurements analyzed; and the minimum, maximum, mean, and
standard deviation of these data were 493.7, 568.9, 519.9, and 5.3 feet
(148.1, 170.7, 156.0, 1.6 m), respectively. The semivariogram curves,
quantifying spatial and temporal correlation in these data, are shown in
Figures 2 and 3. The spatial semivariograms in Figure 2 were calculated
for four standard directions -- north, northeast, east, and northwest.
These semivariograms show clear anisotropy with the highest
variabilities directed north along the predominant flow direction, and
the lowest variabilities directed east in the direction perpendicular to
predominant flow. The corresponding temporal semivariogram for the
monthly hydraulic heads is shown in Figure 3. Note that the units for
separation distances between data locations are in days in Figure 3 and
in feet in Figure 2. The semivariograms in Figures 2 and 3 were modeled
with an anisotropic mathematical model containing three nested variance
structures; the parameters of the model are listed in Table 1. Note in
this table that three types of semivariogram models were used in various
parts of these analyses: spherical, gaussian, and linear models. These
models are fully described by Journel and Huijbregts (1981). In Figure
2 the bold line denotes the model in the north direction; the dashed
line denotes the model in the northeast or northwest direction; and the
dotted line denotes the model in the east direction.
BUXTON ET AL. ON SITE CHARACTERIZATION 55
15000
0

13500 0
0

0 0
12000 0
0
0
OCQ,%>O~o
0:0 0 6
0 0
0 0
10500 0 800 0
0 of}
,..... 0
0 {) a 0
o 0
~ 9000 0 0 o %.00 0
0
....Q)Q) t:9
~o
o 0 0 0
0
E
'-"
7500
0 0
Cboo 0 0
0 0

.c 0009 0
0
0 6
t 000
o 0
0 tf::.
0 6000 0 0
Z 00 00 0<::0
0
0 0 0 6
4500 &&0 00 0 £t:,
0
0 8 o§o

~o
66
3000 o 0 0 <P 6
o 0
o 0
o 00 0
1500
o 00 0
0 00 0
0 0 0
I I I I

0 1500 3000 4500 6000 7500 9000 10500 12000 13500 15000

East (meters)

IWell Locations 000 Both 000 J5T 666 55

Fig. 1 -- Well locations where hydraulic head levels were monitored


from January , 1990 through July, 1993.

The monthly head measurements were used along with the


semi v ariogram model to estimate via kriging monthly changes in the heads
a c ross the entire groundwater modeling grid . The time period for this
kriging analysis was taken as every 30 days starting January 20, 1990
and ending July 2, 1993 . The monthly heads (in feet above mean sea
level) are depicted in Figure 4 for six locations uniformly spaced
across the groundwater modeling grid. This figure shows that hydraulic
heads during this period were relativ ely high in 1990, decreased in
1991 , were relatively low in 1 9 92 , and were increasing in 1993 .
56 GEOSTATISTICAL APPLICATIONS

40.0

36.0

32.0
~
a; 28.0
Q)
..:!::; 24.0 -
E
CU
0, 20.0
.Q
.....
CU
16.0
>
'EQ) 12.0
III
8.0

4.0

0.0
0 600 1200 1800 2400 3000 3600 4200 4800 5400 6000
separation distance (feet)

Fig. 2--Spatial semivariograms f rom joint spatial -temporal analysis of


hydraulic head . Not e that 1 foot ; 30 cm .

2000 Wells Joint sIt - temporal


10.0

9.0

8.0
~
.....Q) 7.0
Q)
LL
6.0
E
CU
..... 5.0
Ol
.Q
..... 4.0
CU
>
'E
Q)
3.0
(/) : : :
2.0 ......... '[ " ......... -;- .......... ! .... . .. . . . ~

1.0 ......... : .......... .:. .... . .

0.0
0.0 50.0 100.0 150.0 200.0 250.0 300.0 350.0 400.0 450.0 500.0
separation distance (Days)
Fig. 3-- Tempora l Semivariogram from joi nt spatial- t emporal a n alysis of
hydraulic he ad l evels . No t e tha t 1 foot; 30 cm.
BUXTON ET AL. ON SITE CHARACTERIZATION 57
TABLE 1--Fitted semivariogram models of spatial and temporal
correlation. (Note that 1 foot = 30 cm.)

Data Semivariogram Model

Joint Spatial-Temporal K = 3 Nested Structures, Nugget Variance o


Hydraulic Head Pressure ft.2
in ft. 1. Geometric Anisotro~ic Spherical,
variance = 1.5 ft. , Spatial Range
1200 ft., Temporal Range = 30 days
2. Geometric Anisotro~ic Spherical,
variance = 5.5 ft. , Spatial Range
7000 ft., Temporal Range = 700 days
3. Zonal Gaussian in Spatial NS
Direction, Variance = 22 ft.2,
Spatial NS Range = 6060 ft.
Steady-State K = 2 Nested Structures, Nugget Variance 0
Hydraulic Head Pressure ft.2
in ft. 1. Isotropic Linear, Slope = 0.00045
(June, 1993 Data) ft. 2/ft.
2. Zonal Gaussian in Horizontal NS
Direction, Variance = 13 ft.2, NS
Range = 8660 ft.
1990 Uranium Levels K = 1 Structure, Nugget Variance = 0.3
in flg/L [In(flg/ L )]2
1. Geometric Anisotropic Spherical,
Variance = 2.7 [In(flg/L)]2,
Horizontal Range = 3000 ft.,
Vertical Range = 120 ft.

Steady-State Analysis
!
il
One primary reason for performing the joint spatial-temporal
analysis in the previous section was to select, for the steady-state
analysis, a single month which was representative of average hydraulic
head levels during the 1990-1993 time period. Examining the results in
Figure 4, it appears that three months can be considered representative:
January, 1990; November, 1991; and June, 1993. Hydraulic heads in each
of these three months appear to be approximately equal to the average
head levels across the entire 1990-1993 time period. However, several
new wells were installed in the area in 1993, particularly in the
southeastern part of the modeling grid. Therefore, a significantly
greater number of head measurements were available for June, 1993 in
comparison with January, 1990 and November, 1991. As a result, June,
1993 was selected as the month to represent steady-state conditions.
Hydraulic head measurements for June, 1993 were available for the
steady-state kriging analysis in 202 wells at various depths. The
horizontal spatial semivariograms for these data are shown in Figure 5.
As in the joint spatial-temporal analysis (Figure 2), horizontal
semivariograms in Figure 5 were calculated in four primary directions.
The fitted model is also shown in Figure 5, where the bold line denotes
the model in the north direction; and the dashed line denotes the model
in the east direction.
A kriging analysis was performed with the June, 1993 data and the
semivariogram model shown in Figure 5, as well as Table 1. This
analysis estimated steady-state head levels across the groundwater
modeling grid at regular 5 ft (1.5 m) vertical intervals from 390 to 540
ft (117 to 162 m) above sea level. The horizontal variability in
steady-state head levels at the 490 ft (147 m) elevation is shown in
58 GEOSTATISTICAL APPLICATIONS

527
526
525
524
523
522
i 521
.!
...... 520
519
~Q) 518
:r: 517
516
"0
....Q)a1 515
514
E
:;:; 513
CJ) 512
W
511
510
509
508
507
506

0 10 20 30 40 50

Time (months)

Grid Location >!<->!<-¥ (1 0,1 0) e-e-e (60,10) B-B-B (40,30)


~ (80,30) ~ (10,90) ~ (60,90)

Fig. 4--Temporal Profile of Six Selected Estimation Locations.


Note that 1 foot = 30 cm .

Figure 6 . This figure shows a general trend of decreasing head pressure


to the south associated with the predominant flow direction, along with
a hydraulic head depression caused by two pumping wells in the eastern
portion of the grid. Note in the extreme south-central part of Figure 6
that an unrealistically abrupt transition is shown between the uniform
head values to the west and the pumping depression to the east .
Unrealistic kriging features like that are possible when estimates are
calculated for areas beyond the coverage of the data locations (see
Figure 1). Figure 7, which presents the statistical uncertainty (in
feet) associated with the estimates in Figure 6, shows that the steady-
state heads are generally estimated to within 1 or 2 ft ( . 3 or . 6 m) .
For areas beyond the spatial coverage of the data, the uncertainties
increase to 3 ft. (.9 m), or more . These uncertainties can be
interpreted as half-widths of a 95\ confidence interval for the
estimates. That is, the confidence interval for the hydraulic head at
any location in the grid is
BUXTON ET AL. ON SITE CHARACTERIZATION 59

20.0 -

18.0 .

16.0 .

-
$!!...
Q) 14.0

-
2
E
ro
12.0

.... 10.0
OJ
0
.;:
8.0 .
ro

~--.z:;. L. . " "fO~~~ ~:


>
'EQ) 6.0
II)
4.0
'~
_r.: :<1 1
---Model
_~ ____ ~ _ - - - Model
2.0

0.0
0 600 1200 1800 2400 3000 3600 4200 4800 5400 6000
separation distance (feet)

Fig. 5--Horizontal spatial semivariograms for steady-state hydraulic


head analysis using June, 1993 data. Note that 1 foot = 30 cm.

HEAD ± PREC

where HEAD is the estimated hydraulic head from Figure 6 and PREC is the
estimation uncertainty from Figure 7 .

ANALYSIS OF URANIUM LEVELS

Estimated uranium levels were needed for calibration of a


groundwater solute transport model. Separate three - dimensional spatial
kriging analyses, similar to that for steady-state hydraulic head, were
performed using average uranium levels measured during 1990, 1991, and
1992 . One important difference between these uranium analyses and the
head data analysis was that a logarithmic transformation of the uranium
data was performed prior to the semivariogram and kriging analyses.
This transformation was required to reduce the extreme variability seen
in the uranium concentrations, making the semivariogram analysis more
reliable . That is, the semivariograms calculated with untransformed
data showed extreme variability which would have been difficult to
model, while the semivariograms calculated with transformed data
exhibited less variability to which a model could more reliably be fit.
However, this transformation of the data leads to possible
complications when the subsequent kriging results are back-transformed.
The most direct back-transformation is a simple inverse-logarithmic
transformation of the kriging estimate and estimation uncertainty.
However, this back-transform corresponds to estimation of the median
uranium concentration across a grid block, instead of the mean uranium
concentration. In addition, the 95% confidence intervals for uranium
60 GEOSTATISTICAL APPLICATIONS

~
o
o
:0
'0
·c
S
.£:
t:
o
Z

o 10 20 30 40 50 60 70 80 90 100 110 120

East (grid blocks)

Krg. Head .. II ..
< 515 • < 518
II •
II • •
< 521
••• < 524 • •• > = 524

Fig. 6--Estimated steady-state hydraulic head (feet) .


Note that 1 foot = 30 cm.

concentrations are multiplicative, rather than additive, in format.


That is, the confidence interval at any location in the grid is

[CONC / PREC, CONC*PREC]

where CONC is the back-transformed estimated median uranium


concentration and PREC is the back-transformed estimation uncertainty.
In the alternative approach, the kriging estimate and estimation
uncertainty can be back-transformed using analytical relationships
between the mean and variance of the normal and lognormal probability
distributions (Journel and Huijbregts, 1981, p. 572), resulting in an
BUXTON ET AL. ON SITE CHARACTERIZATION 61

120

110

100

90

80
~
8 70
:0
u
.;:: 60
.g
..c. 50
t:
0
Z 40

30

20

10

0 10 20 30 40 50 60 70 80 90 100 110 120

East (grid blocks)

[ Krg. Prec. < 1 ••• < 2 ••• < 3 ••• > =3 [

Fig. 7--Statistical uncertainty (width of 95\ confidence interval)


in feet for estimated steady-state hydraulic head.
Note that 1 foot ; 30 cm.

estimate of the mean uranium concentration rather than the median. In


this case, as with the analysis of water levels discussed earlier, the
confidence intervals are additive, although there is no guarantee that
the lower confidence bounds will be greater than zero.

1990 Uranium Levels

The spatial kriging analysis was performed using average uranium


concentrations (~g / L) measured during 1990 in 169 wells at various
depths (Figure 8). The mean of these measurements was 29.3 ~g/L,
although the maximum concentration (691 ~g / L) was considerably higher.
62 GEOSTATISTICAL APPLI CATIONS

15000

*
13500

12000 * ,j:!'** *
* 'lit< :;t* * *
* ~*** *" *
10500

* * **
*
** ** *
* * * <~* *
9000
*
~
t
** * *
0 7500 * ** * *
Z * * **
***
* * * *
* * *** * * *
6000

* *
4500 *** * * *
*
* * ** *
3000 * * **
* * >i<I<

*
1500 * *
* *
*
o T<"-'rrITT"-.rrTT"~rr"""-'"""-'rr"",,-r,,,,,,~
o 1500 3000 4500 6000 7500 9000 10500 12000 13500 15000
East

Fig. 8--Wel l locations where uranium concentrations


were meausred in 1990.

The overall variability in the uranium data, as measured by the


coefficient of variation , was also relatively high (2.90), particularly
in comparison with that of the hydraulic head data (0.01).
Horizontal semivariograms were calculated with the log-transformed
data in the four primary directions (Figure 9), using horizontal
separation distance classes defined in increments of 500 ft (150 m) with
a tolerance of 250 ft (75 m), and vertical distance classes in
increments of 20 ft (6 m) with a tolerance of 10 ft (3 m). The
horizontal semivariograms indicated no significant anisotropy; that is,
all four directional curves exhibited the same shape and variability .
The vertical semivariogram (Figure 10) was found to plateau at the same
overall variance as the horizontal semivariograms; however, the vertical
semivariogram reaches it plateau at a separation distance of about 1 2 0
ft (36 m) wh ile the horizontal semivariograms reach their plateau at a
separation distance of about 3000 ft (900 m) . As a result, a geometric
an i sotropic semivariogram mode l was fi t ted to these curves, as shown in
Figures 9 and 10, as well as Table 1.
BUXTON ET AL. ON SITE CHARACTERIZATION 63
5.0

4.5

---
N

Cl
4.0

3.5
2-
-E
c: 3.0

2.5
"f- .,-<

ro
.....
Cl
.Q
.....
2.0

[
ro OMNI
> 1.5 0 N
·E 0 NE
OJ \1 E
en 1.0 I NW
Model
0.5

0.0 5400 6000


0 600 1200 1800 2400 3000 3600 4200 4800
separation distance (feet)

Fig. 9--Horizontal semivariograms for log-transformed 1990


average uranium concentrations. Note that 1 foot; 30 cm.

A three-dimensional kriging analysis was next performed using the


log-transformed 1990 average uranium concentrations and the
semivariogram model discussed above. In this analysis, a data search
radius of 12,000 ft (3600 m) was used. The resulting estimated spatial
distribution of the median uranium concentrations is depicted in Figure
11 which is a horizontal cross-section at a depth of 512 ft (153.6 m).
above sea level. The most significant uranium concentrations, those
above 70 ~g / L, occur in a northeast oriented area extending about 2500
ft (750 m) by 900 ft (270 m) horizontally, and about 40 ft (12 m)
vertically . A surrounding area, about 5-10 times larger, contains lower
uranium concentrations between 10 and 70 ~g / L. Figure 12, which
presents the statistical uncertainty associated with the estimates in
Figure 11, indicates that these uranium concentrations are typically
estimated to within a multiplicative factor of 10; that is, the true
concentrations could be 10 times higher or lower .
In contrast, Figure 13 presents the mean 1990 uranium
concentrations calculated from the same lognormal kriging, but using the
second back-transform described above. Note that because the estimated
mean concentration is more strongly affected by high uranium data values
than is the estimated median concentration, the mean estimates in Figure
13 exhibit greater spatial variability than the median estimates in
Figure 11. This is particularly true in the northwest, northeast, and
southeast corners of the figure where the kriging is extrapolating
beyond the spatial coverage of the available data. Kriging estimates in
those areas should not be trusted, and are probably best excluded from
the final map. However, they have been retained in Figure 13 to point
out this common problem . The corresponding statistical uncertainty
associated with the mean estimates is presented in Figure 14; the
uncertainties range from about 150 ~g / L to 240 ~g / L. Qualitatively, the
uncertainty results in Figures 12 and 14 are similar and result in
64 GEOSTATISTICAL APPLICATIONS

1990 Uranium - Vertical


5.0

4.5

4.0
~
::::: 3.5
OJ
3
-
c 3.0
E
m 2.5
....
OJ
o 2.0
.;::
m
.~ 1.5
Q)
(J) 1.0

0.5

15.0 30.0 45.0 60.0 75.0 90.0 105.0 120.0 135.0 150.0
separation distance (Feet)
Fig . 10--Vertical semivariograms for log-transformed 1990 a v e rage
uranium concentrations . Note that 1 foot = 30 c m.

similar upper confidence bounds. However , as noted earli e r , the


uncertainties for the mean estimates (Figure 14) imply lower confidence
bounds which are often below zero ~g / L while the uncertainties for the
median estimates (Figure 12) are not plagued b y this problem .

CONCLUSION

This paper presents four variations of the ordinary kriging


methodology which was found useful for environmental characterization of
groundwater at a potentially contaminated site . When estimating
hydraulic head pressure in the groundwater aquifer, three - dimensional
ordinary kriging was used in two different ways: (1) to assess temporal
changes in the two-dimensional spatial distribution of head pressures,
and (2) to estimate the three-dimensional spatial distribution of head
pressures at a fixed point in time. In both cases ordinary kriging was
applied directly to the head data, and the resulting kriging variances
were used to construct statistical confidence intervals for the
estimated head values. Ordinary kriging could be used directly in these
cases because the head data exhibited relatively low overall variability
and a symmetric probability distribution.
In contrast, kriging of uranium concentrations in the groundwater,
which exhibited much greater variability and a skewed probability
distribution, required modification of the standard ordinary kriging
procedure. In this case, ordinary kriging was performed after making a
natural logarithmic transformation of the uranium data to help reduce
the variability and make the subsequent semivariogram analysis more
reliable. The major complication with this approach is related to the
back-transformation which must be performed after kriging to convert the
estimates back into the original scale of measurement. Two approaches
BUXTON ET AL. ON SITE CHARACTERIZATION 65

120

110

100

90

80
~
0
0 70
:0
"0
.1: 60
.9
£ 50
t::
0
Z 40

30

20

10

o 10 20 30 40 50 60 70 80 90 100 110 120

East (grid blocks)

Krg. Conc. < 10 •• < 20 ••• < 30


• •• < 40 ••• < 50 • •• > = 50

Fig. 11--Median estimate of 1990 uranium concentrations (~g/L)

were presented in this paper . The first approach leads to estimated


median (rather than mean) uranium concentrations, and mUltiplicative
(rather than additive) confidence bounds. The second approach results
in estimated mean uranium concentrations and additive confidence bounds.
However, there is no guarantee that the lower confidence bounds will be
non - negative, and the widths of the confidence intervals are independent
of the local data, which may not be appropriate for highly skewed data
distributions .
66 GEOSTATISTICAL APPLICATIONS

80
~
0
0 70
:0
u
01:: 60
.9
.c: 50
t
0
Z 40

30

20

10

o 10 20 30 40 50 60 70 80 90 100 110 120

East (grid blocks)

Preco < 5 • < 10 •••


• •• < 20 ••• >= 20

Fig. 12--Multiplicative uncertainty factor (width of 95% confidence


interval) for median estimates of 1990 uranium concentrations.

REFERENCES

Deutsch, Clayton V., and Andre G. Journel, GSLIB: Geostatistical


Software Library and User's Guide, Oxford University Press, New
York, 1992, 340 pp.

Journel, A.G., and Ch . J . Huijbregts, Mining Geostatistics, Academic


Press, reprinted with corrections, 1981, 600 pp.
BUXTON ET AL. ON SITE CHARACTERIZATION 67

120

110

100

90

80
~
0
0 70
:0
u
.t: 60
.9
..t: 50
1::
0
Z 40

30

20

10

o 10 20 30 40 50 60 70 80 90
..
100 110 120

East (grid blocks)

Krg. Conc. < 10 ••• < 20 ••• < 30


••• < 40 ••• < 50 • •• >= 50

Fig. 13 - -Mean estimate of 1990 uranium concentrations (~g / L).


68 GEOSTATISTICAL APPLICATIONS

120

110

100

90

80
~
8 70
:0
"0 60
·c
,g
l: 50
-e
0
z 40

30

20

10

o 10 20 30 40 50 60 70 80 90 100 110 120

East (grid blocks)

Krg. Prec. < 150 ••• < 180 ••• < 210
••• < 240 ••• > = 240

Fig . 14--Additive uncertainty factor (width of 95% confidence interv al)


in ~g / L for mean estimates of 1 9 90 uranium concentrations .
Pierre Colin/ Roland Froidevaux/ Michel Garcia3 and Serge Nicoletis4

INTEGRATING GEOPHYSICAL DATA FOR MAPPING THE CONTAMINATION


OF INDUSTRIAL SITES BY POLYCYCLIC AROMATIC HYDROCARBONS: A
GEOSTA TISTICAL APPROACH

REFERENCE: Carr, J. R., "Revisiting the Characterization of Seismic Hazard Using


Geostatistics: A Perspective After the 1994 Northridge, California Earthquake," Geo-
statistics for Environmental and Geotechnical Applications, ASTM STP 1283, R. Mohan
Srivastava, Shahrokh Rouhani, Marc V. Cromer, A. Ivan Johnson, and Alexander J. Des-
barats, Eds., American Society for Testing and Materials, 1996.

ABSTRACT: A case study is presented of building a map showing the probability that the
concentration in polycyclic aromatic hydrocarbon (P AH) exceeds a critical threshold. This
assessment is based on existing PAH sample data (direct information) and on an electrical
resistivity survey (indirect information). Simulated annealing is used to build a model of the
range of possible values for P AH concentrations and of the bivariate relationship between
P AH concentrations and electrical resistivity. The geostatistical technique of simple indicator
kriging is then used, together with the probabilistic model, to infer, at each node of a grid,
the range of possible values which the P AH concentration can take. The risk map is then
extracted for this characterization of the local uncertainty. The difference between this risk
map and a traditional iso-concentration map is then discussed in terms of decision-making.

KEYWORDS: polycyclic aromatic hydrocarbons contamination, geostatistics, integration


of geophysical data, uncertainty characterization, probability maps, simulated annealing,
bivariate distributions, indicator kriging.

IHead, Environmental and Industrial Risks, Geostock, Rueil-Malmaison, France

2Manager, FSS Consultants SA, Geneva, Switzerland

3Manager, FSS International r&d, Chaville, France

4Head, Geophysical Services, Geostock, Rueil-Malmaison, France

69
70 GEOSTATISTICAL APPLICATIONS

Steelwork and coal processing sites are prone to contamination by polycyclic aromatic
hydroccu"bons (P}Jl), SOnte of which are known to be carcinogenic. Consequently, local and
state regulatory agencies require that all contaminated sites be characterized and that
remediation solutions be proposed. The traditional approach for delineating the horizontal and
vertical extent of the contamination is to use wells and boreholes to construct vertical profiles
of the contamination at several locations. This approach, however, is both time consuming
and expensive. Recent work has shown that, in some situations, electrical conductivity and
resistivity surveys could be used as a pathfinder for delineating the contaminated areas.
These geophysical surveys, which are both expedient and cost effective, could be used to
reduce the number of wells and boreholes to be drilled.

Geophysical data, however, do not provide direct information on soil chemistry. They are
indicative of the ground nature, which in turn may reflect human activities (backfill material,
tar tanks) and potential sources of ground pollution. These geophysical data have to be
treated as indirect and imprecise infonnation. The mapping of the contamination therefore
requires that imprecise geophysical data be correctly integrated with precise chemical analyses
from wells and boreholes.

Geostatistics offers an ideal framework for addressing such problems. Different types of
information can be integrated in a manner which takes into consideration not only the
statistical correlation between the different types of infonnation, but also the spatial
continuity characteristics of both. Using this approach, it is possible to provide maps showing
the probability that the P AH exceeds some critical level.

A case study from an industrial site in Lorraine (northern France) is used to compare the
geostatistical approach to the traditional approach of directly contouring the data from wells
and boreholes.

AVAILABLE DATA

The available information consisted of chemical measurements of P AH concentrations (in


ppm) from 51 boreholes. Figure 1 shows the location of these boreholes. As can be seen, the
coverage is not even, and the lower right quadrant of the map is undersampled. In terms of
distribution of the concentration values, it is seen that the highest concentrations are located
toward the middle, where the cokeworks were located.

The geophysical information included both conductivity measurements (electromagnetic


survey) and resistivity measurements (dipole-dipole electrical measurements). The
electromagnetic survey made it possible to investigate the overall site area and to identify
anomalous zones where more accurate resistivity measurements were carried out. These
resistivity measures are average values over large volumes of soil and are directly related to
the soil nature (recent alluvium, slag deposits and other backfill material). They depend also
on soil heterogeneities and the spatial arrangement of these heterogeneities.

Although the physical phenomena that govern P AH transport, and the reactions between P AH
and soils of different nature are not yet well understood (they are the subject of ongoing
research projects), the presence of P AH in significant amounts has been found to be be
COLIN ET AL. ON GEOPHYSICAL DATA 71
associated with low resistivity values (i.e it increases, locally, the soil conductivity). This
relationship, however, remains site specific and cannot be considered as a general law.

The available resistivity measurements (in ohm-meter) come from 14 electrical lines, tightly
criss-crossing the contaminated area. Gaps between electrical lines were filled, first, by
sequential simulation to produce the full resistivity map shown on Figure 2.

500 rn

o 0 PAH concentration :
0 0


00 0 0
> 200 ppm
••
0
0 0
• " 0 ...
·.
0 40 - 200 ppm
" 0 0"
0 0 0
" 0 • " < 40 ppm
0
o a- "· 0
" •
0 0
0
• "
0
• 0

"
0
0

0 0

0 600rn

Figure 1 : Sample Location Map

500

15 ohm· m
25

~a~
40


II
95
250

•• 375
750
If

0
0 600
•• 1500

3000

Figure 2 : Electrical Resistivity Map


72 GEOSTATISTICAL APPLICATIONS

OBJECTIVE OF THE STUDY

The problem at hand is to delineate (on a 10 by 10 meter grid) areas where the risk that the
P AH contamination is in excess of a critical threshold is deemed large enough to warrant
either remediation or further testing. The critical threshold used for this study is 200 ppm
PAH, and three classes of risk were considered:

Low Risk : The probability that the concentration in P AH exceeds 200 ppm is less than
20 percent.

Medium Risk: The probability of exceedance is in the 20 to 50 percent range.

High Risk: The probability of exceedance is over 50 percent.

From a methodological point of view, assessing the risk of exceedance implies that, at each
node of the grid, the range of possible values for the P AH concentration, along with their
associated probability, be available.

The challenge therefore is to take advantage of both the direct measurements of P AH


concentration and the indirect information provided by electrical resistivity to infer, at each
grid node, the range of possible values that P AH concentration could take. These ranges of
possible values, called also local conditional distribution functions, can be viewed as measures
of the local uncertainty in the P AH concentration. Once these local uncertainties are
established, the risk maps can be produced.

EXPLORATORY DATA ANALYSIS

Given the objectives of the study, we need to understand the following critical features from
the available data:

1- The range of possible values for the PAH concentrations which may be encountered away
from sampled locations;

2- The relationship which exists between P AH concentrations and electrical resistivity. In


other words, knowing the resistivity value at a particular location, what can we say about
the possible range of P AH concentration values at the same location?

3- The spatial correlation structure of P AH concentrations?

Univariate distribution ofPAH concentrations

Any geostatistical estimation or simulation algorithm requires a model describing the


probability distribution function of the variable under consideration, i.e. an enumerated list
of possible values with their associated probabilities. Traditionally, this probability distribution
function is based on the experimental histogram built on the available data. Figure 3 shows
the experimental histogram, and the corresponding summary statistics, of the available PAH
COLIN ET AL. ON GEOPHYSICAL DATA 73
concentration data. We can see that:

- The number of data available to construct the histogram is fairly limited, resulting in a lack
of sufficient resolution : the class frequencies tend to jump up and down erratically and
there are gaps between data values.

- The histogram is extremely skewed to the left, the bulk of the values being below 300
ppm, with some erratic high values extending all the way up to 6500 ppm (Figure 3a). Not
surprisingly, the coefficient of variation is very high (3 .03).

- The mean and variance of the data are severely affected by this high variability and they
cannot be established at any acceptable level of reliability: removing the two largest values
reduces the average by a factor 3 and the variance by a factor 65!

- Ifwe use a logarithmic scale to visualize the same histogram (Figure 3b), we see clearly
the existence of three populations: a first one below 50 ppm and accounting for 61
percent of the total population, a second one in the range 50 to 500 ppm and including 34
percent of the population, and a third small (5 %) population characterized by extreme
P AH values ranging from 600 to 6500 ppm.

f Summary statistics
0.5 Number of data : 51
Mean : 382 ppm
Standard deviation : 1159 ppm
Coer variation : 3.03
0.25 Minimum : 2 ppm
1st quartile : 9 ppm
Median: 33 ppm
3rd quartile : 206 ppm
Maximum : 6500 ppm
0.0 -Jlllll:ilib:tlhl...frlliL..Illl.....IDlla,.__ PAH (ppm)
250 500

Figure 3a : Histogram and summary statistics for P AH concentrations (in ppm)


74 GEOSTATISTICAL APPLICATIONS

0. 1

0.0 ..,....""'-'''''-+ PAH wpm]


50 500

Figure 3b : Histogram ofPAH concentrations


(logarithmic scale)

The first population can be interpreted as representing the background concentration level on
the site. The second population seems clearly related to contamination itself, with the bulk of
it above the critical threshold of 200 ppm. The third population is a little more dubious to
interpret, primarily because it is represented only by three samples. Although it is obviously
associated with the contamination, it is not entirely clear whether it represents a different
factor of contamination or if it is merely the tail end of the second population.

From this analysis it is clear that the experimental histogram cannot be used as is as a model
of the distribution function of P AH concentrations over the site area. This probability
distribution function should, instead, be modelled and have the following features :

It should not be based on parameters like the mean and variance, which are highly affected
by extreme values and are, as a result, not known with any degree of reliability;

It should provide probabilities for the entire range of possible values, from the absolute
minimum to the absolute maximum and fill the gaps between existing data values;

It should reproduce the existence of the three populations and their respective frequencies.

Bivariate Distribution

The cross-plot shown on Figure 4 describes the relationship which exists between PAH
concentrations and electrical resistivity.
COLIN ET AL. ON GEOPHYSICAL DATA 75
PAH [ppm[

1000 •
•• • •••

100 • • ••
••• • •
10
• •• •
• • •
• • Resist .
ohm. m
10 100 1000

Covariance : -0.799
Correlation (pearson) : -0.360
Correlation (Spearman) : -0.266

Figure 4 : Cross plot and bivariate statistics ofPAH vs


Electrical Resistivity

The most important feature ofthis plot is the existence oftwo distinct clouds of points, which
is a direct consequence of the multi-modality of the PAH distribution, and of the bimodality
of the electrical resistivity:

An upper cloud where P AH values are in excess of 3 5 ppm and the electrical resistivity
ranges from 15 to 150 ohmomo Within this cloud, the correlation between the two
attributes is positive.

A lower cloud with PAH values below 35 ppm and with electrical resistivities in the 30
to 1600 ohmom range. The correlation, again, appears to be positive, but less significantly
so .

From this cross-plot it seems that high concentrations ofPAH (over 35 ppm) are associated
to rather low resitivity values. One possible explanation of this feature, which stilJ remains to
be confirmed, is that P AH, which are viscous fluids, tend to flow down through backfilJ
materials until they reach the top of the natural soil. At this level they filJ up the soil pore
volume, thus creating a flow barrier to water.
76 GEOSTATISTICAL APPLICATIONS

Traditionally, bivariate distributions are parametrized by the mean and variance of their
n;arginal distributions together with the correlation coefficient. Such an approach is
inapplicable in our case, since it will fail completely to reflect the most important feature of
the cross-plot which is the existence of the two populations.

The solution adopted for this study consists of using directly a bivariate histogram to describe
the bivariate distribution model. Because of the sparsity of data, the experimental cross plot
is not sufficient to inform all the possible bivariate probabilities: it is spiky and full of gaps.
The required bivariate histogram, therefore, will be obtained by an appropriate smoothing of
this experimental cross plot, making sure that the two clouds of points are correctly
reproduced.

Spatial Continuity Analysis

The variogram analysis performed on the natural logarithm ofPAH concentrations (Figure
5) shows that the phenomenon is reasonably well structured, with a maximum correlation
distance (range) of approximately 70 meters. There was no evidence of anisotropy and the
shape of the variogram was exponential.

y(h)

... .. ,.. .. .. i · .. · .. ... .... ..................... .......... ..


~ ~ ~

.. '

.. ~'
3

0r-----------r----------.-----------r~ hIm)
so 100 ISO

Figure 5 : Experimental variogram for Ln(P AH) concentrations


COLIN ET AL. ON GEOPHYSICAL DATA 77

BUILDING THE PROBABILISTIC MODEL

Based on the results of the exploratory analysis, the probabilistic model to be used in
estimation and uncertainty assessment will consist of the following:

- A smooth univariate histogram approximating the marginal distribution of the P AH


concentration, and

- A smooth bivariate histogram describing the bivariate distribution of P AH concentration


and electrical resistivity.

Several approaches have been proposed to produce smooth histograms and cross-plots:
quadratic programming (Xu 1994), fitting of kernel functions (Silverman 1986; Tran 1994)
and simulated annealing (Deutsch 1994). The technique selected for this study is simulated
,.,
annealing, because it was perceived to be the most flexible to accommodate all the i.
II:
requirements of the probabilistic model. Simulated annealing is a constrained optimization I.
IU
technique which is increasingly used in earth sciences to produce models which reflect
complex multivariate statistics. A detailed discussion of the technique can be found in (Press I!I,
et al. 1992; Deutsch and Journel 1992; Deutsch and Cockerham 1994). it
,.i'
In this study the modelling of the bivariate probabilistic model was done in two steps: first the !.
marginal distribution of the P AH concentration was modelled, and then the cross-plot
between PAH and electrical resisitivity (there was no need to model the marginal distribution
of resistivity, since it was directly available from the resistivity map).

The modelling of the marginal distribution ofPAH via simulated annealing was implemented
as follows:

1- An initial histogram is created by subdividing the range of possible values into 100
classes, and assigning initial frequency values to each of these classes by performing a
moving average of the original experimental histogram and then rescaling these
frequencies so that they sum up to one.

2- An energy function (Deutsch 1994) is defined to measure how close to desired features
of the final histogram the current histogram is. In the present case the energy function
takes into consideration the reproduction of the mean, variance, selected quantiles and
a smoothing index devised to eliminate spurious spikes in the histogram.

3- The original probability values are then perturbated by choosing at random a pair of
classes, adding an incremental value b.p to the first class and substracting it from the 'i
i
second, hence ensuring that the sum of the frequencies is still one.

4- The perturbation is accepted if it improves the histogram, i.e. if the energy function
decreases. Ifnot, the perturbation may still be accepted with a small probability. This will
il
if
'1'
I
ensure that the process will not converge in some local minimum.
,Ii'
1
5- This perturbation procedure is repeated until the resulting histogram deemed
IS
~
l
II
'I

!
1
I
i
:1
78 GEOSTATISTICAL APPLICATIONS

satisfactory (the energy function has reached a minimum value) or until no further
progress is possible.

The modelling of the cross plot followed a similar general approach:

1- An initial bivariate histogram is created by subdividing the range of possible values along
both axes into 100 classes, and assigning initial bivariate frequency values to each of these
cells by performing a moving average of the original cross-plot followed by a rescaling
of the frequencies to ensure that they sum up to one.

2- An energy function is defined to measure the goodness of fit of the current bivariate
histogram to desired features of the final one. In the present case the energy function
takes into consideration the reproduction ofthe marginal distributions defined previously,
the correct reproductions of some critical bivariate quantiles and, again, a smoothing
index devised to eliminate spurious spikes in the cross plot.

3- The original bivariate frequencies are perturbated by randomly selecting a pair of cells,
and adding an incremental probability t,p to the first class and subtracting it from the
second, leaving therefore the sum of frequencies unchanged.

4- As before the perturbation is accepted if it decreases the energy function and accepted
with a certain probability if not.

5- The perturbation mechanism is iterated until the energy function has converged to some
minimum value.

A detailed discussion on how to use simulating annealing for modelling histograms and cross
plots can be found in Deutsch (1994).

The result of this modelling is shown on Figure 6 (experimental histogram of P AH


concentrations and smooth model) and Figure 7 (experimental cross plot of P AH versus
resistivity and corresponding smooth bivariate histogram). As can be seen, all the important
features appear to be well reproduced: the multi-modality of P AH, the bi-modality of the
resistivity and the existence of the two clouds on the cross plot.

ESTIMA TING THE LOCAL DISTRIBUTION FUNCTIONS

Having developed the bivariate probabilistic model for P AH and electrical resistivity, we will
now use it to infer the local conditional cumulative distribution function (ccdt) of the P AH
concentration.

This inference (see Appendix I) involves two steps:

1- The local a priori distribution function (cdt) of the P AH, given the local resistivity value,
is extracted from the bivariate histogram. This local cdf characterizes the uncertainty of
the P AH value based on the overall relationship existing between P AH and resistivity, but
COLIN ET AL. ON GEOPHYSICAL DATA 79

before using the local P AH data values themselves.

0.1

0.0 ...,...~~--+ PAH [ppm]


50 500

0.0 ...,...--.... PAH[ppm]


50 500

Figure 6 : P AH Concentration Experimental


histogram and smooth histogram model
(logarithmic scale)
80 GEOSTATISTICAL APPLICATIONS

2- The local a priori cdf is then conditioned to the nearby P AH data values via simple
indicator kriging (Journel \989) . This ccdf now describes the uncertainty on the P AH
concentration once the local conditioning information has been accounted for. Note that
simple indicator kriging calls for a model of the spatial continuity of the residual
indicators (see Appendix I). This model is shown on Figure 8.

y 01) = 0.01 + 0.25Gaus 70 (h)

.. ." ........................... " ......... " ....... " ..


... . ·f •

r-----.---~r---~----~-----r-----r~h[ml
70 140 210

Figure 8 : Variogram model for the indicator residuals

In this approach, the two types of information (direct measurements ofP AH concentrations
and electrical resistivities) are mixed in a smooth, transitional fashion: when there is abundant
nearby sample data, the simple indicator kriging system will put a lot of emphasis on this
conditioning information and down play the influence of the indirect information, whereas
when there is little or no conditioning data, the range of possible outcome for P AH will be
primarily controlled by the local resistivity value.

PROBABILITY MAPS

Having inferred the local distribution functions of P AH concentrations, we can now build the
probability map (Figure 9c) showing the risk that this concentration exceeds the critical
threshold of200 ppm .
COLIN ET AL. ON GEOPHYSICAL DATA 81
PAH (ppm )


1000

.....

100
• • ••
••• •
• •
10
• • ••

• •• ResilL (ohm-m 1
10 100 1000

PAH (pp m(
Freguency
0

0.01

1000
0.02
;:m:-:::* 0.03


f.~~~*
0.04

II 0.05
100

•• 0.06

0.07

10
•• 0.08

0.09

=="-~ Resistivity [o1un . m]


10 100 1000

Figure 7 : P AH vs Electrical Resistivity


Experimental cross-plot and bivariate histogram model
82 GEOSTATISTICAL APPLICATIONS

500 m
a) Iso-concentration map

II PAH > 200 ppm


[j PAH <200 ppm

500m ~1I1I1I~------------~ b) Probability map (PAH only)

High risk

Mediwn risk

Low risk

0 600 m

500 m
c) Probability map (PAH & ResistivM

II High risk

~ Mediwn ri sk

II Low risk

0
0 600m

Figure 9 :Iso-concentration map and probability maps


COLIN ET AL. ON GEOPHYSICAL DATA 83
For the sake of comparison, this probability map is compared with two other maps:

- A probability map built by indicator kriging also, but taking into account the P AH
concentrations only (Figure 9a), and

- A map showing the area where the estimated PAH concentration (expected value) exceeds
the critical threshold. This map was obtained by ordinary kriging (Figure 9a).

If we look first at the estimated map (Figure 9a), we see that the area where the P AH
estimated value exceeds 200 ppm forms a rather homogenous, smoothly contoured, zone
concentrated around the cokeworks, where all the high sample values are located. If this map
was used for decision-making, one could come to the conclusion that this zone, representing
a surface of 135,900 square meters, is the only one requiring attention.

Looking now at the probability map produced by indicator kriging based on the P AH
,
concentrations only (Figure 9b), we see that the picture gets more complex: the center zone
,I
II
is still high risk but less homogeneously so, and the medium risk zone extends further to the
south. The medium and high risk zones, now, represent 190,700 square meters. However, I.
because oflack of direct information, the periphery appears mostly as a low risk zone. I
II
Finally, by including the information provided by the electrical resisitivity, we see that there
is a significant probability (medium risk) that peripheral areas to the south, but also to the
north and east, are contaminated. These areas would probably warrant further testing to
II
!,
confirm the level of contamination. In this case, the medium and high risk areas total 271,900
square meters.

From these results it is clear that estimated maps are inadequate for delineating risks of
contamination: they provide information on the expected value of the concentration, but not
on the local uncertainty. And because of their intrinsic smoothing properties, they may either
vastly overestimate or underestimate the extent of the risk zone. In this case the contour map
indicates a risk area (estimated concentration greater than 200 ppm) which is half the size of
the medium and high risk area as shown on the probability map inferred from both the P AH
concentrations and the electrical resistivity.

It is worth remembering also that the objective of such study is to provide a classification of
whether or not the soil, at a particular location, is likely to be affected by the contamination,
and not to provide a good estimate of the concentration at that location. The challenge is to
come up with probabilities and not with estimated concentrations. One may even argue that
the latter is irrelevant to the task at hand: high estimated values may correspond to a low
probability of exceedance and, conversely, moderate estimated values may be associated with
high probabilities of exceedance.

CONCLUSION

Prediction of contaminant concentration can be improved by taking into consideration


indirect, related information such as geophysical data. To integrate effectively this indirect
84 GEOSTATISTICAL APPLICATIONS

information into the estimation process it is crucial that the bivariate relationship existing
bp,tween the direct and indirect information be correctly rendered. Very often this relationship
cannot be captured by the classical parametric description based on the means and variances
of the marginal distributions and the correlation coefficient. A more general alternative
consists in using a full bivariate histogram to describe the relationship between the two
attributes, and it is proposed that this bivariate histogram be modeled by simulated annealing.

This bivariate probabilistic model can then be used to infer the local a priori distribution
function ofthe contaminant concentration given the known value of the secondary attribute.
This local a priori distribution function is finally conditioned to the existing local contaminant
data by simple indicator kriging.

This approach is very general and can be used to address many different situations. It should
be stressed, however, that the relevance of its results depends heavily on how physically
meaningful is the relationship between the main and co-attribute as described by the bivariate
probabilistic model.

APPENDIX I

Simple Indicator Kriging with Bivariate Histogram

This paper is concerned with P AH concentration and electrical resistivity. The technique
however is completely general and can be used each time that secondary information is
provided under the form of a bivariate histogram.

We will use the following notation:

Z(u) random variable describing the attribute of main interest, informed by N data
values, z(uu), a= 1, ... , N

Y(u) random variable describing the secondary attribute, informed by M data values,
y(u~),p= 1, ... ,M
u location coordinates vector

Prob{ Zj_1 ::; Z(u) < z; and Yj-J::; Y(u) < yj}

= bivariate histogram frequency

with:

Zj, i =1, .. , M = number of thresholds discretizing the range of values [zmm' zmar]
Yp j= I, ... , ~ = number of thresholds discretizing the range of values [ymm. YmaJ

At a given location u, the a priori distribution function of the main attribute will be given by:
COLIN ET AL. ON GEOPHYSICAL DATA 85
It
EtcU; ZI; y(U»
1=1
F O(u; zit I y(u» = Prob{Z(u) ~ zit I y(u)} N,
(1)

EtcU; ZI; y(U»


1=1

and the estimated local ccdf (posterior distribution) by:

F*(u; zit Iz(u •.) a=l, ... , n) = L


n
A. .. ·r(u .. ; zit) + FO(u; zit I y(u» (2) I~i
~;: I
.. =1 I,

I:

Ii"
,~

I",
I,
,

I,
1

~II I

with

Z(U~), a= 1, ... , n being the n local conditional data

and

rU!~; zJ being the residual indicator value:

(3)
; otherwise

Simple kriging with a bivariate histogram, therefore, involves the following steps:

1- Select the number thresholds~, k=1, ... ,K required to provide an adequate discretization
of the local ccdf This number K of thresholds depends on the goal of the study and needs
not to be as large as the number of thresholds ~ used to build the bivariate histogram;

2- For each datum z(u a ), define the residual indicator values r(ua,zJ, k=1, ... , K, using the
local a-priori distribution function FO(Ua,Zk) ;

'1,/,'I!
l'
86 GEOSTATISTICAL APPLICATIONS

3- Establish the variogram model y /h; z.) of the residual indicator values;

4- Then, at each grid node u and for each threshold ~:

- determine the local a priori distribution of the main attribute Z(u), given the local
secondary attribute value y( u) using equation I;

- estimate, by simple kriging, the local ccdf of Z( u) using equation 2;

- check and correct for potential order relation problems

5- Process the estimated ccdf to extract the required probabilities, quantiles or estimated
values.

In principle, this involves solving, at each grid node u, K simple kriging systems since
variogram models may be different for each threshold z.. Ifit can be shown that the variogram
model does not change significantly from one threshold to another, then it is sufficient to
solve a single system, and to use the same weighting scheme for every threshold. This
approach, called Median Indicator Kriging - or "mosaic-model" (Lemmer 1984; Journel
1984), can simplifY and speed up the whole estimation process.

REFERENCES

Wenlong, Xu, 1994, "Histogram and Scattergram Smoothing Using Convex Quadratic
Programming," SCRF proceedings, Stanford University.

Silverman B., 1986, "Density Estimation for Statistics and Data Analysis," Chapman and
Hall, New-York.

Tran, T., 1994, "Density Estimation using Kernel Methods," SCRF proceedings, Stanford
University.

Deutsch, c.v., 1994, "Constrained Modelling of Histogram and Cross-Plots with Simulated
Annealing," SCRF proceedings, Stanford University.

Press W., et aI., 1992, "Numerical Recipes in C, The Art of Scientific Computing,"
CambridgeUniversity Press.

Deutsch, C.V. and Journel, AG., 1992, "GSLIB Geostatistical Software Library and User's
Guide," Oxford University Press.

Deutsch, C.V., and Cockerham, P.W., 1994, "Practical Considerations in the Application
of Simulated Annealing to Stochastic Simulation," Mathematical Geology, Vol. 26,
No.1, pp. 67-82.
COLIN ET AL. ON GEOPHYSICAL DATA 87
Journel, A. G.,1989, "Fundamentals of Geostatistics in Five Lessons," Volume 8, Short
Course in Geology, American Geophysical Union, Washington D.C.

Lemmer, I. c., 1984, "Estimating Local Recoverable Reserves via Indicator Kriging, "in
G.Verly et aI., Geostatistics for Natural Resources Charaterization, pp. 349-364,
Reidel,Dodrecht, Holland.

Journel, A. G., 1984, "The Place of Non Parametric Geostatistics," in G.Verly et aI.,
Geostatistics for Natural Resources Charaterization, pp. 307-335, Reidel, Dodrecht,
Holland.

, ,
; ,
1::1
Iii
I':
I,'
"
'I

II"
I
I,
"I,
I
i
Michael R. Wild i and Shahrokh Rouhani 2

EFFECTIVE USE OF FIELD SCREENING TECHNIQUES IN ENVIRONMENTAL


INVESTIGATIONS: A MULTIVARIATE GEOSTATISTICAL APPROACH

REFERENCE: Wild, M. R., Rouhani, S., "Effective Use of Field Screening Techniques in
Environmental Investigations: A Multivariate Geostatistical Approach," Geostatistics for
Environmental and Geotechnical Applications, ASTM STP 1283, R. Mohan Srivastava, Shahrokh
Rouhani, Marc Cromer, A. Ivan Johnson, Alexander J. Desbarats, Eds., American Society for
Testing and Materials, 1996.

ABSTRACT: Environmental investigations typically entail broad data gathering efforts which
include field screening surveys and laboratory analyses. Although usually collected extensively,
data from field screening surveys are rarely used in the actual delineation of media contamination.
On the other hand, laboratory analyses, which are used in the delineation, are minimized to avoid
potentially high cost. Multivariate geostatistical techniques, such as indicator cokriging, were
employed to incorporate volatile organic screening and laboratory data in order to better estimate
soil contamination concentrations at a underground storage tank site. In this work, the direct and
cross variographies are based on a multi-scale approach. The results indicate that soil gas
measurements show good correlations with laboratory data at large scales. These correlations,
however, can be masked by poor correlations at micro-scale distances. Consequently, a classical
direct correlation analysis between the two measured values is very likely to fail. In contrast, the
presented multi-scale co-estimation procedure provides tools for a cost-effective and reliable
assessment of soil contamination based on a combined use of laboratory and field screening data.

KEYWORDS: geostatistics, cokriging, multivariate, field screening, volatile organics

Assessing the extent of soil contamination can be very costly. Laboratory analysis of common
environmental contaminants can range from $200 to $1000 per sample for standard method
testing. Consequently, many investigations first use field screening techniques to help identify
relative levels of contamination and then select a few samples for laboratory analysis. In many
cases, the validity of field data are questioned and such data are rarely used in the actual
delineation of source contamination. This paper presents a geostatistical technique for an optimal
and defensible incorporation of field screening and laboratory data. This approach is intended to

iProject Environmental Engineer, Dames & Moore, Inc., Six Piedmont Center, 3525
Piedmont Road, Suite 500, Atlanta, Georgia 30305.

2Associate Professor, School of Civil and Environmental Engineering, Georgia Institute


of Technology, Atlanta, Georgia 30332-0355.

88
WILD AND ROUHANI ON FIELD SCREENING 89
accomplish the following objectives:

Perform site characterization in a cost-effective and information-efficient manner,


Minimize the need for additional environmental investigations, and
Employ defensible approaches based on rigorous mathematical techniques for the
analysis of spatial data.

Several screening devices or kits are available to measure various contaminants, such as volatile
organics, metals and pesticides. The most commonly used devices are portable soil-gas probes, x-
ray fluorescence spectrometers, and immunoassay kits. The measured results of these tools are
mostly qualitative in nature and may not correlate well to actual laboratory measurements.

Almost all environmental investigations require procedures to determine the extent of


contamination. Investigators would prefer to employ all types of available information, including
field and laboratory data, to perform this task. However, federal and state regulatory agencies
discourage the direct use of field data. Consequently, a large portion of useful information is
either neglected or under-utilized. This paper provides a timely solution to extract the maximum
amount of information from available data. For this purpose, multivariate, non-linear techniques,
such as indicator cokriging, are used. This paper applies these concepts to a site whose soil has
been contaminated by several underground storage tanks over a period of 20 to 30 years. The
tanks were used primarily for storage of kerosene, gasoline and diesel fuels and various industrial
solvents. Although the site was extensively investigated with over 300 samples, limited laboratory
confirmation of volatile organic compounds (VOCs) was performed. This attempt to save money
in laboratory cost actually prohibited the delineation of contamination extent.

BACKGROUND INFORMATION

Screening tools for VOCs, such as photoionization detectors (PID) and organic vapor analyzers
(OVA), can provide an inexpensive alternative to laboratory testing, especially for large, multi-
layer investigations. Manufacturers of these instruments advocate the use of these devices as an
effective method of measuring VOCs for preliminary site characterization, including the
delineation of subsurface contamination. Unfortunately, the reliability of these devices has proven
to be dependent on weather conditions, soil type and actual contaminant concentrations. ,,''"
,,'

Several published case studies exist that incorporate data gathered using these instruments. One
study performed by Marrin and Kerfoot (1988) used a portable gas chromatograph and PID to
predict the extent of groundwater contamination by measuring volatile organics in the soil gas.
Thompson and Marrin (1987) also measured soil gas concentrations at 49 locations to estimate
groundwater contamination. The results of this estimation, however, were verified by an
inadequate number of groundwater samples (five). Crouch (1990) used gas detection tubes to
estimate contaminant concentrations in soil vapor. According to Crouch, gas detection tubes were
used because the OVA and PID are not compound specific and only provide a total VOC
measurement.

None of the above investigations performed any correlation analyses between soil gas probe
90 GEOSTATISTICAL APPLICATIONS

readings and laboratory measurements and therefore assumed that the measured results from these
s<::reening devices were reasonably accurate. Seigrist (1991) compared gas chromatography to two
PIDs of varying ionization potentials to measure volatile organics in a controlled environment.
The results of the comparison showed poor correlation between both PIDs and the gas
chromatograph and demonstrated that the PIDs were very sensitive to water vapor and responded
to natural organics including methane, ethylenes and alcohols. Smith and Jensen (1987) tried
correlating OVA and PID readings to laboratory measurements for total petroleum hydrocarbons
(TPH) in soil. Again, poor correlation prohibited using screening tools to estimate actual TPH,
and the authors cautioned against using screening tools as a sole criterion for determining soil
contamination.

The above works clearly indicate that the screening tools which are widely used and accepted in
the environmental field provide only qualitative information on total VOCs. The use of these tools
in delineating contamination is cautioned against because their effectiveness and reliability remain
unverified. On the other hand, comprehensive laboratory analyses of environmental investigations
can be prohibitively expensive. This study presents geostatistical procedures, such as cokriging, to
link the two measurement techniques and produce accurate maps of contamination.

Cokriging is defined as the estimation of one variable using not only observations of the variable
but also data on one or more additional, related variables defined over the same field (Olea 1991).
Cokriging is suitable for cases where the targeted variable is not sampled sufficiently to provide
acceptably precise estimates of that variable over the entire investigated field (Joumel and
Huijbregts 1978). Such estimates may then be improved by correlating this variable with better-
sampled auxiliary variables as a function of their separation distance. This approach is fully
compatible with actual field conditions where VOC screening data are extensively collected but
only relatively few samples are verified by laboratory analysis. However, it must be emphasized
that the utility of cokriging depends on the level of spatial cross-correlation between the primary
and auxiliary variables. To obtain an adequate cross-correlation between investigated variables, it
may become necessary to apply data transformations prior to the actual co-estimation process.
This may require use of non-linear cokriging techniques.

The use of non-linear geostatistical techniques is preferable if one or both of the variables exhibit
non-gaussian tendencies. Such distributions are commonly observed in contamination
assessments of VOCs in soil. VOC data sets are usually characterized by a few significant outliers
and a majority of very low or non-detectable samples. Indicator kriging has been found to be
useful for highly variant phenomena where data present long-tailed distributions (JoumelI983).
This characterization also applies with respect to mineral deposits data sets (lsaaks and Srivastava
1989). This type of kriging uses a non-parametric approach that does not suffer from the impact
of outliers since the original values are transformed to either a 0 or 1 based on cutoff or threshold
limits (Isaaks and Srivastava 1989). The transformed values can then be used to estimate the
spatial distribution of the data.

GEOSTATISTICAL METHODOLOGY

Geostatistics provides tools for the analysis of spatially correlated data and is well-suited to the
WILD AND ROUHANI ON FIELD SCREENING 91
study of natural phenomena (Journel and Huijbregts 1978). The theory of geostatistics has been
well-documented over the years; therefore, only a general description of techniques applicable to
this investigation are provided. These techniques are cokriging and indicator kriging. For more
information on geostatistics, see Journel and Huijbregts (1978).

Geostatistics allows for the estimation of values at unsampled locations. This estimation approach
is commonly known as kriging and is a linear combination of known nearby values, as shown by

(1)

where Zo* = the estimated value ofZ (an arbitrary parameter) at location xo;
Zj = the measured value at location Xj;
A. j = the kriging weight of the parameter value at Zj; and
n = the number of nearby sample points to be used in the estimation.

The weights are calculated to produce the lowest estimation error or variance and to satisfy the
unbiasedness condition (LA.j = 1, I = 1 to n). The minimized variance for ordinary kriging can be
written as
n
v'o Lj=! Ajy~ + Jl (2)

where Vo' = the minimum variance of estimation error;


yjjZ = the variogram between Zj and Zj; and
f.J- = the Lagrange multiplier.

Cokri~in~

Cokriging is the estimation of one variable based on measured values of two or more variables.
This procedure can be regarded as a generalization of kriging in the sense that, at every location,
there is a vector [Z(Xj), Y(Xj)' ... ] of many variables instead of a single variable Z(x) (Olea 1991).
The variable to be estimated is denoted as the target or primary variable while all other variables
are categorized as auxiliary or secondary variables. The secondary variable is cross-correlated
with the primary variable. The cokriging procedure is especially advantageous in cases where
abundant secondary values are more abundant than primary variables.

The co-estimation of the primary variable is calculated as


n m

Z~ = Lj=! AjZj +L
j=!
vjYj (3)

where Vj = the weight factor for the secondary variable, Y, measured at Xj; and
m = the number of secondary-variable measurements (which is typically much greater than
n).
92 GEOSTATISTICAL APPLICATIONS

Minimizing the variance of estimation error, Vo', subject to cokriging unbiasedness conditions
(~;\'i =1, ~Vj = 0) results in
n m
V; = L Ai y~ + L Vj y~Y
i=l j=l
+ III (4)

where yl =the variogram of the primary variable; and


YijZY = the cross-variogram of the primary and secondary variables.

This technique of cokriging improves the estimation and reduces the variance of estimation error
(Ahmed and De Marsily 1987).

Indicator Kriging

Indicator kriging is a non-parametric technique used in probability theory when parametric


assumptions are not appropriate to describe the distribution of the data set. Instead, indicator
kriging provides an estimate of the cumulative distribution of the data set by calculating
conditional probabilities. These probabilities can be estimated by transforming the variables to a
one or zero, depending upon whether they fall above or below a cutoff level (Sullivan 1984):

.
l(X;Z) =
{I if z(x) S Zk
0 if z(x) >Zk (5)

where Zk is the cutoff level.

By using kriging, the interpolated indicator variable at any point Xo can be estimated by
n

I: (xo) = L AjOik(x) (6)


j=l

where 1*k = the estimate of the conditional probability at Xj


'\0 = the kriging weight for the indicator value at point Xj (Rouhani and Dillon 1989).

The conditional probability in this case is defined as:

(7)

By varying zk' the cumulative probability can then be constructed.

CASE STUDY

The case site has been in operation for over 50 fifty years and is currently a commercial site. The
area under investigation has a total of nine underground storage tanks (USTs) situated around the
site. Some of the tanks have been in operation for up to 30 years and were suspected to have
WILD AND ROUHANI ON FIELD SCREENING 93
leaked for an unknown number of years. The site is approximately 90 to 95 percent covered with
concrete and is relatively level. The investigated data and some site characteristics have been
altered to maintain confidentiality.

The horizontal and vertical contamination at the site were assessed by three boring programs. A
total of 82 borings were advanced. All USTs were eventually excavated and removed after the
environmental investigation. A confirmatory campaign was performed after the tanks were
removed and produced 12 additional sample locations. Figure 1 shows the locations of the 94
borings and the previous locations of the tanks. A Foxboro® OV A3 was used to screen for VOCs
in each boring for all four investigations. The OV A was used to screen over 300 samples from
these borings.


• •

,..
• •


• •
• • • ••••
•• • •
• • • ••
• .~
• • •• •
•• •• :~ • •

• •

• •
• • ••••• • •• •
• • • •
• •

-
iii

B.lidirg
UST
Sci BOfirg

Figure 1-- Site Map.


In addition to the OV A screening, laboratory analysis was performed on 35 of the 300 samples.
Typically, laboratory testing is performed on high OVA readings if one wants to identify the
highest concentrations for risk assessment. Alternatively, when OVA readings are neither high
nor low, laboratory results can be used to determine potential contamination of a sample.

In this case study, the boring campaigns collected VOC information over the entire site.
Laboratory samples were severely under collected for the size of the site. From an agency
standpoint, this site characterization would be incomplete and unacceptable. Figure 2
demonstrates the collected laboratory samples in the surficial soils.

3A Foxboro® OVA uses the principle of hydrogen flame ionization for detection and
measurement of total organic vapors. The OVA meter has a scale from 0 to 10 which can
be set to read at 1 X, 10 X or 100 X or 0-10 ppm, 10-100 ppm and 100-1000 ppm,
respectively. The OVA is factory-calibrated to methane.
94 GEOSTATISTICAL APPLICATIONS


• •SarrpIes sent 10 lab
• . 1520 I (restfu in PID)
.2.5

• • • • .25
2.5-

• · •• .. i.
.250.
· •• .: I.¥'1
100:x>.. •

• • a:::

:~~ : ..
• •• -;.52.5•••
~....
.
.
• • • •
• _l1li ElJIldng
• •
LIST
• • Soil Boring

o 20'-'
L---...J

Figure 2-- Laboratory analyzed sample locations;


ethylbenzene concentrations are shown.
This paper salvages this extensive investigation by providing means to extract the maximum level
of information from the existing data.

ANALYSIS OF DATA SET

Cross-correlations were computed between the laboratory analyzed compounds in order to


determine applicable indicator compounds, if any. The available data were also analyzed for
correlation between the laboratory-analyzed data and the OV A samples. Next, the characteristic
distribution of the data was determined to select the most appropriate geostatistical technique for
the spatial or structural analysis.

Only benzene, ethylbenzene, toluene and xylene (BTEX) were investigated. All other compounds
detected in the samples, together, made up less than eight percent of the total volatile compounds
detected (EPA Method 8240 or Priority Pollutant compounds were tested) . These compounds,
which were mostly methylene chloride measurements, were consistently measured at low levels.

Both the OVA and VOC measurements were grouped into 3-foot intervals. Because the three
boring campaigns did not always collect data at consistent depths, the intervals had varying
amounts of data and spatial distribution. However, these measurements were distributed over
various depths. Figure 2 shows only 13 samples in the surficial layer which was the most
impacted layer in terms of horizontal extent.

Cross-Correlation Analysis

A cross-correlation analysis was performed between each variable to determine applicability of


WILD AND ROUHANI ON FIELD SCREENING 95
indicator compounds. For this purpose, the entire data set of 35 samples, which spanned various
depths, was used. Ethylbenzene had the highest average correlation value with the other BTEX
compounds, R2 = 0.92 (R2 is the correlation coefficient). In addition, ethylbenzene correlated well
with total BTEX (R2 = 0.94). As mentioned previously, the soil-gas probes measure total VOCs.
Therefore, ethylbenzene was used as an indicator of the other three parameters and for total VOCs .

The correlation analysis perfonned for each BTEX compound versus the corresponding OVA
reading produced poor correlations (Figure 3). The highest correlation coefficient was for the

-
1~.--------------------------------.

1400
1200
ill-
--
8' WOO
E; 800 ' -
~
o 600 -
400 •
200
o~
10,000 20,000 30,000 40,000
° Ethylbenzene (Ppb)
Figure 3-- OVA to ethylbenzene correlation analysis.

complete OVA data set versus ethylbenzene yielding an R2 = 0.37 . This low correlation
coefficient indicates that direct correlation between laboratory and OVA measurements could not
be justified. As discussed previously, similar results were also found by many investigators.

Structural Analysis of OVA Measurements

A structural analysis was perfonned on the surficial soil OVA measurements to determine their
spatial correlation. Due to the qualitative nature of the OV A, the structural analysis exhibited a
high degree of variability. It was therefore concluded that, given the non-gaussian shape of the
histogram of OVA measurements, an indicator transfonnation was preferable.

Two approaches were considered for this analysis. The first approach, suggested by Isaaks and
Srivastava (1989), uses the median value of the data as the cutoff. Given the qualitative nature of
the OVA data, the median cutoff value may have no real significance. Therefore, a second
approach was developed. This approach identifies an OVA cutoff that would provide a high
degree of confidence that the soil is less contaminated than an established regulatory threshold for
petroleum hydrocarbons. This threshold was based on a review of a number of government
guidelines on petroleum-contaminated soils. The threshold or target value could then be used to
develop the conditional probability
Prob[Ethylbenzene ~ Target / OVA] (8)
96 GEOSTATISTICAL APPLICATIONS

where Target =the cleanup or suggested maximum-allowable hydrocarbon contamination level.

Calculation of the conditional probability for varying target levels (Figure 4) showed that there was
a greater than 95 percent chance that the ethylbenzene level in the soil was equal to or less than a

100% ,
-..~~.~:~....-.-.---.....-..-
,
95% . -. -.
0 90%
-.-. -. -.
~

~ 85%
-'-
.0

£ 80%

75%

70%0
200 400 600 800 1000
OVA Readings (ppm)
Figure 4-- Conditional probability based on OV A readings and
regulatory cleanup standards.
20 parts per billion (ppb) target level, given an OVA reading of 20 parts per million (ppm) or less.
Therefore, a target level of 20 ppb for ethylbenzene was selected as a conservative cleanup
standard. The 20 ppm cutoff value for the OVA readings was similar to the median values for the
surficial soils.

Using the surficial soils data, the indicator variogram of OVA at 20 ppm (Y) is shown as Figure 5.

0.4 r - - - - - - - - - - - - - - - - - - - - - ,

0.3 • •
0.3

!
o
0.2

~ 0.2
0.1

0.1

Distance (meters)
Figure 5-- Indicator variogram of OVA measurements at 20
ppm threshold
WILD AND ROUHANI ON FIELD SCREENING 97
As shown by this figure, the variogram demonstrated a well-defined spatial structure.

Structural Analysis of Ethylbenzene

Recalling Figure 2, only 13 surficial ethylbenzene measurements were available for mapping soil
contamination. As detennined in exploratory data analysis, the ethylbenzene measurements
exhibited a tendency toward a log normal distribution. To account for this tendency, the natural
log of the ethylbenzene measurements were taken. Furthermore, in order to avoid the possibility
of numerical errors in the cokriging process, the log-transformed values were then normalized
(mean =0, standard deviation =1). This made the latter data set numerically consistent with the
indicator OVA values, thus minimizing the chance of numerical errors in cokriging.

Unlike the OVA measurements, the variogram for the standardized, log-transformed ethylbenzene ,i'
measurements (Z) demonstrated a relatively poor spatial structure (Figure 6). This short range
prohibited accurate mapping with the current ethylbenzene data set.

1.4,-------------------,

1.2

~ 1.0
~ 0.8

0.6

10 20 30 40 50
Distance (meters)
Figure 6-- Direct variogram of normalized ethylbenzene
measurements .
All the above direct variographies were performed using the U.S. Environmental Protection
Agency (EPA) public domain program, GEO-EAS (Englund and Sparks 1988).

Cross-Variography of OVA and Ethylbenzene Measurements

Cross-variography between the above two variables was conducted based on the linear model of
co-regionalization (Rouhani and WackernageI1990). These computations were conducted using
EPA's program, GEOPACK (Yates and Yates 1989). In this approach, the relationship between
the direct and cross variograms is defined as
98 GEOSTATISTICAL APPLICATIONS

yZ = algI + ~g2 + <l:Jg3 + ... + akg k


yY = bIg] + b 2g 2 + b3g3 + ... + bkg k (9)
yZY = c]g] + c 2g 2 + c3g 3 + ... + ckg k

such that

(10)

where yZ = the direct variogram of standardized, log-transfonned ethylbenzene, Z;


y Y = the direct variogram of indicator OVA, Y;
yZY =the cross-variogram of Z and Y;
gi = the ith basic variogram model (sill = 1);
'\ = the sill of gi in the nested variogram model of Z;
bi = the sill of gi in the nested variogram model of Y;
c i = the sill of gi in the nested cross-variogram model of Z,Y; and
k = the number of basic models used in the nested variograms.

The ratio of the fitted ci2/,\b i represents the correlation coefficient (Ri 2) at a scale consistent with
the range of the ith basic variogram.

Figure 7 depicts the cross-variogram between the standardized, log-transfonned ethylbenzene and
0.1.....---------------------,

~ -0.1
5b
0-0.2
'la
~ -0.3

-0.4
• •
-O.5L-~-L-~-L-~-L-~-L_~_L_~~
o 10 20 30 40 50 60
Distance (meters)
Figure 7-- Cross-variogram of standardized ethylbenzene and
indicator OVA.

the indicator OVA. Table I summarizes the variogram models used for all structural analyses.
WILD AND ROUHANI ON FIELD SCREENING 99
TABLE 1 -- Summary ofvario&raphy.

Y Spherical 0.03 0.07 21


0.2 40

Z Spherical 0.6 0.4 21


0.25 40

Z,Y Spherical 0.05 0.167 21


0.223 40

The scale-specific correlation coefficients displayed in Table 1 indicate that at micro-scales


associated with the nugget effect, the correlation between the log-transformed ethylbenzene and
the indicator OVA values is ratherlow (R,2 = 0.14). This can be attributed to measurement
fluctuations. However, at larger scales, associated with the spherical models with ranges of 21 and
40 meters, the correlation between the above variables improves significantly (R22 = 0.996, R/ =
0.995). These correlations, however, can be masked by the poor correlations at the micro-scale
distances. Consequently, a classical direct correlation analysis between the two measured values is
very likely to fail.

CONCLUSIONS

The mapping of the under-sampled ethylbenzene measurements is made feasible by incorporating


the well-sampled OVA measurements. Figure 8 demonstrates the target 20 ppb contour for
ethylbenzene. While the ethylbenzene measurements alone do not provide a basis for delineation
of the contaminated soil, the cokriged map allows us to define the extent of contamination.
However, specific areas of the site still require additional laboratory confmnation. These areas are
south of the 26 ppb ethylbenzene measurement and between the two major tank pits.

The above results show that field screening of contaminated sites can provide valuable information
for characterization and mapping. This objective is accomplished by:

Indicator transformation of OVA measurements based on a site-specific and


regulatory-dependent conditional probability analysis;
Multi-scale direct variography of transformed data; and
Multi-scale cross-variography of the transformed OVA and laboratory
measurements.
100 GEOSTATISTICAL APPLICATIONS




2.1' •


••


_• BUldng
UST
• • Soil Booing
• E.b<mooe • aJ ppb

Figure 8-- Results of co-estimation; ethylbenzene


contamination extent
In conclusion, attempting to directly correlate field screening and laboratory data is usually prone
to failure. Instead, the auxiliary data is subjected to indicator transformation which is consistent
with the qualitative nature of field screening data. This transformation, however, requires that a
cutoff or threshold value be determined. In this work, the cutoff value is computed based on
analysis of conditional probabilities of soil samples passing various regulatory criteria. Such an
approach provides a flexible, site-specific algorithm for the transformation of field screening data
and their eventual use for co-estimation. This combined information can then be cokriged with
laboratory measurements to produce an information-efficient assessment of the extent of
contamination.

REFERENCES

Ahmed, S. and G. Marsily, 1987, "Comparison of geostatistical methods for estimating


transmissivity using data on transmissivity and specific capacity," Water Resources Research.
23(9), pp 1717-1734

Crouch, M. S., 1990, "Check soil contamination easily," Chemical Enl:'ineerinl:' Prol:'ress, pp 41-45

Englund, E., and A. Sparks, 1988, "GEO-EAS (Geostatistical Environmental Assessment


Software) Use's Guide," EPA600/4-88/033, ENMSL, Environmental Protection Agency, Las
Vegas, NV

Isaaks, E. H. and R. M. Srivastava, 1989, Applied Geostatistics, Oxfords University Press, New
York

Journel, A. G. and C. J. Huijbregts, 1978, Mininl:' Geostatistics, Academic Press, London


WILD AND ROUHANI ON FIELD SCREENING 101
Journel, A. G., 1983, "Non-parametric estimation of spatial distributions," Mathematical Geology,
15(3), pp 445-468.

Marrin, D. L. and H. Kerfoot, 1988, "Soil gas surveying techniques," Environmental Science and
Technology, 22(7), pp 740-745

Olea, R. A., 1991, Geostatistical Glossary and Multilingual Dictionary, International Association
of Mathematical Geology Studies in Mathematical Geology No.3, Oxford University Press, 1991

Rouhani, S. and M. Dillon, 1989, "Geostatistical risk mapping for regional water resources
studies," The Use of Computers in Water Management, in International Water Resources
Association - Technical Session, Moscow, pp 216-228

Rouhani, S. and H. Wackernagel, 1990, "Multivariate geostatistical approach to space-time data


analysis," Water Resources Research, 26(4), pp 585-591

Siegrist, R. L., 1991, "Volatile organic compounds in contaminated soil. The nature and validity
of the measurement process," Conference - Characterization and Cleanup of Chemical Waste
Sites, Washington D.C., Journal of Hazardous Materials, 29(1), pp 3-15

Smith, P. G., and S. Jensen, 1987, "Assessing the validity of field screening of soil samples for
preliminary determination of hydrocarbon contamination," Superfund '87, Hazardous Materials
Control Resources Institute, pp 101-103

Sullivan, J., 1984, "Conditional recovery estimation through probability kriging theory and
practice," in G. Verly ~ aI..., eds., Geostatistics for Natural Resource Characteristics, Part I, D.
Reidel Publishing Co., Dordrecht, pp 365-384

Thompson, G. M. and D. L. Marrin, 1987, "Soil gas contaminant investigations: a dynamic


approach," Groundwater Monitoring Reyiew, 7(3), pp 88-93

Yates, S.R., and M.V. Yates, 1989, "Geostatistics for Waste Management: A User's Manual For
the GEOPACK (Version 1.0) Geostatistical Software System," EPA, R.S. Kerr ERL, Ada, OK
Robert L. Johnson l

A BayesianJGeostatisticaJ Approach to the Design of Adaptive Sampling Programs

REFERENCE: R. L. Johnson, "A BayesianJGeostatisticaJ Approach to the Design of


Adaptive Sampling Programs," Geostatistics for Environmental and Geotechnical Ap-
plications, ASTM STP 1283, R. Mohan Srivastava, Shahrokh Rouhani, Marc V. Cromer,
A. Ivan Johnson, and Alexander J. Desbarats, Eds., American Society for Testing and
Materials, 1996.

ABSTRACT: Traditional approaches to the delineation of subsurface contamination


extent are costly and time consuming. Recent advances in field screening technologies
present the possibility for adaptive sampling programs---programs that adapt or change
to reflect sample results generated in the field. A coupled Bayesianlgeostatistical
methodology can be used to guide adaptive sampling programs. A Bayesian approach
quantitatively combines "soft" information regarding contaminant location with "hard"
sampling results. Soft information can include historical information, non-intrusive
geophysical survey data, preliminary transport modeling results, past experience with
similar sites, etc. Soft information is used to build an initial conceptual image of where
contamination is likely to be. As samples are collected and analyzed, indicator kriging is
used to update the initial conceptual image. New sampling locations are selected to
minimize the uncertainty associated with contaminant extent. An example is provided
that illustrates the methodology.

KEYWORDS: adaptive sampling program, indicator kriging, Bayesian analysis, site


characterization, sampling strategy

INTRODUCTION

Characterizing the nature and extent of contamination at hazardous waste sites is


an expensive and time-consuming process that typically involves successive sampling
programs. The total cost per sample can be prohibitive when sampling program

lStaff engineer, Environmental Assessment Division, Argonne National Laboratory,


Bldg. 900, 9700 S. Cass Ave., Argonne, IL 60439.

102
JOHNSON ON SAMPLING PROGRAMS 103
mobilization costs, drilling or bore hole expenses, and sample analysis costs
are all included. For example, the Department of Energy (DOE) estimates that it will
spend between $15 and $45 billion dollars for analytical services alone over the next 30
years to support environmental restoration activities at its facilities (DOE 1992).
One of the primary products of a site characterization study is an estimate of the
extent of contamination. Traditional characterization methodologies rely on pre-planned
sampling grids, off-site sample analyses, and multiple sampling programs to determine
contamination extent. Adaptive sampling programs present the potential for substantial
savings in the time and cost associated with characterizing the extent of contamination.
Adaptive sampling programs rely on recent advances in field analytical methods (FAMs)
to generate real-time information on the extent and level of contamination (McDonald et
al. 1994). Adaptive sampling programs result in more .cost effective characterizations by
reducing the analytical costs per sample collected, by limiting the number of samples
collected by strategically locating samples in response to field data, and finally by
bringing characterization to closure in the course of one sampling program. Adaptive
sampling programs can result in characterization cost savings on the order of 50% to
80% (Johnson, 1993).
Supporting adaptive sampling programs requires the ability to estimate the extent
of contamination based on available information, to measure the uncertainty associated
with those estimates, to determine the reduction of uncertainty one might expect from
collecting additional samples, and to direct sample collection so that sample locations
maximize information gained. Two key characteristics of contaminated sites must be 1".,
I·r'
taken into account. The first is that spatial autocorrelation is often present when samples
are collected. The second is that there may be abundant "soft" information regarding the 'Ii
'II
location and extent of contamination, even if little "hard" sample data are initially :r,
available. Soft data refers to information such as historical records, non-intrusive
geophysical survey results, preliminary fate and transport modeling results, aerial
photographs, past experience with similar sites, etc.
A number of geostatistical approaches to the design of sampling programs for
characterizing hazardous waste sites have been proposed in the past. Early methods
focused on minimizing some form of kriging variance (e.g., Olea 1984 and Rouhani
1985). More recent work has centered on stochastic conditional simulation techniques,
Bayesian implementations of geostatistics and more complex decision rules (for example,
Englund and Heravi 1992; McLaughlin et al. 1993; James and Gorelick 1994). In
practice, site characterization sampling program designs tend to blend rigid sampling
grids with selective sampling based on best engineering judgement. Typically there is
little quantitative analysis to support the final sampling program design.
A combined Bayesianlgeostatistical methodology is well suited to quantitative
adaptive sampling program support. Bayesian analysis allows the quantitative
integration of soft information with hard data. Geostatistical analysis provides a means
for interpolating results from locations where hard data exists, to areas where it does not.
A general Bayesianlgeostastical approach to merging soft and hard data is the Markov
Bayes model described by Deutsch and Journel (1992). The Markov Bayes model
estimates conditional cumulative density functions by developing covariance
relationships between soft and hard data sets, and pooling the two different data sources
through a form of indicator cokriging.
104 GEOSTATISTICAL APPLICATIONS

The methodology described in this paper exploits the fact that environmental
indicator sampling resembles binomial sampling. Binomial sampling events allow for
the derivation of conjugate prior and posterior probability density functions, which in
turn greatly simplifies computational effort. By incorporating soft information into an
initial conceptual model that is subsequently updated as hard sampling data becomes
available, the development of covariance models between soft and hard data is avoided.
The classification of areas as clean or contaminated, and the selection of additional
sampling locations, is based on a form of Type I and Type II error analysis, an approach
consistent with the Environmental Protection Agency's (EPA) Data Quality Objectives
approach to environmental restoration decision making.

METHODOLOGY

Classical statistics estimates the most likely value for 1t, the probability of
encountering contamination, by using hard sample data results. For example, if 20
random locations were sampled at a site and 5 of these samples returned contamination
levels above an action threshold, then an unbiased estimator of the true probability of
observing contamination above that threshold for any random location at the site would
be the number of hits divided by the number of samples, or 0.25. In classical statistics
one could carry the analysis one step further and develop confidence intervals around this
estimator with some basic assumptions about the underlying probability distribution.
Kriging provides similar results for individual points in space, accommodating spatial
autocorrelation as well. Neither classical statistics nor geostatistics provide a means of
quantitatively accommodating soft information in the analysis. For the design of
sampling programs to characterize contamination extent and subsequent analysis of
sampling program results, soft information often plays a crucial role.
A Bayesian approach differs from classical statistics by assuming that parameters
(such as the presence of contamination at a node) are unknown initially, but have some
known probability distribution called the prior probability density function (pdf). As
additional information becomes available (such as results from new sampling locations),
these prior pdfs can be updated quantitatively using Bayes' rule to produce posterior
probability density functions:

P(XIY) oc P(X)P(YIX) (1)

P(XIY) is the posterior pdf for X, P(X) is the prior pdf for X, and P(YIX) reflects the
probability distribution associated with observing Y given the prior pdf of X.
From a Bayesian perspective, a two parameter beta distribution Be(o:,~) is a
conjugate prior in the context of Bernoulli trials and the binomial distribution (Lee
1989). Be( o:,~) ranges between zero and one, and can assume a variety of shapes
depending on the values of 0: and~. For a random variable 1t that follows a beta
distribution, the expected value of 1t is given by:
JOHNSON ON SAMPLING PROGRAMS 105

£(11:) (2)
(ex + P)

where:

ex,P = parameters associated with the beta pdf for 11:, ex,P >= O.

The variance of 11: is given by:

Var{11:) (3)

Binomial distributions provide the probability of observing a specified number of


successes within a specified number of trials. Conjugate priors are priors that retain their
same underlying pdf after the application of Bayes' rule. In the case of a binomial trial
with an unknown underlying probability 11: of seeing a success in any given trial, if X
successes are obtained in N trials, a prior for 11: of the form Be( ex,P) becomes the
posterior Be(ex+X,p+N-X). N functions as the total amount of additional information
supplied to the prior. As N grows large, E(11:) approaches the classical maximum
likelihood estimator for 11:, XIN, and the Var(11:) decreases monotonically. 1/
q
When one considers only the presence or absence of contamination above some ;,
threshold, environmental sampling resembles a binomial trial---N samples collected, X of
"
which encounter contamination above the threshold. The primary difference is that
environmental samples are not independent, as required in a traditional binomial
sampling sequence. Sample values, even at an indicator level, are spatially
autocorrelated. The issue is how to update a prior beta distribution at a given point in
space with results from samples nearby that is consistent with the derivation of beta
distributions as conjugate priors for binomial distributions and that recognizes their
spatial autocorrelation.
Two pieces of information are required from the set of samples: N' .0' the total
amount of information represented by the set of samples appropriate for that point in
space, Xo, and p' .0' the probability of encountering contamination at Xo based on the
samples' results. Indicator kriging provides a means for deriving these two pieces of
information. An unbiased estimator of p' at Xo is given by:
N
Lwl(x) (4)
i=l

where
Xi locations where samples have been collected;
Z(x) = 0 or I, depending on whether the sample at Xi encountered
106 GEOSTATISTICAL APPLICATIONS

contamination below or above the threshold;


Wi =
kriging weights.
The set of kriging weights, W, can be derived by solving the following set of
simultaneous linear equations:

E
i= 1
CijW i + WN*l for j= 1, .... ,N (5)

(6)

where
Cij =
covariance between sample locations Xi and Xj;
CjO =
covariance between sample location Xj and the point where the
interpolation is taking place, "0.
N* at Xo can be tied to N, the number of samples taken, through the following
relationship:

- 1 (7)
Varestim

N
Varestim Coo - (E WiCiO + f.L) (8)
i= 1

where
Varestim = the estimation variance associated with the interpolation of p* at location
xo;
Coo the variance of the indicator values;
~ = the average of the indicator values for the sample locations involved in
the updating.

Equation (7) is heuristically based. When the sampled locations are all "distant"
from the point of interest (i.e., greater than the spatial autocorrelation range), N* goes to
zero, implying that the sampled locations contribute no information at the point of
interest. As a sampled location comes close to the point of interest, N* goes to infinity,
indicating that the sample information has specified the probability at the point of interest
exactly.
The methodology begins by defining a uniform grid over the region of interest.
Grid nodes are designated as Decision Points (DPs). At each DP, a pdf based on the two
JOHNSON ON SAMPLING PROGRAMS 107
parameter beta distribution Be(ct,p) is defined. The beta pdf associated with each DP
describes the probability of encountering contamination above a pre-selected threshold
level at that DP. Initial values for ct and p are selected to represent a synthesis of any
soft information available for a site, using equations (2) and (3). For a particular DP, the
values of ct and p relative to each other determine the expected probability of
contamination at that DP. The absolute sizes of ct and p determine the certainty
associated with the beta distribution at that DP. For example, both ct =P=O.4 and ct
=P=40 result in an expected probability of contamination equal to 0.5. However, in the
latter case, the variance as calculated in equation (3) is much less. In the unlikely case
where no information is available at a particular DP, a "non-informative" prior can be
selected that sets ct and p equal to one at that DP.
Updating the set of decision points with hard sampling data requires knowledge
of the variogram or covariance function for the site. Because the values of p and N at x
are independent of Coo' the primary covariance function parameters of concern are its
shape, or functional form, and its range. If sufficient hard data exist, one can estimate
the covariance function from an experimental variogram analysis.
A simple measure of the uncertainty associated with contamination extent is to
categorize decision points as either "clean", "contaminated", or state uncertain at a given
certainty level, where the probability of contamination being present at any given
decision point is based on equation (2) using the posterior beta pdf parameters that are
associated with that decision point. For example, if one wishes to be 90% certain that the
classification is correct when a decision point is classified as either clean or
contaminated, then decision points with E(n) ranging between 0.1 and 0.9 would be
classified as state uncertain. This definition of uncertainty parallels the use of
uncertainty by the EPA in its Data Quality Objectives approach to decision-making.
This method for handling uncertainty also leads naturally to measures of benefit
one might expect from additional data collection. For example, one might wish to
sample those locations that would be expected to maximize the number of decision points
classified as "contaminated" or "clean" at a given certainty level, or to minimize the
number classified as state uncertain.

EXAMPLE APPLICATION

A simple example illustrates this methodology in action. Figure 1 provides a plan


view of a hypothetical site with surface soil contamination. The site contains a waste
lagoon that was breached during a storm. The owner's property is bounded by two
secondary roads. The demarcated area indicates where surface soil contamination
actually exists (7 940 m2) ___ an area unknown to the site owner. The owner acknowledges
that contamination exists, and that portions of the site will require remedial action. The
purpose of the characterization effort is to determine the extent of contamination so that
the soils can be removed and treated off-site.
The responsible regulator wants all contaminated soils identified and removed.
The regulator wants to ensure that the sampling program is designed so that soils that are
contaminated are not erroneously classified as clean. The owner will have to pay for the
characterization, excavation and remediation of all soils believed to be contaminated.
108 GEOSTATISTICAL APPLICATIONS

)(

IPrope rty Llne l

x
x

FIG. l--Example site

The owner wants to avoid remediating soils that are actually clean, and also to minimize
his characterization costs. After negotiations, the regulator agrees to tolerate a 20%
chance that a soil volume identified as clean is actually contaminated. The owner will be
responsible for removing and remediating all areas that have greater than 20% chance of
contamination being present.

There is no initial hard sampling data for this site. The available soft information
includes the location of the lagoon, scattered survey points from which a terrain model
can be built to indicate the probable direction of overland flow and hence contaminant
migration, the location of a utility building on site that would have been a barrier to flow,
and the location of roads with embankments that would have also blocked flow . This
soft information is used to construct the initial conceptual image of where contamination
likely is, and where it likely is not.
A grid is superimposed over the site that consists of 625 decision points (Figure
JOHNSON ON SAMPLING PROGRAMS 109

----;.I.~
·.......~

.~.~
,..... ..'..................~.
....______
.. •.....••.. ... IO~
• • • • • • • • • • •: • • r• • • • • • • _
~ ~ ~
•••;
-
----I
x
••••••• ~.i ••••• ~ •••••••••
•••••••••••••••••••••••••
-'It • • • • • • • • • • • • • • • • • • • • . • ..• • x
_ • • • • • • • • '. . . _ fi • • • • • • • • • r •
• ••
•• ••
••
• ••

... ..'... ...


···........ . .......... .
,. ............ ...
• • • • • • • • • • • e . • • • • • • • ' • • • • • • '-' "

.. '.' .'..... .......... -.........


I
•••••••••••••••••••••••••
·.....................
_
,

• • • ~ • .• • •.! • • • • • • • • • • • • • • • • '~
'.

'
x

· I~_o______:_o____,_o_o~
x

FIG. 2--Decision point grid

2). At each decision point, a beta distribution is defined, with parameters selected to
reflect the soft information available. For decision points that are in the building, IX is set
equal to zero and Pto a very large number to reflect the fact that the interior of the
building is known clean. For decision points within the lagoon, IX is set equal to a very
large number and p equal to zero, to reflect that fact that the lagoon is known to be
contaminated. For the balance of the decision points, IX and Pare set to values less than
0.5, with their relative sizes selected so that equation (2) reflects the initial probability of
the presence of contamination.
Figure 3 shows the gray-scale representation of the initial conceptual model once
the beta distribution parameter values have been selected, along with a set of terrain
contours based on-the available survey points. As is shown in Figure 3, the initial
conceptual image is faithful to the location of the lagoon, building, and land surface
contours. The area demarcated with the heavy black line indicates soil with
contamination probability greater than 0.2 based on this initial conceptual modeL At this
110 GEOSTATISTICAL APPLICATIONS

x
Contam. Probability
I
o 1.00

FIG. 3--Initial conceptual model

point, without any sampling, the owner would have to clean up 34 440 m2 of soil, more
than four times what is actually contaminated.
Before the adaptive sampling program can begin, the methodology requires a
covariance function. At the outset there is no hard data upon which to base a covariance
function choice. If the covariance function were selected to honor the initial
conceptualization, a range of approximately 200 meters would be used. The larger the
assumed range, however, the fewer the samples that would be required to characterize the
site. As a conservative start, for this example an isotropic exponential covariance
function is assumed with range of 50 meters.
A traditional sampling program for a site such as this would probably rely on a
pre-planned, regular sampling grid. As a point of comparison for the subsequent
adaptive sampling examples, Figure 4 shows an example pre-planned sampling program
based on a triangular grid pattern. The gray-shaded surface contained in Figure 4 shows
the results when a non-informative initial conceptual model is updated with the
JOHNSON ON SAMPLING PROGRAMS 111

I. Sampling Poinl I

x
Contam. Probability

0.011 0.99

FIG. 4--Standard sampling program results

information that would have been derived from this sampling program. The underlying
beta distribution parameters for each decision point were set to a = P=0.1. In this
scenario, the 14 samples result in classifying 23230 m2 of soil as requiring remedial
action (i.e., the probability of contamination for these soils is greater than 0.2). This
captures 87% of the soils actually contaminated, and includes 16230 m2 of
uncontaminated soil.
The classification of much of the clean area in Figure 4 as being contaminated is
a product of the uncertainty associated with the use of an "ignorant" or non-informative
prior during the updating process. If one uses the initial conceptual model shown in
Figure 3, and updates it with the results from the sampling program shown in Figure 4,
one obtains a different interpretation of the site. Figure 5 shows the results graphically.
Using an initial conceptual model that reflects what is known at the outset about the site
results in classifying 22 000 m2 of soil as requiring remedial action. This captures more
than 98% of the soil actually contaminated, and includes 14 190 m2 of uncontaminated
112 GEOSTATISTICAL APPLICATIONS

'Ie Sampling Point I

FIG. 5--Standard sampling grid with initial conceptual model

soil.
If one incorporates the underlying soft information available for the site, as
displayed in Figure 3, and then sequentially selects 14 sampling locations that maximize
the area that would be classified as clean at the 80% certainty level, then one obtains the
pre-planned sampling program shown in Figure 6. The sequential selection of sampling
locations proceeded as follows. First, a set of potential sampling locations based on a
tight grid was established. Second, each potential sampling point was evaluated based on
the impact sampling that point would have on the categorization of soils as requiring
remedial action or not. If a potential sampling location had already been selected for
sampling, then it was discarded. In this evaluation, it was assumed that the sampling
result observed would be the most likely result based on the initial conceptual model
conditioned with any locations either already sampled, or already selected for sampling.
The potential sampling location that provided the greatest increase in the area of soil
classified as clean would be added to the list of locations to sample. This process was
JOHNSON ON SAMPLING PROGRAMS 113

I_ Sampling Point I

x
Contam. Probability
!
o 1.00

FIG. 6--Preplanned sampling program with initial conceptual model

used interatively until 14 locations had been selected.


Figure 6 also shows the results from updating the underlying conceptual model
with the results that would actually have been obtained from this pre-planned sampling
program. These 14 samples reduce the amount of soil classified as requiring remedial
action from 34 440 m2 in the original conceptual model to 15 120 m2 , a reduction, on
average, of 1 380 m2 of soil reclassified as clean per sample collected. This captures
more than 97% of the soil that is actually contaminated, and includes 7 395 m2 of
uncontaminated soil.
The selection of sampling locations for the pre-planned program was based on
what was assumed would have been the results from sampling each of those locations.
As the number of sampling locations included in a pre-planned program increases, the
probability that at least one sample will encounter results that are unexpected also grows.
In an adaptive sampling program, the results from previously selected sampling locations
are available when the decision is made where to sample next. While the same process is
114 GEOSTATISTICAL APPLICATIONS

)( x

I •. Sampling Point I
x

x
Contam. Probability
I
o 1.00

FIG. 7--Extended adaptive sampling program

used for identifying the next sampling location, the difference is that the decision is
conditioned on actual sample results, not assumed sample results as in the selection
process for the pre-planned program. An adaptive sampling program at this site, driven
by the objective of maximizing the area classified as clean at the 80% certainty level,
would initially follow the same course as the pre-planned program shown in Figure 6.
The reason is that for the fourteen samples collected as part of the pre-planned sampling
program, all encountered what was expected---no contamination.
In the case of an adaptive sampling program, one has the additional option of
continuing sampling until the goals of the program have been met. Figure 7 shows the
locations of an additional 14 samples for this site, along with the results from updating
the underlying conceptual model with their results. The additional 14 samples reduced
the area classified as requiring remedial action to 10070 m2 • This included 96% of the
soil actually contaminated, and 2 460 m2 of uncontaminated soils. Each sample
reclassified, on average, 350 square meters of soil, a significantly smaller amount than
JOHNSON ON SAMPLING PROGRAMS 115
obtained from the first 14 sampled. There are two reasons for this: first, there is simply
less area available for reclassification to clean. The second is that the sampling has
begun to encounter the unexpected---contaminated soil.

CONCLUSIONS

Adaptive sampling programs provide the opportunity for significant cost savings
during the characterization of a hazardous waste site. The challenge for adaptive
sampling programs is providing real-time sampling program support that both
incorporates the typically significant amounts of soft information available, and that
accounts for the spatial autocorrelation that is omnipresent. A.joint Bayesian
analysis/indicator geostastistical method can be used to guide the selection of sampling
locations, to estimate the extent of contamination based on available data, and to
determine the expected benefits to be gained from additional sampling.
The example illustrates how the addition of soft information to the design of a
sampling program can result in a more directed sampling strategy. When the ability to
guide the program while in the field is added, the potential for cost savings is great.

l.\tlll

ACKNOWLEDGEMENTS

i
The work presented in this paper was funded through the Mixed Waste Landfill III
Integrated Demonstration, funded by the Office of Technology Development, Office of JII
j"
Environmental Restoration and Waste Management, U.S. Department of Energy through I'

contract W-31-109-ENG-38.

REFERENCES

Department of Energy, Analytical Services Program Five-Year Plan, Laboratory


Management Division, Office of Environmental Restoration and Waste Management,
Washington, D.C., January 29, 1992.

Deutsch, C. V. And A.G. Joumel, GSLIB: Geostatistical Software Library and User's
~, Oxford University Press, New York, NY, 1992.

Englund, EJ. and N. Heravi, "Conditional Simulation: Practical Application for


Sampling Design Optimization", Geostatistics Troia '92, A. Soares, ed., Kluwer
Academic Publishers, Dordrecht, 1992, pp. 631-624.

James, B. R. and S. M. Gorelick, "When Enough is Enough: The Worth of Monitoring


Data in Aquifer Remediation Design", Water Resources Research, Vol. 30, No. 12,
December, 1994, pp.3499-3514.

Johnson, R. L., Adaptive Sampling Strategy Support for the Un!ined Chromic Acid Pit.
116 GEOSTATISTICAL APPLICATIONS

Chemical Waste Landfill. Sandia National Laboratories. Albuquerque. New Mexico,


Argonne National Laboratory ANLIEADffM-2, Argonne National Laboratory, Argonne,
IL, November, 1993.

Lee, P. M., Bayesian Statistics; An Introduction, Oxford University Press, New York,
NY, 1989.

McDonald, W. C, M. D. Erickson, B. M. Abratam, and A. R. Robbat, "Developments


and Applications ofField Mass Spectrometers", Environmental Science & Technology,
Vo!. 28, No.7, 1994, pp. 336-343.

McLaughlin, L. B., L. B. Reid, S.-G. Li, and 1. Hyman, "A Stochastic Method for
Characterizing Ground-Water Contamination", Ground Water, Vo!. 31, No.2, 1993, pp.
237-249.

Olea, R. A., "Sampling Design Optimization for Spatial Functions", Mathematical


Geology, Vo!. 16, No.4., 1984, pp. 369-392.

Rouhani, S., "Variance Reduction Analysis", Water Resources Research, Vo!. 21, No.6,
June, 1985,pp. 837-846.
Kadri Dagdelen 1 and A. Keith Turner 2

IMPORTANCE OF STATIONARITY FOR GEOSTATISTICAL ASSESSMENT OF ENVIROMENTAL


CONTAMINATION
----~------ ----------
REFERENCE: Dagdelen, K., Turner, A. K., "Importance of Stationarity for Geostatistical As-
sessment of Environmental Contamination," Geostatlstlcs for Environmental and Geotechnical
Appliaations, ASTM STP 1283, R. M. Srivastava, S. Rouhani, M. V. Cromer, A. I. Johnson, A. J.
Desbarats, Eds., Ameriaan Society For Testing and Materials, 1996.

ABSTRACT: This paper describes a geostatistical case study to assess TCE


contamination from multiple point sources that is migrating through the
geologically complex conditions with several aquifers. The paper
highlights the importance of the stationarity assumption by
demonstrating how biased assessments of TCE contamination result when
ordinary kriging of the data that violates stationarity assumptions.
Division of the data set into more homogeneous geologic and hydrologic
zones improves the accuracy of the estimates. Indicator kriging offers
an alternate method for providing a stochastic model that is more
appropriate for the data. Further improvement in the estimates results
when indicator kriging is applied to individual subregional data sets
that are based on geological considerations. This further enhances the
data homogeneity and makes use of stationary model more appropriate. By
combining geological and geostatistical evaluations, more realistic maps
may be produced that reflect the hydrogeological environment and provide
a sound basis for future investigations and remediation.

KEYWORDS: Geostatistics, environmental contamination, kriging, second


order stationarity.
~ ::

INTRODUCTION "
,~ :

Determination of the extent of contamination at a site is usually based


",
t,:
"
on the collection and analysis of a limited number of samples. Accurate
assessment of these sample values requires knowledge of the geologic
conditions and correct application of geostatistical methods to extend
the sample values over the entire site area. These two requirements are
mutually supportive.

Geological conditions control the movement of contaminants; therefore


evaluations of existing information concerning contamination are
dependent on a clear and unambiguous understanding of the geologic
framework. Site contamination patterns may be affected in significant
ways by the geologic framework in regions surrounding the site. Thus,
geologic studies should extend appropriate distances beyond the
immediate site boundaries.

Geostatistical methods are frequently employed to convert sampled values


into a complete description of the contamination pattern at a site.

Assistant Professor, Mining Engineering Dept., Colorado School of


Mines, Golden, CO. 80401.

Professor, Geological Engineering Dept., Colorado School of Mines,


Golden, CO. 80401.

117
118 GEOSTATISTICAL APPLICATIONS

Ordinary kriging is recognized as the best linear unbiased estimator


(II.L.U.E.) that minimize the variance of error in determining the
average contaminant concentration at unsampled locations. The mechanics
of the kriging process are relatively straightforward. However, kriging
is based on several assumptions concerning the character of the model,
and kriging produces a B.L.U.E. only as long as these assumptions are
not violated. Violation of these assumptions may result in strongly
biased kriged estimates and a flawed site assessment.

Kriging procedures are based on a random function model that is second


order stationary. The stationarity of the model is the chief assumption
of the kriging procedure that is often violated. A random function is
said to be stationary if the probability distribution of each of its
random variables is the same. A random function is first order
stationary if the expected mean value of each of its random variables is
the same. A random function is said to be second order stationary if, in
addition, the covariance between pair of random variables exist and is
the same for all points separated by a distance ~h (Journel and
Huijbregts 1975).

The unbiasness condition of kriged estimates are based on a random


function model that is first order stationary. The ability to view a
particular sample data set as an outcome of a first order stationary
random function is directly related to the ability of a set of samples
to represent a local population whose expected value is the same at all
locations of the search neighborhood.

This paper describes why a data set coming from an environmental site
may not be suitable to be analyzed under the assumptions of stationary
random function model. It shows how ignoring this condition leads to
biased kriged estimates and documents approaches to address the
stationarity issue, hereby producing more accurate estimates of
contaminant distribution.

SITE DESCRIPTION

The site is located on 464 acres of land in the foothills of the


Colorado Front Range 20 miles south-southwest of the city of Denver.
Since 1957, activities at the site consisted of missile assembly, engine
testing, and research and development for the Titan I, II, and III
missile programs, and included fuels development, purification, and
testing in support of the Titan III program.

Geologic Framework

The site straddles the eastern margin of a portion of the Colorado Front
Range. The western portions are dominated by Precambrian high-grade
metamorphic and igneous intrusive rocks. Younger sandstone formations
are found to the east of the Precambrian rocks. These now dip away from
the mountain front at relatively steep angles, up to 500. Consequently,
the eastern portions of the site are entirely restricted to the lower
and middle portions of the Fountain Formation. Large and small
fractures, faults, and shear zones, some over a mile wide and extending
for many tens of miles are common in the Precambrian rocks. Renewed
movements along several of these zones of weakness introduced fractures
and faults within the younger sedimentary rocks. These sandstones are
partly covered by unconsolidated Quaternary and Holocene deposits,
composed of silty sandy gravels with substantial proportions of clay.
However, the older of these units represent pediment surface deposits
and are distinct from the younger units, which are valley-fill-alluvium
deposited at lower elevations in more geographically restricted areas
DAGDELEN AND TURNER ON STATIONARITY 119
following a period of valley down-cutting.

The Data Set

The bedrock at the site is penetrated by about 80 drill holes extending


to various depths (Figure 1). Figure 1 shows these borehole locations
and highlights the five samples with the highest TeE concentrations. TeE
concentrations were reported for III samples, but 31 of these samples
were duplicates. The sampled TeE concentrations, range between 0 and
10,000 ppm and are skewed, with an arithmetic average of 328.8 ppm and a
coefficient of variation of 3.96. As shown if Figure 2, considerable
numbers of samples show "non detect" conditions, and only 50% of the
samples exceed 3.0 ppm TeE.

Evaluation of Hydrogeologic Conditions

Three distinct hydrologic regimes are obvious at the site: the older
Precambrian rocks, younger sedimentary rocks, and overlying
unconsolidated deposits. Each has distinctive characteristics and
interactions between these regimes are relatively complex.

Ground-water flow in the Precambrian rocks may be characterized as a


system governed by fracture flow. The intrinsic permeability of these
rocks is so low as to be negligible, but fractures and foliation planes
are common and pervasive. Numerous studies have demonstrated the
importance of fractures in controlling ground-water flow within these
otherwise relatively impermeable Precambrian rocks. A large zone of
sheared rock is mapped along one major trend that crosses the western
.. ,
',I':
Jill
,
boundary of the site. Within such regions, there may be substantial
hydraulic interconnection between surface water, ground water in the iii'
relatively thin, spatially-confined and discontinuous alluvial deposits, Iii'
and the regional ground-water flow systems. i""

Water movement through the Fountain sandstone is primarily controlled by


its relatively low matrix permeability. These rocks contain
considerable silt and clay which reduces and clogs the pores between the
sand and gravel particles. Within the Fountain, the highest
permeabilities are generally oriented parallel to the inclined bedding.
Fractures are often sealed by calcite, which further reduces their
ability to transmit water. The Fountain thus appears to have a
relatively consistent and generally low value of effective hydraulic
conductivity, especially in the direction normal to the Precambrian-
Fountain contact. This value is lower than the effective rock-mass
permeability of the fractured portions of the Precambrian terrane. At
least in some areas, it seems probable that the Fountain sediments may
act as a "permeability blanket" to the regional groundwater flow system
in the Precambrian terrane. The presence of a leaky hydraulic barrier
in the lower Fountain would be manifested by artesian or confined
ground-water conditions within the lower Fountain sediments, and springs
and seeps or recharging streams along or near the Precambrian-Fountain
contact. The presence of such seeps and recharging streams has been
reported.

The unconsolidated deposits have a generally similar texture and


hydraulic conductivity values, but their recharge and discharge
characteristics and inter-connections are highly variable throughout the
site. The ability of these deposits to act as a single shallow aquifer
system is uncertain. Movement of contaminants through upper portions of
weathered Fountain bedrock may provide hydrologic connections, but the
existence of such flow paths does not yet appear convincing.
120 GEOSTATISTICAL APPLICATIONS
,.,

,.'

••
30&10.0

'.'
'.' ~O
'.' 2~O

..
1.5
e~o

'.' '.'

,---L--"---"2 1oks,---L'--'-'- 2
' _----'--'-'--'--.,2,0k7·.,--J- - ' --.,'I a60----'--'-'--'--"2_ -'--'---'---21

Eas~ng
Figure 1 . Map of the site showing drill hole locations (the five heavier
circles represent the five largest - valued samples) .

Blldrock Tee Data (log s<:ale) Number 01 Oala 111


mean 328 .7676
SId. dev. 1304.3257
0.300 roe!. 01 va. 3.9673
maximum 10000.0000
uwer quartile .3.000 1
median 3.0000
100m quanie 0.0000
miNftum 0.0000
~ 0.200 Hist"'lram has Looar.hmic ~
c
'"Z
u:
0100

·5 .0 ·3.0 ·1.0 l.0 3.0

Variable

Figure 2 . Histogram and descriptive statistics defining the sample


distribution of the TeE values in the bedrock.
DAGDELEN AND TURNER ON STATIONARITY 121
Available well measurements were adequate to allow the construction of
two potentiometric surface maps, one showing the distribution of heads
in the bedrock aquifers and the second showing conditions within the
shallow ~alluvialH aquifer represented by the entire suite of
unconsolidated deposits. Potentioemetric contours for the ~bedrockH
groundwater system suggest water flows from the west and discharges
toward the east and southeast. Shallow aquifer contours also show a
general west to east flow direction over much of the site. A difference
map was created by subtracting values of the heads in the alluvial
aquifer from those in the bedrock. On this map positive difference
values correspond to areas where the potential flow is upward, from
bedrock to the alluvial aquifer. Similarly, negative difference values
correspond to areas where the heads in the alluvial aquifer are higher
than in the bedrock aquifer and the potential for downward flow exists.
Areas of upward flow are found mostly in the lower portions of the
Fountain Formation, supporting the concept that significant ground water
flow from the Precambrian rocks along fracture systems may be partially
blocked, causing increased pressures within the lower Fountain. From
this difference map, three distinct zones were defined:
• a zone of downward flow where ground water may flow from the shallow
alluvial units into the bedrock;
• a zone of upward flow where ground water may flow from the bedrock
units into alluvium; and
• a zone where neither upward nor downward flow gradients are strong
and there are no apparent preferred directions of vertical ground-
water movements.

Definition of Subregions at Site

In those zones where upward flow from the bedrock into the alluvium
appears to dominate, most wells monitoring the alluvial units report TCE
contamination. Yet, in this same zone, the majority of wells monitoring
the bedrock ground-water flow system show no TCE contamination. In
contrast, in those zones where downward flow dominates, alluvial wells,
with only a few exceptions, report no TCE contamination at locations
where most bedrock wells report TCE contamination. In the zone where
neither upward nor downward flow appears to dominate, many bedrock and
alluvial wells report TCE contamination. TCE contamination of the
Fountain bedrock thus appears to be mostly restricted to those portions
of the site where downward ground-water movement from the alluvial units
may be occurring. In these locations, small groups of bedrock wells
reporting TCE contamination are surrounded by non-contaminated bedrock
wells. The reported TCE contaminants in the bedrock thus appear to be
directly related to downward movement of contaminants from the overlying
unconsolidated materials.

Based on such hydrogeologic evidence, the site was divided into


subregions. Four initial subregions were defined by examining the
shallow unconsolidated deposits in terms of: (a) the hydrogeological
setting, (b) the directions of ground-water flow, and (c) the location
of known contaminant sources. Their boundaries were largely defined by
interpreted ground-water flow directions and ground-water divides
identified by analysis of the potentiometric contours. Comparison of
these initial subregions with the zones of potential vertical ground
water movements, defined by the methods described earlier, and with
known major contaminant source areas, yielded six subregions (Figure 3).
122 GEOSTATISTICAL APPLICATIONS
tl61 4o.lC ~----------------------~~~--------------~

N612500

N609500

N608000 -1--,------,--------,-------,--------,------,------1
210 750 2103125 2104500

3000

Figure 3 . Site map showing the six subregions.

Each subregion represents an area that which is believed to contain a


distinct combination of surface and bedrock geologic conditions, and a
cornmon contaminant source or sources . Thus each should have a distinct
population of contaminant values , and sample values from within each
subregion should be considered as an outcome of a stationary random
field. Each subregion should be evaluated independently by
geostatistical methods , when there are sufficient samples within the
area , or by visual inspection when there are too few samples to allow
for geostatistical analysis.

THE ASSUMPTION OF STATIONARITY IN ESTIMATION

The theoretical derivation of the kriging procedure is based on the


assumptions that the data observations can be conceptualized as an
outcome of a second- order stationary random function. That is , the
variable being measured has the same mean value at all locations and the
same spatial covariance or variogram function between all points
separated by a distance ~h.
DAGDELEN AND TURNER ON STATIONARITY 123
In practice, the assumption concerning stationarity of the mean values
requires the sample set being evaluated to be derived from the domain
under study in such way that, at any point in the domain, the expected
values of samples surrounding this point should be the same. In other
words, the probability of sampling high values within any local region
will be the same throughout the domain, as will the probability of
sampling low values. In a similar fashion, second order stationarity
requires that the expected value of the squared difference between pairs
of points ~h apart in a given direction to remain the same throughout
the domain.

Site-wide sampling campaigns may provide data sets that may be


inappropriate for analysis by stationary random function models,
especially if sampled locations preferentially represent locations of
contaminated zones within a larger domain (Isaaks and Srivastava 1989).
When data sets representing zones of different concentration levels are
mixed to form a single data set, the stationary random function model
may no longer be justified. When this combined data set is analyzed by
kriging, samples coming from one stationary domain will influence the
estimation of unknown concentrations in other domains, violating the
assumptions of the stationary random function model and resulting in
biased estimates.

Indicator kriging determines, by using the samples in the neighborhood,


the probability of data values in a given area being greater than a
defined threshold value(Journel 1983; Isaaks 1984). To conduct
indicator kriging, data values are transformed into indicator values:
original values which exceed the chosen threshold value are coded 1, and
those below the threshold value are coded O. These indicators are then
analyzed to determine their spatial directional variability with a
series of experimental variograms. By inspection of these variograms,
orientations of greatest and least spatial continuity are selected.
Variogram models are fitted to the experimental variograms corresponding
to these two directions. Then the indicator data are kriged using these
variogram models to determine the probability of exceeding the threshold
value at a series of desired grid locations. Though actual values from
multiple local zones of contamination cannot be combined to provide
unbiased estimates of the average contamination at a given unsampled
location, experience has shown that it may be appropriate to combine the
median indicator values and treat them as an outcome of stationary
random function model.

ANALYSIS PROCEDURES

Limitations to the use of ordinary kriging are illustrated with the data
from this site. For these analyses, a threshold limit of 3.0 ppm TeE
contamination was selected, because it seemed to be about the lowest
reported value in any of the wells and was slightly less than the
drinking water standard of 5 mg/l. When data from the entire site were
combined and evaluated by ordinary kriging procedures, the resulting
bias produced over-estimation of the observed concentration values over
much of the site(Table 1). The data were then divided into subregions
defined by careful interpretation of geologic conditions at the site, as
described previously. These subregional data sets were individually
analyzed with ordinary kriging procedures, and although a lower degree
124 GEOSTATISTICAL APPLICATIONS

of over--estimation of the observed values was observed (Table 1). The


j,ias in these estimates was still considered unacceptable. Thus
'r.dicat:or kriging methods were used to determine if assumptions of
stationarity could be achieved by this method, hereby providing more
accurate ("unbiased") estimates (Table 1).

Table 1 Summary of Cross Validation Results, Showing Over- and Under-


Estimation Rates Achieved by Different Analysis Methods.

Procedure # False # False


Positives Negatives
Ordinary Kriging 25 5
on entire data (35%) (7%)
set (Fig.7)
Ordinary Kriging 21 4
on Subregions (29% ) (5%)
(Fig.9)
Indicator Kriging 19 17
on entire data (26% ) (24%)
set (Fig.ll)
Indicator Kriging l3 12
on Subregions (18%) (17%)
(Fig.13)

Ordinary Kriging Using Data from Entire Site

Pairwise relative variograms were created using routines in GSLIB


(Deutsch and Journel 1992) to determine directional anisotropies within
the entire data set. Figure 4 shows the contour map of the resulting
variogram surface. The main axis of anisotropy is aligned along the
Azimuth of 112.5°(see Figure 4), and the anisotropy ratio is 0.5. Figure
5 shows eight directional variograms oriented at 22.5° intervals. The
modeled variogram uses a spherical model with a range of 700 ft, a sill
of 1.2 ppm, and a nugget of 0.2 ppm.

Figure 6 shows results of ordinary block kriging of the entire data set
using the above parameters, and a minimum of 3 and a maximum of 16
samples. The map suggests that almost all the areas covered by drill
holes are contaminated at levels exceeding 3.0 ppm, although examination
of the observational data revealed that only 43.5% of the drill holes
exceed this value. Considerable bias toward over-estimation has
apparently occurred (Table 1). Figure 7 shows the results observed by
cross-validation of kriged estimates and sampled values. Cross
validation allows testing of the estimation method at locations of
existing samples. The sample value at a particular location is
temporarily discarded from the sample data set; the value at the same
location is then estimated using the remaining samples. The procedure is
repeated for all available samples (Isaaks and Srivastava 1989). On
Figure 7 (and also in Figures 9,11,and 13, a circle enclosing a plus-
sign represents locations
where the sample value is below 3.0 ppm, yet the kriged estimate is
DAGDELEN AND TURNER ON STATIONARITY 125
greater than 3.0 ppm. Thirty six percent of the locations (25 of the 72
locations that had at least 3 samples within the search window) were
estimated as contaminated (over 3.0 ppm) when , in reality , the sampled
value was below 3.0 ppm. These results are summarized in Table 1 .

500

400

LEGEND
300

1.80
200
1.60

100 1.40

1.20

0 1.00
Y
0 .80
-100
0 .60

0 .40
-200
0.20

0.00
-300

-400

-500+"---P---""f----t=="'---t--=="'--+--'--+-----'t-----¥-----''-fL----''----+_
-50
X
Figure 4. Contour map of the pairwise relative variogram surface for
the TCE data set for the entire site .

Such over-estimation has important consequences; the kriging suggested


that 86% of the area may be considered contaminated while only 43.5% of
the samples showed such contamination . I f the kriged va l ues are
accepted as correct , substantial remediation costs can be expected .

Figure 7 also shows that 5 sample locations were estimated as not


contaminated (under 3.0 ppm) when in reality , the sampled value was
above 3.0 ppm.
126 GEOSTATISTICAL APPLICATIONS

:::--
'~ !,---:::-----:::--..
DIoc.r-,h
= - = - -=, ..'"
01, - . " ,
...,

Figure 5. Eight directional experimental relative variograms and the


fitted model for the data set for the entire site.

Ordinary Kriging Using Data by Sub-regions

In order to produce a data set that can be viewed as an outcome of a


second-order stationary model for the kriging purposes, the entire data
set was partitioned according to the six subregions described
previously. Ordinary kriging procedures were applied to these
individual subregions using the global variogram model given earlier .
Figure 8 shows the results of this process , while Figure 9 shows the
cross-validation plot of these results . By comparison with Figure 7 , it
can be seen that the overestimation bias has been somewhat reduced, yet
21 (29%) sample locations remain over-estimated (Table 1).
DAGDELEN AND TURNER ON STATIONARITY 127

N612500

Property Boundary
N6 11000

N609500

N608000 l o-jn n-'----'-.,-lIIh z.-s---'--...J'~zlutsoO--'-'----'-


, -"ZllJ$a75 '

EiID"g
Figure 6. Map showing results of ordinary kriging using the data set for
the entire site

E9
E90

o + _+
+ + + e!>
E9 '::-ED +
Property Boundary
.,10 <!+~
.... E9
E9
E9
..... +
+
0 E9

Eas1ing
Figure 7 . Map showing cross-validations for ordinary kriging applied to
the set for the entire site.
128 GEOSTATISTICAL APPLICATIONS

N01401JO

N612500

N81101JO

N809500

N608000 750----'-----'-'~2..,"'ojh-,'>2.5:----'-'-----'----,-,,21.J5oo,--L---"'----,.21rno"~87r;-.--'--'--'-'--'-'21"01250 J

Fa!Dg
Figure 8. Map showing results of ordinary kriging applied to the data
sets for the subregions.

61~q,-'--'--.--'--r--.--'--.--r--'-'--'--'--r--.--'--'--r-,

.,.
Truevalues->

2fooksoo '

Eas1ing
Figure 9 . Map showing cross-validations for ordinary kriging applied to
the data sets for the subregions.
DAGDELEN AND TURNER ON STATIONARITY 129
Indicator Kriging Using Da·ta from Entire Site

To further explore applicability of estimators based on second-order


stationary models for the data, indicator kriging was used to analyze
the entire data set. The TCE data values were transformed into 0 and 1
values, depending on their values relative to the 3.0 ppm (the median of
sample values)TCE threshold. Directional variograms were produced.
Indicator kriging was then applied to produce a map of probabilities of
any location exceeding the 3.0 ppm threshold (Figure 10). Figure 11
shows the cross-validation plot of these probability estimates against
the actual occurrence of sample values greater than 3.0 ppm ( using 50%
or greater probabilities). There are 19 locations (26%) with false
positives and 17 locations (24%) with false negatives (Table 1). The
bias toward over-estimation has been further reduced, but additional
reduction in the numbers of false positive and false negative locations
appeared desirable.

Indicator Kriging Using Data by Sub-regions

The indicator kriging process was then independently applied to the


individual sub-regional data sets. Figure 12 shows the estimated
probability of exceeding 3 ppm in each block. Figure 13 shows the
cross-validation plot for these resulting probability estimates against
the actual occurrence of sample values greater than 3.0 ppm. A further
reduction in the degree of overestimation of the area of contamination
exceeding the 3.0 ppm threshold is evident. There are 13 false
positives, (18%)and 12 false negatives (17%) (see also Table 1). ttlll
11,11

CONCLUSIONS '''I

Mis-interpretation of the extent and degree of contamination at a site


is likely to occur when traditional kriging is applied to a sample data
'"I
set that does not consider the geological complexity. This result is
likely because:

• Kriging should not be applied to data sets having a coefficient of


variation greater than 1.0, since a few high concentration samples
in such skewed data sets make the model assumptions inappropriate for
the data at hand, resulting in biased estimation of local averages.

• One of the important assumptions of geostatistics is second-order


stationarity. In order to be able to apply kriging, a given data set
must combine samples so that they can be conceptualized as an outcome
of a second-order stationary of random function. This means that the
data being processed by geostatistical kriging should come from a
single consistent population. Only data from similar contaminant
sources and geologic environments are likely to satisfy the
stationarity assumption of the model.

For the example site discussed in this paper, the coefficient of


variation for the bedrock TCE data is approximately 3.96, much greater
than the limit of 1.0 defined above. At this site, because the TCE
contamination appears to come from multiple point sources and to be
130 GEOSTATISTICAL APPLICATIONS

N60BOOO
2'.,hoo , 2,U$a7'-.--'---'-'-->'2f'Oh50r"----"--->'2fO*'2•.---"---'---.,2fl

Figure 10. Map showing estimated probability of the threshold of


exceedence by indicator krig i ng applied to the data set for the entire
site .


+
o ++ (!X!)
_+
+ $ 1-:-$ +
~ <!+~
Property Boundary

.... $ •
$
$
..... 0 $
0
++
0 0
$

0.5
TnHl values->

6060Q0.'-cg,orr11"'7so"'-'-'----'-'-2""rn'~h''''25.--'---'-'-->'2,mohook.r''----''--.,2",,0'/.a'''7S.--''---''--.,2,fnort,!z,..,.--'----'------m,k,.,----'------'~rn/,nm
Easling
Figure 11 Map showing cross - validation of i n dicator kriging applied to
the data set for the entire site.
DAGDELEN AND TURNER ON STATIONARITY 131

N608000
1)-1750 '

B*g
Figure 12, Map showing estimated probability of the threshold
exceedence by indicator kriging applied to the data sets for the
subregion.

,---J----'-'-21~
...1~25,-c-...J''-.,2roo1soo,---J-'-J'-~'''''5--'---'-'~2~1"'!o6~---'--
' --'-'~21-.Jo8s:!-~---'----'-~L

Easting
Figure 13. Map showing cross -validations for indicator kriging applied
to the data sets for the subregions.
132 GEOSTATISTICAL APPLICATIONS

,~CClt rollEd cy dLO!ererlt ground-water flow regimes, the assumption of


stationarity was not satisfied. Hence, application of the ordinary
kriging technique to the entire site without subdivisions gave biased
and erroneously high estimates of local TCE values, "smearing" high TCE
values into locations where they do not actually occur. Such a
"smearing" effect presents a false impression of widespread TCE
contamination throughout the site, and suggests the presence of a large
contaminant plume. Cross validation analysis provides a means for
assessing the degree of bias and therefore the appropriateness of the
kriged estimates at existing sample locations.

Indicator kriging was used to analyze bedrock TCE contamination data


with a threshold limit of 3.0 ppm. (median of the data values). This
method indicates that many areas within th~ site have low probabilities
of being contaminated with TCE above this very low threshold level.
Application of indicator kriging at higher threshold levels will define
even more restricted areas as having significant probabilities for
higher levels of TCE contamination. Analysis of the entire data set by
indicator kriging procedures still resulted in slightly biased
estimation; better results were obtained when indicator kriging was
applied to subregional data sets. These results are summarized in Table
1.

Indicator kriging is thus proposed as an appropriate method for


developing realistic estimates of contamination levels at many
geologically complex sites. It provides a mechanism for substantially
meeting the underlying assumptions of stationarity in the model. Coupled
with a complete conceptualization of the geological and hydrological
framework for the site, optimal estimates may be achieved by applying
indicator kriging methods to subregional data sets that reflect geologic
controls. This approach will identify location of the misclassification
bias both with respect to overestimation and underestimation and provide
more accurate assessment of contamination limits.

REFERENCES

Journel, A.G., and Huijbregts, C.J., 1975, Mining Geostatistics,


Academic Press, New York, NY.

Isaaks, E., and Srivastava, R., 1989, An Introduction to Applied


Geostatistics, Oxford University Press, New York, NY.

Journel, A.G., 1983,"Non Parametric Estimation of Spatial Distribution".


Mathematical Geology: Vol. 15, No.3; 1983, pp. 445-468.

Isaaks, E., 1984,"Risk Qualified Mappings for Hazardous Waste Sites: A


case study in Distribution-free Geostatistics". Master's thesis,
Stanford University, CA.

Deutsch, C.V., and Journel, A.G., 1992, GSLIB: Geostatistical Software


Library and User's Guide. Oxford University Press, New York.
Daniela Leonte 1 , and Neil Schofield 2

EVALUATrON OF A sorL CONTAMrNATED srTE AND CLEAN-UP CRrTERrA:


A GEOSTATrSTrCAL APPROACH

REFERENCE: Leonte, D. and Schofield, N., "Evaluation of a Soil Contaminated Site and Clean-
up Criteria: A Geostatistical Approach," Geostatistics foI' EnviI'onmental and Geotechnical Ap-
plications, ASTM STP 128:5, R. Mohan SI'ivastava, Shahrokh Rouhani, Marc V. Cromer, A. Ivan John- ,II>
son, Alexander J. Desbarats, Eds., American Society for Testing and Materials, 1996.
Ii"
I<

I'

'I<
IP

ABSTRACT: A case study of soil contamination assessment and clean-up "


in a site proposed for residential development is presented in this
paper. The contamination consists mainly of heavy metals of which lead
is the most important contaminant. The site has been sampled on an '"'
approximately 25 x 25 square meter grid to between 1 and 3 meters depth
to evaluate the extent of the contamination. Three hotspots were
identified based on eyeballing the lead sample values and a crude
contauring. A geostatistical approach is proposed to map the lead
cont~ination and provide an alternate evaluation. The results suggest a
significantly different evaluation of the area for clean-up based on the
probability of the lead concentration exceeding allowable levels.

KEYWORDS: soil contamination, hotspots, thresholds, geoscatistics,


indicator kriging, conditional probability

rNTRODUCTrON

The issue of contaminated land has only recently become of


importance in Australia, although chemical contamination of land and
groundwater has a long history, going back to the first years of
European settlement. The actual extent of the problem is yet to be
accurately determined, with some predictions placing the number of

lSenior Environmental Scientist, McLaren Hart Environmental


Engineering (Australia), 54 Waterloo Road, North Ryde, NSW 2113.
2Manager, FSSI Consultants (Australia) Pty. Ltd., Suite 6, 3
Trelawney Street, Eastwood NSW 2122.

133
134 GEOSTATISTICAL APPLICATIONS

c:mtaminated sites around 10 000 [.ll. Much of the regulatory framework


deal;.ng with the management of contaminated sites has been developed
during the last decade. The Australian and New Zealand Guidelines for
the Assessment and Management of Contaminated Sites, prepared jointly by
the Australian and New Zealand Environment and Conservation Council
(ANZECC) and the National Health and Medical Research Council (NHMRC),
were released in January 1992 (2]. This document provides the unified
framework within which individual States are developing their own
legislation and guidance. Pollution control requirements are
administered directly by the Environment Protection Authorities in New
South Wales, Western Australia and Victoria, by the Department of
Environment and Planning in South Australia, and by the Departments of
Environment in Queensland and Tasmania.
Specifically for the soil medium, the present lack of a unified
legislative approach results from a combination of factors, the most
important being:

1. More than twenty different soil profiles exist in Australia,


including many where there is a sharp distinction between various
horizons; as a result the natural levels and range of chemical
components vary significantly throughout the country.
2. A myriad of different plant and animal species are unique to
this continent [~l.
3. The value of land is still driven by commercial rather than
environmental factors.

Consequently, criteria-based standards which involve predetermined


clean-up levels are not entirely favoured by both the public and various
regulatory bodies.
The ANZECC/NHMRC document, recognising the need for flexibility,
concluded that "the most appropriate approach for Australia is to adopt
a combination of the site-specific and criteria-based standards". This
methodology incorporates, at a national level, a general set of
management principles and soil quality guidelines which guide site
assessment and clean-up action, obviating, where appropriate, the need
to develop costly site-specific criteria. However, this approach also
recognises that "every site is different" and that "in many cases, site-
specific acceptance criteria and clean-up technologies will need to be
developed to reflect local conditions".
As a result, the national guidelines provide a set of criteria to
assist in judging whether investigation of a site is necessary. Soil
quality guidelines, based on a review of overseas information and
Australian conditions, give investigation threshold criteria for human
health (Table 1) and the environment (Table 2). Levels refer to the
total concentrations of contaminants in dry soil, and have been defined
from a small number of investigations in both urban and rural areas.
Background criteria pertaining to the level of natural occurrence for
various chemical components in soils are also specified in the
guidelines.
Site data with levels less than the criteria indicate that the
quality of soil may be considered as acceptable irrespective of land
use, and that no further contamination assessment of the site is
required. In cases where the contaminant concentration exceeds the
LEONE AND SCHOFIELD ON CONTAMINATED SITE 135
criteria, a contamination problem may exist at the site and requires
further assessment. As the guidelines do acknowledge that Table 2 is
"conservative and has been set for the protection of groundwater", most
state environmental regulatory agencies use these levels as a starting
point for further investigation, and determination of clean-up levels
specifically for each site.

TABLE 1--Proposed Health Investigation Guidelines [21.

Substance Health LeveL, mg/kg

Lead 300
Arsenic (total) 100
Cadmium 20
Benzo(a)pyrene 1

'III

TABLE 2--Proposed Environmental Soil Quality Guidelines [21. 1111


'I
III'
~ III

ti,'
Substance Background Env. investigation

Antimony 4 - 44 20 ""I
Arsenic 0.2 - 30 20
Barium 20 - 200
Cadmium 0.04 - 2 3 !:
"
Chromium 0.5 - 110 50
Cobalt 2 - 170
I
~::i
Copper 1 - 190 60
Lead <2 - 200 300
Mercury 0.001 - 0.1 1
Molybdenum < 1 - 20
Nickel 2 - 400 60 "'Ij

Zinc 2 - 180 200 ::'1


::1
"I
Ii'
These criteria have been applied to the geostatistical case study ,:::1
discussed below. '"I

THB DATA SBT

Site Description

The site considered in this study is an almost rectangular parcel


of land of some 70 000 m2 , having a general flattish topography with a
slight fall to the centre. Its entire history is not well recorded and
the site is only known to had been occupied by a brewery from 1885 to
1910. The site is now vacant with all buildings having been removed
between 1984 and 1986. The majority of the land in the region was used
by timber merchants for milling of timber from 1928 to 1980. It is
136 GEOSTATISTICAL APPLICATIONS

also known that the whole area, being a long strip of land along a
~') 7"ner wharf, is heavily contaminated. The contamination is associated
with old practices of dumping, both domestic and industrial residues,
in times when legislation controlling waste disposal in Australia was
nonexistent and mudflat "reclamation" practices of this manner were
actually encouraged. These residues are known from other nearby areas,
to be of both Australian ane overseas sources.
A development proposal to use the site for a medium density
residential development comprising some 200 residential units and a
retirement village complex initiated a site assessment as an initial
evaluation of the potential for soil contamination.
The site was sampled by taking 54 mrn diameter continuous cores
from boreholes located on a grid of approximately 25 x 25 m2 . The
boreholes were sampled every 500 mrn to a depth of 3 m and submitted for
chemical analysis. The 1 000 - 1 500 mrn and 1 500 - 2 000 mrn layers
were alternated between boreholes. The depth to which samples were
taken was based on the depth to groundwater and natural soil, which was
recorded on the borelogs. Samples were taken by splitting the core
down the middle. The remaining core was retained for reference and in
the cases of "hotspots", was used for further testing.

Sampling of Hotspots

All samples collected initially were analysed for a suite of


parameters which included pH, Cadmium (Cd), Chromium (Cr), Copper (Cu),
Nickel (Ni), Zinc (Zn), Lead (Pb), Mercury (Hg), Sulphur (S), Arsenic
(As), Cobalt (Co), phenols, cyanides and total hydrocarbons. The
chemical analysis revealed certain boreholes where concentrations of
heavy metals, of which lead was the most important, were significantly
higher than the global mean for the site. These groups of holes
defined hotspots which were investigated in detail to map the extent of
the contamination. Four additional boreholes were drilled around each
hotspot at approximately 12.5 m spacing.

Discussion of Chemical Analyses

Chemical analysis was carried out on a total of 378 samples.


Elevated levels of lead were found in approximately 150 samples.
Calculation of the arithmetic mean across the site for various
contaminants, showed that in the case of lead, the mean was 572 ppm,
with a high standard deviation which reflects the highly variable
nature of the contamination. In hotspot areas, lead levels as high as
3.7% (37 000 ppm) were measured. Analysis of the lead levels across
the site and down the individual boreholes showed the following
arithmetic mean and standard deviation values (in ppm) for the number
of data points (n) in each layer:

Mean St. deviation n


layer 1 (0 - 500 mrn) 963.96 4 269.92 126
layer 2 ( 500 - 1 000 mrn) 589.55 1876.59 126
layer 3 (1 000 - 1 500 mrn) 387.44 746.76 88
layer 4 (1 500 - 2 000 mrn) 83.16 148.34 52
layer 5 (2 000 - 2 500 mrn) 11. 00 8.32 18
LEONE AND SCHOFIELD ON CONTAMINATED SITE 137
These statistics suggest a decrease in contamination
concentration with depth, especially bellow 1.5 m. The number of
samples in each layer is also decreasing with depth.

Findings of the Initial Site Assessment

Following chemical analysis of lead concentrations, results were


used to delineate the hotspots by considering the midpoint between two
nearby samples which generally followed the "rule" of one sample
showing a lead concentration above the acceptable limit and the other
one below this limit. However, this rule has not been obeyed for all
sampling points.
Specifically in bore 109 (see Figure 2), lead levels of 2 450 ppm
in the first 500 mm and 1 640 ppm in the 500 - 1 000 rom soil depth
intervals were considered as being isolated, and mixing during the
excavation of soil was recommended as a clean-up method. However, 111'

bores 76 and 75, located to the south and north of bore 109, indicated '"
1i!
1:1
a much lower concentration in layer 1 and a higher concentration in
layer 2, as shown below: I'
rd,

B 76 B 75
"I
I,
I I~
o- 500 mm 470 ppm 14 ppm HI
1'1::
500 - 1 000 mm 2 650 ppm 3 150 ppm 1:11
1 000 - 1 500 mm 180 ppm not sampled
1'1

Furthermore, Figure 2 shows that 2 of the hotspots on site are "I


located immediately to the east of bores 76 and 75.
These details indicate that the extent of contamination in the
north, could be larger than that estimated in the first site assessment
study.
A similar situation is encountered in the south-east corner of
the site, where boreholes on easting 295 m and 320 m showed very high
lead concentrations at one or several depth levels. Because high
concentrations did not appear clustered on all levels, it was concluded
in the earlier study that no contamination existed there.

GBOSTATZSTZCAL ANALYSZS OP LEAD DATA

A file of the lead data was created to be used for a


geostatistical study. The analysis was restricted to samples from the
first three layers, where sampling has been relatively uniform.

Statistical Analysis of Lead Data

Figure 1 shows the declustered histogram of lead concentration in


some 335 samples from the boreholes. The data have been declustered
using the method of Schofield 1992(1] to account for the clustering of
sampling in the hotspots.
Lead shows a strongly positively skewed histogram with a very
high coefficient of variation related to the presence of a few extreme
values in excess of 10 000 ppm. The mean of lead is 793 ppm, well in
excess of the third quartile and over twice the limit of 300 ppm above
138 GEOSTATISTICAL APPLICATIONS

which investigation is recommended.


For lead concentrations between 100 and 1 200 ppm, the histogram
may be well fitted with a lognormal distribution model. Table 3
compares the cumulative probabilities for the declustered data and for
a lognormal distribution with the same mean and log variance.

Statistics

.J!!
(.)
... # data: 323
Mean: 693.8
Variance: 8190539
~ Coet.Var: 4.12
~
Cl Minimum: 4.0
'0 1st Quart: 20.000
c::
0 Median: 105.0
.€
8. 3rd Quart: 430.0
0.
e Maximum: 40800.0
mAD: 665.5

0.0 200 400 600 800 1000

Lead ppm

Figure I--Histogram of declustered lead concentrations in 500 mm


samples.

Table 3--Comparison of the cumulative histogram of the declustered data


and a lognormal distribution with the same mean and log variance.

Lead, ppm Cumulative Prob. Cumulative Prob.


Lognormal Distance Declustered Data

100 0.48 0.49


"I 300 0.70 0.68
'I 500 0.78 0.78
700 0.83 0.84
900 0.86 0.86
I 200 0.89 0.90

Spatial Distribution of Lead and Local HotsootS

Figure 2 presents a contour map of lead concentration generated


from a moving average of the lead sample values. The moving average
method is described by Isaaks and Srivastava 1989 [~l as a useful tool
for identifying local anomalies in the variability of data values. The
map also shows the locations of the boreholes and the three hotspots
LEONE AND SCHOFIELD ON CONTAMINATED SITE 139
shaded in grey that were previously identified for clean-up. The
contours show chree or possibly four areas of significant lead
contamination. The contours do not show any preferred directional
structure to the lead contamination but the lead concencration in the
most easterly area is significantly higher than that in the other
areas. The five samples with lead concentrations above 10 000 ppm all
occur in the most eastern hotspot.

Global Lognormal Model for Lead

Swane et al. [~] has suggested that some regulatory agencies in


Australia may favour defining acceptable clean-up criteria in terms of
the 75th percentile of a lognormal model. This means that if the 75th
percentile of a lognormal distribution model applied to the global
histogram of lead concentrations is higher than the recommended limit
of 300 ppm lead, further investigation and possible clean-up would be
recommended. For the present lead data set, the 75th percentile of a
lognormal model is 470 ppm. By removing the five samples with the
highest lead values above 10 000 ppm, the 75th percentile is reduced to
296 ppm which may indicate that an acceptable site clean-up has been
achieved.
Therefore the global lognormal model would only require
remediation of small areas of extreme contamination in order to reduce
the global level of contamination and satisfy the acceptance threshold
of 300 ppm lead for the site.

Variogram Analysis of Lead


I
The spatial continuity of lead concentration in the soil has been
: I
,
characterised by a set of ornni-directional indicator variograms for a '.'
I
range of relevant indicator thresholds. The actual spatial continuity ~ 11

II
measure used is the correlogram of Srivastava and Parker, 1988[ft].
Analysis of directional variograms did not indicate any preferred "
II
\. ~
orientation to the lead contamination. There is no reason to believe
that some scructured pattern of dumping lead contaminated waste would
have been used at the site. f''l

Figure 3 shows a plot of the ornni-directional horizontal :1


variograms. The limited vertical extent of sampling does not allow :;!
reasonable inference of the vertical continuity of lead contamination. 'I
f,j'l

The horizoncal variograms indicate an ornni-directional structure at all :::1


thresholds and an increasing nugget with increasing threshold. In all '~I

cases, the sample variograms have been reasonably fitted with a nuggeC
and a single exponential model.

Indicator Kriging Model

The Indicator Kriging approach was used to map locally the


probabilicy that the lead contamination in soil exceeds certain
threshold concentrations some of which are used as clean-up criteria.
The approach follows that of Isaaks 1984[~] in mapping lead
concentration in soil around a lead smelter in Dallas, Texas and that
discussed by Journel 1988[Q].
140 GEOSTATISTICAL APPLICATIONS

o
C\I
CO')

o
I'-
C\I

I
Cl
o c:
C\I
C\I ~
W

o
I'-

o
C\I

o
I'-

o
C\I
o o o o
C\I I'- C\I C\I
C\I

(w) f5U!l. nJON

Figure 2- Con kour map of lead con cencrations in the first soil lay er
(0 SOO rom) showing bo r ehole and hocspo t l ocak i ons in grey shading
LEONE AND SCHOFIELD ON CONTAMINATED SITE 141

Lead Variog,a",

:2
):'

~
.~
<a Model Parameters
>
co: 0.1 1 nugg
C1: 0.91 exp range: 52
O.
0.0 .4 120.5 1SO.7
La Distance h

IndicalDr 300 ppm

212~
159<\.~~' 196'1,
:2
/--
):'

~
!?
'lij
> Model P..,.",e",,"
CO: 0.11 nugg
0
C1 : 0.88 exp range: 61
O.
0.0 .4 120.5 1SO.7

IndicalDr 900 ppm

19f1.o\

r~
2124,

:2
):'

~
.~
.
> Model Parame""8
CO: 0.51 nugg
0
C1 : 0.51 axp range: 72
O.
0.0 .4 120.5 1SO.7
La Distance h

Figure 3--Qrnni-directional horizontal sample yariograrns for l ead and


for several indicator thresholds.

Estimates of the local conditional probability for the lead


concentration to exceed given thresholds were made using indicato r
point kriging on a 10 meter square grid. Contours of the probability
of exceedance for the 300 ppm lead thre shold for layer 1 and layer 2
are shown in Figure 4. The previously identified hotspots of high lead
contamination are also shown with grey shading. The maps indicate
142 GEOSTATISTICAL APPLICATIONS

: . E. .~£f-= areas wi th a probability of at least 70 percent that the lead


concentration in samples exceeds 300 ppm. The hotspots identified for
clean-up lie close to the contaminated centres of two of these areas.
However, a large area of high lead contamination in the south-eastern
part of the site (20 m x 120 m northing and 270 m x 320 m easting) has
been ignored completely, most likely because remediation of the
previously identified hotspots would ensure compliance under the 75th
percentile of a global lognormal criteria for lead at the site.

Layer 1, Pr (Pb > 300 ppm)

220

ICI
t: 120
:E
~
o
Z

20
20 120 220 320
Easting (m)
Layer 2, Pr (Pb > 300 ppm)

220

E
CI
t: 120
:E
~
o
Z

20
20 120 220 320
Easting (m)
Figure 4 Cont ours of the conditional probability for lead
concentration to exceed the recommended level of 300 ppm. Hotspots are
shown by grey shading.
LEONE AND SCHOFIELD ON CONTAMINATED SITE 143
Figure 5 presents contour maps 0;: t he: '=s ·,:il r,at. e~1 pr::>(.clinl i ty for
lead concentration to exceed 500 ppm in soils for layers 1 and 2. On
these maps, the areas with very low probability of contamination are
clearly shown. The areas with potentially high contamination are also
clear with the southern area (20 m x 120 m northing and 270 m x 320 m
easting) again standing out as unrecognised by the previous
investigation.

Layer 1, Pr (Pb > 500 ppm)


220

~,
0.1-,--
1
7 -- - , -
1 1

I
01
~ 120
:E
t
o
Z

20
20 120 220 320
Easting (m)
Layer 2, Pr (Pb > 500 ppm)
220

I
01
~ 120
:E
t
o
Z

20
20 120 220 320
Easting (m)
Figure 5 Contours of the conditional probability for lead
concentration to exceed the recommended level of 500 ppm. Hotspots are
shown by grey shading .
144 GEOSTATISTICAL APPLICATIONS

CONCLtl'SrONS

The recommendation of the ANZECC/NHMRC document for the use of


both criteria-based and site-specific standards to assess soil
contamination and clean-up is supported by the authors of this paper.
The use of a universal or blanket standard for assessment of all sites
appears inappropriate. This conclusion is supported by the outcome of
applying the 75th percentile of a lognormal model criteria to the site
in question in this paper. The cleaning of small areas of extreme
contamination may often reduce the global level of contamination below
some acceptance threshold. However, large areas carrying a significant
risk of contamination above the acceptance threshold may go
unrecognised and uncleaned.
The application of geostatistical methods to analyse and model
the lead contamination at this site appears appropriate. The dumping
of lead contaminated material at the site does not seem to have been
highly organised introducing considerable uncertainty as to the exact
location of the contamination. Subsequent migration of the lead in
'I soil due to natural processes has likely modified the spatial
'"
I:~
distribution of lead introducing greater complexity and uncertainty
:( into its spatial geometry.
I'~
Indicator kriging has enabled a mapping of the lead contamination
at a local scale which permits an assessment of the risk associated
with certain levels of contamination. When compared to previous
attempts to identify areas of significant contamination (hotspots), the
IK mapping indicates much larger areas associated with those hotspots
where the risk of contamination is high. In addition, a large area of
significant contamination which has previously gone unrecognised due to
a naive decision rule, has been identified through geostatistical
analysis.
Although other techniques would have enabled estimation of the
global lead contamination at the site by accounting for its specific
directions of spatial continuity, the IK tool uniquely introduces the
risk factor through the quantification of the uncertainty associated
with the estimation process. Therefore making decisions on the extent
and nature of remedial action to be implemented becomes a more informed
process in which clean-up cost and potential liability associated with
'", it can be evaluated.
,
'"
,,'
REFERENCES

Lll M.G. Knight. "Scale of the hazardous waste problem in Australia


and disposal practice," Symposium on Environmental Geotechnics
and Pyoblematic Soils and Rocks, Bangkok: Asian Institute of
Technology, South-east Asian Geomechanics Society, 1985.
[~l Australian and New Zealand Environment and Conservation Council,
and National Health and Medical Research Council (ANZECC/NHMRC).
Australian and New Zealand Guidelines for the Assessment and
Management of Contaminated Sites, January 1992.
LEONE AND SCHOFIELD ON CONTAMINATED SITE 145
U.l J. Daffern, C.M. Gerard and R.McFarland. "Regulatory and non-
regulatory control of contaminated sites," Geotechnical
Management of Waste and Contamination, Fell, Phillips and Gerrard
(editors), Balkema, Rotterdam, 1993.
[~] E.H. Isaaks. Risk Oualified Mappings for Hazardous Wasce Sites; A
Case Study in Distribution Free Geostatiscics. Master's thesis,
Stanford University, 1984.
[~] E.H. Isaaks and R.M. Srivastava. An Introduction to Applied
Geostatistics. Oxford University Press, 1989.
[.§.] A.G. Journel. "Non-parametric geostatistics for risk and
additional sampling assessment," Principles of Environmental
Sampling, L. Keith (ed.), American Chemical Society, 1988.
[1] N.A. Schofield. "Using the entropy statistic to infer population
parameters from spatially clustered sampling," Proceedings of
the 4th International Geostatistical Congress, Troia 92, pages
109-120, Kluwer, Holland, 1992.
[.a] R.M. Srivastava and H. Parker. "Robust measures of spatial
continuity," M. Armstrong (ed.), Third International
Geostatistics Congress, D. Reidel, Dordrecht, Holland, 1988.
[.2.] M. Swane, I.C. Dunbavan and Riddell P. "Remediation of
contaminated sites in Australia," Fell, Phillips and Gerrard,
(editors), Geotechnical Management of Waste and Contamination, ,"
I'I!

Balkerna, Rotterdam, 1993, pp 127-141. III(

1111
'I
I

I
I
2
Amilcar O. Soares', Pedro 1. Patinha2, Maria J. Pereira

STOCHASTIC SIMULATION OF SPACE-TIME SERIES: APPLICATION TO A


RIVER WATER QUALITY MODELLING

REFERENCE: Soares, A. 0., Patinha, P. J., Pereira, M. 1., "Stochastic Simulation of


Space-Time Series: Application to a River Water Quality Modelling," Geostatistics
for Environmental and Geotechnical Applications, ASTM STP 1283, R. Mohan Srivastava,
Shahrokh Rouhani, Marc V. Cromer, A. Ivan Johnson, Alexander J. Desbarats, Eds., Amer-
ican Society for Testing and Materials, 1996.

ABSTRACT: This study aims to develop a methodology to simulate the joint behaviour,
in space and time, of some water quality indicators of a river, resulting from a mine
effluent discharge, in order to enable the prediction of extreme scenarios for the entire
system. Considering one pollutant characteristic measured in N monitoring stations along
the time T, a random function X(e, t), e=l, ... ,N, t=1, .. . ,T, can be defined. The proposed
methodology, a data driven approach, intends to simulate the realisation of a variable
located in station e in time t, based on values located before e and t, and using a sequential
algorithm. To simulate one value from the cumulative distribution F{X(e,t) I X(e-J,t), ...
,xU, t),x(e,t-I), ... ,x(e, I)}, the basic idea of the proposed methodology is to replace the e.t
conditioning values by a linear combination of those:
* e-I (-I
[x(e, t)1 = L: au X( u, t} + L: brt X( e, ~) which allows the values to be drawn sequentially
u=1 ~=I I-'

from bidistributions. The final simulated time series of pH and dissolved oxygen
reproduce the basic statistics and the experimental time and spatial covariances calculated
from historical data recorded over 15 months at a selected number of monitoring stations
on a river with an effluent discharge of a copper mine located in the south of Portugal.

KEYWORDS: stochastic simulation, space-time series, water quality

'Professor, CVRM - Centro de Valoriza<;:ao de Recursos Minerais, Instituto Superior Tecnico, Av.
Rovisco Pais, 1096 Lisboa Codex, Portugal.
'Research Fellow, CVRM - Centro de Valoriza<;:ao de Recursos Minerais, Instituto Superior
Tecnico, Av. Rovisco Pais, 1096 Lisboa Codex, Portugal.

146
SOARES ET AL. ON SPACE-TIME SERIES 147

RIVER WATER QUALITY MODELLING

The main objective of this study is to develop a methodology to simulate the


behaviour, in space and time, of some water quality indicators of a river resulting from a
mine effluent discharge, in order to enable the prediction of extreme scenarios for the
entire system. The first results of the model application, in a case study of a mine located
in the south of Portugal, will be presented.
The time series of pH and dissolved oxygen were simulated, using historical data
recorded over time at a selected number of monitoring stations on a river at Neves Corvo
mine. The occurrence of extreme situations from the joint simulated time series:
simultaneous high spikes of the parameters and durability in time and space of one
extreme situation - can be visualised.

MODELLING METHODOLOGY: STOCHASTIC SIMULATION OF SPACE-


TIME SERIES

The proposed methodology is a data driven approach of sequential simulation of a


random vector. Defining N dependent random variables Xj, X 2,... ,xN , the simulation of
the cdf F( Xl, X2, ... ,xN) can be generated by a conditional distribution approach (Law and
Kelton 1982, Johnson 1987, Ripley 1987), involving the following sequential procedure:
· draw the first value Xl from the marginal distribution F(Xl ) of the RV Xl'
· draw the second value X2 from the distribution ofRV X 2 conditioned onXl=xl:
F(X2 I Xl=Xl)
· draw the}/h value of the conditional distribution: F(XNI X l =Xl,x2=X2,"., XN-l=XN-l)
The random variables Xj, X 2,... ,xN could be the same attribute of water quality
measured in different time and spatial locations. The main limitation of the practical
implementation of this algorithm is the calculation of all these conditional distributions,
considering that, in most applications of earth and environmental sciences, just one
realisation of Xj, X2, ... ,xN is available.
Journel and Gomez Hernandez (1989) presented one solution for spatial
realisations of F(Xj, X 2,... ,xN), in the geostatistical framework, using the indicator
formalism (Sequential Indicator Simulation-SIS) or the multigaussian approach
(Sequential Gaussian Simulation-SGS) (Deutsch and Journel 1992).
Both methods apply for spatial stationarity: the stationarity of the indicator
random function in SIS method or the stationarity of the gaussian transform function in
SGS. Some cases, like a pollutant along the river coming from one point source, can not
be considered spatially stationary. As a matter of fact, in the case study presented here,
some sharp drops in the pollutant concentration are found between two monitoring
stations. Thus, the application of the mentioned methodologies is not straightforward.
Considering the pollutant characteristic measured in N monitoring stations over
time T, the random function X( e,t), e= 1,.. ,N, t= 1,.. ,T, can be defined. The previous
sequential procedure is written for X(e,t) :
Monitoring Station I
t=1 draw a value Xl of F(X(I,I))
148 GEOSTATISTICAL APPLICATIONS

t=2 draw a value X2 of F(X(I,2)1X(1,1)=xl)


t=T draw a value xrof F(X(I,1)IX(I,l)=xb .... ,x(I,T-I)=xr.l)

Monitoring Station N
t=1 draw a value Xl of F(X(N,I))
t=2 draw a value X2 of F(X(N,2)IX(N, 1)=Xl)
t=T draw a value Xr of F(X(N, 1)1X(N, 1)=x J, ••• ,x(N,T-l )=Xr-l)
The basic idea of the proposed methodology is to substitute the NxT conditioning
values by a linear combination of these:
N-l T-l

[X(e,t)]*= La"X(a,t)+ Lb~X(e,p)


" ~-l
With this approximation to simulate one value of X in monitoring station e in
"
• time t, instead of using the cdf - F[X(e,t)IX(l,l), ... ,x(e-l,t-l)] - one uses the
j
bidistribution:

1
Ij

:f.. X(e,t) I[ X(e,t)] * = ~aaX(a,t -1) + t,bpX(e -1, (3)]


:~" Thus, for all monitoring stations we have:
" Monitoring Station 1
t=1 draw a value Xl of F(X(I,I))
t=2 draw a value X2 of F(X(I,2) I [X(I,2)]* =X(I,I))=Xl)
t=T draw a value XT of F(X(I,1) I [X(I,1)]* = b I X(1,I)+ ...+b r _l X(1,T-l)) =
.,
I = F(X(1, 1)lbIXl+H+bT-I-xr-l)

Monitoring Station N
I
":1
t=1 draw a value Xl of F(X(N,I))
t=2 draw a value X2 of F(X(N,2) I [X(N,2)]* = X(N,I)) = Xl)
,Ii
il
."I t=T draw a value xrof F(X(N,1) I [X(N, 1)]* = a\X(1,T-l)+ ...+b r_lX(N-l,T-l))
= F(X(N, 1)laIXl+H+br-l.xr-d

Practical Implementation of the Algorithm

/ If the same pattern of neighbourhood values (in space and time) are used to
,,''" calculate all [X(e,t)]*, e=1 ,N, t=1 ,T, the bidistributions functions F(X(e,t) I [X(e,t)]*) can
'I
be inferred from the historical data.
Using the same neighbourhood pattern one can calculate the pairs (X(e,t), [X(e,t)]
*) for all t=1,T and e=I,N. The bi-plots (X(e,t),[X(e,t)]*) can thus be calculated for each
monitoring station and for homogeneous periods of time (see Fig.1). To simulate one
value for X(e,t) conditioned on the known value [X(e,t)]* =x* (calculated with the
previously simulated values X( e-l ,t-l)), first we need to select all pairs which belong to
the class of [X(e,t)]* and, afterwards, one value ofX(e,t) is drawn randomly from them.
Once the value Xs is simulated, X(e,t)=xs will be part of the next estimators [
(X(e+1,t)]* or [X(e,t+1)]*.
SOARES ET AL. ON SPACE·TIME SERIES 149

F(x(e,t) I x· = [X(e,t»)' X
• •
• • ••
• I.
I.' .
X .:~.'~ ••
S(o.') • 'C"•• '" •
~---t_--Jr----::-.~,t-' • •~ •

... .. ...
' :• '1#' :
~. ~ ; :

: ~'. ~
•• : • j

Fig 1. Illustrative representation of the simulation procedure from the bi-distribution.

STATISTICAL CHARACTERISATION OF SIMULATED TIME-SERIES

With this ty'pe of data driven approach usually one wishes to mime the real data in
some statistics of fundamental variables, which measure their spatial and time structures.
The proposed algorithm generates a time series in each monitoring station with some
statistics identical to the historical data: means, marginal histograms and spatial and time
correlations.
Considering for each monitoring station e, the statistics of historical data:
m(e)- mean of the time data
Fx(X;e)=prob{X(e,t)<x}- the marginal cdf of e
Ce(h) = E{X(e, t)X(e, t+h}- m(e)2 - the time covariance in e, and the spatial
covariance C(h)=E{X(e,t)X(e+h,t}- m(e).m(e+h)'
Ce(h) and C(h) are measures ofthe spatial and time structure of the variableX(e,t).
If the simulation sequence starts with a limited set of conditioning values with the
marginal distribution FxCX;e), the bidistribution F(X(e,t)1 [X(e,t)]*) will generate a
simulated time series with the same mean: m'(e) = m(e), and same marginal cdf F/ (X;e)
= FxCX;e) .
The simulated values Xs(e,t) reproduce the covariances between X(e,t) and [X(e,t)
]*:
150 GEOSTATISTICAL APPLICATIONS

e-J /-1

La"cov [X(e,t)),X(u,t)] + Lb~cov[X(e,t),X(e,l3)l


,,~l ~~l

[1]

The simulated values Xs(e,t) reproduce an average of the time and space
covariances. Consequently, the weights au and bl3 must be chosen in such way that the
individual covariances are represented in final time series.

DEFINITION OF THE LINEAR COMBINATION [X( e, t)] *


N-I )'-1

To define [X(e, t )]* = La" X(u, t) + Lb~X(e, (3) one could choose a centered
H
and minimum variance estimator: E{X(e,t)}= E{[X(e,t)]*} and min (var{X(e,t)-[X(e,t)]
*}), which entails to a kriging system written in terms of time and space covariances. For
., simplicity sake let us use the same notation for the weights and for the spatial and time
location. The estimator of any location Xo is written:
l
"
[X(xo)r = L A"X(X,,)
"

J~ A"C(X",x~) + Jl = C(x",xo)
l~ A~ = 1
The solution vector A combines two distinct effects:
i) The proximity of the samples to the estimated point Xo determine their influence
nd
in terms of weights (2 member of the kriging system).
ii) The declusterizing effect of the first member of kriging system leads usually to
underweighting of clustered samples.
'I Now the problem is that the samples in time series are usually clustered. Thus, due
to the declusterizing effect, this minimum variance estimator tends to overweight the
influence of the nearest sample and underweight the others influence. Consequently, the
average covariance of [1] represents mainly the short distances structures.
To avoid this drawback, in this study an estimator has been chosen which
accounts only for the proximity effect, i.e., the weights are directly proportional to the
correlation coefficient (p) between any sample Xu and the estimated point Xo :
1I l
A" = Pu,o + Nl 1- ~P~o J
Note: this is the solution of the kriging system when there is a null correlation between
the conditioning samples Xu .
SOARES ET AL. ON SPACE-TIME SERIES 151

Obviously, this is no longer a minimum variance estimator, but it assures the


representativity of all co variances (in space and time) in the simulated time series.
N- I T-I

In summary, the estimator [X(e,l)r = LaaX(a ,I) + Lb p X(e ,.8) is defined


P- I
by the weights:
r 1 [" [,-1 I-I]l
Ja" = p".• + N. Tl I - £:P;oe + ~P/.j
J
e-I I-I]]
lI 1 [
bp = Pl oP + N . T l-l[" £:P;o. + ~P/.j

Note - This estimator was chosen for this particular case to avoid the overweighting of the
small distances structures resulting from the ordinary kriging of clustered string data. The
solution consisted of putting all covariances between samples equal to zero. However,
other corrections of the covariances between samples can be adopted ( for example
Deutsch 1994, suggests a kriging estimator with a 10umel ' s redundancy measure
correction of the 1s l member of the equations system, for similar purpose).

MODEL VALIDATION WITH A CASE STUDY

The data set used to implement the stochastic simulation consisted of monitored
values of pH (daily analysis in lab) and dissolved oxygen at 4 monitoring stations located
along a river, with a effluent discharge of a mine (Fig. 2), during a period of 15 months.
The monitored data are shown in Figs.3a, 3b and FigsAa, 4b representing the time series,
histograms and time variograms of 15 months period for pH and dissolved oxygen.

Mine Site
; ..
Ir,. '

Fig 2 - Layout of the mine site and the 4 monitoring stations along the river.
152 GEOSTATISTICAL APPLICATIONS

pH FF===~====~====~====~==o=n=it=o~riRng~tart~i~on~I__ - r____ ~r-________~-"


13
200
12
160

"
10 120

80

100 200 300 400


10 12 14
days pll

~onitoring Station 2
pH R===~====~============R F ~----~~~~--~~--~
13
200
12

"
10
16 0

120

80

40

14

pH ~====~===============~
~o=n=it=o~riRng~tartrio~n~3______
. . __-,____________-.,
13
200 _ ..
12
,--
160
"
10
120

80

~ ..
40

soo o r--
10 12 14
pit

pH
~==========~==~=====~
==o=n=it=o~riRng~tartrio_n_4____r -__________- '______"
13 -
200
12

160
"
10
120

80

40

100 200 300 400


10 12 14
pll

Fig. 3a - Time series, histograms of pH historical data.


SOARES ET AL. ON SPACE-TIME SERIES 153
Monitoring Station I Monitoring Station 2
y(d) y(d) , - - - - - - - - - - - - - - - - - ,

....'.
1.60
,
. 3.20

..... . ........
,
1. 20 2.40

.. '. .,.,
0.80

0.40
,'

....
....
,
"
1. 60 ", '.'

0.00
100 200 300

days

Monitoring Station 3 Monitoring Station 4


y(d) , - - - - - - - - - - - - - - - - - ,
.
y(d) , - - - - - - - - - - - - - - - - - - ,

..
." .. .
120 0.16

.. '
.... ..... .................
II II _ ," •••••••
090 0.12 +-.a.----',-,WO..
-.~-~-------l

"'"
OJO
11.08

O.()4
.
.................
...............

Fig. 3b - Time variograms of pH historical data.

Two different simulations were implemented: a conditional simulation in which


the input is the real time series of the first station corresponding to a mine effluent, and a
non-conditional simulation where all time series (including the mine effluent) were
simulated.

Conditional Simulation

The estimators [X(e, t)]* were calculated with four samples before time t and two
samples spatially located before e:
[X(e, t)]*=a, X(e-l ,t)+ a2 X(e-2,t)+ b, X(e,t-l)+ b2 X(e,t-2)+ b3 X(e,t-3)+ b4 X(e,t-4)
Based on experimental bi-plots {X(e, t), [X(e,t)]*} for each monitoring station, the
simulation procedure has been initialised with real time series of the first station. The
resulting simulated time series of the remaining 3 stations are shown in Figs. 5, 6 and 7a,
7b, for the two elements studied. The time correlations of the real data are quite
satisfactory reproduced in the variograms of the simulated values.
154 GEOSTATISTICAL APPLICATIONS

Monitoring Start-ri_
o n~l~~~_ _~~-.-,--.-,--.-,-~~..--,
00 F

90 .0
80
60
70

40
60

50 20

10 20 30 40 50
40 100
days

Monitoring Sta_t~i;c
o;cn .::2;.,.-~~_ _~~-.--,-~-.-,-~-.-,-..-.--,
00 RF==~==~============R F r

90 80

80
60
70

40
60

50 20

40 90 100
days

Monitoring Station 3
00 RF==~====~==~======~ F ~--~--~~~--~~--~

90 80

80
60

40

20

40 50 60 90 100
days

Monitoring Station 4
00 R==========t=I Fr====:-:r:::::::==:-:r:::::::==::::q
90
80

80
60
70

60 40 ................ _.. ".. __ .....

50 20

40
100 200 300 500
40 90 100
days

Fig. 4a - Time series, histograms of Dissolved Oxygen historical data.


SOARES ET AL. ON SPACE-TIME SERIES 155
Monitoring Station 1 Monitoring Station 2
y(d) y(d)

.. ... .'. ." .. . 64


.......
......" .,'
64 • '0

.. '.'
.'0
. .' '. " "
.'." . ..
48
.'
0
48

.. ,.... ..
32 32
..... '
16 )6 '.

0 0
0 100 200 300 400 0 100 200 300 400
dlY' d.ys

Monitoring Station 3 Monitoring Station 4

..
y(d) y(d)

.......
. . ..'. '.
..........
128 64

.... ...... .
'
. .
96

..' ...... . .
'.' 48
.. . . . ' ...... .'. . I. I,

'0

64 ..... 32 .
32 16

100 200 300 400 100 200 300 400


days days

Fig. 4b - Time variograms of Dissolved Oxygen historical data.

Non-Conditional Simulation

The non-conditional simulation is presented only for pH. The time series of all
monitoring stations were simulated including the first one corresponding to the mine
effluent, based on experimental bi-histograms {X(e, t), [X(e,t)]*} .
The simulated time series of the four stations are reproduced in Figs. 8a, 8b as
well as the time covariances of the simulated values.

Discussion of the Results

The proposed simulation methodology presented very satisfactory results


regarding the main objectives: generation of a time series with the same basic statistics,
time and spatial correlation as the observed historical data. With the first simulation, as it
is conditioned to experimental data of the first monitoring station, it reproduces, at the
remaining stations, not only the variograms but also the main features of the experimental
time series.
The non-conditional simulation generated a time series of pH with excellent
reproduction of the spatial and time correlation of historical data.
156 GEOSTATISTICAL APPLICATIONS

pH P===~========~======~
Monitoring Station 2
~--~~~__..-.. . _ -..
-
_ ~.- ~~~---r;

13 200
12
16 0
II
10 120

~::
9
.::: •• •• • • • 80

40

6
o 100 200 300 400 500
14
days
Monitoring Station 3
pH ~==~=============q ~~~~--~~~--~

13 200
12
160
II

~• '::
10

•••••
6
o 100 200 300 400 500
10 12 14
days pH

Monitoring Station 4
pH ~==========~====~ ~--~--~--------~
13
20 0
12
II 160

10 •• •• •• • ••••••••...•••••• •••••••• • •• • ••• . •••• • 120

n
80

40 ~ ..

o 100 200 300 400 500


10 12 14
days pH

Fig. 5 - Conditional Simulation of pH time series and histograms


SOARES ET AL. ON SPACE·TIME SERIES 157

Monitoring Sta;::.t:;:i;o:.:n;.:..:::2~~~~~~~~_......~~..,
DO ~====~~====~~====~

~ .•••••~••
80
-
1--1-
60

40

20

40 b===~==~====~======~
100 200 300 400 500
days
40 '0 60 70 80 90 100
DO

90 . . . .......... . . ... ....... . ... . .... . ...... .. . . 80

60

40

40 b=====================~
100 200 300 400 500
20

0 "-
rh--
40 '0 '0 70 80 90 100
days DO

DO I-
~==========~=========M~on=i=to=rii:ngSta~t~io~n~4~~-r......--'~~_~-r_~
90 80 1-·

60 I--

40

20

100 200 300 400 500


days
40 '0 60 70 80 90 100
DO

Fig. 6 . Conditional Simulation of Dissolved Oxygen time series and histograms.


158 GEOSTATISTICAL APPLICATIONS

Monitoring Station 1 Monitoring Station 2


y(dl y(dl r---------..::~-----_,

....... ,.......,
3.2 1.2
..··0
," , ............
2.4

.... ", 0.9

0.6
.. .......
1.6

0.3 "
0.8
............. .............
0.0 ~~~~~~~~~~~~~~~~~~
o 100 300 400 100 200 300 400
days

Monitoring Station 3
y(dl

0 16

, ,
0. 12 ...',..... ....... .......
..... ,-
0.08
"
.........
0.04

0.00 h~~~~~~~~~~~~~...--1
o 100 200 300 4()()
doys

Fig. 7a - Conditional Simulation of pH time variograms.

Monitoring Station 1 Monitoring Station 2


ltd) y(dl

......,, ",,, ....."


64 , 128

48
"
", 96 ". " ....... "
",
-..,.
32 II
0"
','
•••••••
64 ......,," , ,
16 32

100 200 300 400 100 200 300 400


doys doys

Monitoring Station 3
,(dl

• ,'0 ' , ""


6' ," , " "
" " "
48
"
.....
",
" "

32

16

0
0 100 200 300 400
doys

Fig. 7b - Conditional Simulation of Dissolved Oxygen time variograms.


SOARES ET AL. ON SPACE-TIME SERIES 159

Monitoring Sta;.:t.;cio:.:n;....::.l~~~__~_~~~~~...,...,
PH
200

160

120

.0
40

100 300
"'" 12 !4

PII F~====~====~==========M=o=n=I=·t=o=ri~ng fStartrio_n__2~~~___.-___~r-----~


Il
200

160

120

'0
40

300
100 200
do,.. "'" 14

PH ~======~====~======~M=o=n=i=
to=r~
ingStartrio_n~3~-r_____r -__________-.~
Il
200

" 160

120

80

40

"'" 10
_"
12 14

~=====================M
==o=n=it=o=r~
ingStartrio~nr-4__-._____r -__________- . ,
PH I:-
13 ..•..•••••.•.••••• . •.••.•••.•..
210

"
II .- .............. -.... __ ... ..... .. .. . ....... .... -.......... . 200

10 . .... . ........... . ... . ... . .......... . ... . . .. .. . ... . . . ..... . ... .


ISO
... ...... .... ..........

~: : : :
100

'0

100 200 300


do,.. "'" 10
pH
12 14

Fig. 8a - Non Conditional Simulation of pH time series and histograms.


Fig. 8b - Non Conditional Simulation of pH time variograms .

CONCLUSIONS

The presented stochastic data driven approach aims to generate a set of


realisations reproducing some basic statistics regarding the contiguity in space and time of
relevant variables of a river water quality. With the stochastic realisations of time-series
one can visualise the joint behaviour of the water quality characteristics and to predict
extreme scenarios in the environmental system.
The crucial point of the proposed methodology consists on the definition of the
bidistributions between real and estimated values. They should generate posterior time-
series reproducing the same space and time covariances and marginal histograms for each
monitoring station as the equivalent statistics of historical data. It means that the pattern
of neighbourhood values of a space and time location to be simulated must be chosen in
order to represent the relevant covariances in space and time.
This simple and easy to implement data driven approach has one limitation: one
can generate time-series only in the spatial locations (and with the time periodicity) of
the historical data, corresponding to the monitoring stations ofthe presented case study.
However, in these situations it typically is useless to simulate the time behaviour of a
pollutant between two monitoring stations. Unless one has another pollutant source
SOARES ET AL. ON SPACE-TIME SERIES 161
between them, any expected value in the middle of two stations belongs to the interval of
its values.

REFERENCES

Deutsch, C., and Journel, A , 1992, GSLIB : Geostatistical Software Library and User 's
Guide,Oxford University Press, New York

Deutsch, C. , 1994, "Kriging with Strings of Data", Mathematical Geology, Vol 26, n- 5,
pp 623-638.

Johnson, M., 1987, Multivariate Statistical Simulation, John Wiley & Sons, New York

Journel, A, and Gomez-Hernandez, 1., 1989, Stochastic Imaging of the Wilmington


Clastic Sequence, SPE paper # 19857

Law, A, and Kelton, D., 1982, Simulation Modeling and Analysis, McGraw Hill Int.Ed.,
New York

Ripley, B., 1987, Stochastic Simulation, John Wiley & Sons, New York
I
~
l 2 2
Gary N. Kuhn , Wayne E. Woldt , David D. Jones , Dennis D. Schulte3
I'I'
I;
I
Ii SOLID WASTE DISPOSAL SITE CHARACTERIZA nON USING NON-INTRUSIVE
ELECTROMAGNETIC SURVEY TECHNIQUES AND GEOSTATISnCS

\
~ I

REFERENCE: Kuhn G. N., Woldt W. E., Jones D. D., Schulte D. D., "Solid Waste
Disposal Site Characterization Using Non-Intrusive Electromagnetic Survey Tech-
niques and Geostatistics," Geostatistics for Environmental and Geotechnical Applica-
tions, ASTM STP 1283, R. M. Srivastava, S. Rouhani, M. V. Cromer, A. 1. Johnson, A. 1.
Desbarats, Eds., American Society for Testing and Materials, Philadelphia, 1995.

ABSTRACT: Prior to the research reported in this paper, a site-specific hydrogeologic


investigation was developed for a closed solid waste facility in Eastern Nebraska using
phased subsurface characterizations. Based on the findings of this prior investigation, a
surface based geoelectric survey using electromagnetic induction to measure subsurface
conductivity was implemented to delineate the vertical and horizontal extent of buried
waste and subsurface contamination. This technique proved to be a key non-intrusive,
cost-effective element in the refinement of the second phase of the hydrogeologic
investigation.

Three-dimensional ordinary kriging was used to estimate conductivity values at


unsampled locations. These estimates were utilized to prepare a contaminant plume map
and a cross section depicting interpreted subsurface features. Pertinent subsurface
features were identified by associating a unique range of conductivity values to that of
solid waste, saturated and unsaturated soils and possible leachate migrating from the
identified disposal areas.

KEYWORDS: Geoelectrics, Electromagnetics, Geostatistics, Conductivity, Leachate,


Hydrogeology, Vadose

IGraduate Student, University of Nebraska - Lincoln, Department of Biological Systems


Engineering
2Assistant Professor, University of Nebraska - Lincoln, Department of Biological Systems
Engineering
3Professor, University of Nebraska - Lincoln, Department of Biological Systems Engineering

162
KUHN ET AL. ON SOLID WASTE DISPOSAL 163

INTRODUCTION

Past landfill management and operational practices in the United States have created
environmental problems and are commonly associated with soil and groundwater
contamination. These landfills have either been upgraded to meet current State and
Federal legislation or have closed. As a result, the number of operational landfills has
decreased from over 20,000 in 1978 to approximately 3,300 in 1994. Because of strict
State and Federal legislation passed dealing with closure of these landfills, the severe
impact that past operations had on the environment is becoming more apparent.

For example, in 1987 the State of Nebraska required the Nebraska Department of
Environmental Quality (NDEQ) to conduct a comprehensive assessment of all
community solid waste disposal sites (SWDS). The purpose of this assessment was to
ascertain compliance of SWDS to standards established by the Nebraska Environmental
Protection Act (NEPA) and the Federal Resource Conservation and Recovery Act
Subtitle D (RCRA Subtitle D)(SCS 1991).

In 1991, nearly 210 landfills in Nebraska had ceased operations or were recommended for
closure by NDEQ. This recommendation was based on insufficient capacities or
significant constraints imposed on owners to maintain compliance with RCRA Subtitle D
requirements (SCS 1991). Due to the risk posed to Nebraska's surface and groundwater
by the existence of unlicensed landfills, the NDEQ focused on closure activities.
Information compiled by NDEQ identified sites located near public or private drinking
water sources, that were underlain by a shallow water table surface, or were located in a
100 year flood plane~

The survey revealed that a significant number of SWDS should have further
investigation. Based on this study, the NDEQ directed most of its efforts at the currently
unregulated sites utilized primarily by rural communities (NDEQ 1990). These sites have
,
not been subject to regulation since 1972 when the Whitney Amendment to the Nebraska I
Environmental Protection Act specifically exempted all cities within the second class L
(5,000 population or less) and villages from state solid waste rules and regulations
(NDEQ 1990). Although these sites were exempt from state solid waste regulations,
NDEQ revoked this amendment in 1991 when RCRA Subtitle D was reauthorized.

As solid waste disposal sites are forced to close over the next few years, hundreds of
millions of dollars will be spent in the United States to identify, characterize and
remediate sites contaminated with hazardous materials. Traditional site investigation
techniques typically include compiling hydrogeologic and contaminant fate and transport
information from testhole and groundwater monitoring well data. Commonly,
background information is limited until the results of the first round of groundwater
samples is available. Only then does it become apparent that the plume may not be
completely delineated and additional monitoring wells are required.
164 GEOSTATISTICAL APPLICATIONS

To address this problem, non-intrusive field methods and geostatistical analysis tools
were utilized to gather preliminary subsurface information pertaining to hydrogeologic
features, horizontal and vertical extents of wastes and suspected leachate plumes. This
information can be utilized to optimize and minimize testhole or permanent monitoring
well locations.

LITERATURE REVIEW

Electromagnetic (EM) surveying techniques, combined with appropriate geostatistical


analyses, are rapid and non-intrusive methods of characterizing subsurface environments.
The non-intrusive nature of this technique reduces drilling or other intrusive investigative
tools. The technique of EM surveying is based on the principle of utilizing varying
subsurface conductivity measurements as an indication of differing geologic and or other
subsurface constructs.

Surface electrical methods have been used successfully in many types of subsurface
investigations. Kelly (1976) showed that the d-c resistivity method can be effective in
delineating a plume moving off-site from a landfill. The use of EM data sources for
delineation of contaminated groundwater has been described by Greenhouse and Siaine
(1986). French et al. (1988) utilized geoelectric surveying to identify anomalous regions
to focus subsequent boring and sampling activities. Hagemeister (1993) identified
potential waste volumes and suspected contaminant migration present at an unregulated
landfill. In each case, differing electrical conductivity was interpreted as an indication of
changes in the systems being investigated.

Geostatistics has been utilized in numerous investigations to estimate expected values at


unsampled locations. Cooper and Istok (1988) utilized geostatistics to estimate and map
contaminant concentrations and estimate errors in a groundwater plume from a set of
measured contaminant concentrations. Cressie et al. (1989) prepared kriged estimate and
error maps to predict a migration pathway of radionuclide contaminants from a potential
high-level nuclear waste repository site. Woldt (1990) mapped the location of a
suspected contaminant plume based on observed geoelectric measurements and
geostatistics. Hagemeister (1993) utilized geostatistics to map subsurface electrical
conductivity in two dimensional cross sections across a site. In each case, kriged estimate
':
;1 and error maps were prepared to assist in the interpretation of the measured data.

DATA COLLECTION

f A three dimensional data set, developed by obtaining readings at several sounding depths
\ across a gridded area, was subjected to geostatistical analyses. This data set was utilized
\ in conjunction with available background data to identify pertinent subsurface features
\ and approximate their general locations. The background data included boring logs,
\ groundwater analytical reports and industrial waste disposal permits. These permits
\\,
KUHN ET AL. ON SOLID WASTE DISPOSAL 165

allowed the disposal of industrial wastes until the late 1970' s. The methods utilized
during this study consisted of establishing a sampling grid, completing an
electromagnetic survey, and performing a geostatistical analysis. Each procedure was
directed towards non-intrusive characterization of the subsurface environment. Existing
testhole data was correlated with the predicted locations of pertinent site features for
validation purposes.

Site Description

Based upon information obtained from the NDEQ, the study site was operated as a
"trench and fill" SWDS from 1975 to 1987. During this time, the owner accepted
domestic and industrial waste from nearby rural communities. Information pertaining to
the actual quantities received are not available.

The site and the surrounding areas are located near the eastern most edge of the Nebraska
Sandhills region. The topography of this region is mostly undulating to rolling. The
elevation of the site is approximately 460 to 466 meters above mean sea level (MSL) near
the northeast and southwest corners, respectively (NDEQ 1990). The surface geology
consists of approximately 38 to 42 meters of fine to medium grain sands interbedded with
coarse sand and fine gravel deposits, which is characteristic of this region.

The uppermost monitorable aquifer is located in sand and gravel deposits of the High
Plains aquifer system. The water table is approximately 11 and 15 meters below grade
level (BGL) in the northeast and southwest corners of the site, respectively, and the
saturated thickness of the unconfined aquifer is approximately 27 meters (NDEQ 1990).
Based on regional bedrock maps for this area, it appears that the top of the uppermost
confining unit is the Niobrara Shale formation that underlies the water table aquifer at an
approximate elevation ranging from 422 to 424 meters above MSL near the northeast and
southeast corners of the site, respectively.
I
Under a Multi-Site Cooperative Agreement with Region VII of the Environmental I
Protection Agency (EPA), the NDEQ performed a Preliminary Assessment (PA) at the
site to assess the threat posed by the site to human health and the environment. The
NDEQ concluded that leachate from the site resulted in a leachate contaminant plume
migrating in an east-northeast direction towards a river 2.5 kilometers away. Because of
the low human and livestock population in the area, no evidence was found indicating
that the site posed an immediate threat to human health and the environment (NDEQ
1990).

Interviews with the site owner revealed that the standard operating procedures involved
excavating a 5 meter deep cell with a backhoe, depositing refuse at the toe of the working
face, compacting the refuse and providing 15 centimeters of daily cover material. After
each cell was completely filled, 1 to 1.5 meters of silty clay was placed on top of the
waste as a final cover. Based on this information, MSL elevations were assigned to the
pertinent subsurface features and are presented in Table 1.
166 GEOSTATISTICAL APPLICATIONS

Table 1. Approximate Elevation of Pertinent Site Features (meters)


Site Feature Southwest Corner Northeast Corner
Surface 466 460
Bottom of Trench 461 455
Water Table 451 449
Top of Bedrock 424 422

Sampling Grid

Sampling point locations were established based on minimizing data collection efforts,
maintaining minimum measurement support volumes of each instrument, and spatially
defining the study area. The sampling point spacing utilized at the site was
approximately 30 meters in the north and east directions and extended nearly 30 meters
beyond all four property boundaries. The horizontal extent was selected based on the
geology and obtaining an adequate number of sampling points beyond the limits of the
suspected landfill cells to establish background subsurface conductivity levels.

Electromagnetic Survey

Electromagnetic techniques measure terrain conductivity to identify geologic and other


subsurface formations. In most environmental EM applications, differing conductivity
measurements are interpreted as a change in geologic formations or subsurface conditions
(McNeill 1980). The EM instrument operates by generating alternating current loops
with a transmitter coil (Tx). A time-varying magnetic field arising from the alternating
current induces secondary currents sensed by a receiver coil (Rx) along with the primary

'I:
field. The EM receiver coil intercepts a portion of the magnetic field from each loop
I
generated by the transmitter coil and results in an output voltage which is also linearly
proportional to the terrain conductivity. The resulting reading is in milli-Siemans per
meter (mS/m)
II
~: The reading obtained from the EM instruments is a conductivity measurement averaged
iji over a volume of subsurface media. Because the effective depths of penetration are
small in comparison to the overall horizontal and vertical dimensions of the site, these
~' readings were interpreted as being representative of a sampling point at the calculated
,!
I effective depth.
Ii
The effective depth of penetration by the induced current is directly proportional to the
intercoil spacing and depends on the orientation of the instrument. By varying the
intercoil spacing, conductivity measurements can be collected at varying depths. Also,
operating the instrument in the horizontal dipole mode reduces the effective depth of
penetration by approximately one halfthat of the vertical dipole mode. Therefore, the
instrument was operated in the horizontal and vertical dipole positions at four different
intercoil spacings to obtain readings at eight different depths.
KUHN ET AL. ON SOLID WASTE DISPOSAL 167
Theoretically, the total instrument response represents a weighted average of subsurface
conductivities with a depth of infinity, but it does have practical limits. Interpretation, or
modeling, of geophysical data to determine a reasonable unique solution to the nonunique
problem was not performed. Although modeling this data provides a more
comprehensive interpretation of the data set, the preliminary nature ofthis research did
not warrant the level effort involved with the modeling process. Instead, Hagemeister
(1993) calculated effective exploration depths offour intercoil spacings for both the
vertical and horizontal dipole modes. These calculations are based on the assumption that
60 percent of the total signal contribution over a volume of subsurface media is
associated with a discernible layer. Based on the small diameter and thickness of the
support volume, in relation to the overall area of the site and the preliminary nature of the
investigation, each instrument reading was assigned to a point located at the centroid of
the calculated effective depth of penetration. Table 2 depicts the exploration depths at
various intercoil spacings.

Table 2. Exploration Depths


Intercoil Spacing Exploration Depth (meter)
Horizontal Mode Vertical Mode
3.7 meters 1.0 2.5
10.0 meters 3.5 6.5
20.0 meters 8.0 13.0
40.0 meters 15.5 26.0

GEOSTATISTICAL ANALYSIS

Environmental professionals are often confronted with the problem of providing detailed
information about a site based on a minimum number of sampling points. Geostatistics ,
provides a means of utilizing spatial continuity for estimating the expected value at I
unsampled locations. Geostatistics is commonly utilized to describe the spatial continuity I.
of earth science data and aims at understanding and modeling the spatial variability of the
data.

The geostatistical analytical process for this study consisted of 1) describing and
understanding the statistical distribution of the data, 2) modeling the spatial variability of
the data, 3) estimating expected values at unsampled locations and 4) computing
estimation variance values at the unsampled locations.

Univariate Description

Univariate description deals with organizing, presenting and summarizing data and
provides an effective means of describing the data by identifying outliers and extreme
values. The univariate descriptive tools utilized to analyze the conductivity data set were:
168 GEOSTATISTICAL APPLICATIONS

700 ,----------------------,
1) histograms, 2) probability plots and 3)
summary statistics. Because the data set is
618
600
three dimensional, a descriptive scatter plot
could not effectively be obtained.
500
Geo-EAS 1.2.1 Geostatistical
C>400
Environmental Assessment Software
5 (Englund and Sparks 1988) was utilized to
g.
~ 300 prepare histograms and probability plots for
both the observed data set and logarithmic
200
transformations of the observed data set.
Characteristic of many environmental data
100
sets, the observed data exhibited a large
nwnber of low values which offset the mean
o of the data distribution to the left of the
Conductivity (mmhoslm)
median. The data were transformed to
prepare lognormal histograms and
Figure 1. Histogram
probability plots to determine if the data
exhibits a lognormal distribution .. The tests
for lognormality indicated that the data does not approach this distribution. Therefore,
the observed data set was utilized for the analysis. Figure 1 presents a histogram plot for
the observed conductivity data set.

The summary statistics presented in Table 3 nwnerically describe the location, spread and
shape of the observed data distribution.

Table 3. Observed Value Summary Statistics


Number 1679 Minimum 0.5
Mean 23 .8 25th Percentile 10.2
Variance 263.6 Median 22.7
Standard Deviation 16.2 75th Percentile 31.9
Skewness 1.1 Maximum 125.0

Experimental Variogram

A variogram is a plot of the variance, or one-half the mean squared difference, of paired
data points as a function of the distance between the two points (Deutsch and Journel
1992). An omnidirectional variogram can be developed to obtain a general understanding
of the spatial characteristics of the sample data. The omnidirectional variogram does not
take into account spatial continuity changes due to directional changes in the data.
Therefore, directional variograms are developed to identify these changes, if present. To
ensure a more realistic sample variogram, the window for the lag distance did not extend
greater than one-half the length or width of the data set. The lag distance between points
KUHN ET AL. ON SOLID WASTE DISPOSAL 169
was also restricted such that a minimum of 30 pairs per lag distance were available to
increase the confidence of variogram calculations.

GSLIB - Geostatistical Software Library and User's Guide (Deutsch and Journe11992)
was utilized to develop directional and omnidirectional experimental variograms from the
three dimensional conductivity data set. Generally, two sets of directional variograms
were developed by restricting the paired data points to be either horizontally coplanar or
vertically cocolumnar.
Attempts to identify directions of maximum and minimum continuity within a horizontal
plane revealed the same structure as the omnidirectional variogram in the all directions.
Therefore, isotropic conditions were considered for the horizontal plane. Paired data

EM CONDUCTIVITY DATA SET


3SO.OO
I

.--- . .
-----
JOO.OO I

/' :
250.00 --7
.
E
e 100.00
t V
g' 0
·C 1 /

~ lSO.00
/"
100.00 --
50.00

0.00
o so 100 ISO 200 2SO

Lag DistaDce (meters)

Figure 5. Schematic of Interpreted Subsurface Environment


points within a 250 meter horizontal search region generally followed the pattern of the
omnidirectional variogram. This indicates that the kriging estimation process was not
significantly influenced by the orientation of the principal axis of the search
neighborhood and data orientation within this horizontal search region. The experimental
variograms for the horizontal plane and the vertical direction are presented in Figure 2.

Model Yario"ram

Once an acceptable experimental variogram was developed, the model variogram was
constructed. Variogram modeling entailed fitting a mathematical function, using visual
techniques, to the experimental variogram points by varying the model type and the
nugget effect, sill and range values until the model variogram closely resembled the
experimental variogram.
170 GEOSTATISTICAL APPLICATIONS

The exponential variogram model was fit to both the horizontal omnidirectional and the
vertical variograms (Figure 2) utilizing the parameters presented in Table 4. Although
the nugget and sill are identical, the range significantly decreases in the vertical direction.
This is characteristic of geometric anisotropy and is commonly encountered in earth
SCIence.

Table 4. Variogram Model Parameters


Parameter Horizontal Model Vertical Model
Model Structure Exponential Exponential
Range 250 50
Sill 180 180
Nugget Effect 130 130

Cross Validation

Model variograms were cross validated to compare the sample point values to the
estimated values at those locations. It is important to develop a variogram model that
, I would minimize the standard deviation of the estimation error as determined by the cross
validation process. A variogram model that produces good results does not necessarily
I , indicate that the estimation at unknown locations will be accurate. However, good results
from cross validation will suggest, with more confidence, the effectiveness ofthe selected
model.
, "
,i
Cross validation consists of removing a data point from the data set and calculating an
estimated value utilizing the model variogram. Once the estimate af6 is calculated, a
comparison can be made between the estimated and observed values at each sampling
point by calculating the difference between the two values.

The three summary statistics utilized to evaluate the cross validation results are : 1)
average kriging error (AKE), 2) mean squared error (MSE) and 3) standardized mean
squared error (SMSE) (Woldt 1990). The AKE provides a measure of the degree of bias
introduced by the kriging process and should equal 0 if the data is unbiased. The MSE
should be less than the variance of the measured values. The SMSE is a measure of the
consistency and is satisfied if the SMSE is within the interval 1.0±[2(2/n)1/2]. The results
are summarized in Table 5 along with their calculated expected values.

Table 5. Cross Validation Summary Statistics


AKE MSE SMSE
Expected Value 0.0 <263.6 0.931 to 1.069
Cross validation Results -0.03 131.0 0.89

I'I
I
J,
KUHN ET AL. ON SOLID WASTE DISPOSAL 171
As depicted in Table 5, the criteria meet the recommended AKE and MSE values and is
just outside the range of expected SMSE values. These results are generally considered
acceptable.

Ordinary krigin~

Ordinary point kriging was selected to estimate expected values at unsampled locations.
This method was selected because it is a linear unbiased estimator that attempts to
minimize the error variance and generally has the lowest mean absolute error and mean
squared error in comparison to other estimation methods (i.e. polygonal, triangulation,
local sample mean, and inverse distance squared). _gSLIB (Deutsch and Journel 1992)
was utilized to calculate the expected values at 8,000 unsampled locations, or nodes, on a
20-x 20-~ 20 grid from the three dimensional conductivity data set. The nodes were
~paced at2S" meters in the north and east horizontal directions and 2 meters in the vertical
direction. These spacings were selected based on the anticipated spatial orientation and
depths of the pertinent subsurface features.

Search Nei~hborhood

The search neighborhood was established based on the following criteria: 1) selecting the
greatest distance on the variogram model that closely fit the experimental variogram, 2)
at least 30 pairs were utilized to calculate the experimental variogram at each point and 3)
the distance did not exceed half the length ofthe horizontal sampling grid diagonal.

The search neighborhood was defined in the horizontal plane at 250 meters. The
geometric anisotropy limited the search to within 25 meters in the vertical direction
which reflects the region of the variogram with the higher level of confidence.

Background Conductivities

Generally, the expected values located west and south of the site and outside the property
boundaries were considered to represent background subsurface conductivities, or
I,
expected values assumed not to be impacted by past site activities. This was established
based on the groundwater flow direction. These background values generally ranged I:"
Iii
from less than 0 mS/m to 12 mS/m in the vadose zone and from 12 mS/m to 24 mS/m
below the water table. These ranges of values established the basis of interpretation for
identifying hydrogeologic features, landfill cells and potential leachate migrating from
the cells.

DISCUSSION

The previous sections discussed a methodology that can be utilized to interpret surface
based electrical data in an effort to construct reliable maps of suspected subsurface
features. Based on the selected cross sections of expected conductivity values presented
172 GEOSTATISTICAL APPLICATIONS

in Figures 3 and 4 and limited knowledge of the site, interpretations of: 1) site specific
hydrogeology, 2) horizontal and vertical extents of waste and 3) potential sources for
leachate migration were developed

Hydro geolo (;:y

Station 000 North (Figure 3) depicts a vertical cross section of the background expected
subsurface conductivity values for background reference. This station is located
upgradient of the site and was utilized as an indication of subsurface conditions not
impacted by past landfill operations. information obtained from NDEQ records, the
natural subsurface environment adjacent to the boreholes consists of fine to medium sand
with a static water table elevation near 450 meters. Therefore, 0 to 12 and 12 to 24 mS/m
were determined to represent unsaturated and saturated fine to medium sands,
respectively. A variance from these ranges was interpreted as an indication of differing
subsurface structures, or impact from the landfill operation.

Landfill Cell Identification

Based on the nature of the site operations, landfill cells are expected to be located from
the ground surface down to an approximate depth of 5 meters. The selected vertical cross
sections (Figure 3) depicted expect conductivity values near the surface ranging in excess
of 24 mS/m within the 0 to 5 meter BGL depth range.

Generally, the landfill cells appear to cover the entire site. Based on the vertical cross
section maps (Figure 3), two areas exhibiting conductivity values in excess of24 mS/m
were elongated in the north and south directions with the centerlines located near stations
150 East and 300 East. Based on a personal interview with the owner, it appears that
these two areas are actually several landfill cells spaced close together.

Potential Leachate Mi(;:ration

The primary concern with SWDS is the potential leachate contamination associated with
the nature of the operation. Leachate is a liquid that consists of refuse moisture and all
precipitation that mixes with this moisture as it migrates through the landfill. Leachate
migrating from a landfill naturally due to gravity or forced out as a result of the
consolidation of refuse may transport contaminants from the refuse to the groundwater
environment.

The presence of elevated conductivity values within the vadose and directly below the
landfill cells was interpreted as potential leachate or possible instrument interference
from the overlying waste. These conductivity values ranged from 12 mS/m to 24 mS/m
(Figure 3).
STATION 100 NORTH
_Uno
_UnoSTATION 000 NORTH I I

-
-
~
Q)
'ai
465

-
,-~ ,,,"
~
Q)
'ai
5
-
I~
5 455
2
455
~ :.,
~ 2
o 0 .
"" r
ciQ" 450

~
450
.,c «i= "
30

"
~ 445
" > 445
"
\H > 12 UJ 12
W
-I 440
• -I
W
<:
~
W -I
a... ~
:2:
435
(/)
:2:
~
.,
("")
Q
100 200 300 <CO
100 200 300 <CO

EASTING (meters) EASTING (meters)


'"'" A
c
~
...
~ I

_
Z

_
::t.
Q STATION 225 NORTH STATION 425 NORTH m
= -i
»r
....
- ....
Q

M
>oi ~
465
-~
465

I
.... 0
z
-
"CI
...
~ 2Q) - Q)
'ai en
0
5 5

r
c

I~
~ 455
Q.
.,
2 2
. 0

-0
~
~

~
o
i=
::;
w
450

445
0
i=
~445
30

""
12
:E
»en
-i
m
~ 12

W •
"CI
ll 440 II 440
0
en
-I -I -0
(/) 435 (/) 435 0
:2: :2:
en
»r
430
100 200 300 400 0 100 200 300 <CO

EASTING (meters) EASTING (meters)


-...,J
VJ
ELEVATION 440 meter.>
(satuntm condidons)

"fj
ciQ' 400

.,=
n>
~
.
.!l
! 300
~
i!i
.,==
Q
~200
N'
Q
0
z
=
....
e:. 100

.,(j
Q 100 200 300
"""
'"'" EASTING (meten:)
rJJ
n>
....
t'l

S'
=
...,
Q
ELEV A TlON 452 meter.>
(unsaturMed conditions Ma,.,...tertable)

.,~
ciQ'
n>
Q.
0
....
~
~

~
~
'Q

100 200 300 400


FASTING (meters)
KUHN ET AL. ON SOLID WASTE DISPOSAL 175
Intemreted Subsurface Environment

Based on the information obtained from this study, it appears that leachate may have
migrated from the landfill and impacted the groundwater table. The plume appears to be
migrating horizontally in the northeast direction with a vertical component. Figure 5
consists of a plan view and cross section illustrating an interpreted leachate plume located
relative to the identified waste.

PROPERTY LINE

Figure S. Schematic of Interpreted Subsurface Environment

Computer Software Support

All data description and estimation efforts requiring computer software support was
completed utilizing an IBM 80386 processor. Software support utilized in this study
consisted of Geo-EAS, GSLIB and TecPlot Version 6.0 (TecPlot).

Probability and histogram plots describing the 1679 observed conductivity values were
prepared utilizing Geo-EAS. Geo-EAS generated on-screen plots within minutes
allowing efforts to be focused on the descriptive analyses.

The expected values. are presented on cross section contour maps included as Figures 3
and 4. The three dimensional expected value data set was imported into TecPlot.
TecPlot utilizes linear interpolation to construct each contour line. Tecplot generated
cross section maps based on a three dimensional data set. By fixing one dimension, a
cross section at a desired location was generated within minutes.
176 GEOSTATISTICAL APPLICATIONS

CONCLUSIONS

Geostatistical analysis demonstrated that the data is spatial correlated which allowed for
an interpreted subsurface model to be developed based on kriged estimated values. As an
alternative to traditional intrusive characterization techniques, surface based
electromagnetic surveying techniques proved to be a key non-intrusive, cost-effective
element in the refinement of the second phase of the hydrogeologic investigation.
Review of kriging error maps can further refine this second phase by focusing on the
areas with the largest error. This study demonstrated that this methodology, as a
preliminary field screening tool, can provide sufficient information to optimize the
placement and minimize the number of permanent groundwater monitoring wells.

REFERENCES

Barlow, P.M., Ryan, B.J., 1985,"An Electromagnetic Method of Delineating Ground-


Water Contamination, Wood River Junction, Rhode Island," Selected Papers in
Hydrologic Sciences, U.S. Geological Survey Water-Supply Paper 2270, pp. 35-49.

Cooper, R.M., Istok, J.D., 1988, "Geostatistics Applied to Groundwater Contamination. I:


Methodology, "Journal of Environmental Engineering, Vol. 114, No.2., pp. 270-285.

Cressie, N.A., 1989, "Geostatistics," American Statistician, Vol. 43, pp. 197-202.

Deutsch, C.V., Journel, A.G., 1992, "GSLIB - Geostatistical Software Library and User's
Guide," Oxford University Press.

Environmental Protection Agency, 1994, "EPA Criteria for Municipal Solid Waste
Landfills," The Bureau of National Affairs, Inc., 40 CFR Part 258.

Englund, E. and Sparks, A., 1988, "Geo-EAS 1,2,1 User's Guide", EPA Report #60018-
911008, EPA-EMSL, Las Vegas, Nevada.

French, R.B., Williams, T.R., Foster, A.R., 1988, "Geophysical Surveys at a Superfund
Site, Western Processing, Washington," Symposium on the Application of Geophysics to
Engineering and Environmental Problems, Golden, Colorado, pp. 747-753.

Greenhouse, J.P., Slaine, DD, 1986, "Geophysical Modeling and Mapping of


Contaminated Groundwater Around Three Waste Disposal Sites in Southern Ontario,"
Canadian Geotechnical Journal, Vol. 23, pp. 372-384.

Hagemeister, M.E., 1993, "Systems Approach to Landfill Hazard Assessment with


Geophysics (SALHAG)," Unpublished Masters Thesis, University of Nebraska - Lincoln.

1994, "Handbook of Solid Waste Management," McGraw-Hill Publishing

It
Ii11
KUHN ET AL. ON SOLID WASTE DISPOSAL 177
Isaaks E.H., Srivastava R.M., 1989, "An Introduction to Applied Geostatistics," Oxford
University Press, New York.

Joumel, A., Huijbregts, C., 1978, "Mining Geostatistics," Academica Press, New York.

McNeill J.D., October 1980, "Electromagnetic Terrain Conductivity Measurement at


Low Induction Numbers," Geonics Limited Technical Note TN-6.

NDEQ, February 1990, "Ground Water Quality Investigation of Five Solid Waste
Disposal Sites in Nebraska", Nebraska Department of Environmental Quality.

SCS Engineers, December 1991, "Volume 1 - Recommendations to State and Local


Governments," Nebraska Solid Waste Mana~ement Plan, Nebraska Department of
Environemtnal Quality.

Woldt, W.E., 1990, "Ground Water Contamination Control: Detection and Remedial
Planning," Ph.D. Dissertation, University of Nebraska - Lincoln.

"

,,
:.,
"
,
,I

'.
I,:
"

Ii'
Ii'
'i,

,,,
Geotechnical and Earth Sciences Applications
I
'I
Craig H. Benson1 and Salwa M. Rashad 2
ENHANCED SUBSURFACE CHARACTERIZATION FOR PREDICTION OF
CONTAMINANT TRANSPORT USING CO-KRIGING

REFERENCE: Benson, C. H. and Rashad, S. M., "Enhanced Subsurface Character-


ization for Prediction of Contaminant Transport Using Co-Kriging," Geostatisticsfor
Environmental and Geotechnical Applications, ASTM STP 1283, R. Mohan Srivastava,
Shahrokh Rouhani, Marc V. Cromer, A. Ivan Johnson, Alexander J. Desbarats, Eds.,
American Society for Testing and Materials, 1996.

ABSTRACT: Groundwater flow and advective transport were simulated in a


heterogeneous synthetic aquifer. These simulations were conducted when the aquifer
was fully defined and when it was characterized using a limited amount of hard and soft
data (hydraulic conductivity data and soil classifications). Co-kriging was used to
combine the data types when estimating the hydraulic conductivity field throughout the
aquifer. Results of the flow and transport simulations showed that soil classifications
were useful in characterizing the hydraulic conductivity field and reducing errors in
statistics describing the plume.
KEYWORDS: kriging, co-kriging, ground water, contaminant transport, hydraulic
conductivity, soil classifications

INTRODUCTION

Simulating flow and contaminant transport is often an essential feature of


remediation projects dealing with contaminated groundwater. In recent years, numerous
sophisticated groundwater models have been developed to conduct such simulations. The
complexity of these models allows one to realistically simulate the fate of contaminants
provided properties of the aquifer affecting transport are adequately characterized.
Unfortunately, what level of characterization is "adequate" is unknown, especially at sites
where the subsurface is heterogeneous. Thus, when limited data are available to describe
subsurface conditions, predictions of contaminant transport can be uncertain even when
sophisticated models are used.
Although many factors affect the fate of groundwater contaminants, the spatial •
!'j
distribution of hydraulic conductivity is the primary factor affecting which pathways are II
active in transport (Webb and Anderson 1996). To better define these pathways,
additional data must be collected and analyzed. The most useful data are hydraulic
conductivity measurements. However, "hard" data such as hydraulic conductivity
measurements are expensive to obtain, especially if the data are to be collected from a
site that is contaminated. It is advantageous, therefore, to investigate the effectiveness of
using less expensive "soft" data, such as soil classifications, to reduce uncertainty. Soft
data can be readily collected using less expensive exploration techniques such a ground
penetrating radar, terrain resistivity surveys, or cone penetrometer soundings.

The objective of the project described in this paper was to evaluate how
characterizing the subsurface affects predictions of contaminant transport. Simulations of

1Assoc. Prof., Dept. of Civil & Environ. Eng., Univ. of Wisconsin, Madison, WI, 53706, USA.
2Asst. Scientist, Dept. of Civil & Environ. Eng., Univ. of Wisconsin, Madison, WI, 53706, USA.

181
182 GEOSTATISTICAL APPLICATIONS

groundwater flow and advective transport were conducted in a heterogeneous "synthetic


aquifer." The aquifer was characterized using various amounts of hard data (hydraulic
conductivities) and soft data (soil classifications). Co-kriging was used to combine the
two data types when estimating the hydraulic conductivity field. Similar uses of co-
kriging have been described by Seo et al. (1990 a,b) and Istok et al. (1993).

SYNTHETIC AQUIFER

Characteristics

A "synthetic aquifer" was used in this study because it can be fully-defined; that
is, the hydraulic properties throughout the aquifer are defined with certainty. In this
particular application, fully-defined means that hydraulic conductivities and soil
classifications can be assigned to every cell in the finite-difference grid used in
simulating flow and transport in the aquifer. Thus, flow and transport simulations
conducted with the "fully-defined" aquifer are representative of its "true" behavior.
Comparisons can then be made between results obtained using the fully-defined case and
cases where the aquifer has been characterized with a limited amount of sub-surface data.
This comparison provides a direct means to evaluate the inherent inaccuracies associated
with estimating subsurface conditions from a limited amount of information.

A schematic illustration of the aquifer is shown in Fig. 1. It is extremely


heterogeneous, as might be encountered in a supra-glacial depositional environment such
as those occurring in the upper midwestern United States (Mickelson 1986, Simpkins et
al. 1987). Details of the method used to design the aquifer are in Cooper and Benson
(1993). Although an attempt was made to create a realistic aquifer, the synthetic aquifer
was created without any site-specific data and thus may not be "geologically correct."
The reader should keep this limitation in mind when considering the results and
conclusions described later.

6 Clay

5 Clayey Sill

• Silty Sand

3 Ane Sand

2 Coarse-Medium
Sand

1 Clean Gravel

Upstream Downstream
Boundary Hydraulic Gradient. 0.01
• Boundary

FIG. 1 - Synthetic aquifer.


The aquifer is discretized into 12,500 cells that comprise a finite-difference grid
used in simulating flow and transport. The aquifer is segregated into 25 layers. Each
layer contains 20 rows and 25 columns of finite-difference cells. Each cell is 100 cm
BENSON AND RASHAD ON CO-KRIGING 183

long per side. Groundwater flow was induced by applying an average hydraulic gradient
of 0.01. Constant head boundary conditions were applied at the upstream and
downstream boundaries of the aquifer. No flow boundaries were applied along the
remaining surfaces of the aquifer.

An important feature of the aquifer is that soil types are layered to create
continuous and non-continuous soil lenses. Lenses with high hydraulic conductivity,
such as clean gravel and coarse to medium sand, simulate preferential flow paths that
might not be detected during a subsurface investigation. Low hydraulic conductivity
soils such as clayey silt and clay are layered to create pinches and stagnation points that
may cause the flow of groundwater to slow or even stop. These intricacies of the aquifer
also might not be detected during a subsurface investigation.

Hydraulic Conductivity of Geologic Units

A soil classification was assigned to each geologic unit (i.e., the geologic facies)
in the fully-defined synthetic aquifer. The soil classifications used to describe geology of
the aquifer are: (1) clean gravel, (2) coarse to medium sand, (3) fine sand, (4) silty sand,
(5) clayey silt, and (6) clay. These soil classifications are represented numerically using
the integers 1-6. The writers note that the integer ordering of these classifications is
arbitrary. Consequently, results somewhat different than those described herein may
have been obtained had a different categorical scheme been used.

Each cell in a given geologic unit was assigned a single realization from the
distribution of hydraulic conductivity corresponding to the unit. Single realizations were
generated using Monte Carlo simulation via inversion. In addition, no spatial correlation
was assumed to exists within a geologic unit. Thus, the correlation structure inherent in
the aquifer is due primarily to the relative location and size of the geologic units.

The triangular distribution (Fig. 2) was used to describe spatial variability in


hydraulic conductivity for a given soil type. The distribution is defined using an upper
bound (Kmax), a lower bound (Kmin), and the peak of the density function (Kp). To select
K max , Kmin, and Kp for each soil classification, a chart was developed that summarizes
hydraulic conductivities assigned to various soil types in thirteen publications (Fig. 3).
The hydraulic conductivities recommended by others were synthesized into a single
"composite chart" having the six different soil types that comprise the synthetic aquifer,
each with a corresponding range of hydraulic conductivities (Table 1).
.'
I.
1.1

fp __

Kmax

FIG. 2 - Distribution of hydraulic conductivity in a geologic unit.


184 GEOSTATISTICAL APPLICATIONS

Oaa (1985)
Bowl.. (1984)
Domenico &
Schwartz (1990)
~~;i~c~~~
HoH.& I I
Hough (1981)
Kovacs (1969)
Lee a1 81. (1983)
I r==~~~==~ ~=---------,
CleanGr.-..l

I ~I~~~E;5$~
eo.,..· IHdIumSand
AM .....

:::: D:;m :
McCar1l1y (1982)&
Means so

=~:~!~=
Cloy
SIIy"""
Cloy., . .
0 .." Sand. Sand l Grwel Mill

t a:
Sanci,Sll,CI",MIIctu,.
SmHh(1978)
SOW.~(l·~ I:' I
Whitlow (1983)

Compoeito

FIG. 3 - Range in hydraulic conductivities for different soil types.

TABLE 1 - Parameters describing hydraulic conductivity distributions.

Soil Type Kmin (em/sec) Kp (em/sec) Kmax (em/sec)


Clean Gravel 5 x 10- 1 5 x 100 5 x 102
Coarse - Med. Sand I x 10-3 5 x 10-2 I x 100
Fine Sand I x 10-4 5 x 10- 3 5 x 10- 2
Silty Sand 5 x 10-5 5 x 10-4 5 x 10- 3
Clayey Silt I x 10-7 I x 10-6 5 x 10- 5
Clay I x 10- 10 I x 10- 8 1 x 10- 6

Spatial Correlation Structure

The spatial correlation structure inherent in the soil type and hydraulic
conductivity fields was characterized by computing directional experimental variograms
in three dimensions. A model was then fit to the experimental variograms. A similar
approach was also used to characterize the spatial cross-correlation structure between
hydraulic conductivity and soil type.

Experimental variograms were computed using the program GAM3 from the
GSLIB geostatisticallibrary (Deutsch and Joumel 1992). The experimental variograms
were computed by:
N(h)
Y*In K (h) = 2 N(h) £.J
1 ~
[In K(xi + h) - In K(xi) ] 2 (1)
1=1
BENSON AND RASHAD ON CO-KRIGING 185

(2)

In Eqs. 1-2, Y*lnK(h) is the estimated variogram for InKs separated by the vector h, y* s(h)
is the estimated variogram for soil classifications (S), N(h) is the number of data pairs
separated approximately by the same vector h, and Xi is a generic location in the aquifer.
The cross-variogram between InK and S is computed as;
1 N(h)
Y~nK,S (h) = 2 N(h) L [S(Xj + h) - S(Xj)] [lnK(xj+h)-lnK(xD] (3)
1=1

The principal axes for soil classification were identified by computing a series of
experimental variograms each having a different orientation relative to the traditional
Cartesian axes. The analysis showed that mild anisotropy exists in the X- Y plane, with
the principal axis oriented 45° counterclockwise from the X-axis. For the vertical
direction, the principal axis coincided with the vertical (Z) axis (Benson and Rashad
1994).
The principal axes for the hydraulic conductivity field were assumed to
correspond to the principal axes for soil type because the hydraulic conductivity field was
generated directly from the soil type field. A similar assumption was made regarding the
cross-variogram (InK-soil type).

Experimental directional variograms for soil type and hydraulic conductivity


corresponding to the principal axes are shown in Figs. 4 a & b. The experimental cross-
variogram (InK vs. S) is shown in Fig. 4c. For each set of variograms, the range is largest
in the Y' direction and smallest in the Z' direction, which is consistent with the size and
shape of the geologic units shown in Fig. 1. In contrast, the sill is essentially the same for
the Y' and Z' directions, but is smaller in the X' direction.

A spherical model with no nugget was found to best represent the experimental
variograms. The spherical variogram is described by (Issaks and Srivastava 1989);

y(h) = C[ I.S(h/a) -D.S(h/a)3] ifh <a (4a)

=C
.
,
y(h) ifh ~ a (4b) ,•
:.1
where C is the sill and a is the range. Table 2 provides a summary of C and a for each
variogram.
The directional experimental variograms exhibit a mixture of geometric and zonal
anisotropies. Geometric anisotropy is characterized by directional variograms that have
approximately the same sill, but different range. In contrast, zonal anisotropy
corresponds to changes in the sill with direction, while the range remains nearly constant
(Issaks and Srivastava 1989). The X'-Z' anisotropy is primarily geometric, whereas the
X'- Y' and Z'- Y' anisotropies are primarily zonal.
186 GEOSTATISTICAL APPLICATIONS

Soli Classification .. ..D-


Variograms "....0 .... [3""
2 .0
b . lJ. . .2
.0. IJ. - Il. oil -/J.' 6 ' .,.,[3""6 . A ' 4
1lc 1 .6 ,,- /'
.~'" 1.2
>
E
(/)
" 0 .8

0 .4
"
X'. Y' . and Z are components of h
along the principle axes
( a)
0 .0
0 400 800 1200 1600 2000

Separation Distance, h (em)

InK Variograms
35 .0

30 .0

g" 25 .0
'"
.~
20 .0
>
E
"
(/) 15.0

10 .0
X', Y·. and Z are components of h
5. 0 along the principle axes

0 .0 L~(b~)~~~~~~~~~==~~~
o 500 1000 1500 2000
Separation Distance, h (em)

X', Y, and r afe components of h


along the principle axes
-2.0

""c
'>"
';ii -4.0
A
'E \
"
(/)
'"
~
e
-6.0
"",-6 "",!!J,
' <!~
() "l!~ 11." ,",
-8.0 u.. -0. -a." ~_""I!l
Cross-Semivariograms for
( c) Soil Classification and InK

Separation Distance. h (em)

FIG_4 - Experimental variograms: (a) soil classifications, (b) hydraulic conductivities,


and (c) soil classification-hydraulic conductivity_
BENSON AND RASHAD ON CO-KRIGING 187
TABLE 2 - Sill (C) and range (a) for variograms
Direction
Parameter X' Y' Z'
InK - Sill 25 32 27
InK - Range (em) 925 1840 500
Soil Class. - Sill 1.5 2.25 1.75
Soil Class. - Range (em) 925 1840 500 !;
Cross-Sill -5.8 -7.7 -6.5
Cross - Range (em) 925 1840 500

A three-level nested structure was used to combine the geometric and zonal
anisotropies into a single variogram model. The model has the form:

(5)

=
The model YI,j is a spherical function (Eq. 4) having C 1; it provides the basis for the
nested structure. The subscript j denotes the variable being described (S, InK, or S-lnK
for the cross-variogram). The separation distance hi is an "equivalent" distance:

h -
1-
J hX')2
ax'
h y,)2
ay'
h ,)2
( - + ( - + (z-
az' (6)

The weight WI,S corresponds to the smallest sill for soil classification (X' axis: sill = WI,S
= 1.5 for S). The models Y2 and Y3 are also spherical models and they are used to ensure
that sills corresponding to the Z' and Y' axes are preserved. That is:

W2,S = 1.75 - WI,S = 0.25 (7a)


and
w3,S = 2.25 - (WI,S + w2,S) = 0.50 (7b)

The equivalent distances h2 and h3 are: "


":1
(8a)
:1
II
and :1
r'
(8b)

A summary of the weights used for the soil classifications, hydraulic conductivity, and
cross-variogram models is contained in Table 3.

It is important to note that data for hydraulic conductivity and soil classification
were available for each cell in the aquifer when computing the cross-variogram. In more
realistic cases, both types of data will probably not be available at each location.
Problems associated with this disparity can be resolved by using the pseudo-cross
variogram (Meyers 1991).
188 GEOSTATISTICAL APPLICATIONS

TABLE 3 - Summary of variogram weights.

Variable wI w2 w3
Soil classification, S 1.5 0.25 0.5
Hydraulic 25.0 4.0 3.0
Conductivity, InK
Cross-Variogram -5.75 -0.75 -1.2
InK vs. S

FLOW AND ADVECTIVE TRANSPORT MODELS

MODFLOW

The three-dimensional finite difference program MODFLOW was used to


simulate steady-state saturated flow in the aquifer. MODFLOW uses a block-centered
finite difference scheme to solve the groundwater flow equation. A detailed description
of MODFLOW can be found in MacDonald and Harbaugh (1988). MODFLOW was
modified for use in this study by adding subprograms and changing the existing data
collection and storage procedures. A detailed description of these modifications can be
found in Benson and Rashad (1994).

For each field of hydraulic conductivity that was simulated, MODFLOW was
used to compute the total heads at each node and the total flow rate (Q) emanating from
cells at the downstream end. The hydraulic head field is used by the advective transport
model for simulating contaminant transport.

PATH3D

The program PATH3D (Zheng 1988) was used to simulate advective contaminant
transport. PATH3D is a general particle-tracking program for calculating groundwater
paths and travel times in steady-state or transient, two- or three-dimensional flow fields.
A detailed description of PATH3D can be found in Zheng (1988). Changes in PATH3D
were required before it could be used in this study. These changes included modifying
the algorithm for time step adjustment and modifying the post-processor to describe
characteristics of the plume. Details of these changes can be found in Benson and
Rashad (1994).

Transport was initiated by releasing contaminant fluid particles along a vertical


profile located at X =0, Y = 1000 cm, and Z =0 to 2500 cm. Eight particles were placed
in each cell along this profile (200 total particles in aquifer).

RESULTS AND ANALYSIS

Ground water flow and contaminant transport simulations were conducted for
three different conditions: (1) fully-defined aquifer, (2) partially defined aquifer using
only hydraulic conductivity data, and (3) partially-defined aquifer using hydraulic
conductivity and soil classification data. In the partially-defined cases, a co-kriging
program (based on the program COKB3D, Deutsch and Journel 1992) was used to
BENSON AND RASHAD ON CO-KRIGING 189
estimate hydraulic conductivity for each finite-difference cell in the aquifer using a liner
co-regionalization model for the variograms (see previous sections).

In this application, co-kriging was only used to estimate the primary variable,
hydraulic conductivity. In addition, point kriging was used instead of block kriging
because it was more easily implemented and the cells used to discretize the aquifer were
small. Nevertheless, a small error was introduced by point kriging.
The variogram models previously discussed were used to describe the spatial
correlation structure. Input for the co-kriging program included subsurface information
consisting of profiles of hydraulic conductivity or soil classifications. A description of
the co-kriging implementation can be found in Benson and Rashad (1994).

Estimating the Hydraulic Conductivity Field

Initial Comparison: Kriging vs. Co-Kriging An initial comparison was made


u

between the estimated hydraulic conductivity field and the fully-defined hydraulic
conductivity field along a transect (single row of cells; X=50 cm, Z=50 cm, Y=O to 2500
cm) through the aquifer. In one case, the hydraulic conductivity field was estimated with
kriging using only hydraulic conductivity data; co-kriging using hydraulic conductivity
and soil classification data was used for the other case.

The hydraulic conductivity fields estimated using kriging and co-kriging are
shown in Fig. 5. Hydraulic conductivities measured along three vertical profiles were
used as input when only kriging was conducted. The resulting estimated InK field is a
smooth, nearly linear interpolation between the profiles at which measurements were
made. The estimated InK field is very different from the "true" hydraulic conductivity
field obtained from the fully-defined synthetic aquifer. That is , the irregular spatial
variations in InK are not preserved.

~------,

",+-- Co-Kriging
-5.0 ,
/
·. ' ,
Knglng " :
)..
-15.0 ,
"
,

-20.0

o 500 1000 1500 2000 2500


Y(cm)

FIG. 5 - Estimated hydraulic conductivities along a transect.

Co-kriging was conducted using three profiles of hydraulic conductivity (primary


variable) and 11 profiles of soil classifications (secondary variable). Figure 5 shows that
190 GEOSTATISTICAL APPLICATIONS

addition of the secondary variable greatly improves the estimated hydraulic conductivity
field. The estimated InKs along the transect more closely resemble the "true" InKs.
However: even with co-kriging, the estimated field is smoother then the "true" field.

Selecting Profiles -- Various exploration schemes (i.e., collections of hydraulic


conductivity and soil classification profiles) were selected to evaluate the relative
effectiveness of different types of subsurface data in estimating the accuracy of the
hydraulic conductivity field. The initial exploration scheme consisted of five profiles of
hydraulic conductivity and five profiles of soil classifications. Subsequent schemes
incorporating more data were constructed by adding more profiles of either hydraulic
conductivities or soil classifications.

A consistent method was needed to select locations for additional profiles. The
writers chose to select subsequent profiles at locations where the co-kriging variance is
largest. These locations have the greatest uncertainty in the estimated hydraulic
conductivity. The writers note, however, that locations where the co-kriging variance is
largest are not necessarily the critical locations where uncertainty in hydraulic
conductivity has the greatest impact on contaminant transport. However, these critical
locations cannot be identified a priori, because under normal circumstances the detailed
characteristics of the aquifer are unknown.

The aforementioned methodology was used to select 14 exploration schemes.


The first seven schemes consist of five profiles of hydraulic conductivity (NK=5) and a
varying number of soil classification profiles (NS=O, 5, 9, 15,22,32, 125). The second
set of seven schemes was similar, except ten hydraulic conductivity profiles (NK=lO)
were used. The layout of each exploration scheme is contained in Benson and Rashad
(1994).

Precision of the Hydraulic Conductivity Field -- In this section, two statistics are
used to characterize the precision of the estimated field. These statistics are: the
maximum co-kriging variance and the mean co-kriging variance. The mean co-kriging
variance (cr~) is used to quantify the global estimation error. The mean co-kriging
variance is computed as:
H
.... 2 _~ ~ 2 (9)
uci -
,
.1..1 cr cki'
N c 1=

where cr~k,i is the co-kriging variance at the ith cell in the finite-difference grid and Nc is
the total number of grid points (Nc = 12,500). In each case, cr c2k,I· is the variance of the
primary variable (hydraulic conductivity) being estimated, as described in Issaks in
Srivastava (1989, p. 404).

The maximum co-kriging variance is shown in Fig. 6. The maximum co-kriging


variance is essentially the same for the schemes using five and ten profiles of hydraulic
conductivity; the co-kriging variance is only slightly larger for the scheme using five
profiles, More importantly, however, addition of soil classification profiles results in a
significant reduction in the maximum co-kriging variance.

The mean co-kriging variance is shown in Fig. 7 as a function of the number of soil
classification profiles. In this case, the error is significantly larger when five hydraulic
conductivity profiles are used instead of ten profiles. A significant difference is
BENSON AND RASHAD ON CO-KRIGING 191

C>
c
'6>
~ Cl> 25.0
OU
()~
E 'fii 20.0
E>
.~
~ 15.0

10.0 L-~~~...........J_~~~.........J.._~~~.........J....--1
Kriging 1 10 100 Fully-
Only Defined
No. of Soil Classification Profiles

FIG. 6 - Maximum co-kriging variance for InK.

,
Q)
u
c
C"IJ
.~
20.0 L
I

>
15.0 r - - --
Cl
c
:§> t
I
--- ""'3 ~
~
o 10.0 ~
()
c
C"IJ t
L
\ -e-N K=5
J
I
I
Q)
:E 5.0 ~ i- aT- -N K=10 I
0.0 L
t ~~~"""""" __ ~......J....._~ _ _ .....Jj
o 10 100 Fully-
Defined
No. of Soil Classification Profiles

FIG. 7 - Mean co-kriging variance for InK.


192 GEOSTATISTICAL APPLICATIONS

expected, because the mean co-kriging variance represents a global measure of


uncertainty whereas the maximum co-kriging variance is a local measure of uncertainty.
Adding more hydraulic conductivity profiles will have a significant effect on the
maximum co-kriging variance only if the profiles are located near or directly at the
location where the maximum co-kriging variance exists because the co-kriging variance
is a point measure. In contrast, because the mean co-kriging variance is a global measure
'i
"
of uncertainty, it will be reduced by adding more profiles, regardless of their location.

Figure 7 also shows that exploration schemes employing more soil classification
profiles with fewer hydraulic conductivity profiles can be as effective in reducing
uncertainty as schemes that simply use more hydraulic conductivity profiles. For
example, the scheme consisting of five hydraulic conductivity profiles and five soil
classification profiles has similar mean co-kriging variance as the scheme using ten
hydraulic conductivity profiles and no soil classification profiles. Furthermore, the
scheme using more soil classification profiles and fewer hydraulic conductivity profiles is
likely to be less expensive. Thus, a similar reduction in uncertainty can be obtained at
less cost.

Total Flow

One means to evaluate how well the aquifer is characterized is to compare the
total flow rate across the compliance surface for the fully-defined condition with the total
flow rate when the aquifer is characterized using a limited amount of subsurface data.
For the synthetic aquifer, the compliance surface was defined as the downstream
boundary (Fig. 1). If the flow rates are not nearly equal, then the aquifer is not
adequately characterized. If the flow rate is too high, low conductivity regions blocking
flow have been missed. In contrast, a flow rate that is too low is indicative of missing
preferential pathways (Fogg 1986).

Figure 8 shows total flow rate when the aquifer is characterized with 5 or 10
profiles of hydraulic conductivity and a variable number of soil classification profiles.
When no profiles of soil classifications are used (kriging only), the flow rate is one-third
to one-half the total flow rate. Apparently, the sampling program has inadequately
defined the preferential pathways controlling true total flow. However, when more soil
classification profiles are added, the flow rate begins to rise and then becomes equal (i.e.,
> 10 profiles) to the flow rate for the fully-defined condition.
Two other characteristics of Fig. 8 are notable. First, similar flow rates were
obtained when five or ten profiles of hydraulic conductivity (but no soil classifications)
were used to characterize the aquifer. Apparently, neither set of measurements is of
sufficient extent to capture the key features controlling flow. Second, the aquifer was
better characterized (in terms of total flow rate) using five hydraulic conductivity profiles
and 15 soil classification profiles then 10 hydraulic conductivity profiles and 15 soil
classification profiles. This indicates that focusing on collecting a greater quantity of
index measurements (i.e., soil classifications) may be more useful in characterization than
collecting a fewer number of more precise measurements (i.e., hydraulic conductivities).
In this case, hydraulic conductivity inferred from a soil classification had a precision of
two to three orders of magnitude, whereas the hydraulic conductivity "measurements"
were exact. Thus, in this case, simply defining the existence of critical flow paths
apparently is more important than precisely defining their hydraulic conductivity.
BENSON AND RASHAD ON CO-KRIGING 193
120

100
0Q)
III
80
"'E
.!:!.
Q) 60
OJ N =0
a:
~ 40 ~S_ _ _
0
u::
20

o L,_--'-"-'-.L.L.Uu.L---'--'-'-'-'-.u..U._............-'-'--W-U-'----;:!
Kriging 10 100 Fully-
Only No. Soil Classification Profiles Defined

FIG. 8 - Total Flow Rate Through the Synthetic Aquifer

Trajectory of the Plume - Centroid

Trajectory of the plume can be characterized by the coordinates (X, Y, Z) of its


centroid. Trajectories for several different explorations schemes are shown in Fig. 9. In
each case, the trajectory is recorded for only four years. For longer times, a portion of the
plume has passed the downstream edge of the aquifer. Consequently, the statistics used
to describe the plume (centroid and variance) are ambiguous.

At early times, the trajectory of the centroid does not depend greatly on the
exploration scheme. However, as the plume evolves, different trajectories of the centroid
are obtained. In particular, the plume moves more slowly in the down-gradient (X-
direction) when the aquifer is characterized with a limited amount of subsurface data.
Apparently, the preferential pathways controlling down-gradient movement were
inadequately characterized.
Addition of soil classification data did not consistently improve the trajectory in
the X-direction. Adding five soil classification profiles improved the trajectory
significantly, but the worst trajectory was obtained when 22 soil classification profiles
were used. Adding even more soil classification profiles (NS = 32 or 125) improved the
trajectory only slightly. This is particularly discouraging because 125 soil classification
profiles corresponds to sampling 25% of the entire aquifer.

The cause of this discrepancy is inadequate representation of subsurface


anomalies that affect movement of the plume. At approximately 0.2 years, the centroid
of the plume moves dramatically as the particles flow around a low conductivity region.
The Y-coordinate increases and the Z-coordinate decreases (i.e., the plume moves
upward and towards the rear face of the aquifer). None of the exploration schemes
provided enough information to adequately characterize this movement. However,
adding soil classification profiles did improve the prediction. When only hydraulic
conductivity profiles were used (kriging only), the plume moved downward and to the
front which is the exact opposite behavior occurring in the fully-defined case. Adding
soil classification profiles did prevent the centroid from moving in the opposite direction
194 GEOSTATISTICAL APPLICATIONS

900

800 - NK =10, Ns=5 Fully-Defined

Q)E
-u0:1-
700

600
- NK =10, Ns=22
-- -. - -- NK =10, Ns=32
- NK =10, N s=125
'"
. ~~
"E "o 500
o!=
oc
U.,
'U
x_ 400
0
Kriging Only
300 NK =10 ,-- /: '
~ -~
/-
200 .--- ,.,0:
L---.--'----
100
0.01 0.1 4
Time (years)

Fully-Defined

-- -
950
--~-
Q)E
-u
.......'\
0:1-
.5"0
"E "o 900
Kri~ng Only , "'- -
o!=
OC _ ~=10_-----1\ ~

r
U.,
'U - .- NK =10, N s=5 \
>--0
NK=10, Ns=22
-- - M- -- /'
850
- ... - NK =10, Ns=32 \.1
- - - NK=10, N s=125 ,,/
- - -
800
0.01 0.1 4
Time (years)

1100

1150 Fully-Defined , ,

1200
Q)E
-u 1250
0:1-
.5'0
5"g 1300
OC
u., ----- - - - - .,
' U 1350 NK =10, N s=5
N_
o NK =10, Ns=22
1400

1450
NK =10, N s=32
---.- NK=10, N s=125
t
Kriging Only
NK=10
1500 L -_ _ _ _ _ _ _ _ ~ _______ ~ __ ~

0.01 0.1 4
Time (years)

FIG, 9 - Centroid of plume: (a) X-coordinate, (b) Y-coordinate, and (c) Z-coordinate
BENSON AND RASHAD ON CO-KRIGING 195

and, in the case where 125 profiles were used, did result in a subsurface where the plume
moved upward and to the rear of the aquifer. Unfortunately, the degree of plume
movement existing when 125 soil classification profiles were used is still too small to
simulate the fully-defined condition.

For brevity, graphs of trajectory of the centroid are not shown for the exploration
schemes where 10 hydraulic conductivity profiles were used. These graphs can be found
in Benson and Rashad (1994).

Smaller errors in the predicted trajectory occurred when ten hydraulic


conductivity profiles were used in the exploration scheme. In this case, addition of soil
classification profiles also had a smaller impact on the predicted down-gradient
movement of the plume. However, adding soil classification profiles did improve the Y
and Z-coordinates of the centroid. When only ten hydraulic conductivity profiles were
used (kriging only), the plume moved in the opposite direction as was observed in Fig. 9
(NK=5, NS=O). But, when soil classification profiles were added, the plume moved in
the correct direction. Nevertheless, the movement occurred more slowly than in the
fully-defined case, which caused the down-gradient movement of the plume in the
estimated aquifers to lag behind the down-gradient movement of the plume in the fully-
defined aquifer.

Spreading - Variance of the Plume

Spreading of the plume is characterized by the variance (or second central


moment); a larger variance corresponds to a greater amount of spreading. Evolution of
the variance of the plume is shown for various exploration schemes in Fig. 10.

Some general features of Fig. 10 are noteworthy of mention. First, the variance is
larger in the X and Z-directions. The Z-variance is large because the particles are
uniformly distributed along a vertical profile (i.e., Z-direction) at the onset of the flow
and transport simulation. A large X-variance occurs because down-gradient movement
of the plume occurs in the X-direction and thus the X-variance corresponds to
longitudinal spreading of the plume. Accordingly, the Y-variance is much smaller
because it corresponds to lateral spreading orthogonal to the average hydraulic gradient,
which is generally smaller than spreading in the longitudinal direction.

The X-variance increases with time. At short times, the variance is small because
little spreading of the plume has occurred. However, as the plume moves down-gradient,
the variance increases as more spreading occurs. Furthermore, the ability to capture the
true amount of spreading depends on the amount of subsurface information used in
characterization (Fig. 10). When less information is used (e.g., kriging only, NK=5,
NS=O), the variance is smallest and when more information is used (i.e., by adding soil
classification profiles) the variance increases. This is expected, because a smoother
subsurface containing fewer heterogeneities exists when less data are used in
characterization. However, adding more soil classification profiles does not consistently
improve the X-variance. In fact, the X-variance for NS=5 is closer to the X-variance in
the fully-defined case than the schemes having NS=22, 32, and 125.

Similar behavior was noted for the Y-variance. However, adding more soil
classification profiles had a more consistent effect on the Z-variance. Adding more soil
classification profiles consistently resulted in a Z-variance that was closer to the Z-
variance in the fully-defined case.
196 GEOSTATISTICAL APPLICATIONS

Fully-Defined '
800
~ - - NK=5'NS=5~
- K - - N =5 N =22
K ' S
NE
-+ N K=5, N =32
-
~ 600 s
C1I
0
- -- - NK=5, N s=125
c - - - -- -----
III
.~
400
>
X
200

0.1 4
I, Time (years)

Fully-Defined

200 l~
- - - N =5
K -
K
=5, N
Ns=22
.
K ' S
=5~
150 - + . N K=5, N s=32
- -- - NK=5, Ns=125
-~ . --- - -----
100
/
50 /

o ~---------~--------
0 .01 0.1
__ ____
~ ~

4
Time (years)

Fully-Defined ......
700

E
~ 650
C1I
0
c
III
.~
600
> NK=5, Ns =5 I
N N =5 N 5 =22 I
550 - NK=5' N =32
K ' 5
I
NK=5, Ns =: 25 1
500 I
0.01 0.1 4
Time (years)

FIG. 10 - Variance of plume: (a) X-variance. (b) Y-variance. and (c) Z-variance
BENSON AND RASHAD ON CO-KRIGING 197
When ten hydraulic conductivity profiles were used for characterization, the X-
variance in the estimated aquifers was similar to the X-variance in the fully-defined case
regardless of the number of soil classification profiles used in characterization (Benson
and Rashad 1994). Apparently, the ten hydraulic conductivity profiles used for
characterization resulted in a sufficiently heterogeneous subsurface such that spreading in
the down-gradient direction was preserved.
In contrast, spreading in the Y and Z-directions for the fully-defined case was
distinctly different from spreading that occurred in these directions when ten hydraulic
conductivity profiles were used for characterization (Benson and Rashad 1994). Adding
five soil classification profiles resulted in a V-variance that was closer to the V-variance
in the fully-defined case. However, as even more soil classification profiles were added
(NS=22, 32, 125), the V-variance became much different than was observed in the fully-
defined case. Apparently, the heterogeneities causing spreading in the V-direction were
inadequately represented when the subsurface was characterized with additional soil
classification profiles.

Kriging with only ten soil classification profiles resulted in a Z-variance that
differed greatly from the Z-variance in the fully-defined case. For larger times, the Z-
variance was much smaller than the Z-variance for the fully-defined condition. However,
when soil classifications were added, the Z-variance more closely resembled the Z-
variance for the fully-defined case. Thus, using soil classifications apparently resulted in
heterogeneities that were similar to those controlling spreading in the Z-direction in the
fully-defined aquifer.

SUMMARY AND CONCLUSIONS

The objective of this study was to illustrate how predictions of contaminant


transport differ as the quantity and type of information used to characterize the
subsurface changes. Groundwater flow and advective contaminant transport were
simulated through a heterogeneous synthetic aquifer that was fully defined. The aquifer
was highly heterogeneous, as might be encountered in supraglacial sediments, such as
those found in the upper mid-western United States. Additional flow and transport
simulations were conducted using versions of the aquifer that were characterized using a
limited amount of subsurface data. Comparisons were then made between the true
movement of the plume (in the fully defined aquifer) and movement of the plume in
versions of the aquifer that were characterized with limited subsurface data.

Results of the flow and transport simulations show that soil classifications can be
used to augment or replace more costly hydraulic conductivity measurements while
maintaining similar accuracy in terms of total flow through the aquifer. However, the
geologic details that govern transport through the synthetic aquifer apparently were never
sufficiently characterized. Bulk movement of the plume (i.e., the centroid) and spreading
(i.e., variance) of the plume were never simulated accurately, regardless of the amount of
subsurface data (hard or soft) that were used for characterization.
198 GEOSTATISTICAL APPLICATIONS

ACKNOWLEDGMENT

The study described in this paper was sponsored by the U.S. Dept. of Energy
(DOE), Environmental Restoration and Waste Management Young Faculty Award
Program. This program is administered by Oak Ridge Associate Universities (ORAU).
,, Neither DOE or ORAU have reviewed this paper, and no endorsement should be implied.

REFERENCES

Benson, C. and S. Rashad (1994), "Using Co-Kriging to Enhance Hydrogeologic


Characterization," Environmental Geotechnics Report No. 94-1, Dept. of Civil and
Environmental Engineering, University of Wisconsin, Madison, WI.

Bowles, J. (1984), Physical and Geotechnical Properties of Soils, 2nd Edition, McGraw-
Hill, New York.

Cooper, S. and C. Benson (1993), "An Evaluation of How Subsurface Characterization


U sing Soil Classifications Affects Predictions of Contaminant Transport,"
Environmental Geotechnics Report No. 93-1, Dept. of Civil and Environmental
Engineering, University of Wisconsin, Madison, WI.

Das, B. (1985), Principles of Geotechnical Engineering, PWS-Kent Publishing, Boston.

Deutsch, C. and A. Joumel (1992), GSLIB Geostatistical Software Library and User's
Guide Book, Oxford University Press, New York.

Domenico, P. and F. Schwartz (1990), Physical and Chemical Hydrogeology, John


Wiley, New York.

Fogg, G. (1986), "Groundwater Flow and Sand Body Interconnectedness in a Thick,


Multiple-Aquifer system," Water Resources Research, 22(5), 679-694.

Holtz, R. and W. Kovacs (1981), An Introduction to Geotechnical Engineering, Prentice-


Hall, Englewood Cliffs, N1.

Hough, B. (1969), Basic Soils Engineering, 2nd Edition, Ronald Press Co., New York.

Isaaks, E. and R. Srivastava (1989), Applied Geostatistics, Oxford Univ. Press, New
York.

Istok, J., Smyth, J., and Flint, A. (1993), "Multivariate Geostatistical Analysis of Ground-
Water Contamination: A Case History," Ground Water, 31(3), 63-73.

Lee, I., White, W., and Ingles, O. (1983), Geotechnical Engineering, Pitman Co., Boston.

McCarthy, D. (1982), Essentials of Soil Mechanics and Foundations, Basic Geotechnics,


Reston Publishing, Reston, VA.

MacDonald, M. and A. Harbaugh (1988), "A Modular Three-Dimensional Finite


Difference Ground-Water Flow Model," Techniques of Water-Resources
Investigations of the United States Geological Survey, USGS, Reston, VA.

Means, R. and 1. Parcher (1963), Physical Properties of Soils, Merrill Books, Columbus.
BENSON AND RASHAD ON CO-KRIGING 199
Meyers, D. (1991), "Pseudo-Cross Variograms, Positive Definiteness and Co-Kriging,"
Mathematical Geology, 23, 805-816.
Mickelson, D. (1986), "Glacial and Related Deposits of Langlade County, Wisconsin,"
Information Circular 52, Wisc. Geologic and Natural History Survey, Madison, WI.
Mitchell, J. (1976), Fundamentals of Soil Behavior, John Wiley and Sons, New York.

Scott, C. (1980), An Introduction to Soil Mechanics and Foundations, 3rd Edition,


Applied Science Publishers, London.

Simpkins, W., McCartney, M., and D. Mickelson (1987), "Pleistocene Geology of Forest
County, Wisconsin," Information Circular 61, Wisconsin Geologic and Natural
History Survey, Madison, WI.

Smith, G. (1978), Elements of Soil Mechanics for Civil and Mining Engineers, 4th Ed.,
Granada Publishing, London.

Seo, D-J, Krajewski, W., and Bowles, D. (1990a), "Stochastic Interpolation of Rainfall
Data from Rain Gages and Radar Using Co-Kriging," Water Resources Research,
26(3),469-477.

Seo, D-J, Krajewski, W., Azimi-Zonooz, A., and Bowles, D. (1990b), "Stochastic
Interpolation of Rainfall Data from Rain Gages and Radar Using Co-Kriging. Results,"
Water Resources Research, 26(5), 915-924.

Sowers, G. and G. Sowers (1970), Introductory Soil Mechanics and Foundations, 3rd
Ed., McMillan Co., New York.

Webb, E. and Anderson, M. (1996), "Simulation of Preferential Flow in Three-


Dimensional, Heterogeneous Conductivity Fields with Realistic Internal Architecture,"
Water Resources Research, 31(3), 63-73.
Whitlow, R. (1983), Basic Soil Mechanics, Construction Press, New York.

Zheng, C. (1988), "PATH3D, A Groundwater Path and Travel Time Simulator, User's
Manual," S. S. Papadopulos and Associates, Inc., Rockville, MD.
Stanley M. Miller 1 and Anja J. Kannengieser 2

GEOSTATISTICAL CHARACTERIZATION OP UNSATURATED


HYDRAULIC CONDUCTIVITY USING PIELD INPILTROMETER DATA

REFERENCE: Miller, S. M., and Kannengteser, A. J., "Geostatistical Characterization of Un-


saturated Hydraulic Conductivity Using Field Infiltrometer Data," Geostatlstlos for Environ-
mental and Geoteohnioal Applioations, ASTM STP 1283, R. Mohan Srivastava, Shahrokh Rouhani,
Marc V. Cromer, A. Ivan Johnson, Alexander J. Desbarats, Eds., American Society for Testing and
Materials, 1996.

ABSTRACT: Estimation of water infiltration and retention in surficial


soils is a critical aspect of many geotechnical and environmental site
evaluations. The recent development of field-usable tension infiltro-
meters now allows insitu measurements of unsaturated hydraulic conduc-
tivity (K u ), thus avoiding some uncertainties associated with remolded
soil samples tested in the laboratory. Several different geostatistical
"mapping" methods can be used to spatially characterize Ku , including
ordinary and indicator kriging, as well as spatial simulations that pro-
vide realizations (stochastic images) of Ku that exhibit more natural
variability than do the smoothed spatial estimations of kriging. Multi-
variate procedures, such as cokriging and Markov-Bayes simulation, can
incorporate information from a secondary attribute (e.g., particle size
information) to enhance the spatial characterization of an undersampled
Ku field. These geostatistical procedures are demonstrated and compared
for a case study at a 700 sq. meter site comprised of coarse soil
material. Results indicate that percent-by-weight fractions can be used
effectively to enhance insitu spatial characterization of Ku'

KEY WORDS: unsaturated hydraulic conductivity, particle size, site


characterization, geostatistics, kriging, spatial simulation.

An important physical property to be measured when investigating


water infiltration through surficial soils is the insitu unsaturated
hydraulic conductivity (K u ). The recent development of field-usable
tension infiltrometers now provides the capability to measure insitu Ku ,
thus avoiding some of the uncertainties associated with remolded soil
specimens tested in the laboratory (e.g., loss of insitu soil structure).
Even though field measurements of unsaturated hydraulic conduc-
tivity exhibit spatial variability, enough spatial dependence typically

10ept. of Geology and Geol. Engrg., Univ. of Idaho, Moscow, 10 83844


20ept . of Mathematics and Statistics, Univ. of Idaho, Moscow, 10 83844

200
MILLER AND KANNENGIESER ON CONDUCTIVITY 201

is observed to warrant a geostatistical investigation to characterize Ku


across the study site. Because of a fairly rapid sampling time, as many
as 20 to 30 Ku field measurements can be obtained in two days, a much
faster and more time-efficient procedure than that associated with
laboratory testing of remolded specimens. This provides an adequate
number of data for many geostatistical assessments, and the data base
can be supplemented by additional data on other physical properties
related to Ku (particle-size distribution attributes or insitu density) .
Several different geostatistical "mapping" methods can be used to
spatially characterize Ku. Univariate procedures include: 1) ordinary
kriging, which provides a smoothed map of Ku estimates at unsampled
locations across the site, 2) indicator kriging, which provides local
estimates of conditional probability distributions of Ku at specified
grid locations across the site, and 3) gaussian-based simulations, which
provide spatial realizations (stochastic images) of Ku that exhibit more
natural variability than do the smoothed spatial estimations of kriging.
Multivariate procedures, such as cokriging and Markov-Bayes simulation
can incorporate spatial information from a secondary attribute (e.g.,
the median particle size) to enhance the spatial characterization of an
undersampled Ku field.
To demonstrate and evaluate these various spatial characterization
methods, a case study at a 700 sq. meter site was conducted. The
coarse-grained soil material, as represented by 20 sampling locations
across the site, had a median particle size of approximately 7 mm and
averaged less than 6% by weight fines (i.e., finer than a No. 200
sieve). Insitu Ku values at a 5-cm tension were obtained at each of the
20 sites to provide a minimally sized sample for geostatistical studies.
Background information, analytical procedures, and results of the study
are presented below.

TENSION INPILTROMETER

For nearly a hundred years, soil scientists have been describing


the flow of water through unsaturated materials. Estimating the amount
and rate of such water flow requires knowledge of the Ku/moisture con-
tent relationship or the Ku/soil tension (water potential) relationship.
The most commonly used method to define these relationships relies on
laboratory measurements obtained by pressure desorption of a saturated
core of soil material, which leads to the construction of a moisture
characteristic curve of moisture content vs. soil tension (Klute 1986).
However, there are three problems associated with such testing: 1) the
time required to setup samples and then test over a wide range of soil
tensions; 2) the cost of field sampling, remolding specimens, and
monitoring the laboratory tests, which may take several weeks; and 3)
potentially unrealistic results due to the remolding of specimens, which
destroys any insitu soil structure or packing arrangements that may have
strong influence on flow characteristics. This latter concern is
especially applicable to coarse-grained soil materials, such as those
containing a significant amount of gravel or coarse sand.
Because of these concerns, there has been considerable interest in
recent years among soil scientists to develop methods for field meas-
urements of unsaturated flow properties (Ankeny et al. 1988; Perroux and
I:
I 202 GEOSTATISTICAL APPLICATIONS

White 1988; Clothier and Smettem 1990; Ankeny et al. 1991, Reynolds and
Elrick 1991}. Field capable devices for such work are characterized as
"tension infiltrometers" or "disk permeameters." They allow direct
measurement of insitu infiltration (flow rate) as a function of tension,
which leads to estimation of the insitu Ku value.
,i The tension infiltrometer used in this study was manufactured by
t Soil Measurement Systems of Tucson, Arizona. It has a 20-cm diameter
infiltration head that supplies water to the soil under tension from a
Marriotte tube arrangement with a 5-cm diameter water tower and a 3.8-cm
diameter bubbling tower (Fig. I). Three air-entry tubes in the stopper
on top of the bubbling tower are used to set the operating tension. All
major parts are constructed of polycarbonate plastic, with a very fine
nylon mesh fabric covering the infiltration head. Pressure transducers
installed at the top and bottom of the water tower are used to measure
accurately the infiltration rate. Output from the transducers is fed
electronically to a field datalogger for real-time data acquisition and
storage. Procedures for field setup and use of the instrument are given
in the SMS User Manual (1992).
Using the measured flow rates, 0 (cm 3 /hr), from the field tests,
values of Ku can be obtained using formulae given by Ankeny et al.
(1988), Ankeny et al. (1991), and Reynolds and Elrick (1991). The first
step is to calculate the pore-size distribution parameter, a, for a
pair of tension settings:

where: h1 = first soil tension value, h2 = second soil tension value


(higher than h1), 01 = volumetric infiltration rate for the first ten-
sion setting, 02 = volumetric infiltration rate for the second tension
setting.
Next, a parameter known as Ks (akin to saturated hydraulic
conductivity) is calculated as follows:

(2 )

where: r = effective radius of wetted area beneath the infiltration


disk (cm), h1 = selected soil tension value in the testing range, 01
volumetric infiltration rate corresponding to h1'
Then, an exponential relationship is used to calculate the desired
r Ku(h} given the results from Eqns. (I) and (2):
II
I (3 )

For our field study, measured infiltration rates were recorded at


soil tensions (suctions) of -3, -6, and -15 cm of water. The pair of
tensions at -3 and -15 cm of tension was used in Eqns. (I) and (2) to
obtain estimates of a and Ks , respectively. Values of Ku then could be
calculated at any soil tension h desired.

i:
MILLER AND KANNENGIESER ON CONDUCTIVITY 203
em

Transducer

Pinch Clomp Water reservoir

Water level

Three-hole
stopper

Bubble tower
8 1 em

56 em
Nylon screen
Air tube
Infiltration disc
Shut-off valve
1--20 em---l
3.8 eM I I
I I
I
==0:::::'=:':::::::::::::::':==---- ---- ------- --

FIG. 1--Schematic diagram of tension infiltrometer.

S BLBCTBD GBOSTATISTICAL MAPPING MBTHODS

Geostatistical spatial characterization of a specified attribute


generally involves the generation of maps by "filling in" values of the
attribute at numerous unsampled locations. Such filling-in processes
that honor the available data can be achieved by one of two methods,
interpolation or simulation. Spatial interpolation methods tend to
smooth the spatial pattern of the attribute (causing the set of
estimates to have a smaller variance than t h e actual data set) , but
generally provide good local esti mations. Spatial simulations, on the
other hand, provide more realistic fluct u ations, with the set of
simulated values having a variance that approximates that of the actual
data set.
The theoretical basis and important practical considerations of
ordinary kriging, the common geostatistical estimation method, have been
described in published literature over recent years (for example , see
David 1977; Journel and Huijbregts 1978; Clark 1979; Isaaks and
srivastava 1989). In essence, the procedure involves calculating a
weighted average of neighborhood data, where the weights represent
least-squares regression coefficients obtained by incorporating spatial
covariances between the data locations and the estimation location
(CSi's) and those between the pairs of the data values (Cij ' S).
Ordinary kriging provides unbiased and minimum-error estimates, and it
can be used to estimate values at point locations or to estimate the
average value of blocks (areas or vol umes) .
The estimated value of the block (or point , if point kriging is
used) is obtained by a weighted average of n data in the immediate
neighborhood (x's at locations u i):
204 GEOSTATISTICAL APPLICATIONS

n
VB = L ai x(Ui) where the ai'S are the kriging weights. (4)

i=l

In practice, the number of neighborhood data used in kriging


estimation is limited so that only those data locations within a range
of influence (or so) of the block or point location are used. Range of
influence is defined at that distance beyond which data values are not
dependent (i.e., covariance is zero). The block covariance CBB is a
constant value for all blocks of identical dimensions; it is estimated
by averaging the calculated covariance values between location pairs in
the block defined by 4, 9, 16, or 25 locations. For point kriging, CBB
= s2 , the sample variance. The CBi values are obtained by averaging
the covariances between 4, 9, 16, or 25 locations in the block with each
i-th data location in the neighborhood. Any given Cij value is the
covariance calculated for the lag and direction defined by the i-th and
j-th data locations in the neighborhood. In all cases, the desired
covariance value is obtained from the modeled variogram or complementary
covariance at the specified lag distance and direction of the pair of
locations being considered.
Ordinary kriging is a useful spatial interpolation and mapping
tool, because it honors the data locations, provides unbiased estimates
at unsampled locations, and provides for minimum estimation variance.
It also produces a measure of the goodness of estimates via the
calculated kriging variance or kriging standard deviation. Because
kriging is an interpolator, it produces a smoothed representation of the
spatial attribute being mapped. Consequently, the variance of kriged
estimates often is considerably less than the sample variance, and a
kriged map will appear smoother than a map of the raw data.
Kriging also accounts for redundancy in sample locations through
the incorporation of the Cij information. Thus, kriging weights
assigned to data locations clustered in the neighborhood will be less
than those assigned to solitary data locations. In fact, data
·overloading" to one side of the estimation point or block may result in
the calculation of small negative kriging weights.
The ordinary kriging system of equations to be solved for the
kriging weights is given by (a Lagrange term, A, is used to preserve
unbiased conditions and to optimize estimates by minimizing the
estimation error):

n
L aj Cij + A. = CBi (Sa)
j=l

(5b)

This system of equations is solved to obtain the ai weights and A. In


addition, the estimation variance, or kriging variance, can be obtained
at each estimation location.
MILLER AND KANNENGIESER ON CONDUCTIVITY 205
When describing the spatial dependence of an attribute of interest
(i.e., the covariance values needed for kriging), either the semivar-
iogram function or the spatial covariance function can be used (for
example, see Isaaks and Srivastava 1989). Sometimes, difficulties are
encountered when estimating these functions using skewed data sets that
contain outliers. The influence of such outliers can be mitigated in
many cases by using monotonous data transforms or by using an indi-
cator-transform framework that leads to computing indicator variograms
for use in indicator kriging (Journel 1983). The goal when kriging
indicator-transformed data is not to estimate the unsampled value at
location Uo, X(Uo), nor its indicator transform ;(110; Xk), which equals 1 if
X(Uo).s;Xkand equals 0 ifX(Uo»Xk' Instead, indicator kriging yields a
least-squares estimate of the local, conditional cumulative distribution
function (cdf) for each cutoffXk, this estimate being valued between 0
and 1 and obtained by ordinary kriging of indicator values. Thus, at
each ofkcutoffs, an estimated (designated by *) conditional cdf value
forXkis obtained, which is equivalent to the indicator kriged value (a
weighted average of neighboring 0' sand l' s) at location 110 with cutoff
Xk:
F*[Xkl (n)] = P*[{X(uo) .s;Xk} I (n)]
= E*[ {I(uo; Xk)} I (n)] = [;(uo; Xk)] * (6 )

where (n) represents the local conditioning information in the neighbor-


hood surrounding the unsampled location Do. Once local conditional cdf's
are estimated and then post-smoothed (if needed), maps of probability
information or percentiles can be constructed to characterize the site.
When bivariate data sets are available, cokriging can be used to
provide estimates of the main attribute of interest that incorporate
additional information from a secondary attribute. This requires
computing semivariograms or spatial covariances for each individual
attribute, as well as the cross-semivariogram between the two attributes
(for example, see Isaaks and Srivastava 1989). Linear coregionalization
of the semivariogram models allows for the cokriging covariance matrix
to be positive definite and thus avoid theoretical and computational
problems, such as estimating negative kriging variances (Issaks 1984).
When adequate data are available and sufficient intervariable relation-
ships observed, cokriging may provide a more comprehensive estimation
than univariate kriging.
Several types of spatial simulations also are available for
mapping a spatial attribute. As discussed earlier, simulations provide
natural-looking fluctuations in spatial patterns, while still honoring
known data locations and preserving the desired variance and spatial
covariance. Thus, simulations do not provide the smoothed appearance on
maps typical to kriging estimations.
For the case study that follows, we wanted to compare a
straightforward simulation procedure to a more complicated approach.
Therefore, we investigated sequential Gaussian simulation and Markov-
Bayes simulation, respectively. In addition, we used simulated anneal-
ing to generate numerous "data" values to supplement the available 20
"hard" data, and thus, provide a reference or training image to be used
as a standard basis for comparisons. Discussions of these simulation
methods and related software are given by Deutsch and Journel (1992).
206 GEOSTATISTICAL APPLICATIONS

CASE STUDY

The site selected for the case study was a portion of a heap
leaching pad at a base-metal mine in the Western U.S. Material at the
site consisted of blasted ore, with particle sizes ranging from several
microns up to several tens of millimeters. Although not a typical soil
in agricultural terms (i.e., possessing necessary organic materials to
support plant life), this material would be classified by engineers as a
coarse gravel with some sand and fines. This type of coarse material
would provide a rigorous test for the SMS tension infiltrometer, which
was designed to be used primarily for agricultural-type soils.

Field and Laboratory Work

Due to time and budgetary limitations, only 20 locations were


sampled over the study site, which was approximately 30 m (E-W) by 20 m
(N-S). Prior to selecting the sampling locations in the field, various
sampling layouts were studied by investigating their lag (separation
distance between any two locations) distributions. The goal was to have
a sampling plan that would provide adequate numbers of data pairs at
short and intermediate lags to facilitate the computation and modeling
of semivariograms. At the same time, fairly uniform coverage across the
site was desired to establish a solid basis for kriging and for simula-
tion. Numerous sampling layouts were evaluated by a trial-and-error
method before the final layout was selected. Even this plan was not
final, because some changes would be needed in the field, such as when a
specified location lay directly over a large cobble.
At each of the 20 sampling locations, an infiltrometer test pad
was leveled by hand, large rocks were removed (those greater than about
8 cm across), and a 3-mm layer of fine sand was laid down to provide
proper contact between the infiltrometer head and the ground surface.
The 20-cm diameter head of the SMS tension infiltrometer device then was
placed on the prepared pad and the infiltration test conducted. Water
flow quantities were measured for three different tensions (suctions):
-3 cm, -6 cm, and -15 cm of water head. Pressure transducers and
electronic data-acquisition hardware were used to record the flow data
on a storage module for later use.
Once the infiltration test was completed at a given location, the
wetted soil material directly beneath the infiltration disc was sampled.
Several kilograms of the material were placed in sealed sample bags for
subsequent analysis at the University of Idaho. Insitu measurements of
density were not attempted at this particular site, due to the amount of
gravel and larger-sized rocks. However, such measurements with a
neutron moisture/density gage are recommended for similar studies of
unsaturated hydraulic conductivity.
At the University of Idaho Soils Laboratory, the 20 specimens were
air-dried and rolled to break-up aggregated fines prior to sieve
analyses conducted according to procedure ASTM-D422, excluding the
hydrometer analyses. A stack of 13 sieves was used to sieve the granu-
lar materials, and particle-size distribution curves then were plotted
to display the sieve results. All specimens showed fairly well-graded
particle size distributions over size ranges from less than 0.075 mm
(fines) to 75 mm (coarse gravel). None of the specimens had more than
8% by weight passing the No. 200 sieve (0.075 mm). Consequently,
MILLER AND KANNENGIESER ON CONDUCTIVITY 207
hydrometer analyses were not deemed necessary. Based on the Unified
soil Classification system, the materials were identified as sandy
gravel with nonplastic fines.

Data Analysis

Given the measured volumetric flow rates at -3 and -IS cm


tensions, values of a and Ks were calculated according to Eqns. (1) and
(2). Values of Ku at several selected tensions, h, then were computed
and compared. Desiring to stay within the field measurement range and
yet wanting to approximate field behavior at near-saturation conditions
(such as after a heavy precipitation event or during spring snow-melt),
we eventually selected a soil tension of -S cm for all subsequent cal-
culations of the unsaturated hydraulic conductivity. Resulting values
of Ku(-S), expressed in cm/day, for the 20 sampling locations are shown
in the postplot of Fig. 2. Sample statistics for Ku(-S) are summarized
below (units are cm/day):

mean 183 minimum 124


S.d.n-l 36.9 median 189
s.d. n 3S.9 maximum 238

various particle-size attributes were studied to evaluate their


influence on Ku(-S), including the 010, 025, and 050 sizes, as well as
the percent-by-weight finer than 2.0 mm (No. 10 sieve) and 4.7S mm (No.
4 sieve). Scatterplots of Ku(-S) vs. each of these attributes were
generated and fitted with linear regression models. The three
characteristics showing the strongest linear relationship were the
percent-by-weight finer than 2.0 mm and finer than 4.7S mm, and the 025
size. Linear correlation coefficients were in the O.SS - 0.60 range, a
positive value for the first two attributes and a negative value for the
third one, respectively. Subsequent computations of experimental semi-
variograms for these three characteristics indicated that only the
percent-by-weight finer than 2.0 mm (PF2) showed any significant uni-
variate spatial dependence and cross spatial dependence with Ku(-S).
Thus, this parameter from the particle-size distributions was selected
as a secondary attribute to help estimate and map the primary attribute,
Ku(-S). Sample statistics for PF2 are summarized below (units are %):

mean 24.S minimum 17.S


s .d.n-l 2.S0 median 2S.1
s.d. n 2.44 maximum 27.9

Computing usable semivariograms with small data sets can be a


challenging task. With only 20 data locations for this study, it was
difficult to select computational lag bins that would provide adequate
numbers of data pairs for the irregularly spaced data set. Therefore,
we decided to use a ·sliding lag window" approach for computing the
experimental semivariograms. A sliding window S-m wide was used for
both the Ku(-S) and the PF2 data sets. Thus, the plotted points shown
on the semivariogram graphs in Fig. 3 represent overlapping lag bins of
O-S m, 1-6 m, 2-7 m, and so on. Because of the limited number of data,
only isotropic (omnidirectional) semivariograms were computed. The two
208 GEOSTATISTICAL APPLICATIONS

3396.
+ 124.
3394. + 211.

3392. + 138.

3390. + 198. + 194.

3388. + 238. + 132.

3386. + 215. + 205.+ 148. + 183. + 209.+ 127. + 210.

3384. + 174.+ 162.

3382 + 161 .

+ 231. + 161. + 233.

3378.L-~--~--~--~--~--~~--~--~--~--L-~--~--~~
664. 666. 668. 670. 672. 674. 676. 678. 680. 682. 664. 686. 668. 690. 692. 694.

FIG. 2--Postplot of e s timated insitu values of Ku(-S) in cm/day;


northing and easting coordinates are in meter s .

1800
"i 1600 ++ + _7 +
~1400
"0
+ !i8
~
1; 1200
u
-1000 "E 5
~4
~ 800 0

:.'"
.2 600 P2
.~ 400
200
.
·e
(/) 1
"
(/)
0 0
0 2 4 6 8 10 12 14 16 18 20 22 0 2 4 6 8 10 12 14 16 18 20 22
Lag Oi.lance (m) Lag Distance (m)

(a) (b)

FIG. 3--Estimated semivariograms and fitted spherical model s ; (a)


Ku(-S) model: y(b)=671+620Sph I2 (b); (b) PF2 model: Y(h)=O.4+5.55Sph I2 (h).

experimental semivariograms were fitted with sph erical variogram models,


as de scribed and annotated in Fig . 3. Sills on the models were set
equal to the sample variance in both cases .
The Ku(-S) data set wa s fitted with a fir s t-order trend s urface
MILLER AND KANNENGIESER ON CONDUCTIVITY 209
model to determine if there was any significant trend in mean across the
site. Calculated F-statistics for this regression fit were not large
enough to induce a rejection of the null hypothesis that a significant
trend was not present. Thus, one of the primary considerations (i.e.,
that the mean does not depend on spatial location) of the covariance
stationarity model for spatial random functions could be readily
accepted for subsequent spatial estimations and simulations.

Site Characterization and Mapping Using Geostatistics

The isotropic semivariogram model shown in Fig. 3a provided the


spatial covariance model to conduct ordinary point kriging on regular
grids to generate estimates of Ku(-5) across the study site. GeoEAS
(Englund and Sparks 1991) computer software was used. A regular 25 x 35
grid at 1-m spacings was used, because at field sampling locations an
area approximately 0.7 to 1 m in diameter was wetted during each infil-
trometer test. The shaded contour map of Fig. 4 clearly shows the
smoothing characteristics of kriging. Summary statistics for these
estimations are presented in Table 1. Comparisons of the sample
variances again reflect the significant amount of spatial smoothing
inherent to kriging estimations. Given similar estimates for stora-
tivity and soil-layer thickness, water-balance computations that
incorporate annual precipitation and evapotranspiration values can be
conducted to predict total annual recharge at each grid location at the
site (Miller et al. 1990).
Point cokriging of Ku(-5) also was conducted on the 1-m grid,
incorporating the spatial information and codependence of PF2 (percent-
by-weight finer than 2.0 mm). The two semivariogram models of Fig. 3,
as well as a cross-semivariogram between the two attributes (Fig. 5),
were used in the GSLIB cokriging software (Deutsch and Journel 1992).
Cokriged estimates were plotted and contoured to produce the shaded
contour map given in Fig. 6. Summary statistics of the estimates are
reported in Table 1. Cokriging did yield estimates with greater
variance than ordinary point kriging, but not with as great of relief
(i.e., difference between maximum and minimum). This estimation method
would be especially applicable in situations where numerous sampling
sites with particle-size analyses could be used to supplement a few
actual Ku sampling sites.

TABLE 1--Summarized statistics of kriging results for Ku (-5), em/day.

Data Ordinary Point Kriging Cokriging

n 20 875 875
mean 183.0 183.2 182.3
s.d. 35.9 11.8 13.1
var. 1291.0 140.1 170.8
min. 124.0 127.2 139.6
max. 238.0 237.8 215.7
210 GEOSTATISTICAL APPLICATIONS

207 .00

199.00

191.00

183.00
§.
175 .00
'"c

0
z 167 .00

159 .00

151.00

143 .00

135.00

Eas1ing 1m)

FIG . 4 - -Shaded contour map of Kul-5) point k ri ging e s tima t es Ic m/ d ay)


on a 1-m regular grid .

80
70 +
+
+ ++
E
~ 60
'"
.~ 50 +
..
.~ 40
C/) 30
~ 20
u
10
0
0 2 4 6 8 10 12 14 16 18 20 22
Lag Distance 1m)

FIG . 5--Es tima ted cross s emivariogram betwee n Kul-5) and PF2 with
fit ted spher ical mode l: Y(h) = 8.0 + 42.5 Sph12(h) .

Indicator kriging yield s a different kind of mapping informatio n


that often i s u s eful to characteri z e a s patial attribute . As di s cu ss ed
previously, local conditional cdf 's are e s timated acro ss the s ite by
indicator krig ing at s everal different data thre s hold value s. For th is
s tudy, five thre s hold s were a ss igned for the Kul-5) data : 140, 161.5,
190, 211 . 5, and 235 cm / day. Computer s oftware in GSLIB IDeut s ch and
Journel 199 2 ) was u s ed to c o nduct the indicator kriging, s moo th the
MILLER AND KANNENGIESER ON CONDUCTIVITY 211

207 .00

I 99 .00

191.00

183 .00

175 .0 0

167 .00

159.0 0

151 .00

143 .00

135 .00

Eosling 1m)

FIG. 6--Shaded contour map of Ku(-5) cokriging estimates on a 1-m


regular grid, using PF2 as the secondary attribute.

estimated cdf's, and produce E-type estimates (expectation, or mean


values) for mapping purposes (Fig. 7a). The probability of exceeding
200 cm/day also was calculated at each estimation location, and a s haded
contour map of this exceedance probability wa s produced (Fig. 7b). This
exceedance cutoff value was selected arbitrarily, but serves to
illustrate the types of probability maps that can be generated to help
characteri ze Ku at the site and provide input for cost-benefit studies
to assist in treatment or remediation designs. Another advantage of the
indicator kriging framework is that "soft" in format ion (inequal i ty
relation s, professional judgments, etc.) can be coded probabilistically
and used to supplement available "hard" indicator data (Journel 1986).
As shown in the Ku(-5) postplot of Fig. 2, spatial variability in
the unsaturated hydraulic conductivity is typical. For example, at a
northing coordinate of 3386 m, a value of 205 cm/day is adjacent to a
value of 148 cm/day, and a 209-cm/day value is adjacent to one of 127
cm/day. This spatial variability is to be expected , especially for
coarser grained soil materials. Therefore, smoothed kriging maps of the
type presented thus far may not always be the most appropriate way to
characterize Ku. Spatial simulations that honor available data and also
pre serve the sample variance provide quite a different prediction of
spatial pattern s.
To compare the performances of two type s of spatial simulators for
small data sets of Ku , we first used simulated annealing (Deutsch and
Journel 1992) on a 1-m grid to generate a pseudo ground-truth image of
Ku(-5) that could serve as a reference base-map (Fig. 8). Sample sta-
tistics for the 875 simulated values are summarized below (in cm/day) :
212 GEOSTAT ISTICAL APPLICATIONS

mean 184 minimum 115


s.d. 35.8 median 189
var. 1280 maximum 248

25.00

15.00

05.00

195.00

185.00

175 .00

165 .0 0

155 .0 0

145 .00

135 .00

Easling (meIer)

(a)

0.90

0.80

0.70

0.60

0.50

0.40

0.30

0.20

0.10

0.00

Easting (meIer)

(b)

FIG . 7--Shaded contour maps of Ku {-5) indicator kriging results on


a 1-m regular grid; (a) E-type map s howing cdf expectation values,
cm/day; (b) probability of exceeding 200 cm/day.
MI LLER AND KANNENGIESER ON CONDUCTIVITY 213

250.00

230.00

210.00

190 .00

170.00

150.00

130.00

110.00

Eosling (meier)

FIG. 8--Shaded contour map of Ku (-5) "reference data" based on


simulated annealing.

The isotropic semi variogram for the simulated values was similar to that
shown in Fig. 3a.
Sequential Gaussian and Markov-Bayes simulations of Ku(-5) were
conducted on a I-m grid u sing software from GSLIB (Deutsch and Journel
1992), based on the 20 known data values and on the semivariogram model
of Fig. 3a. The Markov-Bayes procedure also uses secondary information
(PF2 in this case), but when both the primary and seconda ry attributes
have data values at the same locations, the primary information is given
precedence over the secondary (Zhu 1991; Miller and Luark 1993). Thus,
results of the two different simulation methods for four trials (simu-
lation passes, or iterations) were quite similar, s howing average mean-
square-errors on the order of 2,300 cm/day squared. These errors were
calculated as squared differences between the 875 simulated values and
the 875 values of the ground-truth image. It was not surprising to
observe that the largest of these errors occurred in the most s parsely
sampled areas of the study site , where uncertainties are greatest for
the simulated annealing approach and the other simulation methods.
Advantages of a bivariate simulation method, such as the Markov-
Bayes (MB) procedure, become apparent when the primary attribute i s
undersampled in regard to the secondary attribute. To illustrate this,
we selected several subsets (reduced data sets) containing 10 of the
original 20 Ku sampling sites. The goal became one of simulating 875
Ku (-5) values, given 20 PF2 sites and 10 Ku(-5) sites. The s e results
then could be compared to those ba sed on sequential Gaussian (SG)
simulation u sing only the 10 Ku(-5) data. Basic statistica l information
for the three different subsets , A, B, and C, is presented in Table 2.
214 GEOSTATISTICAL APPLICATIONS

TABLE 2--Summarized statistics for the three subsets of Ku (-5) and


for the corresponding simulation results (units are cm/day).

Original Data Subset A Subset B Subset C

n 20 10 10 10
mean 183.0 177.4 194.4 176.5
s.d. 35.9 35.6 30.3 40.1
var. 1291.0 1264.0 918.6 1610.0
min. 124.0 123.7 138.4 123.7
max. 238.0 237.8 232.8 232.8

SO simulation results (ayera~d from four trials);


n 875 875 875
mean 176.4 194.2 173.0
s.d. 34.5 31.1 40.3
mean sq. err. using ref. image 2391. 2176. 2763.

MB simulation results (ayera~ed from four trials);


n 875 875 875
mean 175.6 182.1 178.8
s.d. 36.2 33.4 37.2
mean sq. err. using ref. image 2588. 2344. 2476.

Note that Set B has smaller variance than the original data set, and
that Set C has a larger variance than the original data set.
In terms of overall mean squared error, SG simulation outperformed
MB simulation for Subsets A and B where the sample variance was
relatively small. However, when the subset had a larger variance, MB
simulation was the better procedure according to this criterion. The
beneficial influence of secondary data as used in the MB simulation
method is shown clearly by the sample means and standard deviations of
simulated sets based on the three different subsets. Note especially
the more consistent results for Subsets Band C shown by the MB
procedure compared to more inconsistent results of the SG procedure.
Examples of MB simulation results for these two subsets are presented as
shaded contour maps in Fig. 9.

CONCLUSIONS

A variety of geostatistical tools are available for mapping and


characterizing unsaturated hydraulic conductivity. Using recently
developed tension infiltrometers for field use, measurements of
volumetric infiltration rates provide a basis for estimating Ku values
that reflect insitu conditions of soil density, packing, and structure.
Although a vast improvement over laboratory testing of disturbed
specimens, such insitu testing still requires enough time and effort
that numerous measurements (greater than 30) at a study site likely will
MI LLER AND KANNENGIESER ON CONDUCTIVITY 215

250 .00

230.00

210.00

190 .00

170 .00

150.00

130 .00

110 .00

Easling (melor)

(a)

250 .00

230 .00

210 .00

190.00

170 .00

150.00

130 .00

110.00

Easling (meier)
(b)

FIG. 9--Examples of Markov-Bayes s imulations of Ku (-5), cm / day; (a)


bas ed on a subset (Set B) of 10 Ku data with variance lower than that
of original data; (b) based on a subset (Set C) of 10 Ku data with
variance higher than that of original data.

not be affordable except for large-budget investigations . However,


s econdary information more economical to obtain, especially particle-
si z e characteristics s uch a s the percent-by-weight finer than 2.0 mm ,
216 GEOSTATISTICAL APPLICATIONS

can be used in bivariate types of kriging and simulation to fill-in Ku


values at unsampled locations and provide enhanced spatial mappings.
The case study presented here dealt only with surface measuremen
and two-dimensional maps. However, trenching with benched sidewalls
could be used to provide insitu Ku assessments at various elevations
add a third dimension of elevation into the characterization scheme.
The kriging and simulation methods described herein are readily adapt
to three-dimensional situations.
If point estimates are desired for generating contour maps of
estimated Ku , then ordinary point kriging (or indicator kriging for
local cdf's) would be preferred. When local cdf's are estimated by
indicator kriging, a variety of probabilistic type maps can be gene rat
to characterize spatial patterns of Ku across the study site.
When secondary data are available, and a recognizable relationsh
is present between secondary and primary data, Markov-Bayes simulation
often will provide better results than those produced by univariate
simulations, such as the sequential Gaussian method. The former meth
particularly has advantages when primary sample data are sparse and
perhaps not representative of the entire population, and when a larger
sample is available of the secondary attribute.

ACKNOWLEDGEMENTS

Portions of this research work were supported by the Idaho Cente


for Hazardous Waste Remediation Research under Grant No. 676-X405.
authors also express appreciation to John Hammel, John Cooper, and Mit
Linne of the University of Idaho for their technical advice and
assistance in the operation of the tension infiltrometer, analysis of
its measurements, and in the laboratory testing program. The Universi
of Idaho does not endorse the use of any specific commercial material
product mentioned in this paper.

REFERENCES

Ankeny, M.D., M. Ahmed, T.C. Kaspar, and R. Horton, 1991, 'Simple Fie
Method for Determining Unsaturated Hydraulic Conductivity," £oil Sci.
Soc. of America Jour., Vol. 55, No.2, p. 467-470.

Ankeny, M.D., T.C. Kaspar, and R. Horton, 1988, "Design for Automated
Tension Infiltrometer," Soil Sci. Soc. of America Jour., Vol. 52, No.
p. 893-896.

Clark, I., 1979, Practical Geostatistics, Applied Sci. Publ., London,


129 p.

Clothier, B.E., and K.R.J. Smettem, 1990, 'Combining Laboratory and


Field Measurements to Define the Hydraulic Properties of Soil," ~
Sci. Soc. of America Jour., Vol. 54, No.2, p. 299-304.

David, M., 1977, Geostatistical Ore Reserve Estimation, Elsevier,


Amsterdam, 364 p.
MILLER AND KANNENGIESER ON CONDUCTIVITY 217
Deutsch, C.V., and A.G. Journel, 1992, GSLIB: Geostatistical Software
Library and User's Guide, Oxford Univ. Press, New York, 340 p.

Englund, E., and A. Sparks, 1991, Geostatistical Environmental


Assessment Software User's Guide (GeoEAS 1.2.1), USEPA Env. Monitoring
Systems Lab., Las Vegas, NV.

Isaaks, E.H., 1984, "Risk Qualified Mappings for Hazardous Waste Sites:
A Case Study in Distribution-Free Geostatistics,· M.S. Thesis, Stanford
Univ., Stanford, CA, 111 p.

Isaaks, E.H., and R.M. Srivastava, 1989, An Introduction to Applied


Geostatistics, Oxford Univ. Press, New York, 561 p.

Journel, A.G., 1983, "Nonparametric Estimation of Spatial Distribu-


tions," Math. Geology, Vol. 15, No.3, p. 445-468.

Journel, A.G., 1986, "Constrained Interpolation and Qualitative


Information -- the Soft Kriging Approach,· Math. Geology, Vol. 18, No.
3, p. 269-286.

Journel, A.G., and C.J. Huijbregts, 1978, Mining Geostatistics,


Academic Press, New York, 600 p.

Klute, A., 1986, "Methods of Soil Analysis, Part 1," Amer. Soc. of
Agronomy. Monograph 9.

Miller, S.M., J.E. Hammel, and L.F. Hall, 1990, HCharacterization of


Soil Cover and Estimation of Water Infiltration at CFA Landfill II,
Idaho National Engineering Laboratory; Res. Report C85-110544, Idaho
Water Resources Research Inst., Univ. of Idaho, Moscow, ID, 216 p.

Miller, S.M., and R.D. Luark, 1993, HSpatial Simulation of Rock


Strength Properties Using a Markov-Bayes Method," Int. Jour. Rock Mech.
Min. Sci. & Geomech. Abstr., Vol. 30, No.7, p. 1631-1637.

Perroux, K.M., and I. White, 1988, HDesigns for Disk Permeameters,"


Soil Sci. Soc. of America Jour., Vol. 52, No.5, p. 1205-1215.

Reynolds, W.D., and D.E. Elrick, 1991, HDetermination of Hydraulic


Conductivity Using a Tension Infiltrometer,· Soil Sci. Soc. of America
Jour., Vol. 55, No.3, p. 633-639.

Soil Measurement Systems, 1992, "Tension Infiltrometer User Manual,·


Soil Measurement Systems, Tucson, AZ.

Zhu, H. 1991, HModeling Mixture of Spatial Distributions with


Integration of Soft Data," Ph.D. dissertation, Dept. of Applied Earth
Sci., Stanford Univ., Stanford, CA.
l 2 3
Marc V. Cromer , Christopher A. Rautman , and William P. Zelinski

Geostatistical Simulation of Rock Quality Designation (RQD) to


Support Facilities Design at Yucca Mountain, Nevada

REFERENCE: Cromer, M. V., Rautman, C. A., and Zelinski, W. P., "Geostatistical


Simulation of Rock Quality Designation (RQD) to Support Facilities Design at Yucca
Mountain, Nevada," Geostatistics for Environmental and Geotechnical Applications,
ASTM STP 1283, R. M. Srivastava, S. Rouhani, M. V. Cromer, A. 1. Johnson, A. 1.
Desbarats, Eds., American Society for Testing and Materials, 1996.

ABSTRACT: The conceptual design of the proposed Yucca Mountain nuclear waste
repository facility includes shafts and ramps as access to the repository horizon, located
200 to 400 m below ground surface. Geostatistical simulation techniques are being
employed to produce numerical models of selected material properties (rock
characteristics) in their proper spatial positions. These numerical models will be used to
evaluate behavior of various engineered features, the effects of construction and operating
practices, and the waste-isolation performance of the overall repository system. The
work presented here represents the first attempt to evaluate the spatial character of the
rock strength index known as rock quality designation (RQD). Although it is likely that
RQD reflects an intrinsic component of the rock matrix, this component becomes difficult
to resolve given the frequency and orientation of data made available from vertical core
records. The constraints of the two-dimensional study along the axis of an exploratory
drift allow bounds to be placed upon the resulting interpretations, while the use of an
indicator transformation allows focus to be placed on specific details that may be of
interest to design engineers. The analytical process and subsequent development of
material property models is anticipated to become one of the principal means of
summarizing, integrating, and reconciling the diverse suite of earth-science data acquired
through site characterization and of recasting the data in formats specifically designed for
use in further modeling of various physical processes.

KEYWORDS: indicator simulation, rock quality designation, variogram, core data

1 Principal Investigator, Sandia National Laboratories/Spectra Research Institute, MS 1324, P.O. Box 5800,
Albuquerque, NM 87185-1342
2 Principal Investigator and Senior Member Technical Staff, Sandia National Laboratories, MS 1324, P.O.
Box 5800, Albuquerque, NM 87185-1342
3 Principal Investigator, Sandia National Laboratories/Spectra Research Institute, MS 1324, P.O. Box 5800,

Albuquerque, NM 87185-1342

218
CROMER ET AL. ON ROCK QUALITY DESIGNATION 219

INTRODUCTION

Yucca Mountain, Nevada is currently being studied by the U. S. Department of Energy as


a potential site for the location of a high-level nuclear waste repository. Geologic,
hydrologic, and geotechnical information about the site will be required for both
engineering design studies and activities directed toward assessing the waste-isolation
performance of the overall repository system. The focus of the overall Yucca Mountain
Site Characterization Project is the acquisition of basic geologic and other information
through a multidisciplinary effort being conducted on behalf of the U. S. Department of
Energy by several federal agencies and other organizations The location of the proposed
underground facilities and the proposed subsurface access drift are shown on Figure 1.
Also shown are the locations for the bore holes used in this two-dimensional study.

The Yucca Mountain site consists of a gently-eastward dipping sequence of volcanic tuffs
(principally welded ash flows with intercalated nonwelded and reworked units). Various
types of alteration phenomena, including devitrification, zeolitization, and the formation
of clays, appear as superimposed upon the primary lithologies. The units are variably
fractured and faulted. This faulting has complicated characterization efforts by offsetting
the various units, locally juxtaposing markedly different lithologies. Most design interest
is focused on the Topopah Spring Member and immediately adjacent units. By
comparison, the waste-isolation performance of the repository system must be evaluated
within a larger geographic region termed the "controlled area" (Figure 1).

The region evaluated by this study is contained entirely within the controlled area. In
general, this study is further restricted to the location of the subsurface access drift known
as the North Ramp, in keeping with a general engineering orientation. This two-
dimensional study represents the first attempt to identify local uncertainty in the rock
structural index known as Rock Quality Designation (RQD).

CONCEPTUAL MODEL

The U.S. Geological Survey provided the original geological cross-section model along
the North Ramp (USGS, 1993). That model was subsequently modified by others and
new cross-sections have also been prepared manually. For this study, the cross-section
shown in Figure 2 was recreated interactively using the Lynx GMS Geosciences
Modeling System, to insure that all of the new bore hole data and corroborative surface
control (Scott and Bonk, 1984) was honored.

The cross-section shown in Figure 2 is consistent with the conventional assumption that
all faults in the repository area are generally down-thrown on the west side. This
interpretation requires a variable, but relatively steep, dip to the beds that can locally
exceed 6 degrees (10% grade). This cross-section also suggests the possible existence of
one or more faults with the east side down thrown. The eight bore holes noted in Figure
2 are of variable lengths and are shown in their proper orientation with respect to the
220 GEOSTATISTICAL APPLICATIONS

Nellis Air Force Range


NRG-7A
t
N
(Not to Scale)
Conceptual Perimiter Drift Boundary,

- - --- -- ~- --) -- - -- -
NRG-l" North Ramp

Nevada Test Site

ELM (Public Lands) ---<----Subsurrace Access Drift

Figure 1 General site map of the proposed repository area.

NRG-S
NRG-7A
NRG-4

NRG-2 & NRG-2a

(Not to Scale)

Figure 2 Cross-section along the axis of the North Ramp.


CROMER ET AL. ON ROCK QUALITY DESIGNATION 221
cross-section. For ease in interpretation, only the variations in gross lithology between
welded and non-welded tuffs are differentiated in this figure.

RQD AS A REGIONALIZED VARIABLE

During construction, emplacement, retrieval (if required), and closure phases of the
project, consideration of excavation stability must be incorporated into the design to
ensure worker health and safety, and to prevent development of potential pathways for
radionuclide migration during the post-closure period. In addition to the loads imposed
by the in-situ stress field, the repository drifts will be impacted by thermal loads
developed after waste emplacement and, periodically, by seismic loads. Rock mass
mechanical properties, which reflect the intact rock properties and the fracture joint
characteristics, are used in detailed mechanical analyses to evaluate the host rock
response to loading. The RQD index is widely used as an indicator of rock
quality/integrity in rock mechanics practice. The concept of RQD is that of a modified
core-recovery percentage that incorporates only nonfractured pieces of core that are 0.33
ft (0.10 m) or greater in length:

Run RQD =
L Piece Lengths ~ 0.33ft x 100%
Run Length(ft)

Although other parameters of rock quality are available and widely accepted e.g., rock
mass rating system (RMR) and rock mass quality (Q), the RQD index is considered to be
a good indicator since it reflects a combined measure ofjoint frequency, degree of
alteration and discontinuity filling, if these exist (Deere and Deere, 1989). In fact, both
RMR and Qmeasurements incorporate a factorial component ofRQD in their derivation.
Common tunnelers' rock quality classifications (Deere & Deere, 1989) are correlated to
RQD values in Table 1, while the information provided in Table 2 summarizes expected
shotcrete and additional support requirements for a tunnel in rock which has been
excavated by a boring machine (Cecil, 1970).

This study found the recovery data for individual core runs to be highly variable.
Typically, core runs with poor or no recovery are often short and numerous, while
intervals with high recovery are usually as long as the drillers could make them. RQD
was measured on individual core runs, but the high local variability in core recovery and
disparate lengths of core runs made analysis of core run data difficult. A weighted
average composite ofRQD values on 10-foot intervals provided useful information with
which to perform geostatistical analyses.
222 GEOSTATISTICAL APPLICATIONS

TABLE 1

ROCK QUALITY CLASSIFICATION


(from Deere & Deere, 1989)

Excellent 90-100 Intact


Good 75-90 Massive, moderately jointed
Fair 50-75 Blocky and seamy
Poor 25-50 Shattered, very blocky and seamy
Very Poor 0-25 Crushed

TABLE 2

GUIDELINES FOR SELECTION OF PRIMARY SUPPORT


FOR 2-FOOT TO 40-FOOT TUNNELS IN ROCK
(from Cecil, 1970)

SHOTCRETE SUPPORTITHICKNESS

ROCK QUALITY CROWN SIDES SUPPORT


Excellent RQD> 90 None to occasional None None
Good RQD 75 to 90 Local 2 to 3 inches None None
Fair RQD 50 to 75 2 to 4 inches None Provide for rock bolts
Poor RQD 25 to 50 4 to 6 inches 4 to 6 inches Rock bolts as required
(approx. 4-6 ft cc)
Very Poor RQD < 25 6 inches or more 6 inches or more Medium steel sets as
(excluding on whole section on whole section required
swelling)

At the heart of geostatistics is the concept of the regionalized variable (ReV). Without
expanding upon random function theory, the ReV can be considered to be a single-valued
function defined over a metric space that has properties intermediate between a truly
random variable and one that is deterministic. In practice, a ReV is preferentially used to
describe natural phenomena which are spread out in space (andlor time) and which
display a certain structure. This structure is typically characterized by fluctuations that
are smooth at a global scale but erratic enough at a local scale to preclude their analytical
modeling (Olea, 1991). Unlike true random variables, the ReV has continuity from point
to point, but the changes in the variable are complex.
CROMER ET AL. ON ROCK QUALITY DESIGNATION 223
Previous studies ofRQD at Yucca Mountain (Lin et.al., 1993) concluded that, in general,
fracture frequency was found to increase with increasing degrees of welding in the
volcanic tuffs. The use of an average RQD value for representing the rock quality of an
entire unit, though, was not deemed appropriate to account for its observed spatial
dispersion. Lateral variation offracture frequencies and RQD observed by Lin, lead to
the recommendation that a range of values should be considered in the drift design
methodology. While Lin's previous work recognized lateral variability ofRQD within
units, this paper outlines the first attempt to model or further examine the nature of these
changes.

Although it would appear, at this point, that RQD could be considered a ReV in a manner
similar to other rock properties that can be expected to vary in space, e.g., porosity or
hydraulic conductivity, the dependence ofRQD upon not only the structural fabric of
Yucca Mountain but also the relationship between the vertical borehole data and that
same structure, produced some unanticipated problems.

Of the four drill cores available to Lin (1993) for evaluation, nearly 95% of the 4000
fractures measured occurred within the more densely welded units and possessed near-
vertical dip orientations. While this vertical nature of fracturing is consistent with most
of the faults and fractures in the Basin and Range geological province which characterizes
the Yucca Mountain area, it required Lin to make corrections when estimating the non-
directional volumetric fracture frequency for each unit. All the data available to this
study is also from drill cores and subject to similar considerations, i.e., there may be
question as to the validity ofRQD measurements derived from vertical drill holes that
align themselves sub-parallel with the structural fabric. For example, intervals of good
core recovery and relatively high RQD may simply reflect an isolated block of intact rock
in pervasively fractured ground. It can also be shown that the orientation on the drill
cores and the sample volume analyzed can influence the interpretation of RQD and
distort characterization and modeling of the ReV. Fortunately, this study is focused in
two-dimensions along the axis of a subsurface drift known as the North Ramp. The
definition of the ReV, and any interpretations made from its modeling, will therefore be
constrained and specific to this locale. Although limited, the RQD data are consistent in
their sample size (boring diameter and 10 foot composite lengths) and general orientation
(all vertical, except for drillhole NRG-3).

ANALYSIS OF SPATIAL VARIATION

RQD data were developed along the North Ramp from eight boring cores. These borings
are shown in Figure 3 and are vertically exaggerated by five times their actual
displacement. The histogram in Figure 4 shows the frequency ofRQD values, as grouped
into 10 classes. A review of this histogram shows a positively skewed distribution
having a mean value of25.3 and a standard deviation of24.5. It is interesting to note that
75% of these values are below an RQD value of 44.0. The limited and sporadic
224 GEOSTATISTICAL APPLICATIONS

NRG-7A

§_~L~
21300.

19300 .
NRG-3\

\I
I
NOG
.
III

.1:1
'= 17300.
NRG-S

..
~
U

15300 . I
13300 . -!-o-.,.,-,'TT'.-.-..,..,..,,.......,............,,......................,n-r,.,......,.,rTT..-j
8000 6000 4000 2000 o
distance from NRG-l

Figure 3 Posting ofRQD data in the bore hole along the axis of the North Ramp.

50 .0

40 .0

!
>- 30.0
()
c

:::J
I:T 20.0
i
N

....f
10.0

0 .0 W.....J....J......L...J.....L....J.....L....l......L....J.....r:;::::;::t;::.::I:i::!l:i:;::J;;j;;;g;;;;=~
0 .0 10 .0 20 .0 30 .0 40 .0 50 .0 60 .0 70.0 80 .0 90 .0 100 .0
RQD
Figure 4 Histogram of the RQD data along the North Ramp.
CROMER ET AL. ON ROCK QUALITY DESIGNATION 225

occurrences of higher RQD values is problematic to extracting a single model


(variogram) that accurately represents the spatial correlation of the ReV.

There are many situations in which the pattern of spatial continuity of higher values is not
the same as that oflower values. For example, preferential groundwater flow resulting
from aquifer heterogeneity can often be attributed to isolated occurrence of sand/gravel
stringers/lenses that possess the very highest permeability. When such marked disparities
exist within a distribution, the higher values tend to increase within-lag variability and
make variogram interpretation difficult. The focus of this study was, therefore, directed
more specifically at trying to understand the nature of the spatial structure from the lower
RQD values and predict their occurrence as they relate to specific design issues.

Indicator methods (Isaaks and Srivastava, 1989) are non-parametric and provide the
flexibility needed to focus this study on particular classes of data. This focus is
accomplished by transforming the raw data distribution into K mutually exclusive classes
of binary, indicator variables. The indicator transform of the raw data is typically defined
as either zero or one, depending upon whether it falls above or below a particular data
value (cutoff threshold):

. {lifZ(X)~Zk
l(X·Z )=
'k Oif z(x» Zk

where Zk is the cutoff threshold for class k; k=1,2, ... K.

Spatial relationships between the indicator transforms ofRQD were determined by


examining all data pairs oriented along the principal directions of anisotropy and
separated by pre-defined "lag" distances. The variogram is defined as half of the average
squared difference between the indicator pairs, and Cromer and Srivastava (1992) suggest
the indicator variogram can be viewed as the probability of switching from one indicator
class to another over a certain distance along a given direction.

With limited information, it is always a challenge to extract well-structured (low nugget


effect, well defined range) variograms. Isaaks and Srivastava (1989) noted that this is
especially true with bore-hole data where, typically, vertical information is closely spaced
in comparison to the horizontal separation between borings. Because of the abundant
data-pairs at all lag intervals and the likelihood of within-unit similarities along the length
of core, vertical variograms often appear stable and well-structured. Horizontal
variograms, by comparison, often display poor short-scale correlation structure when data
are limited. Figure 5 displays the vertical indicator variogram developed for the 25 RQD
cutoff. A clearly defined structure was apparent in the vertical orientation and was
modeled using the parameters outlined in Figure 5. The computer software UNCERT
(Wingle et. a!. , 1994) was used to generate and model the variograms used in this study.
226 GEOSTATISTICAL APPLICATIONS

Contrary to the vertical orientation, the horizontal spatial structure (Figure 6) at the 25
RQD cutoff is not as clearly definable. Although some structure is apparent in Figure 6,
there are many instances were one must rely on additional information, external to the
sample data, to formulate a professional judgment on the horizontal component of this
anisotropy. For example, geologic information collected from surface transects can often
provide details on principal directions of spatial correlation for the ReV at a scale not
easily observed through isolated borings. An omni-directional variogram developed for
this cutoff would not, in an explicit fashion, draw attention to limitations and anisotropy
in the bore hole data... making this an excellent case for emphasizing that exploratory data
analysis cannot be discounted as simply an exercise, nor should geometric anisotropy be
ignored.

The indicator formalism allows us to model mixtures of populations loosely defined as


classes of values of a single attribute (Deutsch and Joumel, 1992). By isolating specific
classes of data from the global cumulative distribution, variograms built on the indicator
transform can often reveal a pattern in spatial continuity not available through other
techniques. When estimating or simulating an indicator variable, the variogram model
should be consistent with the particular cutoff under study. For example, to examine the
distribution ofRQD values less than or equal to 25, a variogram that captures spatial
continuity of the 25 threshold should be used.

The lower RQD values display a relatively stable, continuous structure consistent with
the conceptual geologic framework. This stability, although, does not persist when
examining the 50 RQD indicator threshold. At the 50 RQD threshold, the vertical
variogram in Figure 7 continues to exhibit good structure and was modeled with the
parameters shown. The horizontal variogram (Figure 8), on the other hand, has
degenerated with the inclusion of 78 additional indicator data previously assigned a value
of 0 at the 25 cutoff. Not many conclusions can be drawn from such a relationship and,
therefore, the horizontal component was modeled with an effective nugget at the
theoretical sill of 0.147. The observed degeneration of the variogram structure (higher
nugget, poorly defined range) reflect upon the erratic spatial occurrences of higher RQD
values, which is again consistent with the conceptual framework.

SIMULATION OF RQD ALONG THE NORTH RAMP

Simulation of the complete cumulative distribution of RQD was attempted using


indicator cutoffs at 10,25,50 and 75. These cutoffs were selected to roughly correspond
to the tunnel support design guidelines shown in Table 2. Although selection of these
cutoffs can only provide us with a crude approximation of cumulative distribution of
RQD, examining additional thresholds was not warranted given the objectives of this
evaluation.

Since limited and sporadic high RQD data imposed the observed rapid degeneration of
the variogram structure at higher thresholds, the potential for order relations problems
CROMER ET AL. ON ROCK QUALITY DESIGNATION 227

o.40 r;::::=:::::!==:::!:==-::---..--r-----,~-_,
MOdll Par_mltar,
Nut Rangt Nodal
1 7.
!' Nugglt
0 .17
0 .08
Spharlc&l

..
0 .30

•E
E 0 .20
..

'" Data Par.maters
Data F" = InttJkomt/mvcromaJrqd8k.mlYatlrqdeuct.dal

Gamma Fl. : In.tJhom.fmvcrom./rqd6k.mltl.r/lnd20 ... .g.m


Indicator 61mlllarlogrlm • Point Data
0 .10 lag : 10
Direction : 8 Plunga : 0
Horizonltl eandwldll'l : 40 HaI·Angle : 10
V.l1k:ar Bandwidth : 1 HaI.Angla : 0
FI!. Column: 1 Row : 2 LaYlr: 0 Vaklt : S

0 .00 L ........---''--........~==::::::i:==:j:==:::::;:==~
0.00 40.00 80 .00 120.00 160.00 200.00 240.00

separation distance (feet)

Figure 5 Variogram along vertical principle direction of anisotropy, for the 25 RQD
indicator cutoff.

0 .40 r;::::=:::::!==:::!:==r. . . . -,..-............,.-..,.....-;


MOdll Puamtter.
Nut Rang. Mod.1

!'
1 1000 0 .17 Spherical
Nugget 0 .08
0.30

-----------------------:=,;-------------1
•E
E 0 .20

'" Dati; P.... m.t.'.
Data FUt : Inttlhomlllhnvcromt/rqd8k.mlvatllnd20v.gam

aamma Fl. : Inatlhomtfmvcrom I/rqd8k.mlVarnn620h .gam


Indicator Stmlrrtar!ogram · Point Data
0 .10 lag : 200
Dlrlctlon : II Plunge : 0
Hor!zonul Bandwidth: 40 Hall-Angle : 40
Vertical Bandwidth : 1 HaI·Angle : 0
Fne Column: 1 Row: 2 Llyer: 0 Value : S
0 .00 L ..............._ .........z:==::.::::==;::=::::::::::;:==:J
0 .00 400 .00 800 .00 1200 .00 1600 .00 2000 .00 2400 .00

separation distance (feet)

Figure 6 Variogram along horizontal principle direction of anisotropy, for the 25 RQD indicator cutoff.
228 GEOSTATISTICAL APPLICATIONS

0.40 r.==::::::!==:::::!===r--..,..... . . . - r - -.......--,


Mod.1 P'l'IIm.ter.
N.lt R.ng. C Mod.1

0 .30
I' 1
Nu • • t
10 0 .017
0.08
Sph.r1c.1

•E
E

'"
0 .20
.... . D.tI p.t.m.t.,.

~
Data F.. : /n.tthom. /mvcromlllrqdeklmlV.rllndSOh .g.m
Gamm. F.. : In.tIhom./mvcromlllrqdlktmN.,nnd5OV .g.m
Indlce.tor S.",N.rlogt.", - Point 0.1&
0.10 lag : 10
Direction: I Plung. : 0
Horizontal Bandwidth : 40 Hal-Angl.: 10
V.rtSc.l l Bandwidth : 1 Hal-Anlill. : 0
Flit Column: 1 Row: 2 Laytr: o V.Iu. : a

0 .00 L ............._ .........z==::::i:::::::;:====~==:::!.J


0 .00 40.00 80.00 120.00 160.00 200 .00 240.00

separation distance (feet)

Figure 7 Variogram along vertical principal direction of anisotropy, for the 50 RQD indicator cutoff.

0 .40 r;::::::::::::!==::::!:===f"""""'-""-""'---'-""""""
Mod" Par.ln.tIl,. ,
N.. t Rang. C Mod.1
1 .0 0 .087 Sph.rlc.1
NUIIII.t 0 .01
0 .3 0 1 - ' - - - - - - - - - - - '

•E '

.•
E 0.20

b--=--..:,----ID·~:·~;~::~m.hn .
cromlllrqd.k.m/v.'n rwl50h.SI.m

0 .10
I Gamm. F.. : fn.tIhom. /mvcromalrqdl • • mNarnndSOh.Q.m
Indicator 6.mtnrloar.m - Point 0.1&
Lag : 200
Dlrecllon: " Pklng.: 0
Horizontal B.ndwkUh : 40 Half-Angle : 40
V.rdcal B.ndwldth : t HaI. Anal. : 0
Fl . Column : 1 Row : 2 Lay.r: o Value: S

0 .00
0 .00 400 .00 800.00 1200 .00 1600 .00 2000 .00 2400.00

separation distance (feet)

Figure 8 Variogram along horizontal principal direction of anisotropy, for the 50 RQD indicator cutoff.
CROMER ET AL. ON ROCK QUALITY DESIGNATION 229
existed. These problems threaten the capability of the simulation algorithm to represent
the higher two thresholds (50 and 75) because covariance reproduction is not constrained
where high nugget effects prevail. For this reason, quantitative inferences will not be
made from these upper thresholds.

The indicator transforms were simulated using the sequential indicator simulation
algorithm SISIM (Deutsch and Journel, 1992). Figure 9 shows three separate simulated
fields ofRQD along the axis of the North Ramp. Each field is conditioned to existing
bore-hole data and present a plausible version of the "reality" defined by first- and
second-order statistical moments. These figures have been vertically exaggerated by five
times the horizontal dimension for detailed examination. The three images display some
similar textural characteristics, while the uncertainty in their representation is captured by
the differences between images. If each image was used, for example, as input data to
some process modeling code for design purposes (say, in a Monte Carlo fashion), the
variation in outcomes from the process model would explicitly account for uncertainty.

Geostatistical simulation was selected over estimation in this study because of its
robustness in addressing potential "downstream" application questions. Simulation
differs from estimation in two major aspects; (1) simulation techniques provide high-
resolution models that strive for overall textural and statistical representation rather than
local accuracy and (2) the differences among the alternative models provide a measure of
joint spatial uncertainty. For example, some uncertainty issues can be addressed simply
from the equiprobable images (models) prior to any downstream process modeling. In
Figure lOa total of 100 conditional simulations have been processed using the crude
cumulative distribution function defined by the four indicator thresholds 10, 25, 50, and
75 to determine the distribution of expected (mean) RQD values. The gray-scale limits
the displayed variability in the image to range between a maximum of 50 and a minimum
of 0 to capture detail. Notice the limited occurrence of values that equal or exceed 50,
indicating and that these are isolated occurrences that should not be expected to propagate
spatially.

Most areas of the expected value map shown in Figure 10 are dominated by values that
range between 20.0 and 30.0. Although this is consistent with the histogram ofRQD
values shown in Figure 4, it may also be due, in part, to the selection of simple kriging as
the local estimator. Simple kriging was chosen over ordinary kriging because of the
scarcity of data and the risk of unwarranted data propagation (Deutsch and Joumel,
1992). If ordinary kriging is used with sequential simulation, there may be a tendency to
propagate locally simulated values in a manner inconsistent with the conceptual model.
This characteristic becomes more problematic when there is a lack of constraining,
original data.

Since each pixel (model grid cell) is simulated 100 times, the statistical distribution of
each local outcome allows us to query characteristics of the outcome distribution ofRQD
values on a pixel-by-pixel basis. Post-processing of several outcomes can provide
information such as the probability of exceeding a specified threshold or the average
230 GEOSTATISTICAL APPLICATIONS

"'.0

45.0

40.0

35.0

30.0

2>.0

20.0

15.0

10.0

5.0

.0

"'.0

45.0

40.0

35.0

30.0

25.0

20.0

15.0

10.0

5.0

.0

"'.0

45.0

40.0

35.0

30.0

2>.0

20.0

15.0

10.0

5.0

.0

Figure 9 Three alternative 2-D images (realizations) ofRQD along the axis of the North Ramp.
The angular trace in the middle of each image represents the vertical orientation of the ramp
within the cross-section. Each image can be considered equally probable given the state of
existing knowledge, because each is conditioned to the same sample data and honor the same
spatial statistics. The differences between the images provides a measure of joint spatial
uncertainty.
CROMER ET AL. ON ROCK QUALITY DESIGNATION 231

value above, or below, a threshold. A map showing the value at which an individual
pixel reaches a specified cumulative probability, for example, would provide valuable
information for quantifying risk. Figure 11 shows the probability of exceeding an RQD
value of 25. Although this map looks very similar to the expected value map of Figure
10, it is revealing very different information. The gray-scale in Figure 11 ranges between
zero (0% probability) and one (100% probability), unlike the expected value map.

CONCLUSIONS

Unfortunately, the 2-D simulated images along the North Ramp cross-section do not
explicitly focus information on the expected variability to be encountered along the drift
itself. To evaluate anticipated conditions specifically along the drift, the designed
inclination of the drift has been projected from the tunnel entrance and is shown as the
trace super-imposed on the images in Figure 9. The expected (mean) value ofRQD along
the tunnel projection has been extracted on a pixel-by-pixel basis for comparison against
each of the three simulations presented in Figure 9.

The graphs shown in Figure 12 allow us to compare the variability in simulated RQD
along the three tunnel projections (taken from Figure 9) as a function of distance from the
right (east) edge of the cross-section. As a point of reference, the east edge of the cross-
section also corresponds to the location of boring NRG-l. The most immediate
observation in Figure 12 is the widespread, erratic fluctuations in simulated values about
their expected (mean) value. This was to be expected following our variography
exercises and discovery of the limited horizontal correlation range (approx. 800.0 ft
(243.8 m)) for lower RQD values and negligible spatial correlation in the higher RQD
values. What is not so apparent is the performance of the simulation in areas that are
conditioned by the available boring logs. At distances ofless than 3200 ft (975.4m) from
NRG-l the simulations, in general, tend to deviate less from the expected value. Boring
log data in this region are available to constrain uncertainty and, therefore, reduce the
spread oflikely outcomes for a local prediction

FINAL THOUGHTS

Basic exploratory data analysis identified a great deal oflocal variability in RQD.
Although very low RQD (i.e. less than 25) can be anticipated periodically along the entire
length of the North Ramp, it would not be prudent to extrapolate this interpretation to the
entire mountain. Three factors were found to influence the interpretation ofRQD: 1)
stratigraphic setting, 2) proximity to major fault/fracture zones, and 3) very local foot-
by-foot factors (likely due to individual high-angle fractures sub-parallel to the drill core).
The high degree of variability over very short distances may require design planning to
accommodate the worst rock conditions along the entire length of excavation
232 GEOSTATISTICAL APPLICATIONS

Figure 10. Mean (expected) value map developed from 100 individual simulations of RQD.

1 .0

.90

.80

.70

.60

.50

.40

.30

.20

.10

.0

Figure 11 Probability map reflecting the likelihood of exceeding an RQD value of 25 .


Note the scale reflects a probability range from 0% to 100%.
CROMER ET AL. ON ROCK QUALITY DESIGNATION 233
RN - 112063

RN-30157

10 _________ _

Horlzontlol dl,lance frvm "RO·t

RN-22475

10

"

~~~a~~~~§
.",IM_ """" NRO·'
Horlzontlll

Figure 12 Simulated RQD values along the proposed North Ramp taken
from the three fields shown in Figure 9. For comparison, also shown (in
bold) are their expected values derived from the 100 simulations.
234 GEOSTATISTICAL APPLICATIONS

Investigative work on rock properties in the exploratory studies facility is underway to


supplement drill hole data with an adequate number and distribution of data pairs
collected in a fashion that will support geostatistical analyses. In the meantime,
simulation analyses has provided a preliminary assessment of the conditions that could be
encountered during the excavation of the North Ramp. Indicator simulation along the
axis of this drift identifies the need for additional information if this study, or similar
studies, are to forecast engineering requirements for facilities design, especially with
respect to spatial continuity of higher RQD values.

This study has demonstrated how the measurement and analysis of data may lead to
interpretations that are not obvious or apparent using other means of research. Although,
many statistical tools are useful in developing insights into a wide variety of natural
phenomena; many others can be used to develop quantitative answers to a specific
questions. Unfortunately, most classical statistical methods make no use of the spatial
information in earth science data sets. However, like classical statistical tests,
geostatistical techniques are based on the premise that information about a phenomenon
can be deduced from an examination of a small sample collected from a vastly larger set
of potential observations on the phenomena. Geostatistics offers a way of describing the
spatial continuity that is an essential feature of many natural phenomena and provides
adaptations of classical regression techniques to take advantage of this continuity. The
quantitative methodology found in applications of geostatistical modeling techniques can
reveal the insufficiency of data, the tenuousness of assumptions, or the paucity of
i I information contained in most geologic studies.
II
REFERENCES

Cecil III, O.S., 1970, "Correlations of Rock Bolt--Shotcrete Support and Rock Quality
Parameters in Scandinavian TWmels," Ph.D. Thesis, University of Illinois, Urbana.

Cromer, M. V. and R. M. Srivastava, 1992, "Indicator Variography for Spatial


Characterization of Aquifer Heterogeneities," in Water Resources Plannjn~ and
Mana~ement. Proceedin"s of the Water Resources sessions at Water Forum '92. Au~ust
2-5.1992, American Society of Civil Engineers, Baltimore, MD, pp 420-425.

i' Deere, D.U., and D.W. Deere, 1989, "Rock Quality Designation (RQD) after twenty
I; years: US Army Corps of Engineers," Contract Report GI-89-1.
I
Deutsch C.V. and A.G. Journel, 1992, "GSLIB Geostatistical Software Library and User's
Guide," Oxford Univ. Press, New York, New York.

Isaaks, E. H., and R. M. Srivastava, 1989, "An Introduction to Applied Geostatistics,"


New York: Oxford University Press.
CROMER ET AL. ON ROCK QUALITY DESIGNATION 235

Lin, M., M. P. Hardy, and S. 1. Bauer, 1993, "Fracture Analysis and Rock Quality
Designation Estimation for the Yucca Mountain Site Characterization Project: Sandia
Report SAND92-0449," Sandia National Laboratories, Albuquerque, NM.

Olea, R.A. ,1991, "Geostatistical Glossary and Multilingual Dictionary," International


Association of Mathematical Geology Studies in Mathematical Geology No.3, Oxford
University Press.

Scott, R.B. and J. Bonk, 1984, "Preliminary Geologic Map of Yucca Mountain, Nye
County, Nevada, with Geologic Sections," U.S. Geol. Survey Open-File Report 84-494.

US Geological Survey, 1993 "Methodology and source data used to construct the
demonstration lithostratigraphic model: second progress report".

Wingle, W. L., E. P. Poeter, and S. A. McKenna, 1994 "UNCERT User's Guide: A


Geostatistical Uncertainty Analysis Package Applied to Ground Water Flow and
Contaminant Transport Modeling," draft report to the United States Bureau of
Reclamation, Colorado School of Mines.
James R. Carr!

REVISITING THE CHARACTERIZATION OF SEISMIC HAZARD USING


GEOSTATISTICS: A PERSPECTIVE AFTER THE 1994 NORTHRIDGE,
CALIFORNIA EARTHQUAKE

REFERENCE: Carr, 1. R., "Revisiting the Characterization of Seismic Hazard Using


Geostatistics: A Perspective After the 1994 Northridge, California Earthquake," Geos-
tatistics for Environmental and Geotechnical Applications, ASTM STP 1283, R. Mohan
Srivastava, Shahrokh Rouhani, Marc V. Cromer, A. Ivan Johnson, and Alexander J. Des-
barats, Eds., American Society for Testing and Materials, 1996.

ABSTRACT: An indicator kriging model of seismic hazard for southern California,


based on the time period 1930 - 1971, is developed. This hazard assessment is
evaluated in light of the occurrence of more recent, moderate earthquakes: the 1987
Whittier Narrows, the 1990 Upland, and the 1994 Northridge earthquakes. The
hazard map shows relatively poor spatial correlation between regions of high hazard
and known, active faults. A hypothesis is developed, however, suggesting that high
seismic hazard in southern California is a function of spatial proximity to all active
faults, not to anyone active fault.

KEYWORDS: seismic hazard, modified Mercalli intensity, southern California,


kriging, semivariogram, indicator functions

Geostatistical analysis of earthquake ground motion was first attempted by


Glass (1978). Therein, modified Mercalli intensity data for the 1872 Pacific
Northwest earthquake were analyzed using semivariogram analysis, then regularized
(gridded) using kriging and contoured. Glass (1978) demonstrates the usefulness of
geostatistics vis-a-vis semivariogram analysis and kriging for analyzing earthquake
ground motion.

Based on the success of Glass (1978), an experiment was attempted to


characterize seismic hazard for southern California (Carr 1983; Carr and Glass 1984).
Kriging was used to form digital rasters of modified Mercalli intensity data for all
earthquakes in the time period, 1930 - 1971, that occurred within a 125 km radius of
San Fernando, California (an arbitrary choice). These digital rasters

!Professor, Department of Geological Sciences/l72, University of Nevada, Reno, NV


89557

236
CARR ON NORTHRIDGE EARTHQUAKE 237

were geographically registered and, as such, served as input to a Gumbel (1958)


extreme events model for computing seismic hazard. Procedures for developing this
model consisted of the following steps: 1) kriging was used to form a digital raster
for each earthquake in the aforementioned time frame; all of these rasters were
geographically registered; 2) for each year, 1930 - 1971, if more than one earthquake
occurred, then the maximum kriged intensity for each cell of the raster was found and
a summary raster formed reflecting maximum intensity for the year; this process
resulted in 42 digital rasters, each a record of maximum intensity values for an entire
year; 3) Gumbel (1958) statistics of extreme values was used to compute the
probability that an intensity VI was exceeded for a raster cell over the 1930 - 1971
time period; an intensity VI was an arbitrary choice, but this is the intensity value at
which exterior damage to buildings begins. These exceedance probabilities constitute
the seismic hazard (Fig. 1).

119 118
Fig . 1. Seismic hazard model developed using Gumbel (1958); from Carr (1983) and
also published in Carr and Glass (1984). Contoured values are probabilities (%) of exceeding
an intensity VI over a 50 year period.

A Gumbel (1958) model requires that certain decisions be made when


computing the probability of exceeding a particular level of ground motion. For
example, a minimum, or threshold, ground motion value must be chosen for
calculations. In Carr and Glass (1984), for instance, a minimum intensity value of III
was chosen, yet in many years, the minimum value was actually O. The choice of an
intensity III was entirely arbitrary .

As an alternative to a Gumbel (1958) model, Carr and Bailey (1985) developed


an indicator kriging (c.f., Iournel 1983) seismic hazard model. This model does not
use Gumbel's statistics of extremes method for computing exceedance probabilities.
Instead, modified Mercalli intensity data are rust converted to indicator values as is
described later. Once converted to indicator values, kriging is applied to the indicator
data to form digital rasters. As in the Carr and Glass (1984) model, these rasters
238 GEOSTATISTICAL APPLICATIONS

were geographically registered during the kriging process. Because the rasters are
registered, the final step in the indicator kriging model is simply a summing of all
rasters to form one, combined raster. A contour map of the combined raster shows
the frequency of exceeding a threshold VI over a particular time period. This
frequency constitutes the seismic hazard for a particular geographic region. Carr and
Bailey (1985) applied the indicator kriging model to the New Madrid, Missouri
seismic zone in the time period, 1811 - 1980.

Because the indicator kriging model is considerably easier to apply in


comparison to one using the Gumbel (1958) method, the seismic hazard in southern
California in the time frame 1930 - 1971 is revisited herein using indicator kriging.
One objective of this study is to compare the seismic hazard map from indicator
kriging to that obtained using a Gumbel calculation. Another aspect of this analysis is
to compare the occurrence of recent southern California earthquakes, in particular the
1987 Whittier Narrows, the 1990 Upland, and the 1994 Northridge earthquakes, to
the seismic activity that preceded them (1930 - 1971).

A BRIEF REVIEW OF GEOST ATISTICS

In general, geostatistical methods are useful for characterizing the spatial


variation of regionalized phenomena. Other than earthquake ground motion,
geotechnical applications include soil density and strength, ground water level, and
ground water salinity; of course, there are many more examples.

The term, geostatistics, is often considered synonymous with the spatial


estimation technique known as kriging (Matheron 1963). This estimator is a
relatively simple, weighted average of the form:
N
Z*(xo } = Laiz(xi }
i=l

wherein Z(x;) are data values at N nearest data locations to the estimation location, Xc;
Z'(Xc) is the estimated value at the estimation location, Xc, and the values, a, are
weights applied to the N data values to obtain the estimate. A restriction is placed on
the weights, a, in ordinary kriging such that their sum is 1; this assures unbiased
estimation.

That kriging is a relatively simple estimator is seen in its equation form, a


simple weighted average. Obtaining the weights, a, for this equation is more
complicated. The weights are obtained by solving: [COVu]{a} = {COVo;}. Notice
that these matrices are functions of spatial covariance (COV). Covariance in this case
is the autocovariance of the spatial data, Z, between two locations in space.
Knowledge of spatial covariance is obtainable from what is known as the
semivariogram (often referred to simply as the variogram; see Matheron 1963; or
10urnel and Huijbregts 1978). The semivariogram is estimated from the spatial data
CARR ON NORTHRIDGE EARTHQUAKE 239

as follows:
N
Y (h) = -.l L [Z(x) - Z(xi+h) J 2
2N i =1

which is a function of average squared difference in Z as a function of spatial


separation distance (lag), h.

Once the semivariogram is calculated, it must be modeled for use in kriging.


Only a few functions , those that are negative semi-definite, qualify as valid models
(see Journel and Huijbregts 1978). The most useful semivariogram model is known
as the spherical model and is graphed (Fig. 2) . To model a calculated semivariogram
(Fig. 2), values for the nugget, sill, and range (Fig. 2) are interpreted, allowing the
spherical model equation to fit the calculated semivariogram as closely as possible
(Fig . 2) . Then, spatial covariance is obtainable from the semivariogram model as
follows:
COV(h) = s i l l - y (h)

In kriging, once a semivariogram model is selected and parameters defined (nugget,


sill, and range), covariance entries in the foregoing matrix system are computed using
the semivariogram model. How these calculations are performed is described in Carr
(1995) using hand calculation examples. Once the covariance matrix entries are
obtained, the matrix system is solved for the weights, a, using an equation solver,
such as Gauss elimination or LV decomposition. Software for semivariogram
calculation and kriging is given in Deutsch and Journel (1992), including diskettes
containing FORTRAN source code. Software is also given in Carr (1995) along with
graphics routines for displaying results. 300

225

r(h)
SiII - 165
?"'..,...------ Spherical model

7S

R:mge - 2.3

3 4
h nag distance. km)

Fig. 2. A calculated semivariogram modeled using a spherical model; note nugget


(CO), sill, and range (from Carr 1995).

A BRIEF NOTE REGARDING THE DATA

Herein is presented a seismic hazard model of southern California that is based


on modified Mercalli intensity data. Such data are subjectively assigned and are
integer values in the range 0 to XII (12). A value, 0, represents no ground motion; a
240 GEOSTATISTICAL APPLICATIONS

value, XII (12), represents total damage, landsliding, fissuring, liquefaction, and so
on. A value, VI (6), is that value at which exterior structural damage is noticed, such
as cracked chimneys. Interior damage is noted with a value, V (5). Subsequent to an
earthquake, the United States Geological Survey distributes questionnaires to citizens
living within the region experiencing the earthquake. They are asked to describe what
they experienced during the earthquake. Examples include: 1) Did you observe
damage and, if so, what was the damage?; 2) Did you feel the earthquake and, if so,
where were you when you felt it? Intensity values are then assigned [subjectively] to
each questionnaire.

That modified Mercalli intensity data are subjective is obvious. What is not
obvious is that geostatistics (kriging) is validly applied to grid (estimate) such data.
Clearly, Glass (1978) showed this empirically. Joumel (1986) discusses the
application of geostatistics to "soft," or subjective, data in considerable detail.

INDICATOR KRIGING SEISMIC HAZARD MODEL

Indicator kriging is a form of kriging that does not entail a change in the
equation for the kriging estimator, but does entail a change in the data to which
kriging is applied. With indicator kriging, a transform is applied to the data, in this
case modified Mercalli intensity values. This transform is a simple one: i(x) = 0, if
Z(x) < c; i(x) = 1 otherwise; this simple transform yields the indicator function, i.
Notice that the indicator function is a binary one, taking on only two possible values,
o and 1. Because of this, the indicator function is said to be a nonparametric
function, because the notion of a probability distribution for such a function is not
pertinent. The nonparametric nature of the indicator function has certain advantages
in geostatistics (Joumel 1983), chiefly the minimization of the influence of extreme
data values on the calculation of the semivariogram and in kriging. The value, c,
used to define the indicator function is called a threshold value. In this study of
seismic hazard, c is that critical ground motion value chosen to define the hazard. In
this study, c is chosen to be an intensity value of VI (6) because this intensity value is
that at which exterior structural damage is first noticed.

When performing indicator kriging, the indicator function, i, is used rather


than the raw data, Z. Other than this substitution, the kriging estimator is applied
using the same equation as shown before. Weights, a, are calculated using the matrix
system shown previously; covariance entries in this matrix system are obtained using
the semivariogram for the function, i. When performing kriging on i, estimates are
obtained that range between 0 and 1, inclusive. As the function, i, is defined for
seismic hazard analysis, the estimate of i is interpreted as the probability at the
estimation location that ground motion exceeds the threshold value, c, used to define
the indicator function.

An indicator kriging model for assessing seismic hazard is a simple one.


Modified Mercalli intensity data for each earthquake in a particular time period are
CARR ON NORTHRIDGE EARTHQUAKE 241

transformed to indicator values as follows: if intensity is VI or greater, the intensity


value is converted to 1, otherwise the intensity value is converted to zero. Kriging is
used to form a regular grid (a digital raster) of the indicator values. For this study,
50 x 50 rasters were designed, registered to geographic coordinates as shown in
various figures herein (for example, Fig. 3). Once rasters are formed for each
earthquake in the time period, the digital rasters are simply added together to form a
final, composite-sum map. Higher hazard is realized in this map by noticing regions
that are associated with a higher sum.

APPLICATION TO SOUTHERN CALIFORNIA SEISMICITY, 1930-1971

Indicator kriging has been used to characterize southern California earthquake


hazard previously (van der Meer and Carr 1992). The present study uses all 46
earthquakes that occurred between 1930 and 1971 that were associated with intensity
values of VI or greater (see Carr 1983 for a list of these earthquakes). Van der Meer
and Carr (1992) used only the 11 largest magnitude earthquakes of this 46. Hence,
one objective of this current study is to revisit the earlier indicator kriging model and
to update it using more information. Another objective, one not considered by van
der Meer and Carr (1992), is to compare recent, large earthquakes with the seismic
patterns analyzed in the indicator kriging model that is based on the time period,
1930-1971.

An indicator kriging seismic hazard model based on the 46 earthquakes is


shown (Fig. 3). It shares some similarities to that obtained previously (Fig. 1). In
particular, a region of high hazard is found in each map near Oxnard/Santa Barbara.
However, the indicator kriging hazard map finds particularly high hazard north to
northeast of Long Beach. Both maps (Figs. 1 and 3) are also associated with
relatively low hazard near Mojave, California.

Van der Meer and Carr (1992) focused analytical attention on whether high
hazard correlated spatially with known, active faults. That study found that higher
hazard could not be directly related to anyone active fault in southern California.
This study verifies this conclusion. Higher hazard does not directly correlate spatially
with known active faults (Fig. 3).

Because southern California is associated with so many active faults, it is


perhaps not surprising that higher hazard sometimes occurs spatially where it is not
expected. A hypothesis (Fig. 4) is forwarded as a possible explanation. This figure
shows three hypothetical earthquakes. A circle encloses each epicenter and ground
motion intensity of at least MMI VI (6) is assumed to have occurred everywhere
within each circle. The dark, gray patterned area is that affected by all three
earthquakes and therefore has a higher hazard because three episodes of damaging
242 GEOSTATISTICAL APPLICATIONS

119
Fig. 3. Indicator kriging hazard map with major active faults superimposed. Regions
associated with at least 6 episodes of intensity VI or higher ground motion in the time period, 1930-
1971 , are highlighted in gray. The faults are coded as follows : A) White Wolf Fault; B) Garlock
Fault; C) Big Pine; D) Santa Ynez; E) Oak Ridge; F) San Andreas ; G) San Gabriel; H) Newport-
Inglewood ; I) San Jacinto .

ground motion were experienced . But, a higher hazard would not necessarily be
expected within this gray-patterned region because it is not near anyone fault . Its
proximity to three active faults, however, makes it vulnerable to damage during
earthquakes occurring on all three faults .

This hypothetical model is thought to explain the regions of higher hazard in


Figure 3. With respect to Long Beach, it has experienced damaging ground motion
from earthquakes occurring on the Newport-Inglewood fault (the 1933 Long Beach
Earthquake), faults in the San Fernando Valley (e.g., the 2 February 1971
earthquake), the White Wolf fault (the 1952 Kern County earthquake) , and also
earthquakes occurring on the San Gabriel, San Andreas, San Jacinto, Oak Ridge, and
Santa Ynez faults. With respect to Oxnard, it has been affected by earthquakes on the
Newport-Inglewood fault (1933 Long Beach earthquake), the Oak Ridge fault (1941
Santa Barbara and 1957 Ventura earthquakes), the White Wolf fault (1952 Kern
County earthquake), and to a lesser extent by earthquakes in the San Fernando valley.

As a test of the hypothesis (Fig. 4) , the active faults shown in Figure 3 are
idealized as shown in Figure 5. A digital raster is developed for each of these faults
as follows : 1) an attenuation function was designed from a general formula given in
Cornell (1968): intensity = 5.4 + M - 31nR, where M is Richter magnitude and R is
the distance from the fault; 2) a typical Richter magnitude was chosen for each of the
nine (9) faults (Table 1); 3) a 34 x 34 digital raster (an arbitrary choice of size) was
CARR ON NORTHRIDGE EARTHQUAKE 243

E- EPICENTER
- -' FAULT

119 118

Fig. 4. Three hypothetical earthquakes Fig. 5. Idealized active fault locations.


occurring on the faults shown. Notice that the Codes for faults are the same as described in
gray-shaded region is affected by all three the caption to Fig. 3.
earthquakes.

developed, geographically registered to the kriged seismic hazard rasters (note that
this grid size is smaller than that used for indicator kriging; both grid sizes, however,
are of arbitrary size and merely facilitate the construction of contour maps). An
intensity value was estimated for each cell of the raster using the foregoing
attenuation formula (not by indicator kriging in this case); 4) if the estimated intensity
was VI or greater, the raster cell was
assigned a value of 1; otherwise the cell was assigned the value O. Once a digital
raster was developed by this procedure for each of the nine (9) active faults (Fig. 6),
a composite raster was formed as a sum of all nine rasters. Frequency of intensity VI
or greater ground motion was then contoured (Fig. 7). Gray shading highlights the
geographic regions associated with the highest frequency of damaging ground motion.

A comparison of Figure 7 to Figure 3 shows that regions of higher hazard


found in the hypothetical map (Fig. 7) do not exactly match those in the indicator
kriging hazard map (Fig. 3). But, the region of higher hazard near Long Beach (Fig.
3) is near one of the higher hazard regions of Figure 7; and, the higher hazard found
near Oxnard (Fig. 3) is near another region of higher hazard found in Figure 7. Both
Figures 4 and 8 identify lower hazard near Mojave. Similarities between these two
maps are interesting and lend credibility to the foregoing hypothesis (Figure 4).
244 GEOSTATISTICAL APPLICATIONS

TABLE l--Richter magnitudes used for nine active faults.

Fault Magnitude

White Wolf 7.0


Garlock 7.0
Big Pine 6.5
Santa Ynez 6.5
Oak Ridge 6.5
San Andreas 8.3
San Gabriel 6.5
Newport-Inglewood 6.5
San Jacinto 6.5

RECENT SOUTHERN CALIFORNIA EARTHQUAKES

Epicenters of three, recent southern California earthquakes are plotted (Fig.


8): 1) the 1987 Whittier Narrows earthquake, magnitude 5.5 to 6.0; 2) the 1990
Upland earthquake, magnitude 5.0 to 5.4; and the 1994 Northridge earthquake,
magnitude 6.6 (estimated). It is interesting that these three earthquakes occur closely
to the San Gabriel Fault. With respect to the indicator kriging result, none of these
earthquakes occurs within a region identified as having a high seismic hazard. Of
course, this is the point made with the foregoing hypothesis (Figures 5 and 8) that
higher hazard cannot be spatially correlated with anyone active fault in southern
California. The 1987 Whittier Narrows and the 1994 Northridge earthquakes, for
example, caused damaging levels of ground motion within the region of higher hazard
found north of Long Beach; these earthquakes increased the hazard within this region.
Furthermore, the 1994 Northridge earthquake caused damaging levels of ground
motion in the Oxnard area, another region identified as having higher hazard. Only
the 1990 Upland earthquake occurred in a lower hazard area and did not have a large
enough magnitude to influence any of the higher hazard regions. Figure 8 also shows
these three epicenters plotted on the hypothetical hazard map (Figure 7). The
epicenters for the 1990 Upland and 1994 Northridge earthquakes occur just outside
regions of highest hazard, whereas the epicenter for the 1987 Whittier Narrows
earthquake occurs within the region of high hazard north of Long Beach.

CONCLUSION

An indicator kriging seismic hazard model is much more easily developed in


comparison to one based on Gumbel's statistics of extreme values (Gumbel 1958).
With the indicator kriging model, modified Mercalli intensity data are first
transformed to indicator values, transformed to 1 if the intensity is VI (6) or greater,
transformed to 0 otherwise. Kriging is used to estimate the Oil indicator data at
nodes of a regular grid, hence forming a raster. Once rasters are formed for all
CARR ON NORTHRIDGE EARTHQUAKE 245

<6

..,"<t ( _6
119
, >6

118
~

Fig. 6. Region of intensity VI or greater ground motion for earthquakes occurring


anywhere along the Newport-Inglewood fault having Richter magnitudes of 6.5.

\19
Fig. 7. Resultant hazard map produced by hypothesizing earthquakes along the entire
spatial domain of active faults. Gray shading shows regions that the theoretical model pridicts
having the highest frequency of intensity VI or better ground motion.
246 GEOSTATISTICAL APPLICATIONS

118
INDICATOR KRIGING MODEL THEORETICAL MODEL

Fig . 8. The seismic hazard maps of Figures 4 and 8 with epicenters plotted for the
1987 Whittier Narrows, 1990 Upland, and 1994 Northridge earthquakes.

earthquakes occurring within a time window of interest, the rasters are simply
summed to yield a fmal composite map (e.g., Figure 3) . No normalization is
performed subsequent to this summing process. Final maps therefore do not represent
probabilities, but instead represent total frequency of experiencing ground motion
severe enough to cause damage. Moreover, this summing process assumes all rasters
are geographically registered, a condition easy to achieve with kriging because, when
used for gridding, geographic coordinates defming the grid must be entered into the
computer program performing the kriging .

Accepting the hypothesis that high hazard is a function of spatial proximity to


all active faults, not just to anyone active fault, the hazard map produced herein
using indicator kriging is judged to be plausible. Extremely high hazard is identified
just to the north and west of Long Beach, California. Long Beach is within that
region of southern California affected by more active faults than other regions of
California. In fact, two recent earthquakes, the 1987 Whittier Narrows and the 1994
Northridge earthquakes, occurred close enough to this high hazard region to have
produced damage within it. Another region of relatively high hazard is identified
around Oxnard, California and reflects a relatively high level of seismic activity on
the Oak Ridge fault within the Santa Barbara Channel.

In summary, geostatistics offers spatial analysis tools that are quite useful for
producing maps of seismic activity. Ground motion for individual earthquakes are
readily gridded using kriging. Furthermore, what is presented herein is nothing more
than a raster-based geographic information system. Hence, GIS programs having a
raster-import capability , such as ARC/INFO, are capable of displaying results given
herein once digital rasters have been formed using software such as is given in Carr
(1995).
CARR ON NORTHRIDGE EARTHQUAKE 247
References

Carr, J. R., 1983, "Application of the Theory of Regionalized Variables to


Earthquake Parametric Estimation and Simulation," unpublished doctoral dissertation,
University of Arizona, 259p.

Carr, J. R., 1995, Numerical Analysis for the Geological Sciences, Prentice-Hall,
Englewood Cliffs, New Jersey.

Carr, J. R. and Glass, C. E., 1984, "A Regionalized Variables Model for Seismic
Hazard Assessment," Eighth World Conf. on Earthquake Engineering, Prentice Hall,
Englewood Cliffs, New Jersey, Vol. 1, pp. 207-213.

Carr, J.R., and Bailey, R.E., 1985, "An Indicator Kriging Model for the
investigation of Seismic Hazard" Mathematical Geology, Vol. 18, No.4, pp.
409-428, 1986.

Cornell, C. A., 1968, "Engineering Seismic Risk Analysis," Bulletin of the


Seismological Society of America, Vol. 58, No.5, pp. 1583-1606.

Deutsch, C. V. and Journel, A. G., 1992, GSLIB: Geostatistical Software Library


and User's Guide, Oxford University Press, New York.

Glass, C. E., 1978, "Application of Regionalized Variables to Microzonation," Proc.


2nd International Conference on Microzonation for Safer Construction - Research and
Application, Vol. 1, pp. 509-521.

Gumbel, E. J., 1958, Statistics of Extremes, Columbia University Press, New York.

Journel, A. G., 1983, "Nonparametric Estimation of Spatial Distributions," Journal


of the International Association for Mathematical Geology, Vol. 15, No.3, pp. 445-
468.

Journel, A. G., 1986, "Constrained Interpolation and Qualitative Information - The


Soft Kriging Approach," Mathematical Geology, Vol. 18, No.3, pp. 269-286.

Journel, A. G., and Huijbregts, Ch. J., 1978, Mining Geostatistics, Academic Press,
London.

Matheron, G., 1963, "Principles of Geostatistics,." Economic Geology, Vol. 58, pp.
1246-1266.

van der Meer, F. D. and Carr, J. R., 1992, "Geostatistical Investigation of


Earthquake Hazards in Southern California," ITC (Ind. Inst. for Aerospace Survey
and Earth Sciences) Journal, 1992-2, pp. 164-171.
Farida S. Goderya1 , M. F. Dahab2 , W. E. Wolde, and I. Bogarde.

SPATIAL PATTERNS ANALYSIS OF FIELD MEASURED SOIL NITRATE

REFERENCE: Goderya, F. S., Dahab, M. F., Woldt, W. E., Bogardi, I., "Spatial Pat-
terns of Field Measured Residual Soil Nitrate," Geostatistics for Environmental and
Geotechnical Applications, ASTM STP 1283, R. Mohan Srivastava, Shahrokh Rouhani,
Marc V. Cromer, A. Ivan Johnson, Alexander J. Desbarats, Eds., American Society for
Testing and Materials, 1996.

ABSTRACT: The purpose of this study was to assess the spatial variability of
residual soil nitrate, measured in three contiguous 16 ha fields. Available data for
residual soil nitrate were examined using conventional statistics. Data tended to be
skewed with the mean greater than the median. Geostatistical methods were used to
characterize and model the spatial structure. Three dimensional spatial variability was
examined using two semivariograms: horizontal-spatial and vertical. Two ~ .
dimensional horizontal-spatial serriivariograms were also computed for each O.3m (1ft)
layer. Semivariogram analysis showed that there were similarities in the patterns of
spatial variability for all fields. The results suggest that the spatial patterns in
residual soil nitrate may be correlated with irrigation practices. Furthermore, a trend
was found to be present along the vertical direction, which may be related to the time
of sampling.

KEYWORDS: spatial variability, 3-D semivariogram, 2-D semivariogram,


directional semivariogram, residual soil nitrate.

INTRODUCTION

Nitrate contamination in groundwater is often related to nitrogen fertilizer


applied in excess of crop needs. Residual soil nitrate is frequently the largest source
of inorganic N available to crops. The amount of nitrate in the soil profile is
important for determining a fertilizer nitrogen recommendation that ensures sufficient
nitrogen for crop production as well as preventing potential groundwater problems.

The origin and nature of soil resource variability includes natural and
management induced soil parameters, and factors exhibiting variability in space and

'Graduate Research Assistant, and 2Prof., Dept. of Civil Eng., University of Nebraska, Lincoln, NE 68588
3Assistant Prof., Dept. of Biological Systems Engineering, University of Nebraska, Lincoln, NE 68583.

248
GODERYA ET AL. ON SOIL NITRATE 249
time (Bouma and Finke 1992). It is an outcome of many processes acting and
interacting over a continuum of spatial and temporal scales. Nitrate is a mobile
nutrient, also, soil resource and meteorological variability obscures the contemplation
of its spatial structure. For example, soil nitrate concentrations from individual
samples are usually quite variable, in addition, the non-uniform distribution of
irrigation water complicates the issue.

Classical statistical procedures have traditionally been used to assess the


variability of various properties in soils (Biggar et al., 1973; Biggar and Nielsen
1976; Bresler 1989). The use of these techniques assumes that observations in the
field are independent of one another, regardless of their location. However, there is a
significant volume of literature in various disciplines such as geology (Davis, 1986;
Journel 1989), mining (Guaracio et al., 1975; Isaaks and Srivastava 1989; Journel and
Huijbregts 1978) and soil science (Beckett and Webster 1971; Dahiya et al., 1984;
Bhatti et al., 1991), which shows that variations in geologic properties tend to be
correlated across space. Thus,!the classical methods may be inadequate for
interpolations of spatially dependent variables, because it assumes random variation
and does not consider spatial correlation and relative location of samples.

The geostatistical approach has received increasing attention in science and


engineering during the last decade (Kalinski et al., 1993; Woldt et al., 1992; Woldt
and Bogardi, 1992; Tabor et al., 1985; Berndtsson et al., 1993; Jury et al., 1987;
Mulla 1988; Ovalles 1988; Rolston et al., 1989; Sutherland et al., 1991). The
primary reasons for the adoption of geostatistics in various fields are that this
methodology (1) provides an estimate for the minimum distance for the spacing of
independent samples, (2) provides a basis for an efficient monitoring program from an
initial reconnaissance survey, (3) allows the quantification of unbiased measurements
of location and spread, (4) furnishes optimal, unbiased estimates of regionalized
variables at unsampled locations, based on neighboring data, and (5) can also be used
to characterize associated uncertainty using geostatistical simulations.

To date, we are not aware of any attempts to characterize the spatial variability
of the residual soil nitrate using three dimensional spatial statistics. This information
is necessary since the spatial variability in residual soil nitrates has been considered a
major factor associated with inherent leaching of nitrate in many production
agriculture situations.

The primary objective of this study is to measure quantitatively the spatial


variability of residual soil nitrates in three fields. The hypothesis is that the
variability of residual soil nitrate in a field contributes to the variability of leaching to /
the groundwater from the available soil-N pools. The analysis conducted in this stud~,/
will be further utilized in the modeling of nitrate contamination to groundwater,' The
eventual goal of the project is to explore variable rate application methods by ,relating
residual soil nitrates and other parameters to the amount of nitrate leaching to
groundwater using geostatistical simulation and transport models. '
250 GEOSTATISTICAL APPLICATIONS

METHOOOWGY

Samples from three contiguous 16 ha fields with differing management


histories were used to determine the spatial variability of residual soil nitrates
(Peterson and Schepers, 1992). Two fields are 396m x 426m, and one fields is 365m
x 426m. Field data consist of residual soil nitrate measurements at each location on a
30.5m x 30.5m (100ft x 100ft) grid with a spacing of 20-40m from the boundaries.
At each grid location, a single 5 cm (2-inch) diameter, l.5m (5 ft) long soil core was
collected and divided into 0.3 m(1 ft) increments. Hence, each layer in three separate
fields contained 156, 156, and 143 points respectively. Each sample was analyzed
separately and the results are reported as nitrate-nitrogen in 0.3 m (1 ft) depth
increment. The data for each point were used to study the 3-dimensional and 2-
dimensional spatial continuity of the residual soil nitrate.

Classical statistical parameters such as the mean, the standard deviation and
the coefficient of variation were calculated for each layer. Statistical parameters for
the overall three dimensional data sets (vertically averaged over core), as well as for
profile (vertically integrated nitrate content for each hole), were also calculated.

Structural analysis of the field data was used to evaluate the semivariogram
function using programs from GSLIB (Deutsch and Joumel, 1993). Semivariograms
(Joumel and Huijbregts 1978) were used to examine the spatial dependence between
measurements at pairs of locations as a function of distance of separation. Three,~~
dimensional spatial variability was examined for each of the fields using two
semivariograms: horizontal-spatial semivariogram and vertical semivariogram. The
semivariogram for horiwntal spatially related data identifies the variability due to \
distance and is combined for all the depths. However, the vertical semivariogram )
describes the variabilities due to depth irrespective of horizontal location. Hence, fo~/
, ,the available data set of each field, two semivariograms were constructed. '

Two dimensional horiwntal-spatial semivariograms also were calculated for


each layer, that is for each 0.3 m (1 ft) layer, resulting in 5 different semivariograms
for each field. Furthermore, a 2-dimensional horizontal semivariogram also was
prepared for the vertically integrated nitrate content at each sample location.

In order to explore anisotropies, directional semivariograms were calculated


for each field in the horiwntal spatial direction, keeping the direction of the vertical
dimension constant. They were prepared using the concept of layers, in which
semivariograms were calculated in different spatial directions, by restricting the
search window in the vertical dimension.

RESULTS AND DISCUSSION

Residual soil nitrate in the profile was highly variable, ranging from 64 to 650
kg/ha (57 to 580 Ibs/acre) with a mean of 192 kg/ha (173 Ibs/acre). Table 1 shows
GODERYA ET AL. ON SOIL NITRATE 251
the statistical parameters for the three fields. For each layer, data tended to be
skewed with the mean greater than the median. The general trend was toward an
increase in the values of coefficient of variation and a decrease in the values of
residual soil nitrogen with increasing depth. For overall 3-dimensional measurement
values, the distribution of data was skewed with large coefficient of variation.
TABLE 1--Residual soil nitrate from three fields.
Field Layer Minimum Maximum Mean Median Std. dev C.V (%)
(kg/ba) (kg/ba) (kg/ba) (kg/ba) (kg/ba)

19.35 239.9 84.95 81.45 41.18 48.47


2 9.27 144.35 39.87 33.67 25.02 62.67
Field 1 3 8.06 125.0 30.63 24.6 20.46 66.79
4 8.47 123.78 32.76 23.79 22.98 70.15
5 7.66 95.57 29.73 23.79 18.85 63.39
profile 68.14 650.76 217.94 198.17 96.24 44.16
overall 7.66 239.90 43.6 31.85 34.09 78.20
........................................................................................................................................................................................................
20.16 271.76 73.81 68.95 33.32 45.15
2 9.27 117.33 35.87 30.64 20.05 57.16
Field 2 3 7.26 151.6 31.12 26.81 21.83 70.14
4 7.66 157.65 28.10 22.18 21.26 75.64
5 6.45 75.8 25.97 25.4 14.44 55.61
profile 64.11 574.16 194.87 177.61 91.73 47.07
overall 6.45 271.76 39.00 30.24 29.08 74.61
.........................................................................................................................................................................................................
17.34 160.47 72.71 67.33 30.53 41.99
2 12.5 124.99 32.87 30.24 15.13 46.04
Field 3 3 6.85 75.4 25.51 23.79 11.93 46.78
4 6.85 51.61 19.30 17.74 8.01 41.48
5 5.24 43.55 14.86 13.31 6.67 44.87
profile 65.72 362.48 165.25 156.44 51.68 31.27
overall 5.24 160.50 33.01 24.60 26.70 80.7
** C. V CoeffiCIent of Vanallon

The horizontal-spatial semivariograms are shown in Figures la, 2a and 3a for


the three fields. The semivariograms for all three fields have similar shapes.
Theoretically, the semivariogram should pass through the origin when the distance is
zero. However, all sample semivariograms appeared to approach non-zero values as
distance decreased to zero, indicating the presence of a nugget effect.
252 GEOSTATISTICAL APPLICATIONS

The vertical experimental semivariograms and the models fitted are shown in
Figures 1b, 2b, and 3b. The maximum distance considered in the computation of the
semivariogram cannot exceed half the maximum dimension of the field, (i.e. 0.75 m)
for the vertical semivariogram (Journel and Huijbregts, 1975). Thus, only the first
two values of the vertical semivariogram are reliable. All vertical semivariograms do
not reach a sill, indicating a trend in the property studied. If the information
contained in the semivariogram is to be used for kriging at unsampled locations, the
trend may need to be removed, or universal kriging may be used. A reason for this
trend is most probably related to the presence of high amounts of residual soil nitrate
in the surface layer. Figure 4 shows the average amount of nitrate-N in each layer
for the three fields. Significant differences between the top layer and subsequent
layers may be related to the time of sampling. The results probably exhibit the
influence of temporal dynamics due to the spring sampling of the fields. This may be
because high mineralization and almost no precipitation/irrigation occurred at the time
of sampling of these fields. For this reason, two different types of theoretical models
were fitted to the vertical semivariograms; power and spherical models. If the data
are to be used for simulation purposes, then the power model may not be used, and
hence, another model should be used.
"';
Fitting a model to the experimental semivariogram is a significant step in the,
geostatistical analysis( tt1s1nrpottiiflCto-serecran-appiopii~I'I1OdetiUr-the-- -' .
semivariogram because each model yields different values for the nugget effect and
range. A satisfactory fit to the sample variogram was accomplished by the trial and
error approach as described by Isaaks and Srivastava (1989). Due to resource
constraints only omni-directional horizontal spatial semivariograms and vertical
semivariograms were fit to the sample variogram for each field. Table 2 provides the
values of semivariogram models for the above mentioned cases. Parameters for the
two types of theoretical semivariograms for the vertical direction also are provided in
Table 2. Good agreement was obtained between calculated semivariogram values and
the corresponding models, as shown in Figures 1, 2, and 3.

The range values for horizontal-spatial semivariograms showed considerable


variability among the fields: the scale of horizontal-spatial correlation varies from
about 150 m to 244 m (500 ft to 800 ft). The range of the semivariogram model for
Field 1 was significantly larger than for the other two fields. In the vertical direction
the range varied between 1.5 m to 3 m (5 ft to 10 ft). There was two orders of
magnitude difference between the ranges of horizontal-spatial and vertical dimension.
This represents a system in which the vertical plane is much smaller in scale than the
horizontal plane. A typical approach employed for this system is to examine the
transport process locally as a vertical one-dimensional flow perpendicular to any
layering in the medium (Jury et al., 1987). The complete structural analysis for both
the horizontal-spatial dimension and the vertical dimension represents a combination
of geometric and zonal anisotropy. The complete structural analysis of hydraulic
properties for both dimensions may show the same pattern.

There were no data available for lag distances less than 30 m (100 ft) in the
GODERYA ET AL. ON SOIL NITRATE 253
3000~-------------'
1000~------------,

- [EJ
E 800 t2500

~
J 600 ~2000
'"§ 1500
§ I;,
I;, 400
~ 1000
~
E 200 ·s 500
ell ell
0~0--~0~
. 5--~1~-~1.~5--~2
100 200 300
DislAnce (m) DislAnce (m)

FIGURE l--Experimental (symbols) and theoretical (lines) semivariograms for Field 1;


(a) horizontal-spatial and (b) vertical, (-) spherical model and C-) power model.

1000~------------, 2ooor--------------,

- EJ
~ 800 ~ 1600
"- + + "-
J 600 ~ 1200
5 5
f400V :1 800

's 200 ~
ell
400
ell
°0~--~10~0--~20~0--3~00--~400 0.5 1 1.5 2
DislAnce (m) DislAnce (m)

FIGURE 2--Experimental (symbols) and theoretical (lines) semivariograms for Field 2;


(a) horizontal-spatial and (b) vertical, (-) spherical model and C-) power model.

500r---------------, 2500r--------------.
- EJ
~2000
~::: ~-~+-+-+
200
J 1500
5
~ 1000
·r
E 100
/
:1 500
ell ell
0~0--~1~00~-~2~00~-~30~0--~400 0.5 1 1.5 2
DislAnce (m) DislAnce (m)

FIGURE 3--Experimental (symbols) and theoretical (lines) semivariograms for Field 3;


(a) horizontal-spatial and (b) vertical, (-) spherical model and C·) power model.
254 GEOSTATISTICAL APPLICATIONS

1100
"'-toO
.. ..
.-----
Field 1
C 80
~
~
• Field 2
oS 60
.9 Field 3
z
I
t'l
40 ...
+
... .. .
'-----

0 + +
Z
~
20
::
~
" 0
0 0.5 1 1.5 2
Depth (m)

FIGURE 4--Amount of average residual soil nitrate in each layer.

TABLE 2--Semivariogram parameters of residual soil nitrates for three fields.


Field 1 Field 2 Field 3

Nugget (kg/ha)2 420 130 130


Horizontal-Spatial Sill (kg/ha)2 810 330 (1) 190 (I)
430 (2) 290 (2)

Range (m) 244 30.5 (1) 30.5 (1)


122 (2) 152.4 (2)

Vertical-Modell Nugget (kg/ha)2 420 130 130

(power) Slope (kg/ha)2 110 200 230


Power 1.0 1.15 1.2

Vertical-Model 2 Nugget (kg/ha)2 130 130 130


(Spherical) Sill (kg/ha)' 1550 900 1000

Range (m) 1.5 1.5 1.5


"'. numbers m parenthesis refers to nested structures 1 and 2.
horizontal-spatial direction; hence, the nugget effect was estimated by visual
inspection. There appears to be two different values for the nugget effect in the
horizontal and the vertical directions for Field 1. The small nugget effect of the
vertical semivariogram may be detected because of the small spacing between data
points in the vertical direction. See Figures la and lb.

Spatial variability can also be investigated using the semivariogram and the
relative nugget effect, that is the ratio of nugget to total semivariance expressed as
percentage. A ratio less than 25 % indicates strong spatial dependence, between 25 %
and 75 % indicates moderate spatial dependence, and greater than 75 % indicates weak
GODERYA ET AL. ON SOIL NITRATE 255
spatial dependence (Cambardella 1994). The horizontal-spatial semivariograms may
be described as having moderate spatial dependence for residual soil nitrate.
However, if one considers the spherical model for vertical semivariograms, then the
vertical semivariograms may be characterized by strong spatial dependence; exhibiting
ratios of less than 25 %. Strong to moderate spatially dependent structures may be
controlled by intrinsic and extrinsic variations as well as seasonal variations.

Two dimensional horizontal semivariograms are shown in Figures 5, 6, and 7.


These semivariograms are calculated individually for each layer and also for the
profile (i.e. the sum of amounts in all layers for each grid location), without any
regard to the vertical dimension. If one compares the form of spatial variability of
each individual layer with that of a profile, it is obvious that the form of the structure
is similar to the top layer, indicating that the top layer structure is representative of
the overall spatial structure. The large impact of the top layer semivariogram on the
profile semivariogram is due to the larger variance of residual nitrate concentrations
in the top layer relative to other layers (see Table 1). As a result, if one has to
measure the field again or measure other fields with a similar structure, it may be
appropriate to assess each location to a depth of O.3m (1 ft) and then sample every
fourth or fifth location at lower depths. However, classical statistics reveal high
coefficient of variation values for the deeper layers as compared to the first layer.
Further analysis is necessary to determine an ideal sampling approach.

There was less nitrogen in the soil profile in the third field, and there was less
variability in samples from different layers of this field, as compared to the other two
fields. However, overall (vertically averaged over core) sample variability was the
same or higher (see Table 1 and Figures 3 and 7). Further investigation indicated
that this field received more irrigation water in the previous two years than the other
two fields. It is probable that the excessive application of irrigation water leached
much of the nitrate from the profile and reduced the amount and spatial variability of
residual soil nitrate.

Six directional semivariograms were calculated for each field. All directions
corresponded to rotations in the horizontal plane only. The directions considered
were North, N30E, N60E, N90E, N120E, and N150E, with azimuth half tolerance of
45 degrees. Directional semivariograms are presented as a contour map of the sample
variogram surface (planimetric form) in Figures 8, 9, and lO for Fields I, 2, and 3,
respectively. The values contoured are the semivariance in every direction to a
distance of at least 200 meters, with a contour interval units in (kg/ha)2. Differences
between direction-dependent semivariograms for the fields studied could be the result
of the differences in geology, topography, and/or management of the area.

In this case it is speculated that these significant effects of north-south and


east-west directions across each field were largely due to the irrigation pattern in the
fields. These fields were surface irrigated with water being distributed on the west
side of the field. Hence, residual soil nitrate appears to follow trends in irrigation
water supply. The variogram surface in the east-west direction is more continuous
256 GEOSTATISTICAL APPLICATIONS

2500 r--- 15000


...
EJ ~
~2000
layer 1 N
... .........

~ 1500 ... ... ...


...
... ... ... layer 2

layer 3
,""
~ 12000
9000
101 )( )I( )01 )I(

...
5
~ 1000
... layer 4 ~ 6000
'~ • • +++6111 0 ° 00
00
layer 5
'~
+
's 500 +
iI 10 ii ~~~~llIIilil*i .:: S 3000
~
)()()( )( )(
~ )( )( ><

100 200 300 400 500 °0~----1~0-0--~20~0--~3COO~--4~0'-0--~500


Distance (m) Distance (m)

FIGURE 5--2-D semivariograms for Field 1; (a) five depths; and (b) total in profile.

2500 - ... 15000


N EJ layer 1 OJ ~
~2000 ... ... 112000
... ... layer 2
~ 1500 ...... ~ 9000
layer 3 ""
5 ... ... 5Q, ..
,~ 1000 ............... 10
layer 4
6000
,9
~ ...
• I!IO. ~ ~
S
~
500
• •• ! , 0 •
)()()( )(
)( )()()(
++

)()(
)(~
's
~
3000
)(
0 0
0 100 200 300 400 500 0 100 200 300 400 500
Distance (m) Distance (m)

FIGURE 6--2-D semivariograms for Field 2; (a) five depths, and (b) total in profile.

1200
EJ ...
- - ... 5000
~
~ 4000
i
layer 1
... ...
... ...... ......
900
layer 2
... ~3000
"" layer 3 ...... ......
!
'~
600
layer 4
5
'f2000
300
~ S 1000
S .. ••••• * •• +
~ g ~
II 1111 111111 II SI SI SIll
0 0
0 100 200 300 400 500 0 100 200 300 400 500
Distance (M) Distance (m)

FIGURE 7--2-D semivariograms for Field 3; (a) five depths, and (b) total in profile.
GODERYA ET AL. ON SOIL NITRATE 257

-ISO -100 -SO 0 SO 100 ISO


But-West Sepentim DilcaDce (m)

FIGURE 8--A contour map of the semivariogram values for Field 1. Contour interval
is 50 (kg/ha)A2.

-150 -100 -SO 0 SO 100 ISO


But-West SqJeruion DiIcaDce (m)

FIGURE 9--A contour map of the semivariogram values for Field 2. Contour interval
is 50 (kg/hat2 .

than in the north south direction. In other words, the irrigation pattern seems to
result in high variability (larger sill values) with the variogram surface rising rapidly
in the north-south direction. Hence, the directional semivariograms indicates the
258 GEOSTATISTICAL APPLICATIONS

FIGURE lO--A contour map of the semivariogram values for Field 3. Contour interval
is 25 (kg/ha)'"'2.

presence of anisotropy. However, this can be classified as a mild case of geometric


and zonal anisotropy case which is apparent in all three fields at larger distances.

SUMMARY AND CONCLUSIONS

Geostatistical analyses showed that residual soil nitrates in three fields were
spatially structured. This spatial structure is important to consider, both for fertilizer
application and for evaluation of potential pollutant transport to the groundwater. The
apparent spatial variability in the residual soil nitrate has the potential to seriously
limit the efficiency of fertilizer application according to traditional practices.
Conventional statistical analysis showed that the residual soil nitrate in the profIle was
variable, ranging from 64 to 650 kg/ha (57 to 580 lbs/acre) with a mean of 192 kg/ha
(173Ibs/acre). Data tended to be skewed with the mean greater than the median.

Geostatistical techniques offer alternative methods to conventional statistics for


the estimation of parameters and their associated variability. Three-dimensional
semivariograms were calculated for each field. Two different semivariograms were
also calculated for each field, horizontal-spatial semivariogram and vertical
semivariogram. In addition, two dimensional semivariograms were prepared for each
layer. Finally, six directional semivariograms also were calculated for each field.

Semivariogram analysis demonstrated that there were similarities in the


patterns of spatial variability for the three fields. This may suggest that spatial
GODERYA ET AL. ON SOIL NITRATE 259
relationships derived from one set of measurements for one field, may have
applicability at other field sites. Since spatial structures are influenced by the scale of
the investigation, it remains to be seen whether or not this approach will be useful for
extrapolating spatial information obtained at the field-scale to the watershed or
regional scale.

The 3-dimensional and 2-dimensional semivariogram analysis resulted in


similar structure and form for all three fields. Three dimensional horizontal-spatial
semivariograms showed that for all three fields, the range was about 120 to 245 m.
In the vertical direction the range varied between 1.5 to 3m (5 to 10 ft). The
complete structure for both the horizontal-spatial dimension and the vertical dimension
represents a combination of geometric and zonal anisotropy. The complete structural
analysis of hydraulic properties for both dimensions may show the same pattern.
Three-dimensional vertical semivariograms also displayed a significant trend which
may be related to conditions at the time of data collection.

The nugget values expressed as a percentage of the total semivariance defines


different classes of spatial dependence. Horizontal-spatial semivariograms indicated
moderate spatial dependence, while the vertical semivariograms were characterized by
strong spatial dependence, exhibiting ratios less than 25 %. Strong to moderate
spatially dependent structures may be controlled by intrinsic and extrinsic variations
as well as seasonal variations.

The two dimensional analysis showed a strong spatial pattern in the top layer,
which is displayed in the overall structure of the 2-dimensional semivariograms. The
analysis further revealed that the soil nitrates at 0.6m to 1.5m (2 to 5 ft) depths may
be sampled without a great sensitivity to location with a resulting similar variance.
Direction-dependent semivariograms showed that residual soil nitrates apparently
followed trends in irrigation water supply. This pattern resulted in high variability in
the direction perpendicular to irrigation water flow.

The structural information can be useful in the management of production


agriculture systems in which variable rate application of nitrogen can be used to
increase production and reduce the risk of groundwater contamination. The balance
between crop uptake rates and residual soil nitrogen can also lead to a more cost-
effective fertilizer application rates without increasing the risk of groundwater
pollution.

ACKNOWLEDGEMENT

This paper was supported, in part, by the Center for Infrastructure Research,
the Water Center, and the University of Nebraska-Lincoln and, in part, by the
Cooperative State Research Service (CSRS) of the U.S. Department of Agriculture
(Grant Number 92-34214-7457). Assistance provided by Dr. T.A. Peterson from
Department of Agronomy of University of Nebraska-lincoln is acknowledged.
260 GEOSTATISTICAL APPLICATIONS

REFERENCES

Beckett, P. H. T., and Webster, R., 1971, "Soil variability: A review", Soils and
Fertilizers, Vol. 34, No. l,pp. 1-15

Berndtsson, R., Bhari, A., and Jinno, K., 1993, "Spatial dependence of geochemical
elements in a semiarid agricultural field: I. Geostatistical properties", Soil Science
Society of America J., Vol. 57, pp. 1323-1329

Bhatti, A. U., Mulla, D. J., Koehler, F. E., and Gurmani, A. H., 1991, "Identifying
and removing spatial correlation from yield experiment", Soil Science Society of
America J., Vol. 55.pp.1523-1528

Biggar, J.W., Nielsen, D. R., and Erh, K. T.,1973, "Spatial variability of field-
measured soil-water properties", Hilgardia, Vol. 42, No.7, pp. 214-259

Biggar, J.W., and Nielsen, D. R., 1976, "Spatial variability of the leaching
characteristics of a field soil", Water Resources Research, Vol. 12, No. l,pp. 78-84

Bouma, J., and Finke, P. A., 1992, "Origin and nature of soil resource variability",
1992, Proceedings of the Soil Specific Crop Management Conference, Minneapolis,
Minnesota, April 14-16

Bresler, E., 1989, "Estimation of statistical moments of spatial field averages for soil
properties and crop yields", Soil Science Society of America J., Vol. 53, pp. 1645-
1653

Cambardella, C.A., Moorman, T. B., Novak, J. M., Parkin, T. B., Karlen, D. L.,
Turco, R. F., and Konopka, A. E., 1994, "Field-scale variability of soil properties in
central Iowa soils", Soil Science Society of America J., Vol. 58 (In press)

Dahiya, I. S., Ritcher, J., and Malik, R. S., 1984, " Soil spatial variability: A
review", International Journal of Tropical Agriculture, Vol. 11, No.1, pp. 1-102

Davis J., 1986, "Statistics and Data Analysis in Geology", John Wiley & Sons, New-
York, NY

Deutsch, C.V., and Journel, A. G., 1992, "GSLIB: Geostatistical Software Library
and User's Guide", Oxford University Press, New York, NY

Guaracio, M., David, M., and Huijbregts, C. J., 1975, "Advanced Geostatistics in
the Mining Industry", D. Reidel Publishing Company, Dordrecht, Holland

lsaaks, E. H., and Srivastava, R. M., 1989, "An Introduction to Applied


Geostatistics", Oxford university press, New York, NY
GODERYA ET Al. ON SOIL NITRATE 261
Journel, A. G., and Huijbregts, C. J., 1978, "Mining Geostatistics", Academic press,
New York, NY

Jury, W. A., Russo, D., Sposito, G., and Elabd, H., 1987, "The spatial variability of
water and solute transport properties in unsaturated soil; I. Analysis of property
variation and spatial structure with statistical models", Hilgardia, Vol. 55, No.4, pp.
1-32

Kalinski, R.J., Kelly, W. E., Bogardi, I., and Pesti, G., 1993, "Electrical resistivity
measurements to estimate travel times through unsaturated ground water protective
layers", Journal of Applied Geophysics, Vol. 30, pp. 161-173

Mulla, D. J., 1988, "Estimating spatial patterns in water content, metric suction, and
hydraulic conductivity", Soil Science Society of America J., Vol. 52, pp.1547-1553

Ovalles F. A., and Collins, M. E., 1988, "Evaluation of soil variability in northwest
Florida using geostatistics", Soil Science Society of America J., Vol. 52, pp. 1702-
1708

Peterson T. A., and Schepers J. S., 1992, "Spatial distribution of soil nitrate at the
Nebraska MSEA site", Agriculture Research to Protect Water Ouality, Poster Paper,
USDA Agricultural Research Service, University of Nebraska, Lincoln, NE

Rolston, D. E., and Liss, H. J., 1989, "Spatial and temporal variability of water
soluble organic carbon in a cropped field", Hilgardia, Vol. 57, No.3, pp. 1-19

Sutherland, R. A., Kessel, C. V., and Pennock, D. J., 1991, "Spatial variability of
Nitrogen-15 natural abundance", Soil Science Society of America J., Vol. 55, pp.
1339-1347

Tabor, J. A., Warrich, A. W., Myers, D. E., and Pennington D. A., 1985, Snruial
variability of nitrate in irrigated cotton: II. Soil nitrate and correlated variables, Soil
Science Society of America J., Vol. 49, pp. 390-394

Webster, R., and Burgess, T. M., 1983, "Spatial variation in soil and the role of
Kriging", Agricultural Water Management, Vol. 6, pp. 111-122

Woldt, W., and Bogardi, I., 1992, "Ground water monitoring network design using
multiple criteria decision making and geostatistics", Water Resource Bulletin, Vol.
28, No.1, pp. 45-62

Woldt, W., Bogardi, I., Kelly, W. E., and Bardossy, A., 1992, "Evaluation of
uncertainties in a three-dimensional groundwater contamination plume", Journal of
Contaminant Hydrology, Vol. 9, pp. 271-288
Dae S. young1

GEOS~A~IS~ICAL JOINT MODELING AND PROBABILIS~IC S~ABILITY ANALYSIS FOR


EXCAVA~IONS

REFERENCE: Young, D. S., "Geostatistical Joint Modeling and Probabilistic Stability Anal-
ysis for Excavations," Geostatistias for Environmental and Geoteahmaai AppllOatlOns, ASTMBTP
1283, R. M. Srivastava, S. Rouhani, M. V. Cromer, A. 1. Johnson, A. J. Desbarats, Eds., AmerIcan
Society for Testing and Materials, 1996.

ABS~C~I Two geostatistical interpolation methods were studied for


rock joint modeling; ordinary kriging and indicator kriging.
Geostatistics was extended to improve the spatial interpretation of
joint parameters, especially for pole vectors, which was kriged on the
sphere. A matrix approach was introduced for probabilistic block
failure analysis, and applied to study the stability of pit slopes and
subway tunnels. The localized structural stability was achieved in
probabilistic terms based on the geostatistical joint model.

KEYWORDS I joint models, geostatistics, block failure, matrix approach,


probabilistic stability analysis.

Rock joints play an important role in rock mechanics (particularly


for structural stability analysis) and geohydrology (particularly for
fluid-flow in fractured rocks). The site characterization requires the
characterization of both the rock mass and the joint systems within the
rock mass. In this paper, recent advances in joint system modeling and
its applications to rock mechanics are presented to demonstrate the
superiority of the resultant engineering analysis from the advanced
models. It was achieved mainly from geostatistics, which incorporates
the spatial variability of joint characteristic parameters into the
modeling and localizes them to build a localized discrete cell-block
model of joint systems.
Considering many characteristics of rock joints that are best
described in probabilistic terms, there are intrinsic advantages in
geotechnical approaches that directly employ the relevant statistical
distributions (Baecher et al. 1977, Warburton 1980).
Consequently, an appropriate model of joint systems in a rock mass
is the localized probabilistic model, and a realistic geotechnical
analysis of rock structures is a probabilistic approach made on the

1Associate Professor, Mining Engineering Department, Michigan


Technological University, Houghton, MI 49931.

262
YOUNG ON ANALYSIS OF EXCAVATIONS 263
model, which will yield local structural stability in terms of the
probability of failure.
since the block size distribution (i.e. blocks formed by the joints
in a rock mass) can describe or be related to these engineering
criteria, it is a pertinent characteristic parameter in numerous
engineering studies including tunneling and underground excavations,
rock bolting and other types of supporting systems, engineering
classifications of rock mass, key block analysis for structural
stability, drilling and blasting, and transmissibility of fluids through
fractured rock formations.
In this paper, a numerical method was developed to identify blocks
and calculate their sizes (or volumes), shapes, and locations, as well
as their stability. The connectivity matrix was introduced in this
numerical approach, which is equivalent to the stiffness matrix of the
finite element method of stress analysis. Then, the key block analysis
was extended for the probabilistic structural analysis based on the
connectivity matrix. Finally, the key question, the localized
probabilistic structural analysis was achieved by applying the finite
element approach for the key block analysis on to the discrete cell-
block model of the joint system.

JOINT MODELS

Rock joint systems surveyed in the field and characterized


statistically often are incorporated in the geotechnical analysis
through the joint model. Because of the complexity of joint geometries
and their characteristic nature, as well as limited accessibility in the
field for joint surveys, various degrees of simplification or
assumptions are made in joint modeling. Thus corresponding joint models
are developed depending on the field geology, modeling purposes and the
model's end usages.
Basic joint models used in the geotechnology are created either to
simulate the spatial distribution of joints; the joint network
simulation that will duplicate the statistics of joint parameters
sampled, or to replace the joint systems with the equivalent continuous
media that can represent effectively the statistics of joints. Most of
these models share common features and assumptions such as planar
joints, a uniform distribution in space for joint locations and their
independency with other joint parameters. Also, these models could be
generated as either deterministic or stochastic models by using the
statistical distribution of jOint parameters.

GEOSTATISTICAL APPROACH

Recently geostatistics has been applied to joint system modeling,


joint network simulations (Chiles 1988) and discrete block models of the
equivalent continuum media (Young 1987a, 1987b). In the joint network
simulation, joint planes are considered as discs and the parent-daughter
model is applied for the location of joint disc centers, in which
daughters are nucleated around parents. Then the density of the parents
is regionalized for the geostatistical simulation of the joint networks.
When studying a large volume of rock, such an approach requires
generating a huge number of joints that cannot be handled easily on the
computer. A simple shortcut to avoid this problem is to estimate
average characteristic properties in discrete cell-blocks.

DISCRETE CELL-BLOCK MODEL

In most engineering analyses in rock mechanics and geohydrology, the


network geometry of joint systems can be replaced with the discrete
cell-block model, because the joint parameters can be used directly as
264 GEOSTATISTICAL APPLICATIONS

input data or they can be converted into the equivalent continuum media
that represents the joint systems effectively. In this discrete model,
the entire area (or rock mass) to be modeled is divided into uniform
cell-blocks and the characteristic parameters are inferred for each
cell-block from the sparse sample data measured in the field.
A few cases of geostatistics applications to geotechnology are
reported where rock mass characteristic parameters were found to be
spatially correlated random variables and geostatistical interpretations
are a must to incorporate this phenomena into the modeling (Chiles 1988,
Young 1987a, 1987b, Miller 1979).
In most of these cases, the regionalized variables are in scalar
terms but the joint orientations or poles are considered unit vectors.
This means that the pole vectors should be kriged on the unit sphere
where they are projected and analyzed traditionally (a stereonet
projection for orientations) (Young 1987a, 19887b).

KRIGING ON THE SPHERE

Traditionally, jOint orientations are projected on a stereonet as a


pole to define and analyze them. A stereonet is simply the projection
of a unit sphere on a plane, and so a pole is a unit vector projected on
a unit sphere or stereonet.
Therefore, a kriging system was developed on the sphere for the
spatial analysis of pole vectors (Young 1987a). In this kriging, the
pole vector Z(x) at a location x was regionalized directly. The spatial
variability of poles was introduced by the vectorial variogram, which
was defined as the expectation of the squared norm of the difference of
two pole vectors over a vectorial distance h:

2y (x,h) = E [IZ(x) - Z(X+h) 12]

So, it is the magnitude of the vector difference; the distance AB in


Figure 1, rather than an angular difference.

..
- z(x+h)]

-;------~~~-------r~E

FIG. 1--Two pole vectors regionalized and their difference vector


projected on the upper reference hemisphere.
Under the intrinsic hypothesis (Journel and Huijbregts 1979), the
variogram was estimated by a mean value of samples (or poles) grouped
over a distance h, as done for scalar variables;

2y (h) = _1_ 1: Iz (x) - Z (x+h) 12


N(h)
YOUNG ON ANALYSIS OF EXCAVATIONS 265
where N (h) is the number of sample pairs available at h.
As seen here, the vector variogram is turned out in scalar terms,
and consistent with the definition of the classical scalar variable
variogram (Journel and Huijbregts 1979). Therefore, the classical
operation of geostatistics can be applied to the pole vectors through
the vector variogram including estimation variance analysis, dispersion
variance analysis and variogram structural analysis (Journel and
Huijbregts 1979).
By defining the estimation variance as the difference vector

where Zv = actual vector to be estimated for the support v, that is


equivalent to a local cell-block
Z; = estimate of Zv
the kriging system for pole vectors was rederived as follows (Young
1987a).
+ Jl = Y (va' V) , IV- a=l,n

[exclusively followed the notation in Journel and Huijbregts (1978, p.


306)].
Then, the kriging estimate and its variance are obtained respectively as
follows:
Z; = La haZ (Xa ), >v- a=l, n

and a! La haY (va' V) + Jl = Y (V, V)

(~ • Lagrange Multiplier)
As shown above, the kriging system of vector variables (or poles) is
the same as ordinary kriging (OK) of scalar variables, depending on the
definition of the estimation variance. In this vector kriging system,
the magnitude of the estimation error vector was optimized. The kriging
variance cr.2 is not a local conditional estimation variance and its
application is limited (Journel and Huijbregts 1979).
The kriged mean vector represents the average orientation of joints
within the cell-block, V, and its accuracy can be measured by its
kriging variance. So, it creates a deterministic model and is good
enough for only deterministic analysis of geotechnology, but it is still
a localized model.
However, the full statistical distribution of pole vectors within a
cell-block V is needed to generate a stochastic model of poles for
probabilistic engineering analysis. This was achieved by using
Indicator Kriging (IK) (Young 1987b, Young and Hoerger 1988a).
266 GEOSTATISTICAL APPLICATIONS

LOCAL DETERMINISTIC MODEL

In this localized deterministic cell-block model, mean values of


joint parameters are estimated for each local block by OK and the joint
systems are characterized for every block in the entire area modeled.
This is the best type of model that can be developed from the sample
information typically available, since OK incorporates the spatial
variability observed in the sample into the kriging estimator via
variograms, and minimizes the estimation variance.

LOCAL STOCHASTIC MODEL

Considering the dispersion of joint parameters over their means and


the complexity of the characteristic nature of joints, the mean value
does not carry much meaning nor is it close to the reality, and neither
is the deterministic engineering analysis. This difficulty was
corrected in the stochastic model, which provides the full statistical
distribution of joint parameters for each local cell-block. Thus,
probabilistic engineering is applicable to the model at the early stages
of site exploration and engineering design.
The local probability distribution was estimated by IK, more
precisely Mononodal IK, which is a non-parametric approach (Lemmer
1984). The original IK approach was rederived for vectorial variables
(pole vectors in this case) and indicator variables were defined on the
two-dimensional area of class intervals, which was projected on the
Grossman's tangent plane for pole histograms (Lemmer 1984). The
accuracy of this stochastic model was then cross validated by comparing
the model with the actual field data in the following open pit case
(Young 1987b, Young and Hoerger 1988a).

FINITE ELEMENT APPROACH FOR KEY BLOCK FAILURE

The traditional key block theorem for the stability analysis on the
structures excavated in a jointed rock mass is a deterministic method
based on deterministic infinite joint planes. This means that the
location and frequency of joints and size of joint planes are excluded
from the block failure analysis and it provides a worse case analysis.
Consequently a numerical approach was developed, which is general
for both joint system models and any structure (their size and shape).
Also, it is an effective algorithm to computerize the entire block
failure analysis in probabilistic terms. Therefore, it can be combined
easily with the local stochastic model of joint systems to achieve the
localized probabilistic analysis of block failures.
The numerical algorithm was developed based on the connectivity
matrix, which is comparable to the stiffness matrix of the finite
element method of stress analysis in the continuum mechanics.
Connectiyity Matrix Approach
The local area, where the joint systems were simulated and the key
block analysis desired, was replaced with the discrete finite element
model as used in the finite element method of engineering mechanics.
However, the elements were constructed by two-force bars as in truss
structures rather than by solid elements. Then, the local area can be
represented as a large truss structure with bars connected at nodal
points, whose continuity and immobility were secured.
When the rock mass in the local area is cut by joints, some elements
will be cut, as well as the connection bars within those elements.
Also, many independent small truss structures will be formed when the
rock mass is cut into many rock blocks by joints; that is, the whole
truss structure is cut into many parts corresponding to those rock
blocks. The connectivity matrix was introduced to define the connecting
YOUNG ON ANALYSIS OF EXCAVATIONS 267

condition of bars (or their continuity conditions), and the whole truss
structure was represented by the global connectivity matrix. Then, the
independent small size structure representing a block formed by joints
can be searched and identified as an independent block matrix system in
the global connectivity matrix. Each of the independent matrix
elemental blocks in the whole system matrix has its own size, shape, and
location. So, the complete information for a block geometry is known
and available from the nodal numbers of the matrix elemental block.
Element Model
The element constructing the whole system of the rock mass is
replaced with a truss formed by simple bars connected between two nodes.
The element can be in any shape or size with different numbers of nodes.
For simplicity in this paper, a rectangular element with 8 nodal points
was used for the rock block calculations. Then, the 8-point equal
parameter truss element appears like the usual 8-point solid element in
the finite element method, but it consists of 28 two-force bars as shown
in Figure 2. The volume of this element is the same to that of a solid
element and distributed equally on to its nodal points. Therefore, the
nodal point system and the number of freedoms in the element model were
not changed from the finite element model systems.
The global continuous truss structure for the entire rock mass was
developed by constructing this type of truss element on every element in
the model. The inner nodal points will have 26 bars connecting to
adjacent nodes around it.

FIG. 2--28 bar truss element model.


Elemental and Global Connectivity Matrices
The connectivity of the truss element is an 8x8 matrix since the
element has 8 nodal points and only one freedom at each node is needed
for the purpose of block calculations. Also, the distribution of
connection bars at a node is not required to be known exactly.
Consequently, two indicator numbers are enough to define the continuity
condition of the connection bars between any two nodes, such as 0 for no
connection or the connection cut by a joint, and 1 for the positive
connection, so there is a bar to connect them. By applying this
indicator system, the connectivity matrix of a truss element can be
written simply as follows:
268 GEOSTATISTICAL APPLICATIONS

7 1 1 1 1 1 1 1
1 7 1 1 1 1 1 1
1 1 7 1 1 1 1 1
K = 1 1 1 7 1 1 1 1
1 1 1 1 7 1 1 1
1 1 1 1 1 7 1 1
1 1 1 1 1 1 7 1
1 1 1 1 1 1 1 7

Compared with the exact stiffness matrix of the finite element


method, the freedom of [KJ was reduced from 24 to 8 and the matrix
element of [KJ does not express exactly the mechanical behavior of the
element. However, the connectivity matrix [KJ remains symmetric to
follow the principles of mechanics.
Also, the global connectivity matrix can be assembled from the
element connectivity matrix [KJ by following the same procedure of
assembling the global stiffness matrix in the finite element method.
The global connectivity matrix of the entire truss structure in the rock
mass, [MJ, is a (nxn) matrix, where n is the number of nodal points in
the whole structural matrix. Therefore, the information of the
continuity or connection between two nodes can be stored at each matrix
element in [MJ, because each node has one freedom. The [MJ matrix is
symmetric and banded along the diagonal direction as in the way of the
finite element method.
Block Calculation
Simulated rock joints were introduced into the whole structural
matrix one by one (Young and Hoerger 1988a). Whenever one joint plane
was introduced, the elements were searched that may be cut by the joint,
and which of the 26 bars within the element to be cut were checked.
Whenever a bar is cut by a joint, the continuity of the global truss
structure will be weakened. This weakness was reflected on the global
connectivity matrix by modifying its matrix element corresponding to the
bar that was cut by the joint. Actually, the matrix element was
subtracted by one from the original number of 26. Consequently, the
global connectivity matrix was modified constantly while all of the
joints were introduced and all the connection bars were tested for their
continuities. The final global connectivity matrix must be singular,
and will consist of as many independent substructures as blocks formed
by the joint systems.
It is obvious that when a rock block is isolated by joints from the
rock mass, its corresponding small substructure will be separated from
the global structure. These substructures are independent from each
other, i.e. no connection exists among them. This means that the global
connectivity matrix, which has been modified completely after the whole
joint was introduced, can be transformed into a block matrix by
elemental transformation (Golub and Van Loan 1983). Then, each
elemental matrix block in the final block matrix transformed from the
global connectivity matrix represents an independent substructure, which
is nothing but the rock block isolated by the joints and the block
searched for the block calculations.
In this way the problem of searching blocks in the rock mass was
replaced with the problem of identifying the independent matrix blocks
on the global connectivity matrix. The knowledge of nodal numbers
within the independent matrix block is enough to identify the shape,
size or volume, and location of a rock block formed by the joint
systems. The volume of a rock block was calculated simply by counting
the nodes within its substructural matrix block and summing up their
volumes.
For this connectivity matrix method, there are no limits on the
sizes and shapes of joint planes as well as the geometry of rock blocks
YOUNG ON ANALYSIS OF EXCAVATIONS 269
formed by the joint systems, including any random aggregation of
elements in the three-dimensional space.
Key Block Failure
Once blocks were identified, key blocks were sorted out by following
three steps:
1. Collect the joint and excavation plane geometry associated with a
particular block.
2. Evaluate kinematic stability of the block with an algorithm based on
Shi's theorem (Goodman and Shi 19984) and if the block is
kinematically unstable.
3. Evaluate mechanical stability with Warburton's algorithm (Warburton
1980) •
positional Probability of Failure
The probabilistic analysis of key block failures was achieved by the
positional probability of failure, which was defined by the number of
times the position (or node) was evaluated as being contained in a key
block. In this way the probabilistic key block analysis can be
effectively combined with the stochastic joint system simulation, and
the realistic structural stability can be obtained in probabilistic
terms.
One of the interesting aspects of this type of analysis is that it
provides for the evaluation of progressive key block failures. If it is
assumed that a key block displaces into the excavation, the next level
of blocks exposed to the excavation surface modified by the initial key
block failures become potential key blocks. The process may continue
over many levels of failure. With this type of analysis it is very
simple to evaluate the successive levels of key block failure around an
excavation surface.
It has a specific application in mining engineering: cavability
analysis for the caving method of mining.

CASE S~IES

A few cases were analyzed to demonstrate the capability of localized


discrete cell block models and corresponding improvements achieved in
the engineering analysis by the finite element method of probabilistic
key block theorem.
Open Pit Slope Stability
An extensive statistical analysis was made on a total of 939 joint
survey data taken by the cell mapping technique from an open pit mine.
When pole vectors were projected on the upper hemisphere, three joint
sets were identified and separated by FRACTAN computer code (Shanley and
Mahtab 1975) as shown in Figure 3. Characteristic parameters and their
statistics on those joint sets were summarized in Table 1, which was
computed on Grossman's tangent plane (Grossman 1985). Also, their
variogram parameters for the isotropic spherical model were presented in
Table 2 (Young and Hoerger 1988b).
270 GEOSTATISTICAL APPLICATIONS

TABLE l--Joint parameters and their statistics.


Mean Attitude Spacing Roughness Avg. Length
(m) (deg. ) (m)
Joint Dip Dip Mean/ Mean/ Mean/
Set Direction (deg.) Variance Variance Variance
(deg.)
1 92.21 74.1 0.845/0.582 3.17/7.34 0.545/0.235
2 353.05 70.5
3 190.3 49.2

TABLE 2--Spherical variogram of joint parameters for East set.


orientation Spacing Roughness Avg. Length
Sill 0.145 0.53 6.87 0.23
Nugget 0.06 0.275 4.00 0.135
Range (m) 250 250 250 250

~~r----r----~--~--~~--~---,
6 _.-pi. loe.t i ona

~L-__~____~____~____~__~____~
'UL 1"'.
EAST (METERS)
no •. ..·,et
"-

FIG. 3--Contours of poles FIG. 4--Discrete cell block model


projected on a stereonet. model for an open pit mine.
The entire mine was subdivided into cell-blocks as shown in Figure 4
to build a discrete cell-block model. Then geostatistical operations
were performed to define the spatial variability of every joint
parameter by their variograms and to characterize the joint systems
within each cell-block by estimating those parameters by kriging
techniques.
First, the local deterministic cell-block model of three joint sets
was generated for the mine by using OK. The average values of joint
parameters were obtained for every cell-block, which is equivalent to
the characterization of joint systems within a small unit block. In
other words, the average local joint system properties were inferred and
characterized from the global sample data. Then the key block theorems
(Goodman and Shi 1984) were applied to study the local slope stability
in terms of the maximum safe slope angles within each local cell-block
area. The local joint parameters kriged previously were used as input
for this local slope analysis.
The localized slope stability was then compared with the slope
stability obtained from the global average values of the joint
parameters. The maximum safe slope angles based on local input showed
YOUNG ON ANALYSIS OF EXCAVATIONS 271
significant deviations from those based on global averages as input. To
illustrate these results graphically, the localized maximum safe slope
angles were plotted along the various pit slope dip directions as shown
in Figure 5. These significant local deviations would have a major
effect on the overall behavior of the mine slope (Hoerger and Young
1987) •
For an open pit already designed using global averages as design
inputs, geostatistics can identify areas whose slopes could be steepened
as well as local high risk areas deserving increased monitoring; for a
new mine, using only the limited information available during the
development stage, kriging can create a block model of joint
orientations which could be used to design not only the final pit slope,
but also to design the intermediate slopes.

15
- - glob.l IVg . ~Pf( .""J..J
80
o loca l esti ma te

..
.., 60
)0

.,·
." .
~
·
a. 40
0
00
j
10

...·. ooO!O~
~

~
zo 00000

o~
00000 ·

20
180 ZZ5 Z70 315 360 Probabill lyof C.:.llIolU , (I)

Pit Slop. Oip Di r.ct i on (dog)

FIG. 5--Safe slope angles using FIG. 6--Histograms of PF for


global averages and local cell-blocks.
estimates as input.
Secondly, the local stochastic joint model was developed for the
mine by using IK as described before (Young 1987b). In this case the
full probabilistic distribution of joint parameters are available for
each local block and the probabilistic stability analysis can be
performed on every local block to achieve the localized probability of
slope stability. Then, the probability of failure (PF) based on the
localized probabilistic model of joint systems [PF (IK)) was compared
with PF (sample), which was calculated similarly from the global sample
distributions, by constructing a histogram of failure probability for
local blocks (Figure 6). PF (sample) yielded a marginal PF of 50\ to
every cell block, which could be expected from the symmetric
distribution of the global sample data (Young and Hoerger 1988b).
Therefore, it could be said that the PF (sample) did not improve the
stability analysis over the deterministic method used currently . PF
(IK) draws a distinctively different histogram, which spreads over a
wide range of PF between 25\ and 80\. Only 17\ of a total of 36 cell-
blocks showed the marginal 50\ of PF by the PF (IK). Fifty percent of
the 36 blocks had a higher PF than the marginal PF and the remaining 33%
showed a PF lower than the marginal PF. This clearly indicates that the
local variation or spatial variability of joints plays a significant
role in the slope stability, and the local probabilistic approach should
be applied to achieve an effective slope analysis. Also, local risk
assessments on the regional pit slopes can be achieved from the PF (IK)
analysis at any time period of the mine life. It is an important
improvement in slope design in general and should be exercised routinely
in field projects.
272 GEOSTATISTICAL APPLICATIONS

"¥ ; r---.,.---..,..---...,....--...,....---,-----,

!L-__'-__.......__.......__-'-__-'-__-'
..... ..... ..... "01 .
EAST ,METERSI "
IIUt. "".
'1f1
FIG . 7--Spatial distribution of PF (IK) in the pit.
The spatial distribution of PF (IK) plotted in Figure 7 indicates
cell-blocks of higher and lower PF's than the marg i nal 50%. Regional
zones of higher and lower PF's were formed and zones were scattered
throughout the pit, randomly, but it was noticed that block PF's were
not scattered randomly in the pit, showing that the analysis can
identify local stability trends in the mine.
A Subway Tunnel
A metropolitan subway tunnel was studied to illustrate the
difference between traditional key block theorem and the positional
probability of key block failures by the finite element approach for the
block failure. The joint systems and their statistical details were
published by cording and Mahar (1974).
An unit length of the tunnel was isolated based on the discrete
cell-block model. The joint systems within this cell-block were modeled
and simulated for the stability analysis as done for pit slopes.
When the frequency of positional block failure was projected along
the unit length of tunnel, the cumulative probability of positional
failure can be plotted around the tunnel as shown in Figure 8. Compared
with the worst case type of analysis by the traditional key block
theorem, the positional probability analysis shows clearly the size and
frequency of key block occurrences in this projection.
",""""",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
1"" ......"fI'"''''''''''''''''''''''''''''''''''''
",""""",,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
i"lliI""''''''II,,,,,,,,,''OEEctJ'-'II''''''''''''''''''
"""""",,,,,,,,,,-o71,,-,,,,,,,,,,,,,,""""
'1#'''''''''''''''''''''''.'
iN
'-.11""'"''''''''''''''
................ :------ IIII."""'U,,,,,'''' 10' E92''''''''''',,,''''''
'II'---------z
" , •••• " ••" ' - 1 9 C
lJKI
6-"-,."""""",
-,-.""""
,",,"'11 .... ',',2 11- - f l ' "

""UU'''.
,,,,---""'2211223
"""-'-"
' '-''' '2Jl
31'5'......."'"
,....."',,,
",,,,,,,,-11223 51-''''''''
6 1- ,. " " "

#"""'''''-'23'
n""",,'''-'37 82""''''''
n-,II"""
mm:m:m;~ ~;mmm:
Z~~::;;=m:m::""""",,,,,,,,,,,,,,,m:::m:::
a ) HaxiDnDI reJlO••bl. are. b) Positional probabiliti.s of failur.

FIG. 8--Comparison of key block failures (a) with the positional


probability of block failures (b) for a tunnel.
YOUNG ON ANALYSIS OF EXCAVATIONS 273
The positional probability of key block failure carries important
features that can be simply implemented in the geotechnical design of
excavations. For the design of a roof bolting system, the anchor should
be located in the area where the positional probability is zero or low.
The positional probability is the probability of the bolt not being
anchored effectively to a stable portion of the rock mass. Also, the
parts of the excavation requiring the most support can be identified
easily from this.
The other measures of key block stat"istics included here are:
1. The distribution (or histogram) of key block sizes and its mean and
standard deviation.
2. The total volume of key blocks which summarizes the suspectability
to key block failure.
3. The frequency of different sizes of key blocks.

Block Size Distributions


The matrix approach for block calculations is general; general in
block size, shape and location. Therefore, block calculations can be
completed within a discrete element. Results can be presented in a
histogram which shows the frequency of block size distribution (Figure
9). In this case the circular disk model was applied in the joint
simulation. It should be a part of site characterization for risk
assessments in geotechnical and geohydrological engineering.

" ~------------------------------------~

~ ~ ~ ~
block volume
~ ~ ~ - =
FIG. 9--The frequency distribution of block sizes formed by three joint
sets.

CONCLUSIONS
1. The most important conclusion, in general, which could be drawn
from this work is that the localized probabilistic stability analysis
for geotechnical structures can be made at the early stages of
engineering design and construction, when only sparse sample data is
available. It leads to the optimum design of geotechnical structures,
optimum in their relative locations and orientations with other
peripheral structures, and their shapes and sizes. This is achievable
through the geostatistical model of characteristic parameters of rock
masses. Then, it can be said that this is an ideal model of joint
systems for many engineering analyses in both rock mechanics and
geohydrology.
2. As comparing PF (lK) and PF (sample), the local probabilistic
analysis of pit slopes is more powerful to draw a detailed and realistic
picture of slope stability conditions. The local variation of joint
orientations played an important role in the slope stability and should
274 GEOSTATISTICAL APPLICATIONS

be included in slope design and construction. Also a localized full-


scale risk assessment can be made from PF (IK) and the pit design and
operation can be optimized progressively.
3. Geostatistics contributed significant improvements into the
modeling of rock joint systems (or site characterizations) for
goetechnical structural analysis. The spatial variability was fully
incorporated in this modeling and the characteristic model parameters
were localized. The non-parametric approach by IK simplified the
modeling of three-dimensional probability distributions of pole vectors
projects on the reference sphere. Otherwise, the local probability
distribution of poles will never be achievable from the sparse sample
data available at the early stages of engineering design. The
mathematical probability density function (pdf) for directional data
projected on a sphere is little known and a non-parametric approach has
been desirable for a long time [13]. Even though the mathematical pdf
is available, the sample data available at the early stages will never
be enough to describe the local statistical distribution.
4. Geostatistics is general and applicable to any characteristic
parameters of physical properties for geotechnical materials such as
strength values, elastic or plastic constants and flow parameters.
Therefore, local probabilistic models of these parameters are readily
available from geostatistics and corresponding geotechnical analysis can
be made in terms of the probability as seen here. Considering the
dispersion of these geotechnical parameters around their mean values as
well as their spatial variations, the local probability analysis is a
natural choice for various geotechnical fields in the future. The
stochastic analysis based on the global sample data distribution did not
improve the overall picture of slope stability conditions over the
deterministic analysis by using sample mean attitudes, although it
yielded a probability of slope failure over a full range of slope
angles.
5. The deterministic approach based on block theorems treats joint
orientations as constant and requires "engineering judgement" to
qualitatively incorporate the quantitatively ignored factors of joint
sizes and spacings, and their variabilities. Because of the fixed joint
orientations and assumptions of infinite size of joint, the maximum
removable area approach of the deterministic key block analysis provides
an upper bound to the key block size identified. When the probabilistic
analysis is coupled with the localized stochastic model of joint systems
in geological formations, a Significant amount of engineering judgement
required to optimize the size, shape and orientation of an excavation
could be replaced by quantitative solutions.
6. The positional probability of key block failure carries
important features; the parts of excavation requiring the most support,
probability of roof bolts not being anchored in stable zones,
distributions of key block volumes, and key block sizes and frequency.
7. In-situ block size distribution should be a part of geotechnical
and geohydrological site characterizations. It is a pertinent parameter
to be included in key block failures, and it is related directly to the
transmissibility of fluid flow through the fractured rock as in the
granular materials. A further study is deserved for this hydrological
application of the block size distribution.

REFERENCES
Baecher, G. B., Lanney, N. A. and Einstein, H. H., 1977, "statistical
Description of Rock Properties and Sampling," Proceedings of the
Eighteenth Symposium on Rock Mechanics, Golden, Colorado.
Chiles, J. P., 1988, "Fractal and Geostatistical Methods for Modeling of
a Fracture Network," Mathematical Geology, Vol. 20, pp. 631-654.
YOUNG ON ANALYSIS OF EXCAVATIONS 275
Cording, E. J. and Mahar, J. W., 1974, "The Effects of Natural Geologic
Discontinuities on Behavior of Rock in Tunnels," Proceedings of 1974
Rapid Excayation and Tunneling Conference, San Francisco, CA, pp. 107-
138.
Golub, G. H. and VanLoan, C. F., 1983, "Matrix Computations," John
Hopkins University Press, Baltimore, MD.
Goodman, R. E. and Shi, G. H., 1984, "Block Theory and Its Application
to Rock Mechanics," Prentice-Hall, Englewood Cliffs.
Grossman, N. F., 1985, "The Bivariate Normal Distribution on the Tangent
Plane at the Mean Attitude," proceedings of International Symposium on
Fundamentals of Rock Joints," Bjorkliden, Sweden, pp. 3-11.
Hoerger, S. F. and Young, D. S. 1987, "Predicting Local Rock Mass
Behavior Using Geostatistics," Proceedings of Twenty-eighth U.S.
Symposium on Rock Mechanics, Tucson, AZ, pp. 99-106.
Journel, A. G. and Huijbregts, Ch. J., 1978, "Mining Geostatistics,"
Academic Press, London.
Lemmer, 1. C., 1984, "Estimating Local Recoverable Reserves via
Indicator Kriging," Proceedings of Geostatistics for Natural Resources
Characterization (ed. by G. Verlys), D. Reidel, Dordrecht, pp. 349-364.
Miller, S. M., 1979, "Geostatistical Analysis for Evaluating Spatial
Dependence in Fracture Set Characteristics," Proceedings of the
Sixteenth Symposium on Application of Computers and Operations Research
in the Mineral Industry, Tucson, AZ, pp. 537-545.
Shanley, R. J. and Mahtab, M. A., 1975, "FRACTAN: A Computer Code for
Analysis of Clusters Defined on Unit Hemisphere," U.S. Bureau of Mines,
IC 8671, Washington, DC.
Warburton, P. M., 1980, "Stereological Interpretation of Joint Trace
Data: Influence of Joint Shape and Implications for Geological
surveys," International Journal of Rock Mechanics & Mining Science, Vol.
17, pp. 305-316.
Young, D. S., 1987a, "Random Vectors and Spatial AnalysiS by
Geostatistics for Geotechnical Applications," Mathematical Geology, Vol.
19, pp. 467-479.
Young, D. 5., 1987b, "Indicator Kriging for Unit Vectors; Rock Joint
Orientations," Mathematical Geology, Vol. 19, pp. 481-502.
Young, D. S. and Hoerger, S. F., 1988a, "Non-Parametric Approach for
Localized Stochastic Model of Rock Joint Systems," Geostatistical.
Sensitivity. and Uncertainty Methods for Ground-Water Flow and
Radionuclide Transport Modeling (ed. by B. Buxton), Battelle Press,
Columbus, OH, pp. 361-385.
Young, D. S. and Hoerger, S. F., 1988b, "Geostatistics Applications to
Rock Mechanics," Proceedings of Twenty-ninth U.S. Symposium on Rock
Mechanics, Balkema, Brookfield, pp. 271-282.
Author Index

Benson, C. H., 181 N


Bogardi, L., 248
Buxton, B. E., 51 Nicoletis, S., 69

C
p
Carr, J. R, 236
Colin, P., 69 Pate, A D., 51
Cromer, M. V., 3, 218 Patinha, P. J., 146
Pereira, M. J., 146
D
R
Dagdelen, K., 117
Dahab, M. F., 248 Rashad, S. M., 181
Desbarats, A J., 32 Rautman, C. A, 218
Rouhani, S., 20, 88
F
s
Froidevaux, R, 69
Schofield, N., 133
G Schulte, D. D., 162
Soares, A 0., 146
Garcia, M., 69 Srivastava, R M., 13
Goderya, F. S., 248
T
J
Turner, A K., 117
Johnson, R L., 102
Jones, D. D., 162
W
K
Wells, D. E., 51
Kannengieser, A J., 200 Wild, M. R, 88
Kuhn, G. N., 162 Woldt, W. E., 162,248

L
y
Leonte, D., 133

Young, D. S., 262

M z
Miller, S. M., 200 Zelinski, W. P., 218

277
Subject Index

A Electromagnetics, 162
Estimation procedures, 20
Annealing, 69 F
Arsenic,13
ASTM standards, 13, 32 Flow model, 51
Flow simulation, 181
B
H
Bayesian analysis, 102
Bivariate distribution, 69 Histogram, 32
Block failure, 262 Hotspots, 133
Block value estimation, 20 Hydraulic conductivity, 181,200
Hydraulic head analysis, 51
C
I
Conceptual interpretation, 3
Conditional probability, 133 Indicator simulation, 218
Conductivity, 162 Infiltrometer, 200
Contamination Interpolation techniques, 20
copper, 146 Inverse-distance weighting, 51
delineation, 88, 102 Irrigation practices, 248
lead, 13, 133
metal, 13, 51, 133, 146 J
plume mapping, 162
site mappmg, 69 Joint model, 262
soil, 133
stationarity, assessment with, K
117
subsurface, 162 Kriging,20,32,162,200,262
transport, 181 cokriging, 88, 181
uranium, 51 indicator, 69, 102, 117, 133,
Contouring, 133 236,262
Copper, 146 lognormal, 51
Core data, 218
Correlogram, 13 L

D Leachate, 162
Lead, 13, 133
Design analysis, 3
Direct measurement, 3 M

Mapping, 20, 32,51,200


E plume, 162, 181
probability, 69
Earthquakes, 236 Markov-Bayes simulation, 200
Electncal resistivity, 69 Matrix approach, 262
279
280 GEOSTATISTICS

Mercalli intensity, 236 Site remediation strategies, 32


Metals, heavy, 13, 133 Soil classifications, 181
Mine effluent, 146 Soil contamination, 133
Modeling Soil gas measurements, 88
flow, 51 Soil nitrate, 248
histogram, 32 Space-time series, 146
joint, 262 Spatial patterns analysis, 248
numerical,218 Spatial simulation, 200
spatial variability measures, Spatial temporal analysis, 51
13 Spatial variation, 13, 20, 248
stationary, 117 modeling, 32
stochastic, 117, 146 Stability analysis, 262
transport, 102 Stationarity, second order, 117
variogram, 13,32,218,236, Stationary model, 117
248 Statistical methods, 3
water quality, 146 Stochastic images, 200
Multivariate approach, 88, 200 Stochastic modeling, 117, 146
Structural stability, 262
N Subway tunnels, 262
Nitrate, 248
Numerical methods, 3 T
Numerical model, 218
Transport simulation, 102, 181
p

Pit slope stability, 262 u


Polycyclic aromatic hydrocarbons,
69 Uncertainty, 102
Probabilistic stability analysis, characterization, 69
262 measures, kriging, 51
Uranium, 51
R

Repository system, 218 v


Rock quality designation, 218
s Variogram, 13,32,218,236
semivariograms, 248
Sampling program Vario~raphies, 88
adaptive, 102 Volatile organic screening, 88
planning, 13
strategy, 102
Second order stationarity, 117 w
Seismic hazard, 236
Site characterization, 102 Water quality modeling, 146
ISBN 0-8031-2414-7

You might also like