Professional Documents
Culture Documents
Geostatistics and Analysis of Spatial Data: Allan A. Nielsen
Geostatistics and Analysis of Spatial Data: Allan A. Nielsen
Geostatistics and Analysis of Spatial Data: Allan A. Nielsen
Abstract— This note deals with geostatistical measures for spatial cor- ages can be found at http://www-sst.unil.ch/research/variowin/
relation, namely the auto-covariance function and the semi-variogram, as (or via a search engine). Also commercial geostatistical soft-
well as deterministic and geostatistical methods for spatial interpolation,
namely inverse distance weighting, radial basis functions (RBF) and krig- ware exists.
ing. Some semi-variogram models are mentioned, specifically the spherical, This note which is inspired by [11] (see also [12]), in Sec-
the exponential and the Gaussian models. Equations for RBF interpolation tion II deals with spatial correlation, specifically the auto-
as well as simple and ordinary kriging (OK) are deduced. Other types of
kriging are mentioned, and references to international literature, Internet
covariance function, the semi-variogram and some semi-
addresses and state-of-the-art software in the field are given. A very simple variogram models are described. Section III deals with spatial
example to illustrate the computations for OK and a more realistic exam- interpolation including the deterministic methods inverse dis-
ple with height data from an area near Slagelse, Denmark, are given. A tance weighting and radial basis function interpolation along
series of attractive characteristics of kriging are mentioned, and a simple
sampling strategic consideration is given based on the dependence of the with a family of statistically based methods termed kriging.
kriging variance of distance and direction to the nearest observations. Here simple and ordinary kriging are dealt with in some detail.
Section IV gives final remarks.
I. I NTRODUCTION
II. S PATIAL C ORRELATION
3h2
C0 + C1 1 − exp − R2 h>0.
These latter two models never reach but approach the sill asymp-
Northing
142000
θ = [C0 C1 R]T
Fig. 2. Sample sites, each circle is centered on a sample point, radius is pro-
∗ 2 portional to the quantity measured, in this case the height above the ground
min kγ̂(h) − γ (θ, h)k . water in a 10 km × 10 km area near Slagelse, Denmark.
θ
ALLAN A. NIELSEN: GEOSTATISTICS AND ANALYSIS OF SPATIAL DATA 3
1/d
w = PN
. .. .. .. .... ...... . . ... . ...
..... .......... ... .
. j=1 1/dj
.. .... .
6000
. .
. .. . .... .. ....... . . .
. ....... ........ .......... . ... . . ................ ..............
. . . . . .... . .. . .
.
.... . .............. .......... .. ..... .
. where dj s he d s ance from po n j o he po n o wh ch we
.. . ........... . ............. . ............. .. .. ... ...................................................... ........... . . n erpo a e Th s s read y ex ended o we gh ng w h d fferen
.. ........ . ......... ..... ............... .. . . . ... ............................................................... .
4000
. .. ... ... .. .. . ................. ..... .......... ... .. . .................. ..................................................................... .. powers p > 0 of he nverse d s ance
.. . .... . ............... .... .... . ................ ... .......... .... . . ........ . .
.. . . .... ...... ........... ...... .... .......................... ....... .... ..... ............ ................ .. .. .
. . .. ........... . ..... ............................................................................................................................................... .......................... ....
. . .. ...... ... ........ ......................... .......................................... .......... .......................................................... .......... ......... ...... ... ........ ... ..
. ....... ........ .......................... ..................................................................................................................................................
.................................................................................................................
.. .. .... ....... ..... ....................... .
...................................................................................................................................................................................................... . 1/dp
.......
.. w = PN
2000
. . . . . .. .. .. ... . . . .. . . .
.................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................. ... ..
. . . .. . .. . . . . . . . . . . . . . .
p
j=1 1/dj
. . ....................................... ...........
. ...................................................................... ............................................................... ........................................................................................................................................................................................................... .
.......................................
... . . ............... ........... .............................................................
..............
............................................................................
............................................................................... .......................................................................................................................................... .
........................................................................................................................... . . . . . . .........
. ................................
.................... ................................................................................................................................................................................. ............. .. . . ....
.
...... ...... ..................................................................................................................................................... . . . .. . . . . . . .
.. . .
....................... . . .
............
..
..
.....
..
.......
. . .. .
. . .. ..
. .
......... .
...
. . .
.....
. . . .
. . .
.. .
............
. . .... .............................................. .................................................................................................................................................................................................................................................................................................................. ...
..... .......... ............... . ....
.... .........................................................................................
......... .. ................
.. .......................
. ..
... .........................
. ..
. .......
..
.. . . ...
. ..
. ...
.
.. ...
.
................................
.
..... . .... . ... . .
.............................................................................................
. . . . . . . ..
. . . . . . .........
......
. ...................................................................................................................................
. ................................................................................................................................................................................... ............ . .
..
..........
............
.................
.....
...
....
......
............................
..................
..........
.....
.. .. ......
.......
..
... .
..........
.........
...... .........
...
... ...................................
.............
....
.................
.......... ......
..
... . .
.. .
.
................................. .
.
.
.
........
.
..
. ............
.
...
.................................
...
.
. ....
. ... . . . ...
.
.
............................
.....
.
.
......
.
.... .
. .
..............................
.........
.
..........
...... ........
.
.......................
............. .....
.
....
.
....................................................................................
.
............
...... ...... .....
...........
.......
.
.
...... .
.
.....
.
... ....
. .....
....
..
..........................
...
................
......
.................. .......... .
.
....
.
.........
...
.
..
.
......
...
. .
. .
.
.........
............
.
. .
. .
.
............
.....
..
.
.
..........
.
..
. .
...
. .
.
.............
......
...... ...
.
. .
. . . ....
......................
...
.
.. .........
... .
... .
...
. .
.
.........
...................................
.
..
. .
..
. .. ........
....................
..........
.....
............
........................
. .........
......................
....... ................
............ ............. ...
...
.....
.................
...
.............................................................................................
........................................
..
... ................................. .................
............................................................................................................................... ... .. .
.............
..... ......
.....
... .....
.... ...... ......
.............. ....
.... ....... ...........
......... ...................
..... ...........
............................. ..... . ..
A 1 Examp es
0
0 2000 4000 6000 8000 10000 12000 We now w sh o n erpo a e o Z0 a pos on r = 0 n F gure 1
d s ance m b m o nverse d s ance we gh ng d s he d s ance from po n
Z o Z0 We read y ca cu a e he fo ow ng we gh s
F g 4 A poss b e pa w se squa ed d e ences as a unc on o he magn ude
o he d sp acemen vec o exponen a va og am mode shown P
r d 1/d (1/d )/ (1/d )
−2 2 1/2 3/11 (= 0 2727)
−1 1 1 6/11 (= 0 5455)
3 3 1/3 2/11 (= 0 1818)
r d p=01 p = 2 0 p = 10 0
500
0 2000 4000 6000 Cons der a near es ma e ẑ0 = ẑ(r 0 ) a oca on r 0 based
d s ance m
on N measuremen s z = z(r 1 ) z(r N ) T = z1 zN T
T
F g 5 Expe men a sem va og am as a unc on o he magn ude o he a oca ons r 1 rN Assume ha each observa on nflu-
d sp acemen vec o Gauss an sem va og am mode shown ences s surround ngs n he same way n a d rec ons and ha
he nfluence s expressed by some func on φ ( he rad a bas s
func on RBF) wh ch depends on he d s ance h = kr 0 − r k
4 ALLAN A. NIELSEN: GEOSTATISTICS AND ANALYSIS OF SPATIAL DATA
between locations r 0 and r i only, φ = φ(kr 0 − r i k). We shall B.2 Choice of RBF
look into choices of φ shortly. Define the interpolated value
Often one of the following choices is made for φ
N
X • multiquadric: φ(h) = (h2 + h20 )1/2 ,
2 2 −1/2
ẑ0 = wi φ(kr 0 − r i k), • inverse multiquadric: φ(h) = (h + h0 ) ,
2
i=1 • thin-plate spline: φ(h) = h log(h/h0 ) (which tends to 0 for
h tending to 0), or
where wi is the weight associated with location i. Let us deter- 1
• Gaussian: φ(h) = exp(− 2 (h/h0 ) ),
2
mine the wi so that the interpolation becomes exact at the known
where h0 is a scale parameter to be chosen. Generally, h0 should
locations r j , i.e.,
be chosen larger than a typical distance between samples and
N smaller than the size of the study area. The multiquadric is said
X
zj = wi φ(kr j − r i k), j = 1, . . . , N. to be less sensitive to the choice of h0 . Especially the Gaussian
i=1 is sensitive to this choice.
With a Gaussian RBF you don’t need the polynomials men-
This makes up N equations with N unknowns, the wi , which tioned above, with the thin-plate spline RBF a linear polynomial
can be written in matrix form may be needed.
φ(kr 1 − r 1 k) · · · φ(kr 1 − r N k) w1 z1
.. .. .. .. .. B.3 Shepard Interpolation
. . . . = .
A special case for the normalized RBF interpolation consists
φ(kr N − r 1 k) · · · φ(kr N − r N k) wN zN of setting the matrix on the left hand side to a constant times the
or identity matrix. This corresponds to applying a φ that tends to
infinity for h tending to 0, and is finite for h > 0. This leads
Φ w = z. to setting the weights equal to the measurements wi = zi , i.e.,
we needn’t solve the system of equations for w. In this case
Often a polynomial P (r j ) = c0 + c1 xj + c2 yj + · · · = φ(h) = h−p , 1 < p ≤ 3 with appropriate handling for h = 0 is
[1 xj yj . . .] c where c = [c0 c1 c2 . . .]T is the vector of coeffi- often used.
cients for the polynomial is added to the interpolation at location
(in two dimensions) r j = [xj yj ]T C. Kriging
N Kriging (after the South African mining engineer and profes-
X
zj = P (r j ) + wi φ(kr j − r i k), sor Danie Krige) is a name for a family of methods for minimum
i=1 error variance estimation. Consider a linear (or rather affine) es-
timate ẑ0 = ẑ(r 0 ) at location r 0 based on N measurements
so that P T w = 0. Here z = [z(r 1 ), . . . , z(r N )]T = [z1 , . . . , zN ]T
1 x1 y1 x21 y12 x1 y1 ··· N
X
.. .. .. .. .. .. .. ẑ0 = w0 + wi zi = w0 + wT z,
P = . . . . . . .
i=1
1 xN yN x2N 2
yN xN yN ···
where wi are the weights applied to zi and w0 is a constant.
defines the polynomial applied. Often a constant corresponding
We consider zi as realisations of stochastic variables Zi , Z =
to the column of ones or a linear polynomial is used.
[Z(r 1 ), . . . , Z(r N )]T = [Z1 , . . . , ZN ]T . We think of Z(r) as
Solving the system of equations Φ w = z under the con-
consisting of a mean value and a residual Z(r) = µ(r) + ²(r)
straint P T w = 0 we get (see above on ordinary kriging and the
with mean value zero and constant variance σ 2 , E{²} = 0 and
Lagrange multiplier technique)
Var{²} = σ 2 . For the linear estimator we get
· ¸· ¸ · ¸
Φ P w z
= . Ẑ0 = w0 + wT Z. (2)
PT 0 c 0
The order of the polynomial may depend on the RBF chosen. The estimation error z0 − ẑ0 is unknown. But for the expec-
References on RBF are [16], [17], [18]. tation value of the estimation error we get
The variance of the estimation error is for any µ0 . 1 is a vector of ones. This is possible only if w0 = 0
and wT 1 = 1.
2
σE = Var{Z0 − Zˆ0 } The weights wi are found by minimising σE 2
under the con-
T
= Var{Z0 } + Var{w0 + wT Z} straint w 1 = 1. A standard technique for minimisation under
−2 Cov{Z0 , w0 + wT Z} under a constraint is introducing a function F with a so-called
Lagrange multiplier (here −2λ) which we multiply by the con-
= σ 2 + wT (Cw − 2 Cov{Z0 , Z}), straint set to zero and then minimising
where C is the dispersion or variance-covariance matrix of the F = 2
σE + 2 λ(wT 1 − 1)
stochastic variables, Z, entering into the estimation.
What is said in Section III-C so far is valid for all linear es- without constraints. Again the partial derivatives are set to zero
timators. The idea in krigingis now to find the linear estimator
which minimises the estimation variance. ∂F
= 2 Cw − 2 Cov{Z0 , Z} + 2 λ1 = 0
∂w
C.1 Simple Kriging ∂F
= 2 (wT 1 − 1) = 0,
In simple kriging (SK) we assume that µ(r) is known. From ∂λ
Equations 2 and 4 we get which results in the OK system
Ẑ0 − µ0 = wT (Z − µ). Cw + λ1 = Cov{Z0 , Z}
The weights wi are found by minimising the estimation variance 1T w = 1
2
σE . This is done by setting the partial derivatives to zero
or
2
∂σE
= 2 Cw − 2 Cov{Z0 , Z} = 0, C11 ··· C1N 1 w1 C01
∂w .. .. .. .. .. ..
. . . . .
. = .
which results in the SK system CN 1 ··· CN N 1 wN C0N
1 ··· 1 0 λ 1
Cw = Cov{Z0 , Z}
The values requested for Cij are found as described in the pre-
or vious section on SK.
The minimised squared estimation error termed the ordinary
C11 ··· C1N w1 C01
.. .. .. .. = .. kriging variance is
. . . . . ,
2
CN 1 ··· CN N wN C0N σOK = σ 2 + wT (Cw − 2 Cov{Z0 , Z})
= σ 2 − wT Cov{Z0 , Z} − λ.
where Cij , i, j = 1, . . . , N is the covariance between points i
and j among the N points, which enter into the estimation of OK implies an implicit re-estimation of µ0 for each new con-
point 0. C0j , j = 1, . . . , N is the covariance between point j stellation of points. This is an attractive property making OK
and point 0, the point to which we interpolate. We get these co- well suited for interpolation in situations where the mean is not
variances from the semi-variogram model (remembering Equa- constant (i.e., in the absence of first order stationarity).
tion 1, γ(h) = C(0) − C(h)) as the sill minus the value of
the semi-variogram model for the relevant distance (and possi- C.3 Examples
bly direction) between observations. (Alternatively, the kriging Let us consider the data in Figure 1 again. We now wish to
system may be formulated b.m.o. the semi-variogram; to avoid interpolate to the position r = 0 b.m.o. ordinary kriging. To
zeros on the diagonal of C we prefer the covariance formulation carry out the calculations we use a stipulated semi-variogram
for numerical reasons.) Here Cij must not be confused with the based on the spherical model with C0 = 0, C1 = 1 and R = 6.
semi-variogram parameters C0 and C1 . Remembering Equation 1, C(h) = C(0) − γ(h), this gives the
The minimised squared estimation error termed the simple auto-covariance function (in this case where C0 + C1 = 1 this
kriging variance is is the same as the auto-correlation function)
2
σSK = σ 2 + wT (Cw − 2Cov{Z0 , Z}) h γ̂(h) C(h)
= σ 2 − wT Cov{Z0 , Z}. 0 0.0000 1.0000
1 0.2477 0.7523
In SK the mean value µ(r) is known. In practice it is often 2 0.4815 0.5185
assumed constant for the entire domain (or study area), or we 3 0.6875 0.3125
must estimate it before the interpolation (or we must construct 4 0.8519 0.1481
an interpolation algorithm which does not require knowledge of 5 0.9606 0.0394
the mean field, see the next section). 6 1.0000 0.0000
C.2 Ordinary Kriging
Therefore the OK system looks like this
In ordinary kriging (OK) we assume that the mean µ(r) is
constant and equal to µ0 for Z0 and the N points that enter into 1.0000 0.7523 0.0394 1 w1 0.5185
the estimation of Z0 . From Equations 3 and 4 we get 0.7523 1.0000 0.1481 1 w2 0.7523
=
0.0394 0.1481 1.0000 1 w3 0.3125
E{Z0 − Ẑ0 } = µ0 (1 − wT 1) − w0 = 0 1 1 1 0 λ 1
6 ALLAN A. NIELSEN: GEOSTATISTICS AND ANALYSIS OF SPATIAL DATA
assumptions. It can hardly be considered as a drawback of geo- [19] E. C. Grunsky and F. P. Agterberg, “Spatial and multivariate analysis of
statistical methods that we are forced to consider whether such geochemical data from metavolcanic rocks in the Ben Nevis area, On-
tario,” Mathematical Geology, vol. 20, no. 7, pp. 825–861, 1988.
assumptions are appropriate. [20] E. C. Grunsky and F. P. Agterberg, “SPFAC: a Fortran program for spatial
factor analysis of multivariate data,” Computers & Geosciences, vol. 17,
IV. F INAL R EMARKS no. 1, pp. 133–160, 1991.
[21] Allan A. Nielsen, Knut Conradsen, John L. Pedersen, and Agnete Steen-
A sampling strategy may be based on the dependence of the felt, “Spatial factor analysis of stream sediment geochemistry data from
South Greenland,” in Proceedings of the Third Annual Conference of the
kriging variance on the distance to the nearest samples. If the International Association for Mathematical Geology, Vera Pawlowsky-
auto-covariance function (or the semi-variogram) and the sam- Glahn, Ed., Barcelona, Spain, September 1997, pp. 955–960.
ple locations are known, we can determine the kriging weights [22] Allan A. Nielsen, Knut Conradsen, John L. Pedersen, and Agnete Steen-
felt, “Maximum autocorrelation factorial kriging,” in Proceedings of the
and the kriging variances before the actual sampling takes place. 6th International Geostatistics Congress (Geostats 2000), W. J. Kleingeld
If the variances become too large in some regions of our study and D. G. Krige, Eds., Cape Town, South Africa, April 2000. Internet
area we may modify the sample locations to obtain smaller vari- http://www.imm.dtu.dk/pubdb/p.php?3639.
[23] Knut Conradsen, Introduktion til Statistik, Informatics and Mathematical
ances. Also, to obtain a good estimate of the nugget effect which Modelling, Technical University of Denmark, 1984.
is an important parameter for the outcome of the kriging process, [24] T. W. Anderson, An Introduction to Multivariate Statistical Analysis, sec-
it may be an advantage to position some samples close to each ond edition, John Wiley, New York, 1984.
[25] D. A. Griffith and C. G. Amrhein, Multivariate Statistical Analysis for
other. Geographers, Prentice Hall, 1997.
In multivariate studies where the joint behaviour of several [26] Annette Kjær Ersbøll, “A comparison of two spatio-temporal semivari-
variables is investigated, rather than interpolating the original ograms with use in agriculture,” in Lecture Notes in Statistics, T. G. Gre-
goire et al., Ed., vol. 122, pp. 299–308. Springer-Verlag, 1997.
variables we may interpolate combinations of them. For in- [27] Annette Kjær Ersbøll and Bjarne Kjær Ersbøll, “On spatio-temporal krig-
stance we may interpolate principal components or factors re- ing,” in Proceedings of the Third Annual Conference of the International
Association for Mathematical Geology (IAMG’97), Vera Pawlowsky-
sulting from a factor analysis or a spatial factor analysis, [19], Glahn, Ed., Barcelona, Spain, September 1997, pp. 617–622.
[20], [11], [21], [22]. Generel references to multivariate statis- [28] Thomas Knudsen, Busstop - A Spatio-Temporal Information System,
tics are for example [23], [24]. [25] is written especially for Ph.D. thesis, Niels Bohr Institute, Department of Geophysics, University
of Copenhagen, Denmark, 1997.
geographers, [15] for geologists.
Also temporal aspects in connection with the application of I NDEX T ERMS
data that vary in both space and time may be important. Spatio-
auto-correlation function
temporal semi-variograms and spatio-temporal kriging are dealt auto-covariance function
with in for example [26], [27]. A GIS for handling of temporal BLUE
data is described in [28]. data analysis
distance weighting
estimation variance
R EFERENCES geostatistics
[1] A. G. Journel and Ch. J. Huijbregts, Mining Geostatistics, Academic GIS
Press, London, 1978, 600 pp. GSLIB
[2] Isobel Clark, Practical Geostatistics, Elsevier Applied Science, London, intrinsic hypothesis
1979, http://www.stokos.demon.co.uk/. covariance function
[3] Edward H. Isaaks and R. Mohan Srivastava, An Introduction to Applied kriging
Geostatistics, Oxford University Press, New York, 1989, 561 pp. kriging variance
[4] K. Conradsen, A. A. Nielsen, and K. Windfeld, “Analysis of geochemical Lagrange multiplier
data sampled on a regional scale,” in Statistics in the Environmental and least squares method
Earth Sciences, A. Walden and P. Guttorp, Eds., pp. 283–300. Griffin, multivariate statistics
1992. nested structures
[5] Noel A. C. Cressie, Statistics for Spatial Data, revised edition, Wiley, normalized radial basis functions
New York, 1993. nugget effect
[6] P. A. Burrough and R. A. MacDonnell, Principles of Geographical Infor- ordinary kriging
mation Systems, Oxford University Press, 1998. sampling strategy
[7] M. F. Goodchild, B. O. Parks, and L. T. Steyaert, Environmental Modeling radial basis functions
with GIS, Oxford University Press, 1993. range of influence
[8] H. Wackernagel, Multivariate Geostatistics, third edition, Springer, 2003. regionalised variable
[9] C. V. Deutsch and A. G. Journel, GSLIB: Geostatistical Software Library screening effect
and User’s Guide, second edition, Oxford University Press, 1998, Internet semi-variogram
http://www.gslib.com/.
semi-variogram model
[10] Y. Pannatier, VARIOWIN: Software for Spatial Data Analysis in 2D,
sill
Springer, 1996, Internet http://www-sst.unil.ch/research/variowin/.
simple kriging
[11] Allan A. Nielsen, Analysis of Regularly and Irregularly Sampled Spa-
tial, Multivariate, and Multi-temporal Data, Ph.D. thesis, Informatics spatial interpolation
and Mathematical Modelling, Technical University of Denmark, Lyngby, spatial correlation
1994, Internet http://www.imm.dtu.dk/pubdb/p.php?296. stationarity
[12] Karsten Hartelius, Analysis of Irregularly Distributed Points, Ph.D. thesis, statistics
Informatics and Mathematical Modelling, Technical University of Den- stochastic function
mark, Lyngby, 1996. Internet http://www.imm.dtu.dk/pubdb/p.php?1204. stochastic process
[13] Allan A. Nielsen, “2D semivariograms,” in Proceedings of the Fourth stochastic variable
South African Workshop on Pattern Recognition, Paul Cilliers, Ed., Si- variogram
mon’s Town, South Africa, 25–26 November 1993, pp. 25–35. Variowin
[14] Allan A. Nielsen, “Kriging,” lecture note, Technical University of Den-
mark, 2004. Internet http://www.imm.dtu.dk/pubdb/p.php?3479.
[15] J. C. Davis, Statistics and Data Analysis in Geology, second edition, John
Wiley & Sons, 1986.
[16] M. J. D. Powell, “The theory of radial basis function approximation,” in
Advances in Numerical Analysis II: Wavelets, Subdivision Algorithms and
Radial Functions, W. A. Light, Ed., Oxford University Press, 1992.
[17] W. H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery, Numer-
ical Recipes: The Art of Scientific Computing, third edition, Cambridge
University Press, 2007.
[18] F. Anton, J. A. Bærentzen, J. Gravesen, and H. Aanæs, Computational Ge-
ometry Processing, lecture note, Technical University of Denmark, 2008.