Download as pdf or txt
Download as pdf or txt
You are on page 1of 20

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/236771894

Improvement in mapping vegetation cover factor for the universal soil loss
equation by geostatistical methods with Landsat Thematic Mapper images

Article  in  International Journal of Remote Sensing · September 2002


DOI: 10.1080/01431160110114538

CITATIONS READS

119 435

4 authors:

Guangxing Wang Stephen Wente


Southern Illinois University Carbondale United States Environmental Protection Agency
123 PUBLICATIONS   2,565 CITATIONS    12 PUBLICATIONS   757 CITATIONS   

SEE PROFILE SEE PROFILE

George Z Gertner Alan B. Anderson


University of Illinois, Urbana-Champaign Engineer Research and Development Center - U.S. Army
135 PUBLICATIONS   3,492 CITATIONS    118 PUBLICATIONS   1,911 CITATIONS   

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Science and Technology Program of Guangzhou City 2018: Application of multi-source remote sensing and Geospatial Big-data for decision support on urban
redevelopment practices. View project

Comparative Analysis of Modeling Algorithms for Forest Aboveground Biomass Estimation in a Subtropical Region View project

All content following this page was uploaded by George Z Gertner on 07 January 2016.

The user has requested enhancement of the downloaded file.


int. j. remote sensing, 2002, vol. 23, no. 18, 3649–3667

Improvement in mapping vegetation cover factor for the universal soil


loss equation by geostatistical methods with Landsat Thematic
Mapper images

G. WANG†, S. WENTE†, G. Z. GERTNER*† and A. ANDERSON‡


†W503 Turner Hall, 1102 S. Goodwin Ave., University of Illinois, Urbana,
IL 61801, USA
‡US Army Corps of Engineers, Construction Engineering Research Lab,
P.O. Box 9005, Champaign, IL, USA

(Received 29 November 2000; in Ž nal form 26 July 2001)

Abstract. The universal soil loss equation (USLE) is a product of six factors: (1)
rainfall erosivity, (2) soil erodibility, (3) slope length, (4) slope steepness, (5) cover
and management, and (6) support practice, and is widely used to estimate average
annual soil loss. The cover and management variable, called the C factor, represents
the eVect of cropping and management practices on erosion rates in agriculture,
and the eVect of ground, tree and grass canopy covers on reduction of soil loss in
non-agriculture situation. This study compared three traditional and three geosta-
tistical methods for mapping the C factor. They included vegetation classiŽ cation
with average, linear and log–linear regression for C factor assignment, sequential
Gaussian cosimulations with and without Thematic Mapper (TM) images, and
colocated cokriging with TM images. The coeYcient of correlation between estim-
ates and observations varied from 0.4888 to 0.7317, and the root mean square error
(RMSE) from 0.0159 to 0.0203. The sequential Gaussian cosimulation with a TM
ratio image resulted in the highest correlation and the smallest RMSE, and repro-
duced the best and most detailed spatial variability of the C factor. This method
may thus be recommended for mapping the C factor. It is also expected that this
method could be applied to image-based mapping in other disciplines.

1. Introduction
Since 1998 we have been working on a larger project ‘Error and Uncertainty
Analysis for Ecological Modeling and Simulation’. Spatial prediction or mapping
and uncertainty analysis of soil loss by the universal soil loss equation (USLE)
(Wischmeier and Smith 1978) or revised USLE (RULSE) (Renard et al. 1997 ) are
being carried out as the case study of this project. Both models are being widely
used to estimate soil loss in agriculture and environmental management. In the
models, soil loss A=R×K ×L×S×C×P, where R, K , L, S, C and P, respectively,
are rainfall erosivity, soil erodibility, slope length, slope steepness, vegetation cover
and management, and support practice factor. The objectives of this project are to

*Corresponding author: W503 Turner Hall, 1102 S. Goodwin Ave., University of Illinois,
Urbana, Illinois 61801, USA; e-mail gertner@uiuc.edu.

Internationa l Journal of Remote Sensing


ISSN 0143-116 1 print/ISSN 1366-590 1 online © 2002 Taylor & Francis Ltd
http://www.tandf.co.uk/journals
DOI: 10.1080/01431160110114538
3650 G. Wang et al.

develop a general methodology and framework to study uncertainty and error


sources, and make error budgets in spatial mapping of predicted soil loss. It is
expected that this methodology could be applied to other disciplines.
The vegetation cover and management factor C represents the eVect of cropping
and management practices in agricultural management, and the eVect of ground,
tree and grass covers on reducing soil loss in non-agricultura l situation. The higher
the ground and vegetation covers, the less the soil loss. According to Benkobi et al.
(1994) and Biesemans et al. (2000), the vegetation cover factor together with slope
steepness and length factors is most sensitive to soil loss. In the USLE, the vegetation
cover C factor is derived based on empirical equations with measurements of ground
cover, aerial cover and minimum drip height (Wischmeier and Smith 1978). Often
the measurements of these variables are obtained by sampling the transect lines. The
average ground cover, aerial cover and minimum drip height are calculated based
on the samples. The values of the C factor at the non-sample locations are estimated
using the C factor values at the sampling locations by spatial interpolation. In order
to provide accurate estimates of soil loss, it is important to create a reliable map of
vegetation cover C factor.
The traditional method widely used for the spatial interpolation of the C factor
is the so-called point-in-polygo n or point-in-stratum (Warren and Bagley 1992).
Within each polygon or stratum the cells are assumed to be homogeneous and an
average is calculated and assigned to each cell. The polygons or strata are derived
by supervised or unsupervised classiŽ cation of all pixels using remote sensing data
and the C factor values at the measured locations. Siegel (1996 ) and Wheeler (1990)
used the procedure to map the C factor for the USLE. This method is based on
correlation of the C factor and remote sensing data. The shortcomings, however, are
that the C factor is indirectly mapped through vegetation classiŽ cation, and the
classiŽ cation errors are thus introduced into the C factor map. Using average C
factor value for each vegetation type leads to smoothing of estimates and disappear-
ance of spatial heterogeneity and variability. At sub-areas or pixels, the uncertainty
of the resulting map is also unknown.
Recently an improved method for assigning the C factor (Tweddale et al. 2000 )
was proposed. A linear and non-linear regression model were constructed based on
the correlation between the C factor values and the vegetation indices from satellite
image data. The main improvement was to reduce the smoothing eVect and to
increase the spatial variability of the C factor map. Using vegetation indices instead
of original image data strengthened the correlation on which the model was
developed. However, the extreme image data at non-sampled locations might result
in negative or extreme C factor values.
There is still a strong need to develop new methodology to improve mapping
the vegetation cover C factor using remote sensing data. An accurate map should
be unbiased for population estimates, sub-areas, and any speciŽ c location. The new
method needs to provide unbiased estimates of the population, but need also to
reproduce the inherent spatial variability of the vegetation cover C factor and to
provide its spatial statistics in term of uncertainty. The alternative may be geostatist-
ical methods such as colocated cokriging (Xu et al. 1992), sequential Gaussian
cosimulation with colocated cokriging (Gomez-Hernandez and Journel 1992), by
which remote sensing data can be used as secondary variables to improve mapping
the C factor called the primary variable. These methods create unbiased estimates
with minimum error variances, and reproduce auto- and cross-spatial variability.
Improvement in mapping by geostatistical methods 3651

Remote sensing data acquired by a sensor recording spectral signals from the
ground can be considered to be a model of the ground surface. The spatial varia-
bility of the ground characteristics is coded in the remote sensing data. A cross-
semivariogram or cross-correlogram used in these geostatistical methods can capture
the spatial correlation between the ground characteristics and image data. The spatial
correlation is related to Ž eld plot size used for data collection and spatial resolution
of images. Wang et al. (2000) found that when the plot size and spatial resolution
was 90 m by 90 m for the vegetation cover mapping in this study area, the semivario-
grams from the vegetation cover measurements were highly correlated with Landsat
Thematic Mapper (TM) data in terms of spatial structure.
The above idea has been supported by other studies. For example, Barata et al.
(1996) used a cokriging method to map forest cover with TM images and a signiŽ cant
improvement was found. Hunner et al. (2000) modelled forest stand structure by
cokriging, where the secondary variables included topographic, TM images and their
vegetation indices. The authors suggested that cokriging was the best compared with
other methods. Wallerman (2000) compared cokriging and a simple regression model
for estimating forest stem volume using Landsat TM data, and concluded that
cokriging was much better than the regression model.
The results above show that cokriging is promising. However, the error variances
depend only on local data conŽ guration and measure local uncertainty, not spatial
uncertainty. The interpolation may uniformly smooth out local details of spatial
variability. That is, the smoothing is minimal close to the data locations and arises
as the locations to be estimated get farther away from the sample locations. The
mapping can be improved by sequential Gaussian cosimulation by colocated cokrig-
ing with remote sensing data. In this method, a number of realizations for each
location are generated conditional to the sample data and previous simulated values
given a neighbourhood , and the image datum at the estimated location, and used to
obtain an expectation value and variance.
This study presents a methodological improvement to map the vegetation cover
C factor for soil loss estimation using a sequential Gaussian cosimulation with
Landsat TM images. This method combines the measurements of the C factor and
image data in auto- and cross-spatial variability to derive conditional cumulative
density functions. From these functions, a number of realizations are randomly
drawn and used to calculate an expectation value for each pixel and its vari-
ances measuring spatial uncertainty. This approach is compared with two other
geostatistical and three traditional methods for mapping.

2. Datasets and C factor


This study was carried out in an area of 87 890 ha, located at Fort Hood, Texas,
where summer is long and hot, and winter is short and mild. The dominant vegetation
type at the east and north-east is oak–juniper woodlands. West and south parts are
savannah type, dominated by grasses with scattered motts of live oak. In the centre
there is a mixture of the savannah type and oak–juniper woodlands. Based on the
vegetation types and soil types, 215 Ž eld plots were selected in a stratiŽ ed random
fashion and measured in the spring and summer of 1989 (Tazik et al. 1992 ). The
number of these plots at each vegetation and soil type was proportional to its percent
of land area. The plot width was 6 m with a 100 m line transect located in its centre.
Ground cover, canopy cover and botanical composition were recorded by the
point intercept method as described by Diersing et al. (1992), that is, at 1 m intervals
3652 G. Wang et al.

(0.5 m, 1.5 m, ..., 99.5 m) along the transect. The ground and vegetation cover
percentages of each plot were calculated by dividing the total number of the covered
points by the total points measured (×100%). Canopy vegetation heights were
recorded by species at 0.1 m height intervals up to 2 m, and at 0.5 m intervals up to
8 m in height. The main vegetation cover types obtained were tree, shrub, grass,
mixed, bare land and water. The vegetation cover C factor was calculated for each
Ž eld plot using the approach described by Wischmeier and Smith (1978).
A scene of Landsat TM images dated on 16 October 1989 and at the spatial
resolution of 30 m by 30 m was acquired. These images consisted of band 1:
0.45–0.53 mm, band 2: 0.52–0.60 mm, band 3: 0.63–0.69 mm, band 4: 0.76–0.90 mm,
band 5: 1.55–1.75 mm, and band 7: 2.08–2.35 mm. They were geo-referenced to the
Universal Transverse Mercator (UTM) projection and re-sampled to a resolution of
90 m×90 m by an average process and coarsening pixel size with 3×3 pixels.

3. Methods
Three traditional and three geostatistical methods were used in this study for
spatial interpolation of the vegetation cover C factor, that is, mapping the C factor
using a sample dataset and a scene of six Landsat TM images. Three traditional
methods were typically point-in-polygo n or point-in-stratum , that is, vegetation
classiŽ cation with pixel value assignment using (i) average cross-category ; (ii) linear
regression model cross-category ; and (iii) log–linear regression models cross-category.
The three geostatistical methods were (i) colocated cokriging with a TM ratio image;
and sequential Gaussian cosimulation (ii) with and (iii) without the TM ratio image.
From all 215 sample plots, 31 plots were randomly selected and used as the test
dataset. The remaining 184 sample plot data were used for developing spatial
interpolation models. For the traditional methods, the image data used were all six
TM bands. For two geostatistical methods, the image data employed were a ratio
image having the highest correlation with the C factor values. Geostatistical Software
Library (GSLIB) and VARIOWIN software for spatial data analysis in 2D (Pannatier
1996) were used for development of three geostatistical methods (Deutsch and
Journel 1998), Geographic Resources Analysis Support System (GRASS 1993) for
three traditional methods, and ArcView GIS (Hutchinson and Daniel 1997) for
displaying the raster data.

3.1. Image data analysis


In addition to six original band images, data transformation s were made. The
correlation between the image data and vegetation cover C factor was then analysed.
The data transformation s were:
Normal difference vegetation index (NDVI): (TM4+TM3)/(TM4­ TM3) (1)
TM Ratio 1: TM3/TM4 (2)
TM Ratio 2: TM7/TM4 (3)
TM Ratio 3: (TM3×TM7)/TM4 (4)
TM Ratio 4: (TM3×TM5)/TM4 (5)
TM Ratio 5: (TM3+TM7)/TM4 (6)
TM Ratio 6: (TM2+TM3+TM7)/TM4 (7)
Improvement in mapping by geostatistical methods 3653

The purpose of the transformation s was to improve the correlation of the image
data with vegetation cover by reducing the redundant information due to high
correlation between the images and by increasing variation of the image data (Barata
et al. 1996 ).

3.2. T raditional methods


Traditionally, vegetation classiŽ cation is Ž rst performed using image data by
supervised or unsupervised classiŽ cation. Vegetation cover C factor values are then
assigned to pixels by diVerent methods within vegetation types. In the sampling area,
the vegetation types consisted of grass, shrub, tree, mixed, and bare land. A method
similar to supervised classiŽ cation with maximum likelihood was applied. A set of
classiŽ cation rules for the vegetation types (not listed here) were deŽ ned. Based on
the actual percentage cover and the rules, the Ž eld plots were classiŽ ed and used to
estimate the parameters of maximum likelihood function based on data of six TM
band images. Using the function, each pixel was Ž nally classiŽ ed and a classiŽ cation
map was obtained.
Three methods were applied for the assignment of the C factor. The Ž rst was
average cross-category (AverageC), that is, an average of the C factor values from
Ž eld plots was calculated for each vegetation category, and the average value was
assigned to each pixel of the same vegetation. The second method was linear regres-
sion cross-category (LregC). A linear regression model to represent the relationship
of the C factor values with the satellite image data was derived for each vegetation
category, then using the model the estimate of C factor was calculated for each pixel
of the same vegetation. The third method was the linear regression with a logarithmic
transformation of the C factor (LogLregC).
When linear or log–linear regression models were used to predict the C factor
values at the non-sampled pixels, negative or extreme large values were sometimes
realized. For these cases, the minimum and maximum C factor values from the
sample data were assigned to the pixels. Additionally, the number of independent
variables varied depending on vegetation categories. In general, as the number of
sample plots increased within a vegetation type, the greater the number of the image
variables was included in the regression models.

3.3. Spatial variability


In geostatistics, the spatial variability can be modelled by semivariograms and
cross-semivariograms . Let Z be the vegetation cover C factor or a spectral variable
and Z(u) [ =m+e(u)} be a random function deŽ ned at location u in two-dimensional
space, where m is the mean of Z in a region, and e(u) is a random function with zero
mean. A semivariogram c (h) of the random function is related to the semivariance
z
with a separation vector, or lag of distance h given a direction (Krige 1966), and
can be obtained by sampling using experimental semivariogram, that is, equation (8).

1 N(h)
c (h)= æ (z(u )­ z(u +h))2 (8)
z 2N (h) a a
a= 1
where N (h) is the set of all pair-wise Euclidean distances, z(u ) and z(u +h) are
a a
data values of variable Z at spatial locations u and u +h, respectively. Similarly,
a a
3654 G. Wang et al.

the spatial cross-variability between two random functions can be calculated by


cross-semivariogra m (Goovaerts 1997) in equation (9):
1 N(h)
c (h)= æ (z(u )­ z(u +h))(y(u )­ y(u +h)) (9)
zy 2N (h) a a a a
a= 1
where y(u ) and y(u +h) are data values of variable Y at spatial locations u and
a a a
u +h, respectively. The semivariogram s and cross-semivariogram s can often be
a
modelled using spherical, exponential and Gaussian models as follows (Pannatier
1996 ):

G AB
h h 3
1.5 ­ 0.5 if hå a
a a
c (h)= (10 )
Sph
1 otherwise

A B
­ 3h
c (h)=1­ exp (11 )
Exp a

AB A B
h ­ 3h2
c =1­ exp (12 )
Gau a a2
where a is the range parameter of spherical model and the practical range parameters
of exponential and Gaussian models, deŽ ned as the distance at which the model
value is at 95% of the sill. The range parameters provide the range of spatial
dependence of the variable. Within the range, observations can be considered spatially
dependent, and beyond the range, observations can be considered essentially inde-
pendent. As the distance h increases, the semivariogram and cross-semivariogram
varies and approach their limit values.
The spatial variability of a variable may be a linear combination of these models
above. The simplest example is c(h)={c +c ×c (h)], where c and c respectively
0 1 Sph 0 1
are called nugget and structure parameters, and c=c +c is the sill parameter
0 1
(Ž gure 1). The nugget, structure and sill parameters account for nugget variance,
structure variance and total variance of the spatial variability. The nugget variance
when h=0 can be considered the noise term, implying short distance variability and
measurement error in remote sensing.

(a) (b)

Figure 1. Examples of spherical (a) and Gaussian (b) models with their parameters.
Improvement in mapping by geostatistical methods 3655

3.4. Colocated cokriging (Co_Cok) and Markov model


Colocated cokriging estimators are the alternatives to improve spatial prediction
of a primary variable, that is, vegetation cover C factor in this study, given secondary
information available at all nodes to be estimated, that is, Landsat TM images. This
method introduces the image information into the interpolation for improving
estimation. The colocated cokriging estimator is:
n(u)
z (u)= æ lsck (u)[ z(u )­ m }+lsck (u)[ y(u)­ m }+m (13)
sck a a z y y z
a= 1
where z (u) is a simple colocated cokriging estimate of the C factor. z(u ) is one of
sck a
the sample data of the C factor, called the primary variable, a=1, 2, ..., n and n is
the number of all ground plots. y(u) is the datum of the image variable called
secondary variable at an unknown location u to be estimated. The variables m and
z
m are the average values of the C factor and the image variable, lsck (u) and lsck (u) are
y a y
the weights to the C factor data and the image datum. n(u) is the number of the
sample data used to predict the vegetation cover C factor at an unknown location
u to be estimated given a neighbourhood and one TM datum y(u).
The number of the sample data used to predict C factor at each unknown location
varies depending on the sample density given a neighbourhood . At least two samples
are required to calculate estimation variance. Too many samples may lead to
smoothing estimates. In this study, the minimum and maximum number of the
sample data used was 3 and 16 respectively. The colocated cokriging linear equation
for the weights is:

G
n(u)
æ lsck (u)C (u ­ u )+lsck (u)C (u ­ u)=C (u ­ u)
b zz a b y zy a zz a
b= 1
a=1, ..., n(u) (14)
n(u)
æ lsck (u)C (u­ u )+lsck (u)C (0)=C (0 )
b yz b y yy yz
b= 1
where C (h) and C (h) are the covariance functions of the primary and secondary
zz yy
variables at a separation distance h. C (h) is the cross-covariance function between
zy
the two variables and h=u ­ u , or u ­ u, u­ u , and 0. The relationship between
a b a b
covariance and semivariogram functions is C(h)=C(0)­ c(h).
To derive the weights, the covariance and cross-covariance (or semivariogram)
functions should be modelled together. The conditions for obtaining solutions are
that permissible covariance models are used, all diagonal elements and all principal
minor determinants of the matrix consisting of nugget and sill parameters are
non-negative . This is a very complicated and tedious task. An alternative is that
cross-covariance C (h) can be approximate d by Markov models in terms of spatial
zy
covariance in equation (15):
C (h)=C (0)×C (h)/C (0) (15)
zy zy zz zz
In addition, the approximation can also be made by multiplying the cross-spatial
correlogram r (h) between the primary and secondary variable at the separation
zy
distance h=0, with r (h) the auto-spatial correlogram of the secondary variable,
yy
that is, equation (16).
r (h)=r (0)r (h) (16)
zy zy yy
3656 G. Wang et al.

This equation is called Markov model MM2 (Journel 1999 ) and was applied in
colocated cokriging with TM images. When the distance h=0, the cross-spatial
correlogram equals a correlation coeYcient in traditional statistics. For more details,
readers should refer to Journel (1999), and Shmaryan and Journel (1999).

3.5. Sequential Gaussian cosimulation with and without T M images (SGW T M,


SGWoT M)
In addition to the colocated cokriging above, Landsat TM data were added into
a sequential Gaussian cosimulation process to improve the prediction of the C factor.
Instead of deriving a single estimate and variance at an unknown location by
colocated cokriging, the sequential Gaussian cosimulation generates a number of
realizations conditional to the sample data. From these realizations, the expected
average and its variance can be estimated. The variance estimate is thus the measure
of spatial uncertainty, not local uncertainty as that by colocated cokriging. In
addition to sample data, the estimates derived in the previous cosimulations
will become conditional data in next cosimulation in the sequential Gaussian
cosimulation algorithm.
The Gaussian cosimulation requires the primary and secondary variables to be
normally distributed. For both C factor and image data, a normal score transforma-
tion was used in order for both variables to have a normal distribution with a mean
of zero and unit variance. A multivariate multiple-point Gaussian random function
model was adopted. The colocated cokriging does not require normal distribution
of data. Finally, the cosimulation algorithm used a Markov model MM1, equation
(17), to approximate the cross-spatial correlogram r (h) (Almeida and Journel 1994):
zy
r (h)=r (0)r (h) (17 )
zy zy zz
The diVerence between both Markov models is that in MM1 the auto-spatial
correlogram was r (h) from the primary variable, while in MM2 the auto-spatial
zz
correlogram was r (h) from the secondary variable. In practice, both Markov models
yy
should be checked by the sample data.
Suppose that [z(u ), a=1, 2, 3, ..., n] is the set of C factor sample data, and
a
[Z(u ), j=1, 2, 3, ..., N] is a set of random variables deŽ ned at N locations u¾ , that
j j
is, N nodes of the grid for the study area. The cosimulation is to generate L joint
realizations [z(1)(u¾ ), j=1, ..., N] (l=1, 2, ..., L ) for these N random variables condi-
j
tional to the sample dataset. The key for the performance is that an N-point
conditional cumulative density function is expressed as the product of N one-point
conditional cumulative density functions given the set of n(u) original data values
and N­ 1 previously simulated values (Goovaerts 1997).
In the sequential Gaussian cosimulation with TM image (SGWTM), a random
path visiting each node only once was set. At each node, the mean and variance of
the Gaussian conditional cumulative density function were determined by colocated
cokriging given the normal score values of n original data and an image data, and
all previously simulated values with direct and cross-semivariogra m models. From
the density function, a value was drawn, becoming a conditional datum. This step
was repeated until N nodes were visited to obtain a realization of the C factor for
the whole area. Running this process L times with diVerent paths resulted in L
realizations providing the expectation and variance map.
Improvement in mapping by geostatistical methods 3657

For comparison of the cosimulation with TM images, the sequential Gaussian


simulation was also done without TM image data (SGWoTM) and the simulation
steps were similar to those above. However, the mean and variance of the Gaussian
conditional cumulative density function were determined by simple kriging given the
normal score values of n original data and previously simulated values with the
direct semivariogram of C factor.

4. Results
The coeYcients of correlation between vegetation cover C factor and Landsat
TM images are listed in table 1. The correlation varied from 0.402 to 0.586. The C
factor had the highest correlation with TM band 7, then band 3, band 5, band 1,
band 2 and band 4. The TM bands with higher correlation had larger coeYcients
of variation and the variation coeYcient of TM band 4 was the smallest. The
correlation between these TM bands except for TM band 4 was very high, which
implied redundant information. TM band 7 or 3 may be the best image variable
for mapping the C factor, however TM band 4 may also be used because of less
redundant information.
According to the analysis above, seven transformed image variables were derived
mainly from TM bands 7, 3 and 4. Compared with the original TM bands, the
normal diVerence vegetation index had much weaker correlation with C factor,
however, the other six TM ratios led to improvement in the correlation. The largest
correlation gained by the ratio image 5 (TM3+TM7)/TM4. The ratio image 5 was
thus employed as the secondary variable in two geostatistical methods, that is,
colocated cokriging and sequential Gaussian cosimulation with TM images.
The semivariogram of C factor, c (h), was modelled using 184 original sample
C
data (Ž gure 2(a)) and their normal scores with standardizatio n (Ž gure 2(b)). The lag
distance and lag tolerance used were 1.0 and 0.5 km. The two experimental semi-
variograms were Ž t using Gaussian and spherical models, respectively, equations
(18) and (19).

G A BH
­ 3h2
Sample data: c (h)=0.00016+0.00045 1­ exp (18)
c 2.42

G A BH
h h 3
Sample data and standardization : c (h)=0.23+0.77 1.5 ­ 0.5
c 2.6 2.6
(19)

Table 1. Correlation between vegetation cover C factor and Landsat TM and ratio images.
(Note: NDVI: (TM4+TM3 )/(TM4­ TM3); Ratio 1: TM3/TM4; Ratio 2: TM7/TM4;
Ratio 3: (TM3×TM7 )/TM4; Ratio 4: (TM3×TM5 )/TM4; Ratio 5: (TM3+TM7)/
TM4; and Ratio 6: (TM2+TM3+TM7)/TM4.)

TM1 TM2 TM3 TM4 TM5 TM7 NDVI

C factor 0.551 0.549 0.570 0.402 0.560 0.586 0.250


Ratio 1 Ratio 2 Ratio 3 Ratio 4 Ratio 5 Ratio 6
C factor 0.602 0.599 0.594 0.589 0.608 0.607
3658 G. Wang et al.

(a) (b)

Figure 2. Experimental (dots) and modelled ( line) semivariograms of the C factor using
the original sample data for modelling (a) and normal score transformation with
standardization (b). Note: distance unit is km.

The semivariogram using raster data of ratio image 5, c (h), is given in Ž gure 3 and
R
equation (20).

G A BH G A BH
­ 3h ­ 3h2
c (h)=0.002+0.130 1­ exp +0.018 1­ exp
R 2.0 14.42

G A BH
h h 3
+0.002 1.5 ­ 0.5 (20 )
16.2 16.2
The lag distance and lag tolerance was 90 m and 45 m, respectively. The semivario-
gram was Ž t by integrating a common nugget of 0.002, an exponential, a Gaussian,
and a spherical model.
In Ž gure 4, two Markov models MM1 equations (17 ) and MM2 (16) were
evaluated by comparing them to the sample cross-correlogram between the C factor

Figure 3. Experimental (dots) and modelled ( line) semivariograms of the ratio image 5. Note:
distance unit is km.
Improvement in mapping by geostatistical methods 3659

Figure 4. Experimental cross-correlograms between C factor and ratio image 5 using sample
data, Markov models MM1 and MM2.

and ratio image 5. When the separation distance h was zero, the spatial cross-
correlograms from Markov models were almost 0.608. This is the coeYcient of
correlation between the C factor and ratio image 5. When the distance was less than
5000 m, the approximation from these two models might lead to underestimation of
the sample cross-correlogram . The underestimation was more signiŽ cant for the
MM1 than for the MM2. The models MM1 and MM2 respectively were employed
in the sequential Gaussian cosimulation and colocated cokriging with the ratio
image 5.
The sample data of C factor and results by colocated cokriging using the ratio
image 5 are shown in Ž gure 5. The most of the C factor values were small at the
east and north-east, and large at the west, south-west and north-west. The spatial
distribution from the sample data was reproduced in the estimation map in
Ž gure 5(b). However, smoothing of the estimates was signiŽ cant. At the sample
locations, the interpolation held sample data and the variances of the estimates were
zero. The variances increased rapidly as the estimated locations were apart from the
sample locations, which implied the eVect of data conŽ guration.
The results by the sequential Gaussian cosimulation with ratio image 5 are
illustrated in Ž gure 6. As shown in the sample data, the estimates of the C factor in
Ž gure 6(a) were small at the east and north-east, and large at the west, south-west
and north-west. Smoothing estimates was not signiŽ cant compared with that by the
colocated cokriging with the ratio image 5. The more detailed spatial variability was
also found in the variance map of the estimates in Ž gure 6(b). At the locations with
smaller estimates and denser samples, the variances were smaller, and otherwise the
variances were larger. The variances varied depending on not only data conŽ guration
but also sample data themselves. Less smoothing happened in the variance map by
the cosimulation compared with that by the colocated cokriging.
For comparison to the results above by the cosimulation with the ratio image 5,
the estimate and variance maps of the C factor were also derived by the sequential
Gaussian simulation without any TM images (Ž gure 7 ). The spatial distribution of
3660 G. Wang et al.

(a) (b)

(c)

Figure 5. (a) C factor samples, (b) C factor estimate and (c) C factor variance maps using
colocated cokriging and ratio image 5.

the C factor estimates and variances was not so distinct as the corresponding maps
in Ž gure 6. The simulation without TM images led to more smoothing and signiŽ cant
eVect of data conŽ guration compared with the cosimulation with ratio image 5.
If the variance image by the simulation without TM images was subtracted from
the variance image by the cosimulation with the ratio image 5, an image accounting
for diVerences in variances was obtained in Ž gure 7(c). Using the ratio image 5
resulted in reducing estimation variances at 95% of all the pixels in this study area,
Improvement in mapping by geostatistical methods 3661

(a) (b)

Figure 6. (a) C factor estimate and (b) C factor variance maps using sequential Gaussian
cosimulation and ratio image 5.

no change at 3%, and increasing estimation variance only at 2%. The average
estimation variances with and without the ratio image 5 respectively were 0.00021
and 0.00047, with a reduction of 0.00026. The reduction rate in average variance
was about 56%. Using the ratio image 5, therefore, the reduction in the map
uncertainty is very signiŽ cant.
Four vegetation categories including grass, shrub, tree, and mixed were identiŽ ed
by the classiŽ cation. Bare land was missed and water was out of the sampling area.
The average of the C factor for grass, shrub, tree, and mixed were 0.0608, 0.0347,
0.0324 and 0.0475 respectively. Because of higher cover and more eVect on reducing
rainfall drips, tree had the smallest C factor, then shrub, mixed vegetation , and grass
had the largest C factor. The mean values were assigned to the pixels belonging to
the corresponding vegetation categories. Linear and log–linear regression models were
constructed within each category and are listed in table 2. Because there were only
few samples available for shrub, two most signiŽ cant variables, i.e. TM5 and TM7
were selected in the regression models of shrub and other TM bands were removed.
In Ž gure 8, the estimation maps of the C factor from three traditional methods
are shown. The estimates had similar spatial distribution to those obtained by three
geostatistical methods. That is, the smaller estimates were located at the east and
north-east, and larger one at the west, south-west and north-west. The estimates
varied over space in less detail than those by the colocated cokriging and cosimulation
3662 G. Wang et al.

(a) (b)

(c)

Figure 7. C factor estimate (a) and C factor variance (b) maps using sequential Gaussian
simulation without ratio image 5. (c) The diVerence image is the diVerence in variance
between the simulations with and without the ratio image 5.

with the ratio image 5, but in more detail than those by the simulation without
TM images.
Comparison of the six interpolation methods was made in Ž gure 9 based on the
diVerences between the estimates and the test data. All the methods resulted in slight
overestimates at the locations with smaller C factor values and underestimates at
the locations with larger C factor values. The three traditional methods and the
Improvement in mapping by geostatistical methods 3663

Table 2. Regression models of the C factor cross-categories with Landsat TM images.

Intercept
Vegetation of models TM1 TM2 TM3 TM4 TM5 TM7

L inear regression models


Grass 0.0369 0.0008 ­ 0.0021 0.0015 0.0002 ­ 0.0007 0.0007
Shrub 0.0711 ­ 0.0034 0.0071
Tree 0.0766 ­ 0.0014 0.0030 ­ 0.0017 ­ 0.0006 0.0003 0.0010
Mixed ­ 0.1249 0.0051 ­ 0.0024 ­ 0.0028 ­ 0.0014 ­ 0.0011 0.0036
L og–linear regression models
Grass ­ 2.8681 0.0064 ­ 0.0370 0.0312 0.0032 ­ 0.0130 0.0117
Shrub ­ 1.3371 ­ 0.1035 0.1912
Tree ­ 1.2107 ­ 0.0505 0.0922 ­ 0.0425 ­ 0.0377 0.0191 0.0169
Mixed ­ 8.7998 0.1324 ­ 0.1310 ­ 0.0595 ­ 0.0127 0.0039 0.0543

simulation without TM images might lead to slightly larger maximum errors than
the colocated cokriging and cosimulation with the ratio image 5.
The six methods were also compared in table 3 using the test data. All six mean
estimates fell into the conŽ dence interval. The colocated cokriging with the ratio
image 5 and simulation without TM images provided the minimum mean diVerence
between the estimates and observations of the C factor. However, the cosimulation
with the ratio image 5 led to the smallest range of the diVerences, the largest coeYcient
of correlation and the smallest root mean square error (RMSE) between the esti-
mates and observations, then colocated cokriging with the ratio image 5. The linear
regression cross-categor y resulted in the smallest correlation and largest RMSE.

5. Discussion and conclusions


This study demonstrated the comparison of three traditional and three geostatist-
ical methods for mapping the vegetation cover C factor for the USLE used in soil
loss prediction. A new methodology for mapping was suggested. Based on the
coeYcients of correlation and RMSE, and reproduction of spatial variability, the
sequential Gaussian cosimulation with the ratio image 5 was the best, then
the colocated cokriging with the ratio image 5, the vegetation classiŽ cation with
average and with log–linear regression, and the sequential Gaussian simulation without
TM images. The vegetation classiŽ cation with linear regression was the worst.
Although it is easy to obtain remotely sensed data now, many investigators still
map natural resources using geostatistical methods without any auxiliary data. This
study showed that the simulation without TM images resulted in much worse
prediction than the colocated cokriging and sequential Gaussian cosimulation with
the ratio image 5. The simulation without TM images created even worse estimates
than two traditional methods, vegetation classiŽ cation with average and log–linear
regression. As expected, the TM images and cross-semivariogra m between the image
data and the C factor values provided useful spatial information at the non-sampled
locations in terms of coding spatial variability of the C factor. In other words,
geostatistical methods without any auxiliary data should be used with caution for
mapping natural resources. Furthermore, using Markov models might lead to a
reasonable approximation of the cross-correlogram . However, the approximation
depends very much on the correlation of the primary and secondary variables.
Compared with the three traditional methods, two geostatistical methods, i.e. the
3664 G. Wang et al.

(a) (b)

(c)

Figure 8. C factor estimate maps (a) average cross-category (AverageC), (b) linear regression
cross-category (LregC) and (c) log–linear regression cross-category (LogLregC).

colocated cokriging and Gaussian cosimulation with the ratio image 5 reproduced
better and more detailed spatial variability of the vegetation cover C factor. At the
same time, both gave uncertainty measures, that is, error variances at the non-sample
locations and areas. Theoretically, the colocated cokriging, as an interpolation
method, aims at providing the best estimates at every location, and does not care
about spatial variability. On the other hand, the Gaussian cosimulation tries to
reproduce spatial variability and probably may not result in the best predictions. In
this study, the cosimulation led to slightly better estimates than the colocated
cokriging. The diVerences may probably be mainly due to the normal score trans-
formation done and diVerent Markov model used for the cosimulation. Although
Improvement in mapping by geostatistical methods 3665

(a) (b)

(c) (d)

(e) (f)

Figure 9. Comparison between six methods based on diVerences between estimates and
observations for spatial interpolation of C factor using colocated cokriging (Co_Cok:
a), sequential Gaussian cosimulation with TM (SGWTM: b) and without TM images
(SGWoTM: c), average cross-category (AverageC: d ), linear regression (LregC: e) and
log–linear regression (LogLregC: f ).

the cosimulation was about 10 times more expensive than the colocated cokriging
in terms of computing time, the former was very worthwhile in this study because
spatial variability was very important in prediction of soil erosion and uncertainty
analysis.
Additionally, a simulated value at a non-sample location was drawn from
conditional cumulative density function derived conditional to the sample data, the
previously simulated values and the image datum at this location. Thus, the Gaussian
3666 G. Wang et al.

Table 3. Comparison between six methods based on statistical parameters of estimates and
observations. The absolute parameter values were ranked in parentheses. ConŽ dence
interval: 0.0416–0.0574. Co_Cok, SGWTM, SGWoTM, AverageC, LregC and
LogLregC, respectively, are colocated cokriging with TM, sequential Gaussian cosimu-
lation with TM, sequential Gaussian simulation without TM, average cross-category,
linear regression cross-category, and log–linear regression cross-category.

Mean Minimum Maximum


Methods Mean SD diVerence diVerence diVerence Correlation RMSE

Test data 0.0495 0.0225


Co_Cok 0.0492 0.0137 (3) ­ 0.0003 (1) ­ 0.0360 (2) 0.0317 (2) 0.6742 (5) 0.0164 (2)
SGWTM 0.0458 0.0201 (5) ­ 0.0037 (6) ­ 0.0375 (4) 0.0293 (2) 0.7317 (6) 0.0159 (1)
SGWoTM 0.0499 0.0090 (1) 0.0003 (1) ­ 0.0361 (3) 0.0376 (4) 0.4991 (2) 0.0193 (4)
AverageC 0.0488 0.0118 (2) ­ 0.0007 (3) ­ 0.0321 (1) 0.0372 (3) 0.5497 (3) 0.0185 (3)
LregC 0.0505 0.0185 (4) 0.0010 (5) ­ 0.0408 (6) 0.0422 (5) 0.4888 (1) 0.0203 (6)
LogLregC 0.0487 0.0203 (6) ­ 0.0009 (4) ­ 0.0384 (5) 0.0457 (6) 0.5498 (4) 0.0200 (5)

cosimulation with the ratio image 5 avoided illogical estimates such as negative and
extremely large values, which deemed to be a shortcoming for two traditional
methods with linear or log–linear regression modelling.
In this study, the values of vegetation cover C factor from the sample data were
assumed to be the observations. In fact, the values were calculated as a function of
ground cover, aerial cover and minimum average height of vegetation. The spatial
uncertainty and error propagation from these three variables and the function
parameters to the C factor prediction was not analysed. This will be done in the
future and error budgets will be generated.

Acknowledgments
We are grateful to SERDP (Strategic Environmental Research and Development
Program) for providing support for this study, to US Army Corps of Engineers,
Construction Engineering Research Laboratory (USA-CERL) for the datasets, to
Xiangyun Xiao, Department of NRES, University of Illinois for calculating vegetation
cover percentages of ground plots, and to Prof. Andre Journel, Stanford University
for the program colocated cokriging.

References
Almeida, A. S., and Journel, A. G., 1994, Joint simulation of multiple variables with a
Markov-type coregionalization model. Mathematical Geology, 26, 565–588.
Barata, M. T., Nunes, M. C., Sousa, A. J., Muge, F. H., and Albuquerque, M. T., 1996,
Geostatistical estimation of forest cover areas using remote sensing data. In
Geostatistics Wollongong ‘96, 2, edited by E. Y. BaaŽ and N. A. SchoŽ eld (Dordrecht:
Kluwer Academic), pp. 1244–1257.
Benkobi, L., Trlica, M. J., and Smith, J. L., 1994, Evaluation of a reŽ ned surface cover
subfactor for use in RUSLE. Journal of Range Management, 47, 74–78.
Biesemans J., Meirvenne, M. V., and Gabriels, D., 2000, Extending the RUSLE with the
Monte Carlo error propagation technique to predict long-term average oV-site
sediment accumulation. Journal of Soil and Water Conservation, 55, 35–42.
Deutsch, C. V., and Journel, A. G., 1998, Geostatistical Software L ibrary and User’s Guide
(New York: Oxford University Press).
Diersing, V. E., Shaw, R. B., and Tazik, D. J., 1992, US Army Land Condition-Trend
Analysis (LCTA) Program. Environmental Management, 16, 405– 414.
Gomez-Hernandez, J. J., and Journel, A. G., 1992, Joint sequential simulation of
Improvement in mapping by geostatistical methods 3667

multiGaussian Ž elds. In Geostatistics T róia 1992, 1, edited by A. Soars (Dordrecht:


Kluwer Academic), pp. 85–94.
Goovaerts, P., 1997, Geostatistics for Natural Resources Evaluation (New York: Oxford
University Press).
GRASS, 1993, Geographic Resources Analysis Support System (GRASS) version 4.1, User’s
Reference Manual. USA Corps of Engineers, Construction Engineering Research
Laboratories, Champaign, Illinois. Web: http://www.baylor.edu/grass/.
Hunner, G., Mowrer, H. T., and Reich, R. M., 2000, An accuracy comparison of six spatial
interpolation methods for modeling forest stand structure on the Fraser Experimental
Forest, Colorado. In Accuracy 2000, Proceedings of the 4th International Symposium
on Spatial Accuracy Assessment in Natural Resources and Environmental Sciences,
Amsterdam, July 2000, edited by G. B. M. Heuvelink and M. J. P. M. Lemmens (Delft,
The Netherlands: Delft University Press), pp. 305–312.
Hutchinson, S., and Daniel, L., 1997, Inside ArcV iew GIS (Santa Fe, New Mexico:
OnWord Press).
Journel, A. G, 1999, Markov models for cross-covariances. Mathematical Geology, 31,
955–964.
Krige, D. G., 1966, Two-dimensional weighted moving average trend surfaces for ore-
evaluation. Journal of the South Af rican Institute of Mining and Metallurgy, 66, 13–38.
Pannatier, Y., 1996, VARIOW IN Software for Spatial Data Analysis in 2D (New York:
Springer).
Renard, K. G., Foster, C. R., Weesies, G. A., McCool, D. K., and Yoder, D. C., 1997,
Predicting soil erosion by water: a guide to conservation planning with the Revised
Universal Soil Loss Equation (RUSLE). US Department of Agriculture, Agriculture
Handbook Number 703, US Government Printing OYce, SSOP Washington, DC.
Shmaryan, L. E., and Journel, A. G., 1999, Two Markov models and their application.
Mathematical Geology, 31, 965–988.
Siegel, S. B, 1996, Evaluation of land value study (Draft). Resource Analysis Division, US
Army Concepts Analysis Agency, Bethesda, Maryland.
Tazik, D. J., Warren, S. D., Diersing, V. E., Shaw, R. B., Brozka, R. J., Bagley, C. F.,
and Whitworth, W. R., 1992, U.S. Army Land Condition Trend Analysis (LCTA)
plot inventory Ž eld methods. USACERL, Technical Report N-92/03, Department of
the Army, Construction Engineering Research Laboratories, Champaign, Illinois.
Tweddale, S. C., Echlschlaeger, C. R., and Seybold, W. F., 2000, An improved method
for spatial extrapolation of vegetative cover estimates (USLE/RUSLE C factor) using
LCTA and remotely sensed imagery. USAEC Report No. SFIM-AEC-EQ-TR-200011,
ERDC/CERL TR-00-7, US Army Engineer Research and Development Center, CERL,
Champaign, Illinois.
Wallerman, J, 2000, Co-kriging of forest stem volume using Landsat TM data and detected
edges. In Accuracy 2000, Proceedings of the 4th International Symposium on Spatial
Accuracy Assessment in Natural Resources and Environmental Sciences, Amsterdam,
July 2000, edited by G. B. M. Heuvelink and M. J. P. M. Lemmens (Delft: The
Netherlands Delft University Press), pp. 709–716.
Wang, G., Gertner, G., Xiao, X., Wente, S., and Anderson, A. B., 2000, Appropriate plot
size and spatial resolution for mapping multiple vegetation types. Photogrammetric
Engineering and Remote Sensing, 67, 575–584.
Warren, S. D., and Bagley, C. F., 1992, SPOT imagery and GIS in support of military land
management. Geocarto International, 7, 35–43.
Wheeler, P. H., 1990, An innovative county soil erosion control ordinance. Journal of Soil
and Water Conservation, 45, 374–378.
Wischmeier, W. H., and Smith, D. D., 1978, Predicting rainfall-erosion losses from cropland
east of the Rock Mountains: guide for selection of practices for soil and water
conservation. USDA, Agriculture Handbook. No. 282, US Government Printing
OYce, SSOP Washington, DC, pp. 1–58.
Xu, W., Tran, T. T., Srivastava, R. M., and Journel, A. G., 1992, Integrating seismic data
in reservoir modeling: the colocated cokriging alternative. T he 67th Annual T echnical
Conference and Exhibition of the Society of Petroleum Engineers, Washington, DC,
October 4–7, Society of Petroleum Engineers Inc., Richardson, Texas, pp. 833–842.

View publication stats

You might also like