Professional Documents
Culture Documents
Improvement in Mapping Factor C USLE - Landsat Images
Improvement in Mapping Factor C USLE - Landsat Images
net/publication/236771894
Improvement in mapping vegetation cover factor for the universal soil loss
equation by geostatistical methods with Landsat Thematic Mapper images
CITATIONS READS
119 435
4 authors:
Some of the authors of this publication are also working on these related projects:
Science and Technology Program of Guangzhou City 2018: Application of multi-source remote sensing and Geospatial Big-data for decision support on urban
redevelopment practices. View project
Comparative Analysis of Modeling Algorithms for Forest Aboveground Biomass Estimation in a Subtropical Region View project
All content following this page was uploaded by George Z Gertner on 07 January 2016.
Abstract. The universal soil loss equation (USLE) is a product of six factors: (1)
rainfall erosivity, (2) soil erodibility, (3) slope length, (4) slope steepness, (5) cover
and management, and (6) support practice, and is widely used to estimate average
annual soil loss. The cover and management variable, called the C factor, represents
the eVect of cropping and management practices on erosion rates in agriculture,
and the eVect of ground, tree and grass canopy covers on reduction of soil loss in
non-agriculture situation. This study compared three traditional and three geosta-
tistical methods for mapping the C factor. They included vegetation classi cation
with average, linear and log–linear regression for C factor assignment, sequential
Gaussian cosimulations with and without Thematic Mapper (TM) images, and
colocated cokriging with TM images. The coeYcient of correlation between estim-
ates and observations varied from 0.4888 to 0.7317, and the root mean square error
(RMSE) from 0.0159 to 0.0203. The sequential Gaussian cosimulation with a TM
ratio image resulted in the highest correlation and the smallest RMSE, and repro-
duced the best and most detailed spatial variability of the C factor. This method
may thus be recommended for mapping the C factor. It is also expected that this
method could be applied to image-based mapping in other disciplines.
1. Introduction
Since 1998 we have been working on a larger project ‘Error and Uncertainty
Analysis for Ecological Modeling and Simulation’. Spatial prediction or mapping
and uncertainty analysis of soil loss by the universal soil loss equation (USLE)
(Wischmeier and Smith 1978) or revised USLE (RULSE) (Renard et al. 1997 ) are
being carried out as the case study of this project. Both models are being widely
used to estimate soil loss in agriculture and environmental management. In the
models, soil loss A=R×K ×L×S×C×P, where R, K , L, S, C and P, respectively,
are rainfall erosivity, soil erodibility, slope length, slope steepness, vegetation cover
and management, and support practice factor. The objectives of this project are to
*Corresponding author: W503 Turner Hall, 1102 S. Goodwin Ave., University of Illinois,
Urbana, Illinois 61801, USA; e-mail gertner@uiuc.edu.
Remote sensing data acquired by a sensor recording spectral signals from the
ground can be considered to be a model of the ground surface. The spatial varia-
bility of the ground characteristics is coded in the remote sensing data. A cross-
semivariogram or cross-correlogram used in these geostatistical methods can capture
the spatial correlation between the ground characteristics and image data. The spatial
correlation is related to eld plot size used for data collection and spatial resolution
of images. Wang et al. (2000) found that when the plot size and spatial resolution
was 90 m by 90 m for the vegetation cover mapping in this study area, the semivario-
grams from the vegetation cover measurements were highly correlated with Landsat
Thematic Mapper (TM) data in terms of spatial structure.
The above idea has been supported by other studies. For example, Barata et al.
(1996) used a cokriging method to map forest cover with TM images and a signi cant
improvement was found. Hunner et al. (2000) modelled forest stand structure by
cokriging, where the secondary variables included topographic, TM images and their
vegetation indices. The authors suggested that cokriging was the best compared with
other methods. Wallerman (2000) compared cokriging and a simple regression model
for estimating forest stem volume using Landsat TM data, and concluded that
cokriging was much better than the regression model.
The results above show that cokriging is promising. However, the error variances
depend only on local data con guration and measure local uncertainty, not spatial
uncertainty. The interpolation may uniformly smooth out local details of spatial
variability. That is, the smoothing is minimal close to the data locations and arises
as the locations to be estimated get farther away from the sample locations. The
mapping can be improved by sequential Gaussian cosimulation by colocated cokrig-
ing with remote sensing data. In this method, a number of realizations for each
location are generated conditional to the sample data and previous simulated values
given a neighbourhood , and the image datum at the estimated location, and used to
obtain an expectation value and variance.
This study presents a methodological improvement to map the vegetation cover
C factor for soil loss estimation using a sequential Gaussian cosimulation with
Landsat TM images. This method combines the measurements of the C factor and
image data in auto- and cross-spatial variability to derive conditional cumulative
density functions. From these functions, a number of realizations are randomly
drawn and used to calculate an expectation value for each pixel and its vari-
ances measuring spatial uncertainty. This approach is compared with two other
geostatistical and three traditional methods for mapping.
(0.5 m, 1.5 m, ..., 99.5 m) along the transect. The ground and vegetation cover
percentages of each plot were calculated by dividing the total number of the covered
points by the total points measured (×100%). Canopy vegetation heights were
recorded by species at 0.1 m height intervals up to 2 m, and at 0.5 m intervals up to
8 m in height. The main vegetation cover types obtained were tree, shrub, grass,
mixed, bare land and water. The vegetation cover C factor was calculated for each
eld plot using the approach described by Wischmeier and Smith (1978).
A scene of Landsat TM images dated on 16 October 1989 and at the spatial
resolution of 30 m by 30 m was acquired. These images consisted of band 1:
0.45–0.53 mm, band 2: 0.52–0.60 mm, band 3: 0.63–0.69 mm, band 4: 0.76–0.90 mm,
band 5: 1.55–1.75 mm, and band 7: 2.08–2.35 mm. They were geo-referenced to the
Universal Transverse Mercator (UTM) projection and re-sampled to a resolution of
90 m×90 m by an average process and coarsening pixel size with 3×3 pixels.
3. Methods
Three traditional and three geostatistical methods were used in this study for
spatial interpolation of the vegetation cover C factor, that is, mapping the C factor
using a sample dataset and a scene of six Landsat TM images. Three traditional
methods were typically point-in-polygo n or point-in-stratum , that is, vegetation
classi cation with pixel value assignment using (i) average cross-category ; (ii) linear
regression model cross-category ; and (iii) log–linear regression models cross-category.
The three geostatistical methods were (i) colocated cokriging with a TM ratio image;
and sequential Gaussian cosimulation (ii) with and (iii) without the TM ratio image.
From all 215 sample plots, 31 plots were randomly selected and used as the test
dataset. The remaining 184 sample plot data were used for developing spatial
interpolation models. For the traditional methods, the image data used were all six
TM bands. For two geostatistical methods, the image data employed were a ratio
image having the highest correlation with the C factor values. Geostatistical Software
Library (GSLIB) and VARIOWIN software for spatial data analysis in 2D (Pannatier
1996) were used for development of three geostatistical methods (Deutsch and
Journel 1998), Geographic Resources Analysis Support System (GRASS 1993) for
three traditional methods, and ArcView GIS (Hutchinson and Daniel 1997) for
displaying the raster data.
The purpose of the transformation s was to improve the correlation of the image
data with vegetation cover by reducing the redundant information due to high
correlation between the images and by increasing variation of the image data (Barata
et al. 1996 ).
1 N(h)
c (h)= æ (z(u ) z(u +h))2 (8)
z 2N (h) a a
a= 1
where N (h) is the set of all pair-wise Euclidean distances, z(u ) and z(u +h) are
a a
data values of variable Z at spatial locations u and u +h, respectively. Similarly,
a a
3654 G. Wang et al.
G AB
h h 3
1.5 0.5 if hå a
a a
c (h)= (10 )
Sph
1 otherwise
A B
3h
c (h)=1 exp (11 )
Exp a
AB A B
h 3h2
c =1 exp (12 )
Gau a a2
where a is the range parameter of spherical model and the practical range parameters
of exponential and Gaussian models, de ned as the distance at which the model
value is at 95% of the sill. The range parameters provide the range of spatial
dependence of the variable. Within the range, observations can be considered spatially
dependent, and beyond the range, observations can be considered essentially inde-
pendent. As the distance h increases, the semivariogram and cross-semivariogram
varies and approach their limit values.
The spatial variability of a variable may be a linear combination of these models
above. The simplest example is c(h)={c +c ×c (h)], where c and c respectively
0 1 Sph 0 1
are called nugget and structure parameters, and c=c +c is the sill parameter
0 1
( gure 1). The nugget, structure and sill parameters account for nugget variance,
structure variance and total variance of the spatial variability. The nugget variance
when h=0 can be considered the noise term, implying short distance variability and
measurement error in remote sensing.
(a) (b)
Figure 1. Examples of spherical (a) and Gaussian (b) models with their parameters.
Improvement in mapping by geostatistical methods 3655
G
n(u)
æ lsck (u)C (u u )+lsck (u)C (u u)=C (u u)
b zz a b y zy a zz a
b= 1
a=1, ..., n(u) (14)
n(u)
æ lsck (u)C (u u )+lsck (u)C (0)=C (0 )
b yz b y yy yz
b= 1
where C (h) and C (h) are the covariance functions of the primary and secondary
zz yy
variables at a separation distance h. C (h) is the cross-covariance function between
zy
the two variables and h=u u , or u u, u u , and 0. The relationship between
a b a b
covariance and semivariogram functions is C(h)=C(0) c(h).
To derive the weights, the covariance and cross-covariance (or semivariogram)
functions should be modelled together. The conditions for obtaining solutions are
that permissible covariance models are used, all diagonal elements and all principal
minor determinants of the matrix consisting of nugget and sill parameters are
non-negative . This is a very complicated and tedious task. An alternative is that
cross-covariance C (h) can be approximate d by Markov models in terms of spatial
zy
covariance in equation (15):
C (h)=C (0)×C (h)/C (0) (15)
zy zy zz zz
In addition, the approximation can also be made by multiplying the cross-spatial
correlogram r (h) between the primary and secondary variable at the separation
zy
distance h=0, with r (h) the auto-spatial correlogram of the secondary variable,
yy
that is, equation (16).
r (h)=r (0)r (h) (16)
zy zy yy
3656 G. Wang et al.
This equation is called Markov model MM2 (Journel 1999 ) and was applied in
colocated cokriging with TM images. When the distance h=0, the cross-spatial
correlogram equals a correlation coeYcient in traditional statistics. For more details,
readers should refer to Journel (1999), and Shmaryan and Journel (1999).
4. Results
The coeYcients of correlation between vegetation cover C factor and Landsat
TM images are listed in table 1. The correlation varied from 0.402 to 0.586. The C
factor had the highest correlation with TM band 7, then band 3, band 5, band 1,
band 2 and band 4. The TM bands with higher correlation had larger coeYcients
of variation and the variation coeYcient of TM band 4 was the smallest. The
correlation between these TM bands except for TM band 4 was very high, which
implied redundant information. TM band 7 or 3 may be the best image variable
for mapping the C factor, however TM band 4 may also be used because of less
redundant information.
According to the analysis above, seven transformed image variables were derived
mainly from TM bands 7, 3 and 4. Compared with the original TM bands, the
normal diVerence vegetation index had much weaker correlation with C factor,
however, the other six TM ratios led to improvement in the correlation. The largest
correlation gained by the ratio image 5 (TM3+TM7)/TM4. The ratio image 5 was
thus employed as the secondary variable in two geostatistical methods, that is,
colocated cokriging and sequential Gaussian cosimulation with TM images.
The semivariogram of C factor, c (h), was modelled using 184 original sample
C
data ( gure 2(a)) and their normal scores with standardizatio n ( gure 2(b)). The lag
distance and lag tolerance used were 1.0 and 0.5 km. The two experimental semi-
variograms were t using Gaussian and spherical models, respectively, equations
(18) and (19).
G A BH
3h2
Sample data: c (h)=0.00016+0.00045 1 exp (18)
c 2.42
G A BH
h h 3
Sample data and standardization : c (h)=0.23+0.77 1.5 0.5
c 2.6 2.6
(19)
Table 1. Correlation between vegetation cover C factor and Landsat TM and ratio images.
(Note: NDVI: (TM4+TM3 )/(TM4 TM3); Ratio 1: TM3/TM4; Ratio 2: TM7/TM4;
Ratio 3: (TM3×TM7 )/TM4; Ratio 4: (TM3×TM5 )/TM4; Ratio 5: (TM3+TM7)/
TM4; and Ratio 6: (TM2+TM3+TM7)/TM4.)
(a) (b)
Figure 2. Experimental (dots) and modelled ( line) semivariograms of the C factor using
the original sample data for modelling (a) and normal score transformation with
standardization (b). Note: distance unit is km.
The semivariogram using raster data of ratio image 5, c (h), is given in gure 3 and
R
equation (20).
G A BH G A BH
3h 3h2
c (h)=0.002+0.130 1 exp +0.018 1 exp
R 2.0 14.42
G A BH
h h 3
+0.002 1.5 0.5 (20 )
16.2 16.2
The lag distance and lag tolerance was 90 m and 45 m, respectively. The semivario-
gram was t by integrating a common nugget of 0.002, an exponential, a Gaussian,
and a spherical model.
In gure 4, two Markov models MM1 equations (17 ) and MM2 (16) were
evaluated by comparing them to the sample cross-correlogram between the C factor
Figure 3. Experimental (dots) and modelled ( line) semivariograms of the ratio image 5. Note:
distance unit is km.
Improvement in mapping by geostatistical methods 3659
Figure 4. Experimental cross-correlograms between C factor and ratio image 5 using sample
data, Markov models MM1 and MM2.
and ratio image 5. When the separation distance h was zero, the spatial cross-
correlograms from Markov models were almost 0.608. This is the coeYcient of
correlation between the C factor and ratio image 5. When the distance was less than
5000 m, the approximation from these two models might lead to underestimation of
the sample cross-correlogram . The underestimation was more signi cant for the
MM1 than for the MM2. The models MM1 and MM2 respectively were employed
in the sequential Gaussian cosimulation and colocated cokriging with the ratio
image 5.
The sample data of C factor and results by colocated cokriging using the ratio
image 5 are shown in gure 5. The most of the C factor values were small at the
east and north-east, and large at the west, south-west and north-west. The spatial
distribution from the sample data was reproduced in the estimation map in
gure 5(b). However, smoothing of the estimates was signi cant. At the sample
locations, the interpolation held sample data and the variances of the estimates were
zero. The variances increased rapidly as the estimated locations were apart from the
sample locations, which implied the eVect of data con guration.
The results by the sequential Gaussian cosimulation with ratio image 5 are
illustrated in gure 6. As shown in the sample data, the estimates of the C factor in
gure 6(a) were small at the east and north-east, and large at the west, south-west
and north-west. Smoothing estimates was not signi cant compared with that by the
colocated cokriging with the ratio image 5. The more detailed spatial variability was
also found in the variance map of the estimates in gure 6(b). At the locations with
smaller estimates and denser samples, the variances were smaller, and otherwise the
variances were larger. The variances varied depending on not only data con guration
but also sample data themselves. Less smoothing happened in the variance map by
the cosimulation compared with that by the colocated cokriging.
For comparison to the results above by the cosimulation with the ratio image 5,
the estimate and variance maps of the C factor were also derived by the sequential
Gaussian simulation without any TM images ( gure 7 ). The spatial distribution of
3660 G. Wang et al.
(a) (b)
(c)
Figure 5. (a) C factor samples, (b) C factor estimate and (c) C factor variance maps using
colocated cokriging and ratio image 5.
the C factor estimates and variances was not so distinct as the corresponding maps
in gure 6. The simulation without TM images led to more smoothing and signi cant
eVect of data con guration compared with the cosimulation with ratio image 5.
If the variance image by the simulation without TM images was subtracted from
the variance image by the cosimulation with the ratio image 5, an image accounting
for diVerences in variances was obtained in gure 7(c). Using the ratio image 5
resulted in reducing estimation variances at 95% of all the pixels in this study area,
Improvement in mapping by geostatistical methods 3661
(a) (b)
Figure 6. (a) C factor estimate and (b) C factor variance maps using sequential Gaussian
cosimulation and ratio image 5.
no change at 3%, and increasing estimation variance only at 2%. The average
estimation variances with and without the ratio image 5 respectively were 0.00021
and 0.00047, with a reduction of 0.00026. The reduction rate in average variance
was about 56%. Using the ratio image 5, therefore, the reduction in the map
uncertainty is very signi cant.
Four vegetation categories including grass, shrub, tree, and mixed were identi ed
by the classi cation. Bare land was missed and water was out of the sampling area.
The average of the C factor for grass, shrub, tree, and mixed were 0.0608, 0.0347,
0.0324 and 0.0475 respectively. Because of higher cover and more eVect on reducing
rainfall drips, tree had the smallest C factor, then shrub, mixed vegetation , and grass
had the largest C factor. The mean values were assigned to the pixels belonging to
the corresponding vegetation categories. Linear and log–linear regression models were
constructed within each category and are listed in table 2. Because there were only
few samples available for shrub, two most signi cant variables, i.e. TM5 and TM7
were selected in the regression models of shrub and other TM bands were removed.
In gure 8, the estimation maps of the C factor from three traditional methods
are shown. The estimates had similar spatial distribution to those obtained by three
geostatistical methods. That is, the smaller estimates were located at the east and
north-east, and larger one at the west, south-west and north-west. The estimates
varied over space in less detail than those by the colocated cokriging and cosimulation
3662 G. Wang et al.
(a) (b)
(c)
Figure 7. C factor estimate (a) and C factor variance (b) maps using sequential Gaussian
simulation without ratio image 5. (c) The diVerence image is the diVerence in variance
between the simulations with and without the ratio image 5.
with the ratio image 5, but in more detail than those by the simulation without
TM images.
Comparison of the six interpolation methods was made in gure 9 based on the
diVerences between the estimates and the test data. All the methods resulted in slight
overestimates at the locations with smaller C factor values and underestimates at
the locations with larger C factor values. The three traditional methods and the
Improvement in mapping by geostatistical methods 3663
Intercept
Vegetation of models TM1 TM2 TM3 TM4 TM5 TM7
simulation without TM images might lead to slightly larger maximum errors than
the colocated cokriging and cosimulation with the ratio image 5.
The six methods were also compared in table 3 using the test data. All six mean
estimates fell into the con dence interval. The colocated cokriging with the ratio
image 5 and simulation without TM images provided the minimum mean diVerence
between the estimates and observations of the C factor. However, the cosimulation
with the ratio image 5 led to the smallest range of the diVerences, the largest coeYcient
of correlation and the smallest root mean square error (RMSE) between the esti-
mates and observations, then colocated cokriging with the ratio image 5. The linear
regression cross-categor y resulted in the smallest correlation and largest RMSE.
(a) (b)
(c)
Figure 8. C factor estimate maps (a) average cross-category (AverageC), (b) linear regression
cross-category (LregC) and (c) log–linear regression cross-category (LogLregC).
colocated cokriging and Gaussian cosimulation with the ratio image 5 reproduced
better and more detailed spatial variability of the vegetation cover C factor. At the
same time, both gave uncertainty measures, that is, error variances at the non-sample
locations and areas. Theoretically, the colocated cokriging, as an interpolation
method, aims at providing the best estimates at every location, and does not care
about spatial variability. On the other hand, the Gaussian cosimulation tries to
reproduce spatial variability and probably may not result in the best predictions. In
this study, the cosimulation led to slightly better estimates than the colocated
cokriging. The diVerences may probably be mainly due to the normal score trans-
formation done and diVerent Markov model used for the cosimulation. Although
Improvement in mapping by geostatistical methods 3665
(a) (b)
(c) (d)
(e) (f)
Figure 9. Comparison between six methods based on diVerences between estimates and
observations for spatial interpolation of C factor using colocated cokriging (Co_Cok:
a), sequential Gaussian cosimulation with TM (SGWTM: b) and without TM images
(SGWoTM: c), average cross-category (AverageC: d ), linear regression (LregC: e) and
log–linear regression (LogLregC: f ).
the cosimulation was about 10 times more expensive than the colocated cokriging
in terms of computing time, the former was very worthwhile in this study because
spatial variability was very important in prediction of soil erosion and uncertainty
analysis.
Additionally, a simulated value at a non-sample location was drawn from
conditional cumulative density function derived conditional to the sample data, the
previously simulated values and the image datum at this location. Thus, the Gaussian
3666 G. Wang et al.
Table 3. Comparison between six methods based on statistical parameters of estimates and
observations. The absolute parameter values were ranked in parentheses. Con dence
interval: 0.0416–0.0574. Co_Cok, SGWTM, SGWoTM, AverageC, LregC and
LogLregC, respectively, are colocated cokriging with TM, sequential Gaussian cosimu-
lation with TM, sequential Gaussian simulation without TM, average cross-category,
linear regression cross-category, and log–linear regression cross-category.
cosimulation with the ratio image 5 avoided illogical estimates such as negative and
extremely large values, which deemed to be a shortcoming for two traditional
methods with linear or log–linear regression modelling.
In this study, the values of vegetation cover C factor from the sample data were
assumed to be the observations. In fact, the values were calculated as a function of
ground cover, aerial cover and minimum average height of vegetation. The spatial
uncertainty and error propagation from these three variables and the function
parameters to the C factor prediction was not analysed. This will be done in the
future and error budgets will be generated.
Acknowledgments
We are grateful to SERDP (Strategic Environmental Research and Development
Program) for providing support for this study, to US Army Corps of Engineers,
Construction Engineering Research Laboratory (USA-CERL) for the datasets, to
Xiangyun Xiao, Department of NRES, University of Illinois for calculating vegetation
cover percentages of ground plots, and to Prof. Andre Journel, Stanford University
for the program colocated cokriging.
References
Almeida, A. S., and Journel, A. G., 1994, Joint simulation of multiple variables with a
Markov-type coregionalization model. Mathematical Geology, 26, 565–588.
Barata, M. T., Nunes, M. C., Sousa, A. J., Muge, F. H., and Albuquerque, M. T., 1996,
Geostatistical estimation of forest cover areas using remote sensing data. In
Geostatistics Wollongong ‘96, 2, edited by E. Y. Baa and N. A. Scho eld (Dordrecht:
Kluwer Academic), pp. 1244–1257.
Benkobi, L., Trlica, M. J., and Smith, J. L., 1994, Evaluation of a re ned surface cover
subfactor for use in RUSLE. Journal of Range Management, 47, 74–78.
Biesemans J., Meirvenne, M. V., and Gabriels, D., 2000, Extending the RUSLE with the
Monte Carlo error propagation technique to predict long-term average oV-site
sediment accumulation. Journal of Soil and Water Conservation, 55, 35–42.
Deutsch, C. V., and Journel, A. G., 1998, Geostatistical Software L ibrary and User’s Guide
(New York: Oxford University Press).
Diersing, V. E., Shaw, R. B., and Tazik, D. J., 1992, US Army Land Condition-Trend
Analysis (LCTA) Program. Environmental Management, 16, 405– 414.
Gomez-Hernandez, J. J., and Journel, A. G., 1992, Joint sequential simulation of
Improvement in mapping by geostatistical methods 3667