Professional Documents
Culture Documents
Improved Long-Term Mean Annual Rainfall Fields For Colombia
Improved Long-Term Mean Annual Rainfall Fields For Colombia
=1
[z(u
) z(u
+ h)]
2
(3)
where N(h) is the number of pairs of data locations a
vector h apart. The semivariogram is a function of both
distance and direction, and therefore it is able to deal with
anisotropic spatial patterns. Similar to the inverse square
distance method, geostatistical interpolation amounts at
estimating the unknown rainfall depth Z at the unsampled
location u as a linear combination of neighbouring obser-
vations. For ordinary kriging (OK) the linear estimator is
given as
Z
OK
(u) =
n(u)
=1
OK
(u)Z(u
) (4)
The OK weights
OK
OK
(u) Z(u)},
while ensuring an unbiased estimation of E{Z
OK
(u)
Z(u)} = 0. These weights are obtained by solving a
system of linear equations known as OK, such that
(Matheron, 1965; Matheron, 1971)
_
_
n(u)
=1
OK
(u) (u
) +
OK
(u) = (u
u);
with = 1, 2, . . . , n(u)
n(u)
=1
OK
(u) = 1
(5)
Diverse variants of kriging implemented herein are
discussed next.
2.1. Kriging with an external drift
Kriging with an external drift (KED) uses information
from an auxiliary secondary attribute or variable to derive
the local mean of the primary one, Z. For instance,
considering a modelled trend function, m(u), which in
turn can be expressed as a linear function of the secondary
variable, y(u), (Goovaerts, 1997),
m
KDE
(u) = c
0
(u) + c
1
(u)y(u) (6)
KED uses densely measured external information to
calculate the local mean of the primary attribute Z by
performing OK on the residuals, such that (Goovaerts,
1999):
Z
KED
(u) m
KED
(u) =
n(u)
=1
KED
(u)[Z(u
KED
(u)]
(7)
where
KED
KED
(u) =
n(u)
=1
KED
(u)z(u
), (8)
where the KED weights,
KED
_
n(u)
=1
KDE
(u)C
_
u
_
+
KDE
0
(u)
+
KDE
1
(u)y(u
) = C(u
u)
with = 1, 2, . . . , n(u)
n(u)
=1
KED
(u) = 1
n(u)
=1
KED
(u)y(u
) = y(u)
(9)
where
KED
0
(u) and
KED
1
(u) are the two Lagrangian
parameters needed to perform the variance minimization
procedure, and C(u) is the spatial autocorrelation value
for the distance vector, u. The autocorrelation is related
to the variogram, (u), via C(u) =
2
(u), where
2
is the variance of the regionalized variable.
2.2. Standardized cokriging
In some cases the primary and secondary data belong to
the same variable, but are obtained through different pro-
cedures, as in the case of laboratory and eld data. In such
cases, their uncertainties and experimental variograms
usually exhibit different properties. Provided both mea-
surement processes are unbiased and the means of both
the primary and secondary variables are deemed equal,
the standardized cokriging (SCK) estimator is given by
Z
SCK
(u) =
n
1
(u)
1
=1
SCK
1
(u)Z
1
(u
1
) +
n
2
(u)
2
=1
SCK
2
(u)Z
2
(u
2
)
(10)
whose weights are the solution of the linear system
_
_
n
1
(u)
1
=1
SCK
1
(u)C
1
(u
1
u
1
) +
n
2
(u)
2
=1
SCK
2
(u)C
12
(u
1
u
2
) +
SCK
(u) = C
1
(u
1
u)
n
1
(u)
1
=1
SCK
1
(u)C
21
(u
2
u
1
) +
n
2
(u)
2
=1
SCK
2
(u)C
2
(u
2
u
2
) +
SCK
(u) = C
21
(u
2
u)
with
1
= 1, 2, . . . , n
1
(u) and
2
= 1, 2, . . . , n
2
(u)
n
1
(u)
1
=1
SCK
1
(u) +
n
2
(u)
2
=1
SCK
2
(u) = 1
(11)
Equation (11) is similar to the Orinary cokriging
(OCK) linear system (Goovaerts, 1997, p. 203), except
that the original two non-bias conditions for OCK are
replaced by a single constraint represented by the last
expression of
Equation (11).
2.3. Colocated cokriging
In those cases for which data for the secondary variable
are more densely sampled than the primary one, the
OCK system might be numerically unstable due to higher
correlations between the former than the latter. Moreover,
secondary data near to some estimation site tend to
obscure the inuence of secondary data from farther
sites. Such a problem can be overcome by retaining
the secondary data pertaining to each estimation site
(colocated). In such case, the linear collocated cokriging
estimator is given by (Almeida and Journel, 1994)
Z
CCK
(u) m
1
(u) =
n(u)
1
=1
CCK
1
(u)[Z
1
(u
1
) m
1
(u
1
)]
+
CCK
2
[Z
2
(u) m
2
(u)] (12)
where m
1
(u) and m
2
(u) are the global averages of the
primary and secondary data, respectively. In turn, the
CCK weights are obtained from the following system
of n(u) + 2 linear equations (Xu et al., 1992; Almeida
and Journel, 1994):
_
_
n
1
(u)
1
=1
CCK
1
(u)C
1
(u
1
u
1
) +
CCK
2
(u)C
12
(u
1
u) = C
1
(u
u) with
1
= 1, 2, . . . , n
1
(u)
n
1
(u)
1
=1
CCK
1
(u)C
12
(u u
1
) +
CCK
2
(u)C
2
(0) = C
12
(0)
(13)
Both the primary and secondary data need to be stan-
dardized if their variances differ signicatively
(Goovaerts, 1997).
2.4. Colocated cokriging using a Markov type
coregionalization
The colocated cokriging using a Markov type coregional-
ization (CCKM) entails that primary data z
1
(u) colocated
with secondary data, z
2
(u), screens the inuence of any
other data, z
1
(u + h), located at some distance h, and
then (Goovaerts, 1997; Cassiraga, 1999)
E{Z
2
(u)|Z
1
(u) = z, Z
1
(u + h) = z
}
= E{Z
2
(u)|Z
1
(u) = z} h, z
(14)
Equation (14) means that the dependence of the sec-
ondary variable is limited to the primary colocated
data. From Equation (14), a relation between the spa-
tial autocorrelation and cross-correlations arises, whereby
(Almeida and Journel, 1994; Cassiraga, 1999):
C
12
(h) =
C
12
(0)
C
1
(0)
C
1
(h) (15)
Copyright 2010 Royal Meteorological Society Int. J. Climatol. (2010)
MEAN ANNUAL RAINFALL FIELDS FOR COLOMBIA
where C
1
is the spatial autocorrelation of the primary
data, and C
12
is the spatial cross-correlation between the
primary and secondary data.
2.5. Indicator modelling of local uncertainty
The indicator approach to estimate local (pixel) uncer-
tainty is based on the interpretation of the conditional
probability as the conditional expectation of an indicator
random variable, dened by
I (u, z
k
) =
_
1 if Z(u) z
k
0 if Z(u) > z
k
(16)
where z
k
is a previously identied threshold for the pri-
mary variable Z(u), in our case annual precipitation.
According to the projection theorem (Luenberger, 1969,
p. 69), the least squares (kriging) estimate of the indi-
cator I (u; z
k
) is also the least squares estimate of its
conditional expectation. Thus, the conditional cumula-
tive distribution function (CCDF) value (F(u; z
k
|(n)) =
P{Z(u) z
k
}) can be obtained by (co)kriging the
unknown primary indicator i(u; z
k
) using the indicator
transformation of the neighbouring information. Thus, the
solution of a (co)kriging system using the indicator trans-
formation as the primary variable provides the cumulative
probability of the untransformed variable Z(u) in the
unsampled location u for a selected threshold z
k
(Jour-
nel, 1983; Goovaerts, 1997; Deutsch and Journel, 1998;
Cassiraga, 1999).
Then, indicator (co)kriging consists of performing any
type of (co)kriging estimation, i.e. simple (co)kriging
or ordinary (co)kriging to the indicator transformation.
For the case of a random eld estimation, the indica-
tor transformation makes possible (1) modelling the local
uncertainty of Z(u) (in our case long-term annual pre-
cipitation) at each pixel, and (2) performing risk analysis
in decision-making processes (Journel, 1983). Under the
indicator approach, it is not possible to estimate the com-
plete CCDF at any given pixel of a random eld. Instead,
cumulative probabilities F(u; z
k
|(n)) can be estimated
for preselected thresholds, z
k
, thus producing a discrete
CCDF. In general, such thresholds correspond to deciles
of the cumulative distribution function (CDF) for the pri-
mary variable Z(u). Such a CDF is estimated with the
available data sample over the region of study.
Goovaerts (1997) and Cassiraga (1999) provided a
suite of indicator (co)kriging algorithms to estimate local
(pixel) cumulative probabilities and to dene uncertainty
models for environmental variables. Once the estima-
tion of local uncertainty models has been performed, the
obtained local (pixel) CCDFs frequently show a decreas-
ing behaviour: F(u; z
k
|(n)) > F(u; z
k+1
|(n)), with, z
k
<
z
k+1
. The non-decreasing nature of the CCDFs is vio-
lated because the (co)kriging estimate is a non-convex
linear combination of the data (Goovaerts, 1997). In order
to correct such problems (usually referred as order rela-
tion deviation) we use a Gaussian smoothing procedure
(Carr, 1994). Finally, to obtain the statistics of the cor-
rected CCDFs (F(u; z
k
|(n))
oICK
=
n
1
(u)
1
=1
OCK
1
(u, z
k
)I (u
1
, z
k
) +
OCK
2
[Y(u
1
, z
k
) m
Y
(z
k
) + F(z
k
)] (17)
where [F(u; z
k
|(n))] denotes the estimated cumulative
probability distribution function of the primary variable,
P(Z(u) z
k
), for each threshold z
k
, which in our case
corresponds to the k
th
decil of the global conditional
cumulative probability distribution, and F(z
k
) represents
the mean for the indicator transformation. Implementa-
tion of indicator cokriging requires that secondary data
be transformed into values of cumulative probability,
referred to as conditional previous cumulative probabil-
ity, Y(u
1
, z
k
). The former transformation depends on
the preselected thresholds of the primary variable, z
k
.
For details refer Goovaerts (1997, p. 284) and Cassiraga
(1999, p. 175). Likewise, in Equation (17), m
Y
(z
k
) is the
mean for Y(u
1
, z
k
),
OCK
(u, z
k
)s are the weights for the
ICCKM linear estimator and u is the estimation site. The
weights are estimated by solving the CCK linear system
for indicator covariances dened as,
_
_
n
1
(u)
1
=1
OCK
1
(u, z
k
)C
I
(u
1
u
1
, z
k
)
+
OCK
2
(u, z
k
)C
IY
(u
1
u, z
k
)
+
OCK
(u, z
k
) = C
I
(u
1
u, z
k
),
with
1
= 1, 2, . . . , n
1
(u)
n
1
(u)
1
=1
OCK
1
(u, z
k
)C
YI
(u u
1
, z
k
)
+
OCK
2
(u, z
k
)C
Y
(0, z
k
)
+
OCK
(u, z
k
) = C
YI
(0, z
k
)
n
1
(u)
1
=1
OCK
1
(u, z
k
) +
OCK
2
(u, z
k
) = 1
(18)
Copyright 2010 Royal Meteorological Society Int. J. Climatol. (2010)
O. D.
ALVAREZ-VILLA et al.
The BayesMarkov coregionalization hypothesis ap-
plied to the indicator transform entails that a primary
indicator data located at u, i(u; z
k
), screens out the
inuence of any other colocated secondary indicator data,
y(u; z
k
), in estimating the primary variable at a different
location, u, such that,
Prob{Z(u
) z
k
|i(u, z
k
)y(u, z
k
)}
= Prob{Z(u
z
k
|i(u, z
k
))} u, u
, z
k
(19)
which in turn implies the existence of the following
relations between the covariances and cross-covariances
of the primary and secondary variables (Zhu and Journel,
1993)
C
IY
(h; z) = B(z)C
I
(h, z); h
C
Y
(h; z) = B
2
(z)C
I
(h, z); h > 0
C
Y
(0; z) = V
2
c
(z) + V
2
f
(z)
with
C
I
(h; z) = Cov{I (u; z); I (u + h; z)}
m
(1)
(z) = E{Y(u; z)|I (u; z) = 1} [0, 1]
m
(0)
(z) = E{Y(u; z)|I (u; z) = 0} [0, 1]
B(z) = {m
(1)
(z) m
(0)
(z)} [1, 1]
2
(1)
(z) = Var{Y(u; z)|I (u; z) = 1}
2
(0)
(z) = Var{Y(u; z)|I (u; z) = 0}
F(z) = Prob{Z(u) < z} = E{I (u; z)}
V
c
(z)
2
= F(z)[1 F(z)]B
2
(z)
V
f
(z)
2
= F(z)
2(1)
(z) + [1 F(z)]
2(0)
(z) (20)
where C
I
, C
Y
and C
IY
are the covariances and cross-
covariances, respectively, h is the vector of spatial sep-
aration, z is a threshold value for the primary variable
untransformed into indicator scores Z(u), and the remain-
ing terms are parameters that need to be estimated using
the calibration procedure.
2.7. Data
For interpolation purposes, ve major hydroclimatologi-
cal regions were dened in and around Colombia (Sec-
tion 1 and Figure 1), within the following coordinates:
Andean (0
8.5
N; 76.5
72
W), Pacic (0
8.5
N;
80
76
13
N; 80
70
W),
Orinoco and Amazon (5
S8
N; 75.5
66.5
W). In
addition, two other regions were dened for Ecuador
(5
S2.5
N; 80.0
73
13
N;
73
66.5
1x
_
2
+
_
h
y
1y
_
2
_
_
+ C
2
Sph
_
_
_
_
h
x
2x
_
2
+
_
h
y
2y
_
2
_
_
, (21)
where Sph is a spherical permissible variogram, C
0
is
the sill of the isotropic nugget effect (rst imbricated
structure), C
1
is the sill of the second (rst spherics)
Copyright 2010 Royal Meteorological Society Int. J. Climatol. (2010)
MEAN ANNUAL RAINFALL FIELDS FOR COLOMBIA
Table I. Fitted parameters for primary variable coregionalization, secondary variable coregionalization and their cross variogram.
a
.
Type Parameter Andean Pacic Caribbean Ama-Ori Ecuador Venezuela
Precipitation Azimuth [
] 30 30 40 Omnidir. 30 Omnidir.
C
0
[mm
2
] 85 000 80 000 50 000 210 000 100 000 80 000
C
1
[mm
2
] 1 375 000 2 350 000 1 300 000 2 000 000 750 000 480 000
1x
[m] 280 000 120 000 350 000 700 000 200 000 350 000
1y
[m] 80 500 120 000 150 000 700 000 200 000 350 000
C
2
[mm
2
] N.A. 13 100 000 N.A. N.A. 1 400 000 N.A.
2x
[m] N.A. N.A. N.A. N.A. N.A. N.A.
2x
[m] N.A. N.A. N.A. N.A. N.A. N.A.
Anisotropy Geometric Zonal Geometric N.A. Zonal N.A.
TRMM Azimuth [
] 30 30 40 Omnidir. 30 Omnidir.
C
0
[mm
2
] 400 000 250 000 300 000 200 000 20 000 170 000
C
1
[mm
2
] 1 200 000 1 700 000 1 000 000 250 000 900 000 370 000
1x
[m] 280 000 120 000 350 000 280 000 200 000 380 000
1y
[m] 80 500 120 000 150 000 280 000 200 000 380 000
C
2
[mm
2
] N.A. 3 300 000 N.A. 4500 1 400 000 N.A.
2x
[m] N.A. N.A. N.A. 700 000 N.A. N.A.
2x
[m] N.A. N.A. N.A. 700 000 N.A. N.A.
Anisotropy Geometric Zonal Geometric N.A. Zonal N.A.
Cross Azimuth [
] 30 30 40 Omnidir. 30 Omnidir.
C
0
[mm
2
] 70 000 50 000 30 000 30 000 20 000 60 000
C
1
[mm
2
] 1 100 000 1 100 000 950 000 550 000 550 000 250 000
1x
[m] 280 000 120 000 350 000 700 000 200 000 380 000
1y
[m] 80 500 120 000 150 000 700 000 200 000 380 000
C
2
[mm
2
] N.A. 6 500 000 N.A. N.A. 1 000 000 N.A.
2x
[m] N.A. N.A. N.A. N.A. N.A. N.A.
2x
[m] N.A. N.A. N.A. N.A. N.A. N.A.
Anisotropy Geometric Zonal Geometric N.A. Zonal N.A.
N.A., Does not apply.
a
Primary variable coregionalization is the long-term average precipitation from raingauges. Secondary variable coregionalization is the long-term
average precipitation intensity from the TRMM database.
imbricated structure,
1x
and
1y
are the ranges of the
second imbricated structure in the direction of largest
and smallest spatial variability, x and y, respectively;
C
2
is the sill of the third (second spherics) imbricated
structure,
2x
and
2y
are the ranges of the third
imbricated structure in x and y, respectively. On the other
hand, for the Pacic and Ecuador regions an anisotropic
coregionalization model was tted as
(h
x
, h
y
) = C
0
+ C
1
Gau
_
h
x
1x
_
+ C
2
Gau
_
h
y
1y
_
,
(22)
where Gau is a Gaussian permissible variogram model.
Estimated parameters of permissible variograms tted for
coregionalizations are shown in Table I.
2.9. Structural analysis for precipitations indicator
transformation
Estimation of the conditional cumulative probability
elds for the long-term average annual precipitation (pri-
mary variable) using ICCKM, required performing the
structural analysis of the indicator transformation. Per-
missible regionalization models of precipitations indi-
cator transformation were tted for each region. Then,
nine different regionalization models were constructed
for each region, corresponding to precipitation (10%)
percentiles of the regional cumulative probability distri-
bution functions. The anisotropic regionalization model
for each threshold was estimated as
I
(h
x
, h
y
) = C
0
+ C
1
Sph
_
_
_
_
h
x
1x
_
2
+
_
h
y
1y
_
2
_
_
,
(23)
with all variables previously dened in Section 2.8. The
work by
Alvarez-Villa (2007) reports the whole set of
parameters for all thresholds and regions.
3. Results and discussion
3.1. Experimental semivariograms
Figures 35 show the experimental direct (for aver-
age precipitation calculated using raingauge records and
TRMM measurements) and cross semivariograms for
each region, estimated using the anisotropy directions
shown in Table I. Using them as reference, coregion-
alization models were tted according to the procedure
described in Section 2.8, which produced the parame-
ters shown in Table I. Equally, graphs of tted direct and
cross variograms models are shown in Figures 35. As
Copyright 2010 Royal Meteorological Society Int. J. Climatol. (2010)
O. D.
ALVAREZ-VILLA et al.
Figure 3. Experimental and tted variograms in anisotropy directions for average precipitation calculated from raingauges records.
expected, the experimental semivariograms increase with
distance, and so the uncertainty of estimation. To mini-
mize sampling problems and with the aim of ensuring a
reliable tting of the coregionalization model, experimen-
tal variograms were estimated for the distances containing
more than 80 data pairs.
3.2. Renement of the estimated long-term average
precipitation elds
The resulting long-term mean annual precipitation elds
for Colombia using the four variants of kriging were
obtained by assembling the six regional elds. Assem-
blage was carried out by averaging estimated precipita-
tion values at overlapping pixels. Despite of the simplic-
ity of such procedures, it increases estimation uncertainty
(kriging inaccuracy) due to aggregation of values in colo-
cated pixels. Such an uncertainty was quantied using the
standard error of the mean, which requires estimating the
standard deviation elds, using the appropriate kriging
mathematical expressions (Goovaerts, 1997).
In order to avoid sharp discontinuities in the esti-
mated rainfall elds in the AmazonOrinoco, Ecuador
and oceanic regions, a post-processing procedure was
implemented at those regions, consisting in applying a
mobile circular 20 km diameter averaging window Ker-
nel lter (Borrough and McDonell, 1998). The result-
ing long-term annual mean and standard deviation elds
obtained through KED, SCK, CCK and CCKM, respec-
tively, are shown in the four panels of Figures 6 and
7, respectively. At rst sight, the obtained rainfall elds
look very similar, but a detailed examination shows oth-
erwise both in qualitative and quantitative terms, as dis-
cussed next.
We base the analysis of the four post-processed precip-
itation elds on estimation errors (cross-validation pro-
cedure), and on visual inspection of the spatial consis-
tency of the resulting precipitation elds. Furthermore,
an interesting quality control test of the produced maps
is their ability to capture the aforementioned PO within
the intra-Andean valleys. Figure 8 shows four transects
of topography and mean annual rainfall along 2, 4 and 6
and 8
i=1
R
i
, (24)
RMSE =
1
n
_
n
i=1
R
2
i
, (25)
where n is the number of available precipitation data for
each region and R is the residual. Table II shows that the
KED and CCKM interpolation algorithms produce the
smallest RMSE. Error analysis can be complemented by
quantifying the plain ME of residuals besides taking their
absolute values, which allowed us to obtain the following
conclusions:
Based on their denition, negative residuals indi-
cate that the interpolation algorithm tend to under-
estimate precipitation, as is the case for the SCK
algorithm (Figure 6(b)), particularly over the Andean,
Caribbean and Venezuela regions (Table II). Similar
characteristics were found for the Amazon and Orinoco
regions using the KED algorithm (Figure 6(a)). On
the other hand, a mean positive value of residuals
point out to precipitation overestimation, as is the
case of all algorithms but SCK, particularly so for all
regions using the CCK (Figure 6(c)) and CCKM algo-
rithms (Figure 6(d)). In summary, KED is the algo-
rithm exhibiting the lowest ME values, as indicated in
Table II.
Errors also depend on sample size, and thus more
reliable statements regarding (under) over estimation
can be made for the Andean and Pacic regions,
whereas those of Amazon and Orinoco are less reliable.
3.4. Quality control of the estimated rainfall elds
Comparing the newly developed long-term average pre-
cipitation elds with some of the previously estimated
rainfall elds over Colombia (Snow, 1976; Oster, 1979;
Mesa et al., 1997; Meja et al., 1999; Poveda and Mesa,
2000; V elez et al., 2000; Poveda et al., 2007), the fol-
lowing results are worth noting:
Copyright 2010 Royal Meteorological Society Int. J. Climatol. (2010)
O. D.
ALVAREZ-VILLA et al.
Figure 5. Experimental and tted cross variograms in anisotropy directions.
(a) All interpolation algorithms capture the most recog-
nizable features of rainfall over the Caribbean region,
with average rainfall estimates below 1500 mm
year
1
, including three well-known extreme precip-
itation spots: (1) the extremely dry Guajira region
in northeastern Colombia (labelled as 1 in precipi-
tation elds of Figure 6), with average rainfall esti-
mates around 300400 mm year
1
, and (2) two rainy
regions, one near Sierra Nevada de Santa Marta
(labeled as 2 in precipitation elds of Figure 6), with
estimates around 1900 mm year
1
, and the southern
region of Bolivars department (labelled as 3 in pre-
cipitation elds of Figure 6), with estimates around
3500 mm year
1
,
(b) With the only exception of SCK (Figure 6(b)), all
interpolation algorithms were able to capture the
presence of the PO within the intra-Andean (Cauca
and Magdalena River) valleys. Figure 8 contains four
longitudinal proles of both topography and mean
annual precipitation. Some observations regarding the
PO are worth mentioning: (1) in general, the PO is
located at altitudes between 1400 and 1700 m above
sea level (m a.s.l.) over both valleys. (2) For the
Cauca River valley (located within the western and
Central ranges of the Andes), the PO is observed
solely over the western slope of the Central range, but
not so over the eastern slope of the western Andes.
This observation can be explained by the dynamics of
the katabatic winds over the latter region (L opez and
Howell, 1967); (3) mean annual precipitation in the
Cauca River valleys PO is around 2800 mm. (4) The
Magdalena River valley located between the Central
and eastern Andes exhibit POs over both slopes,
with annual rainfall at the PO around 2600 mm;
and (5) over the eastern slope of the Central Andes
estimated precipitation at PO increases with latitude
reaching up to the highly rainy region of the southern
Bolivar department (labelled 3 in Figure 6). It is
worth noticing that the PO is more easily identied
as the intra-Andean valleys broaden, as evidenced by
comparing Figures 6(a) and 6(b).
(c) According to Hastenrath (1991), below the PO the
lower part of tropical valleys experiences less rain-
fall as they benet less form the orographic ascent
of air, and because they are affected by evaporation
of rain falling down from the cloud base. Above the
PO, air humidity, precipitable water and thus pre-
cipitation diminish with height. In spite of that, the
forcing mechanism is quite similar within the two
valleys, main differences in the spatial distribution
of rainfall are explained by the atmospheric circula-
tion dynamics strongly controlled by the interaction
Copyright 2010 Royal Meteorological Society Int. J. Climatol. (2010)
MEAN ANNUAL RAINFALL FIELDS FOR COLOMBIA
Figure 6. Long-term annual mean precipitation elds for Colombia estimated using (a) KED (top left), (b) SCK (top right), (c) CCK (bottom
left) and (d) CCKM (bottom right). This gure is available in colour online at wileyonlinelibrary.com/journal/joc
between the topography with the northeasterly trade
winds and the westerly winds of the low-level Choc o
jet (Poveda and Mesa, 1997), and by the geographi-
cal and morphological features that control local wind
circulations inside each valley. Such a strong inu-
ence of local topography is reected in the different
characteristics of the diurnal cycle of rainfall over the
tropical Andes (Poveda et al., 2005).
(d) The newly estimated precipitation elds adequately
represent the orographic inuence on rainfall esti-
mates, as larger (smaller) precipitation values are
located on the windward (leeward) slopes of the
Andes. Also, precipitation values decrease at high
altitudes, as 800 mm year
1
is estimated around
3000 m a.s.l. Such features were previously reported
by Oster (1979).
(e) None of the newly estimated precipitation elds
captured the pluviometric maximum of 12,700 mm
year
1
, located on the lowlands of the Pacic coast by
the western pidemont of the western Andes (Poveda
and Mesa, 2000). This shortcoming can be mainly
attributed to the secondary TRMM data. In general,
TRMM information is not able to capture extreme
storm events due to its sampling technique and
trajectory, and therefore the long-term precipitation
intensity in the wetter zones of Colombia can be
underestimated.
Besides, estimates of kriging accuracy, using the
standard deviation maps (Figure 7), allows noticing the
following:
(a) Standard deviation estimates exhibit discontinuities
in the borders of adjacent regions. They arise
from the combined effect of the following issues:
(1) independent coregionalization models have been
tted for each region, and (2) despite the fact that
estimated values of the primary variable (raingauge
data) are quite similar for overlapping pixels among
adjacent regions, that is not the case for standard
deviations due to their dependence on the semivar-
iogram sill of the corresponding coregionalization
method.
(b) All kriging interpolation algorithms exhibit the
largest estimation errors in poorly gauged regions.
In particular, errors in the Amazon and Orinoco
regions reach in excess of 40%. A similar situation
is found at oceanic regions where no primary rainfall
Copyright 2010 Royal Meteorological Society Int. J. Climatol. (2010)
O. D.
ALVAREZ-VILLA et al.
Figure 7. Kriging uncertainty elds corresponding to the panels of Figure 6 as estimated using (a) KED (top left), (b) SCK (top right), (c) CCK
(bottom left) and (d) CCKM (bottom right). This gure is available in colour online at wileyonlinelibrary.com/journal/joc
records were available. Some of the regions exhibit a
signal-to-noise ratio less than unity. It is noteworthy
that the algorithms that incorporate secondary infor-
mation within the spatial variation analysis exhibit
lower estimation errors in the standard deviation esti-
mates. This is particularly evident for the CCK and
CCKM algorithms (Figure 7(c) and (d)), whose stan-
dard deviation estimates at ungauged zones halve
with respect to KED (Figure 7(a)).
(c) Estimated standard deviations are small across the
Andean and Caribbean regions, although the dri-
est regions exhibit signicant estimation inaccuracy.
On the other hand, the Pacic region exhibits high
standard deviations, albeit restricted within a small
spatial domain. In spite of an acceptable coverage
of raingauges, high standard deviation values appear
because of the high anisotropy of the tted coregion-
alization model, likewise the poorly covered regions;
the lowest standard deviation values were obtained
with the CCK and CCKM algorithms (Figure 7(c)
and 7(d)).
(d) The kriging variance is a measure of estimation accu-
racy which relies on the coregionalization model ade-
quacy. Unfortunately, only if the primary variable
obeys a Gaussian distribution does the kriging vari-
ance provide an actual measure of uncertainty, i.e. the
variance of the local (pixel) probability distribution
function (Olea, 1991; Goovaerts, 1997). Then, taking
the kriging variance as a measure of uncertainty is not
rigorously accurate because (1) in general, the proba-
bility distribution function of tropical Andean rainfall
is not Gaussian and exhibits heavy tails (Poveda,
2010), and (2) very different values of the kriging
variance are estimated using different raingauge sets
as primary data, which reects into discontinuities in
the four standard deviation elds (Figure 7). Similar
values of the variance associated with the probabil-
ity distribution function of precipitation at any pixel
are to be expected using different raingauge sets for
uncertainty estimations. In our case, this could not be
achieved using the kriging variance as a measure of
precipitation uncertainty, because different coregion-
alization models along with different data sets were
used for the six dened hydroclimatological regions.
Copyright 2010 Royal Meteorological Society Int. J. Climatol. (2010)
MEAN ANNUAL RAINFALL FIELDS FOR COLOMBIA
Figure 8. Cross-sections of topography and kriging-estimated rainfall at different latitudinal bands. Note the presence of the pluviographic
optimum in the different Andean slopes and latitudes. Precipitation and topographic proles for latitudes 2
N (top-left), 4
N (top right), 6
N
(bottom left), and 8
N (bottom right). Rainfall estimates are obtained using the KED, SCK, CCK and MCCK kriging algorithms, while topography
is depicted by the Digital Elevation Model (DEM).
Table II. Results for the estimation error analyses.
KED SCK CCK CCKM
Region ME (mm) RMSE (mm) ME (mm) RMSE (mm) ME (mm) RMSE (mm) ME (mm) RMSE (mm)
Andean 7 556 12 750 59 607 25 601
Pacic 10 866 -32 931 132 1097 94 1037
Caribbean 5 486 21 1228 140 575 124 576
Ama-Ori 10 577 1 609 45 668 101 603
Ecuador 19 519 16 578 59 919 162 824
Venezuela 19 693 23 703 31 668 25 650
To avoid this effect, the nonparametric (indicator)
approach was used to quantify the local (pixel) uncer-
tainty of long-term average precipitation estimation.
3.5. Local uncertainty modelling using the
nonparametric (indicator) approach
In order to obtain a more physically based spatial
representation of the uncertainty elds, we applied the
so-called indicator approach or nonparametric modelling
of local uncertainty. It was done by estimating the CCDF,
using both the primary and secondary rainfall data sets.
With the aim of modelling the local (pixel) uncer-
tainty of the long-term mean annual precipitation elds
for the six regions, probability elds were interpolated
for each selected threshold using the ICCKM algorithm,
Equations (17) and (18). Probabilities were estimated on
a pixel-by-pixel basis, based on the indicator structural
analysis (Section 2.9). The CCDF at each pixel was
obtained as follows:
Regional CDF was estimated for each region based on
the mean annual precipitation at each raingauge. Then,
from the sample CDF, nine decile values were used
as thresholds, z
k
, to calculate the indicator transforma-
tions.
Indicator transformations, i(u; z
k
), for the primary
variable were calculated for each region and for each
selected threshold.
Previous estimations of the values for the cumulative
probability at each pixel, Y(u; z
k
) (corresponds to
Copyright 2010 Royal Meteorological Society Int. J. Climatol. (2010)
O. D.
ALVAREZ-VILLA et al.
the indicator coding of the secondary variable), were
performed for all regions and thresholds of the primary
variable.
The structural analysis for the indicator scores of
the primary variable was performed for all regions
and thresholds (Section 2.9). Given that we used the
ICCKM algorithm, it was not necessary to perform
the structural analysis for the indicator scores of the
secondary TRMM data.
For all regions and thresholds, probability elds were
estimated using the ICCKM algorithm with Equa-
tions (17) and (18).
Order relationships for the estimated values of CCDF
were corrected using the Gaussian smoothing proce-
dure of Carr (1994).
Based on the corrected values of the CCDF at each
pixel, interpolated elds of precipitations long-term
mean, standard deviation, and coefcient of variation
were estimated for each region. Then they were assem-
bled and post-processed to obtain the nal country-
wide precipitation and uncertainty elds. Estimations
were executed using E-type estimators for the expected
value, and the conditional variance for the standard
deviation (Goovaerts, 1997).
As we implemented the diverse variants of kriging to
six different regions, the subsets of annual precipitation
observations from raingauges used as primary variable
differ accordingly. As a consequence, regional precip-
itation CDFs are different, as well as their associated
decile values (as a matter of fact, all percentiles). Then,
the probability elds estimated using the ICCKM algo-
rithm represent different non-exceedance probabilities
for the different regions because the associated thresh-
olds, z
k
, used for the indicator coding are different. It
implies that the probability elds cannot be assembled
to obtain a complete non-exceedance probability eld
for Colombia. A full assembled probability eld from
the regional probability elds would be possible to con-
struct only if the threshold values are similar for all
regions. Although indicator kriging has proven not to
be as efcient as other variants of kriging (Lloyd and
Atkinson, 2001), some procedures can be used to obtain
much more adequate interpolated elds. They include:
(1) integration of physically based secondary informa-
tion, (2) the transformation of secondary data in terms of
conditional cumulative probabilities as a previous esti-
mation of non-exceedance probability elds (the actual
non-exceedance probability value is estimated afterwards
using the ICCKM algorithm, by integrating both primary
and secondary data) and, (3) a careful modelling of indi-
cator coregionalization. The estimated ICCKM annual
average precipitation eld (Figure 9(a)) shows features
quite similar to the kriging estimated elds (Figure 6). In
our case, integration of the secondary variable (TRMM
data) into the uncertainty estimation improved the esti-
mation of the CCDF at each pixel. The precipitation eld
obtained using indicator cokriging (Figure 9(b)) exhibits
high quality and is comparable to those obtained with the
KED, CCK and CCKM algorithms (Figure 7), with the
advantage of exhibiting a more reliable modelled uncer-
tainty.
In addition to average precipitation elds, standard
deviation and coefcient of variation elds were esti-
mated from the estimated CCDF at each pixel (Figure 9).
Fields of the standard deviation are pretty smooth and
continuous throughout Colombia (Figure 9(b)), with the
highest values in those regions showing the largest pre-
cipitation rates: Pacic and Amazon. On the other hand,
the smallest standard deviation values are located in the
driest regions of Colombia, including the highest Andes
and the Caribbean region. Some important differences
between the standard deviation elds obtained with the
kriging and the indicator cokriging algorithms are worth
mentioning:
For all kriging algorithms, the assembly procedure of
the standard deviation elds generated large discon-
tinuities between adjacent regions, whereas the indi-
cator cokriging produced a highly continuous map
(Figure 9(b)). This means that the indicator transfor-
mation is able to capture the spatial variability of
precipitation uncertainty; in spite of that, all regions
have independent coregionalization models.
Krigings standard deviation elds represent the uncer-
tainty of the annual average precipitation estimation (in
terms of accuracy), which reects the reliability asso-
ciated with kriging estimation. Such uncertainty value
(standard deviation or variance) is tightly associated
with the coregionalization model for each region. On
the other hand, standard deviation elds obtained using
the indicator approach for uncertainty represents the
statistical value of CCDF at each pixel. The complete
local (pixel) model of long-term annual average pre-
cipitation uncertainty is given by the estimated CCDF.
The standard deviation elds derived from kriging
(Figure 9(b)) show spatial variability patterns basically
driven by the location of raingauges and their average
precipitation values. On the other hand, standard devi-
ation elds obtained with indicator kriging are driven
by the location of raingauges and by the spatial pat-
terns of the secondary TRMM information. Inclusion
of TRMM information into the uncertainty estimation
analysis is a highly remarkable fact, because TRMM
information can be transformed into previous condi-
tional probability values to be used within the indicator
kriging estimation procedure. As a consequence, the
CCDF-related uncertainty measures elds (Figure 9)
have a close resemblance to the TRMM secondary
data, despite the fact that our analysis is performed
regionally.
For most of the Colombian territory, the coefcient
of variation remains almost constant (less than 30%),
which points out to some degree of spatial stationarity
that could be explained by the use of a previous prob-
ability, conditioning estimates of mean and standard
deviation. Thus, TRMM data exert a strong condition-
ing of the spatial modelling of precipitation, in support
Copyright 2010 Royal Meteorological Society Int. J. Climatol. (2010)
MEAN ANNUAL RAINFALL FIELDS FOR COLOMBIA
Figure 9. (a) Long-term annual average precipitation (top), (b) condi-
tional variance (middle) and (c) coefcient of variation (bottom)
elds for Colombia estimated using non-parametric local uncertainty
modelling and ICCKM. This gure is available in colour online at
wileyonlinelibrary.com/journal/joc
of the robustness of our estimation procedure, as it
is based on measured data. Now, TRMM informa-
tion does not capture all extreme events, and for some
highly rainy regions, the previous distribution does not
adequately represent the heavy tails of actual precipi-
tation data (Poveda, 2010).
4. Conclusions
Four new long-term annual average precipitation elds,
along with their associated standard deviation elds,
have been estimated for Colombia using multivariate
geostatistical algorithms which link primary information
from raingauges with spatially distributed secondary
information from the TRMM satellite data set. We used
the indicator approach to model the local (at-a-pixel)
uncertainty and to estimate mean annual precipitation, as
well as conditional variance and coefcient of variation
elds using the ICCKM modelled pixels CCFDs.
The estimated precipitation elds exhibit some of the
most relevant previously identied features, widely com-
mented by Oster (1979), DNP (1984), V elez et al. (2000)
and Poveda et al. (2007), along with some new features
that improve those previous estimations. Particularly, the
new elds adequately preserve (1) the spatial behaviour
of rainfall in Caribbean region, (2) the PO over the intra-
Andean valleys, (3) the spatial distribution of rainfall
and location of the pluviometric maximum at the Pacic
region, the rainy belts at the eastern piedmont of the east-
ern Andes, and the middle Magdalena River valley, and
(4) the spatial variability of precipitation at the southeast-
ern (Orinoco and Amazonia) and oceanic regions, of very
limited or nonexistent raingauge coverage.
Some limitations of our estimated rainfall elds refer
to (1) their lack of more than 12,000 mm year
1
plu-
viometic maximum along the western piedmont of the
western Andes over the Pacic coast, (2) precipitation
estimates in data scarce regions (Orinoco, oceanic and
Amazonia) which are less reliable, (3) uncertainty elds
show discontinuities for adjacent regions. In spite of
these shortcomings, most estimated elds reect phys-
ical important features derived from the secondary
information.
Using regional cross validation procedures we found
that the KED algorithm produces the best annual average
precipitation estimates due to the high linear correlation
coefcient found for the colocated primary and secondary
information. Nevertheless, a qualitative analysis of the
estimated elds allows us to conclude that the newly
developed annual average precipitation elds, with the
only exception of the SCK algorithm, are able to capture
well-known main physical characteristics of rainfall in
Colombia (discussed in Section 1). The newly produced
rainfall maps can be reliably used for multiple users
in Colombia due to an improved spatial resolution and
physical consistency.
We performed quality control of colocated TRMM-
raingauge precipitation data using Pearsons linear cor-
relation coefcient. From such analyses, we found that
TRMM data very accurately reect annual precipita-
tion in Colombia and arise as highly reliable informa-
tion for future studies, mainly in data-scarce regions of
Copyright 2010 Royal Meteorological Society Int. J. Climatol. (2010)
O. D.
ALVAREZ-VILLA et al.
Colombia. These newly improved precipitation elds and
their uncertainty maps are freely available for the scien-
tic community and can be downloaded from the website
http://cancerbero.unalmed.edu.co/hidrosig/index.php.
Acknowledgements
The work of J.I.V. and G.P. made part of the GRECIA
research programme funded by COLCIENCIAS.
References
Almeida AS, Journel AG. 1994. Joint simulation of multiple variables
with a Markov type coregionalization model. Mathematical Geology
26(5): 565588, DOI:10.1007/BF02089242.