Professional Documents
Culture Documents
Non-Stationary Frequency Analysis of Extreme Daily Rainfall in Sao Paulo, Brazil
Non-Stationary Frequency Analysis of Extreme Daily Rainfall in Sao Paulo, Brazil
ABSTRACT: This work is an assessment of frequency of extreme values (EVs) of daily rainfall in the city of Sao Paulo,
Brazil, over the period 1933–2005, based on the peaks-over-threshold (POT) and Generalized Pareto Distribution (GPD)
approach. Usually, a GPD model is fitted to a sample of POT values selected with a constant threshold. However, in
this work we use time-dependent thresholds, composed of relatively large p quantiles (for example p of 0.97) of daily
rainfall amounts computed from all available data. Samples of POT values were extracted with several values of p. Four
different GPD models (GPD-1, GPD-2, GPD-3, and GDP-4) were fitted to each one of these samples by the maximum
likelihood (ML) method. The shape parameter was assumed constant for the four models, but time-varying covariates were
incorporated into scale parameter of GPD-2, GPD-3, and GPD-4, describing annual cycle in GPD-2, linear trend in GPD-3,
and both annual cycle and linear trend in GPD-4. The GPD-1 with constant scale and shape parameters is the simplest
model. For identification of the best model among the four models we used rescaled Akaike Information Criterion (AIC)
with second-order bias correction. This criterion isolates GPD-3 as the best model, i.e. the one with positive linear trend
in the scale parameter. The slope of this trend is significant compared to the null hypothesis of no trend, for about 98%
confidence level. The non-parametric Mann–Kendall test also showed presence of positive trend in the annual frequency
of excess over high thresholds, with p-value being virtually zero. Therefore, there is strong evidence that high quantiles
of daily rainfall in the city of Sao Paulo have been increasing in magnitude and frequency over time. For example, 0.99
quantiles of daily rainfall amount have increased by about 40 mm between 1933 and 2005. Copyright 2008 Royal
Meteorological Society
KEY WORDS Brazil; extreme daily rainfall; frequency analysis; non-stationarity; peaks-over-threshold; Generalized Pareto
Distribution; Sao Paulo
Received 19 March 2008; Revised 30 July 2008; Accepted 3 August 2008
approach. This method is designed to utilize the available et al., 2006) including Sao Paulo (Alexander et al.,
data in a more efficient manner, by selecting from 2006; Haylock et al., 2006; Liebmann et al., 2004). Non-
each block any excesses over a given large threshold, stationary features have been also detected in the various
instead of simply selecting the maximum. The asymptotic climate extreme indices derived from daily rainfall for a
distribution of POT data set, under iid assumption, is the number of sites across the State of Sao Paulo (Dufek and
Generalized Pareto Distribution (GPD). Ambrizzi, 2008).
The main objective of this work is to perform FA of EV The anthropogenic influences as well as large-scale
of daily rainfall in Sao Paulo, by using the POT method natural low-frequency climate modes, which include, for
and the GPD modelling. Whether the intensity of extreme example, El Niño-Southern Oscillation (e.g. Jain and
daily rainfall has changed over the past 73 years is of Lall, 2001) and Antarctic Oscillation (e.g. Li et al.,
particular interest. Thus, FA of EV in this work is carried 2005), imply potential non-stationarity in the hydro-
out in the non-stationary context. The statistical method meteorological extremes. In such a case, the assumption
used in this study is different from the ones used in the of stationarity for observed data becomes questionable,
previous studies, which focused on rainfall extremes over and often statistical modelling of EV needs to be properly
South America through the non-parametric method (e.g. modified to accommodate, for instance, a time-trend
Marengo, 2004; Alexander et al., 2006; Haylock et al., component, either linear or non-linear (e.g. Strupczewski
2006; Dufek and Ambrizzi, 2008). et al., 2001a,b; Katz et al., 2002, 2005; Khaliq et al.,
There are at least three reasons for non-stationary 2006; El Adlouni et al., 2007; Nogaj et al., 2007). This
modelling of EV of daily rainfall in Sao Paulo. The has been an obvious motivation for effort by research
first is that over the past several years the city has community for the development of various mathematical
been growing in area and in population, mainly after frameworks capable of bringing in non-stationary features
1930s, following ever-increasing industrialization. The into EV probability distributions, as may be seen in
city has become one of the most populous and largest the recent literature (Jain and Lall, 2001; Strupczewski
metropolitan areas of the world. This has contributed to et al., 2001a,b; Katz et al., 2002, 2005; Cunderlik and
the development of a large urban heat island (Lombardo, Burn, 2003; Li et al., 2005; Cunderlik and Ouarda,
1985), the phenomenon in which temperatures of urban 2006; El Adlouni et al., 2007; Laurent and Parey, 2007;
areas are higher than those of its surrounding rural areas Parey et al., 2007). Readers are referred to Khaliq et al.
(e.g. Oke, 1982). Gonçalves et al. (2002) attributed urban (2006) for a comprehensive review of the various existing
expansion as one of the causes of upward trend in approaches to deal with non-stationary EV.
the monthly averages of daily minimum air temperature
in Sao Paulo. Aside from the effect on temperature,
the urbanization and associated land-use change can 2. Data and methods for the analysis
also produce effects on local wind patterns, humidity,
and precipitation (e.g. Childs and Raman, 2005, and 2.1. Data
references therein). For discussion on the impact of The daily rainfall data set used in this work is from
urbanization and land-use change on climate, readers are the meteorological station of the Institute of Astronomy,
referred, for example, to Huff and Changnon (1972); Geophysics and Atmospheric Sciences of the Univer-
Oke (1982); Carlson and Arthur (2000); Kalnay and sity of Sao Paulo (IAG/USP) for the period from 1933
Cai (2003), and He et al. (2007). The second reason for to 2005. The time series plot of this data is shown
non-stationary modelling is related to the global climate in Figure 1. The mean values of total monthly rain-
change which can also be a potential driver of extreme fall are shown in Figure 2. The station is located at
hydro-meteorological events (e.g. Trenberth, 1999). A latitude 23° 39 S and longitude 46° 37 W and has never
warming trend in the globally averaged surface air
temperature during last 50 years is now widely admitted
(Houghton et al., 2001), and numerous studies have been
devoted concerning whether extreme events such as
floods, droughts, very high temperatures, and destructive
tropical cyclones are likely to increase in frequency
and intensity in a changing global climate (Wigley,
1988; Katz and Brown, 1992; Kharin and Zwiers, 2000;
Houghton et al., 2001; Jain and Lall, 2001; Strupczewski
et al., 2001a,b; Cunderlik and Burn, 2003; Beniston
and Stephenson, 2004; Cunderlik and Ouarda, 2006;
Garcia et al., 2007; Felici et al., 2007; Laurent and Parey,
2007; Parey et al., 2007). The third reason is related to
results of some previous studies evidencing statistically
significant positive trend in the extreme daily rainfall Figure 1. Plots for time series of daily rainfall amounts (mm) over the
time series across different regions of South America period from 1933 to 2005 observed at the IAG/USP meteorological
(e.g. Marengo, 2004; Alexander et al., 2006; Haylock station in Sao Paulo.
Copyright 2008 Royal Meteorological Society Int. J. Climatol. 29: 1339–1349 (2009)
DOI: 10.1002/joc
FREQUENCY ANALYSIS OF EXTREME DAILY RAINFALL IN SAO PAULO 1341
been changed from its original location. This data set quantile, y1−p , is given by
is one of the longest for studying the climate of Sao σ −γ
Paulo and fortunately free of missing values, which γ (p − 1), γ = 0
y1−p = F −1 (1 − p, σ, γ ) =
is an extremely rare feature for historical data sets in 1 , γ =0
σ log p
Brazil. The quality control tests applied to this data set (2),
revealed no erroneous outliers. Further, quality control where log is the natural logarithm.
test, based on the two-phase linear regression technique In the POT framework, it is assumed that the sequence
(Wang, 2003, 2008) was applied, particularly for detect- of time, say t1 , t2 , . . . , tM , corresponding to occurrences
ing step-like changes that could have been caused by of extreme events, is a Poisson process with the rate
changes in, for instance, instruments and/or exposure parameter λ. It is common to express the quantile
throughout the whole period of record. Essentially, in function in terms of this parameter and the return period
this approach, the classic regression F-statistic is calcu- T as yT = σ [(T λ)γ − 1]/γ (e.g. Khaliq et al., 2006),
lated under the assumption that each point of the time where yT is termed the T -year return level, interpreted as
series, c = {1, 2, . . . , n}, is potentially a change point. the level exceeded in any one year with probability 1/T ,
The existence of a change point is concluded when or alternatively the level which is exceeded once in T
Fmax = 1≤c≤n
max
Fc is too large to be attributed to chance years. In the later definition, the meaning of ‘once in T
variation (Lund and Reeves, 2002; Wang, 2003, 2008). years’ becomes questionable if there is some trend in the
In this work, two regression equations were designed for data, as pointed out, for instance, by Smith (2001) and
detecting the mean shift without trend change. Further- Khaliq et al. (2006). Furthermore, the expression for yT
more, a recent improvement of the maximum F -test due is restricted to stationary cases. For non-stationary cases,
to Wang (2008), called penalized maximal F (PMF), was Parey et al. (2007) (see also Laurent and Parey, 2007)
used, which takes the relative position of each candi- proposed recently a new definition for yT and a procedure
date change point into account to reduce the undesirable for its calculation. They defined yT as being the unique
effect of unequal sample sizes on the power of change level such that the expected number of exceedances over
point detection. The result of the PMF test for daily and yT in the next T years will be 1. Therefore, the time
monthly rainfall time series of Sao Paulo reveals no dis- variation of the considered distribution function and the
continuity in the mean value at 95% confidence level. The Poisson rate parameter should be taken into account,
software package to perform this test is available online properly, in the estimation of yT .
(http://cccma.seos.uvic.ca/ETCCDMI/software.shtml), in
R (R Development Core Team, 2006) and FORTRAN 2.3. Parameter estimation
languages. The parameters σ and γ were estimated in this work
using the maximum likelihood (ML) method, which is
summarized as follows. Let x1 , x2 , x3 , . . . , xn be n obser-
2.2. POT approach and GPD model vations of a random sequence with probability density
function f (x|θ ), where θ is the vector of parameters of
The POT approach is described as follows (Smith, 2001; this function. If one considers that the observations xi
Khaliq et al., 2006): Consider a sequence of M iid ran- are statistically independent, the joint probability density
dom variables X1 , X2 , . . . , XM conditioned on X > u, function for this sample is the product of the individual
where u is a given high threshold. It can be demon- densities, called the likelihood function:
strated that for sufficiently high u, under a wide range of
n
conditions, the distribution function of the excess values L(θ |x) = f (xi |θ ) (3)
Y = X − u converges to GPD, given by i
−1/γ
1 − 1 + γ σ y , σ > 0, 1 + γ (y/σ ) > 0
F (y, σ, γ ) = P r(X ≤ u + y|X ≥ u) = (1)
−y
1 − exp σ σ > 0, γ = 0
Here σ and γ are scale and shape parameters, respec- In general, this function is written as
tively. The parameter σ characterizes the spread of dis-
tribution and γ the tail features. γ is often termed the
n
logL(θ |x) = logf (xi |θ ) (4)
tail index. If γ < 0 the distribution is thin tailed, and if
i=1
γ > 0 it is heavy tailed or Pareto. The distribution with
γ = 0 is the limiting case obtained as γ → 0, and it is The ML estimation is in accordance with the likelihood
light tailed or exponential. The distribution with γ > 0 principle, which states that, in the process of the inference
is particularly important for studying EV. The inverse of of θ , all the relevant information in the observed data is
F is the quantile function, F −1 , such that the (1 − p) contained in the likelihood function (e.g. Pindyck and
Copyright 2008 Royal Meteorological Society Int. J. Climatol. 29: 1339–1349 (2009)
DOI: 10.1002/joc
1342 S. SUGAHARA ET AL.
Rubinfeld, 1976). Given the observed data, the goal depends on the particular day of the year td , according to
of ML estimation method is to find out among all
the probability density functions the one that is most logσi = σ0 + σ1 sin(2π td /365) + σ2 cos(2π td /365),
probably to have produced the given sample. Thus, one (5)
should look for the best estimate of θ , θ̂ , such that where i = 1, 2, . . . , M and td = 1, 2, . . . , 365.
L(θ̂ |x) > L(θ |x), where θ represents any other vector An approximation often used to represent long-term
of estimates. In practice, the ML estimates can be found trend is the linear trend (e.g. Katz et al., 2002), expressed
∂logL(θ |x) as:
by solving the likelihood equations ∂θi = 0,
∂logL(θ |x) logσi = σo + σ1 ti i = 1, 2, . . . , M (6)
given that ∂θi < 0, ensuring that L(θ |x) is in
fact the maximum in the vicinity of θ̂i . A great advantage This covariate can be attributed to human influences on
of this method is that the estimates are consistent and the natural systems or it can also be due to natural vari-
asymptotically normally distributed. The standard error, ability of very low frequency, as discussed previously.
SE(θ̂i ), of each parameter estimate is given by the inverse Both the annual cycle and long-term trend may be
of the Hessian matrix, which has second-order partial incorporated simultaneously as
derivatives of the likelihood function, with respect to the
parameters of the distribution. Usually, the equations are logσi = σ0 + σ1 sin(2π td /365)
solved numerically by using a non-linear optimization + σ2 cos(2 π td /365) + σ1 ti , i = 1, 2, . . . , M (7),
algorithm because in majority of the cases analytical
solutions are not possible. In this work, the Nelder–Mead where the time indexes td and ti are as defined before.
algorithm was used (Nelder and Mead, 1965). In Equations (5)–(7), logarithm transform is used to
In the FA of EV, Smith (2001) and Katz et al. (2002, ensure only positive values for σi . The non-stationary
2005) suggest the application of the ML method not only characteristics can also be incorporated into the γ param-
because of its propriety of consistency for estimation eter, but it is not a common practice as it is difficult to be
of the distribution parameters but also because of its estimated. The use of covariates might be helpful for non-
flexibility for incorporating non-stationary features into stationary series, since it provides means for relaxing the
the distribution parameters (say γ or σ , for instance) conditions of identically distributed data (Davison and
as covariates, such as annual cycle and long-term trend. Smith, 1990; Smith, 2001; Katz et al., 2002), and then
Consider a POT series y1 , y2 , . . . , yM , and the dates making it possible to analyse the entire series without
on which these peaks have been observed t1 , t2 , . . . , tM . splitting it by month as done in Silva and Zocchi (2006),
The annual cycle could be incorporated into the scale or by removing nonstationary features before modelling.
parameter σ assuming that the intensity of exceedances
2.4. Time-dependent thresholds and problem of their
choice
The choice of an adequate threshold is a crucial step
for FA of EV in the POT method. A number of
procedures for selecting thresholds have been proposed
in the literature, but it does not seem that a general
and objective method to define threshold has emerged
(see Lang et al., 1999; Coles, 2001; Smith, 2001; Katz
et al., 2005 for a discussion on this issue). The key
point for threshold choice is that it has to be sufficiently
large to avoid violation of the asymptotic property of
the GPD, but not too high to trade the benefits of
the ML estimates. This work adopts the strategy of
testing several relatively high thresholds for selecting the
EV of daily rainfall and ascertaining which one may
give simultaneously a good fitting and satisfactory ML
estimates. In addition, the possibility of using a variable
threshold within each year was exploited, taking into
account the notion that the extreme is season dependent,
especially if the local rain regime has a significant annual
cycle, as in Sao Paulo. This idea of using a variable
threshold is given in Katz et al. (2002), though they did
not use it effectively. Hence, in order to obtain a sample
of POT values, time-dependent thresholds composed of
Figure 2. Plots for time series of the average monthly rainfall (mm) p quantiles, qp (td ), of daily rainfall amounts are used
for Sao Paulo for the period from 1933 to 2005. instead of a fixed threshold within each year. Here qp (td )
Copyright 2008 Royal Meteorological Society Int. J. Climatol. 29: 1339–1349 (2009)
DOI: 10.1002/joc
FREQUENCY ANALYSIS OF EXTREME DAILY RAINFALL IN SAO PAULO 1343
Copyright 2008 Royal Meteorological Society Int. J. Climatol. 29: 1339–1349 (2009)
DOI: 10.1002/joc
1344 S. SUGAHARA ET AL.
model, and a linear approximation seems to be reasonable Table I. The four GPD models and the specification of their
for this task. scale parameter σ . The shape parameter γ is constant for the
four models.
2.5. Condition of independence and declustering Models Scale parameter of the models
The practical application of EV theory has the drawback
GPD-1 σ = constant
of relying on the hypothesis that the observations are sta- GPD-2 σi = σ0 + σ1 sin(2 π td /365) +
tistically independent. However, as several data points are σ2 cos(2 π td /365), as in Equation (5)
taken from each year, there may be several clustered val- GPD-3 σi = σ0 + σ1 ti , as in Equation (6).
ues above threshold that are unlikely statistically indepen- GPD-4 σi = σ0 + σ1 sin(2π td /365) +
dent. In the case of daily rainfall in Sao Paulo, a cluster of σ2 cos(2π td /365) + σ1 ti , as in Equation (7).
EV may have the same origin as the cold front, which can
cause time dependence (or autocorrelation) among EV,
and this phenomenon is not uncommon in Sao Paulo over identically distributed, they have to be transformed to
the year. In this work, in order to overcome the depen- residuals, εi (Katz et al., 2002):
dence we adopt the simplest technique (e.g. Khaliq et al.,
cluster are set to zero. This filling with zeros aims to pre- i
serve the original positions of exceedances. The condition , 1 − exp(−ε̃i ) (9)
(M + 1)
of independence among exceedances within declustered
series was examined using non-parametric Kendall’s τ Like PP, QQ plot is given by points:
statistic following Claps and Laio (2003). It was found
Copyright 2008 Royal Meteorological Society Int. J. Climatol. 29: 1339–1349 (2009)
DOI: 10.1002/joc
FREQUENCY ANALYSIS OF EXTREME DAILY RAINFALL IN SAO PAULO 1345
2K(K + 1)
AI Cc = −2logL(θ̂ |Y ) + 2K + (11),
M −K −1
Copyright 2008 Royal Meteorological Society Int. J. Climatol. 29: 1339–1349 (2009)
DOI: 10.1002/joc
1346 S. SUGAHARA ET AL.
asymptotic theory, loosely speaking, the q0.98 (td ), which the GPD-3 and GDP-4 models contain a trend component
generated 276 POT values, is a reasonable choice for in the scale parameter. As one may see, the QQ plots in
the upper limit of the thresholds. These 276 values can Figure 7 show a satisfactory fitting for both the models.
be thought as a realization of the Poisson process with For the sake of brevity, PP plots are not shown. It was
a rate parameter of about 3.8 occurrences per year. observed that the PP plot is less helpful than the QQ
Changing p in qp (td ) from 0.98 to 0.95 to enlarge plot for discerning differences of the quality of fitting.
POT sample is not a recommended procedure, though A slight superiority of GPD-3 can be noted in Figure 7,
it could be very attractive. One can note that if q0.95 (td ) which is consistent with the model selection analysis.
is used instead of q0.98 (td ) the SE(γ̂ ) does not diminish With all the results in favour of GPD-3, it is rea-
significantly, despite a substantial increase in sample size, sonable to assume that GPD model, which incorporates
i.e. from 276 to 655. The results of GPD models fitted both linear trend and annual cycle through time-varying
to 1000 bootstrap samples also show that the strategy thresholds, is the best model to represent the extreme
of lowering threshold does not lead to an increase of daily rainfall events for Sao Paulo.
accuracy. The estimates of the mean, γ ∗ , and standard We shall now infer about the sign of the shape parame-
errors, SE(γ ∗ ), of distribution of bootstrap estimates, ter or the kind of tail of this distribution using t-statistic.
γ ∗ , of the shape parameter are shown in Table II, for The use of this statistic is reasonable since the distribu-
a set of five thresholds qp (td ). Considering again the tion of γ estimated from 1000 bootstrap samples (not
thresholds q0.95 (td ) and q0.98 (td ), one can note from this shown) is close to the normal distribution. The computed
table that the difference in SE(γ ∗ ) is only about 50%. value of t-statistic is about 1.9, such that the null hypoth-
Table II also shows the proportion of positive estimates. esis H0 : γ = 0 is rejected with 90% confidence level.
A proportion larger than 90% for positive estimates, In addition to this, it is also noted that 95% of 1000
especially for more appropriate thresholds discussed bootstrap estimates of this parameter are positive, though
above, is an indication of a heavy-tailed distribution. We its magnitude is quite small, about 0.09. The estimated
will return to this matter latter.
Table II. Average, γ ∗ , and standard error, SE(γ ∗ ) (in parentheses), obtained from 1000 bootstrap estimates of γ ∗ for each of the
four GPD models and five qp (td ) thresholds used for extracting POT values to which GPD models were fitted. The percentage
of samples with positive γ ∗ , np, is also shown.
qp (td ) thresholds
Models q0.95 (td ) q0.96 (td ) q0.97 (td ) q0.98 (td ) q0.99 (td )
GPD-1
γ ∗ and SE(γ ∗ ) 0.031(0.030) 0.033(0.033) 0.092(0.037) 0.100(0.047) 0.188(0.071)
np 86 85 99 98 99
GPD-2
γ ∗ and SE(γ ∗ ) 0.009(0.029) 0.010(0.032) 0.071(0.038) 0.076(0.047) 0.076(0.077)
np 65 66 96 94 94
GPD-3
γ ∗ and SE(γ ∗ ) 0.022 (0.031) 0.023(0.033) 0.082(0.039) 0.088(0.048) 0.185(0.070)
np 78 79 97 95 99
GPD-4
γ ∗ and SE(γ ∗ ) 0.000 (0.030) 0.000 (0.032) 0.056(0.039) 0.057(0.048) 0.125(0.075)
np 48 48 91 88 95
Copyright 2008 Royal Meteorological Society Int. J. Climatol. 29: 1339–1349 (2009)
DOI: 10.1002/joc
FREQUENCY ANALYSIS OF EXTREME DAILY RAINFALL IN SAO PAULO 1347
Figure 7. QQ plots for the residual quantiles for (a) GPD-3 and
(b) GPD-4 models fitted to the values selected by the time-varying
threshold. Line of equality in the both figures indicates a perfect fit.
Copyright 2008 Royal Meteorological Society Int. J. Climatol. 29: 1339–1349 (2009)
DOI: 10.1002/joc
1348 S. SUGAHARA ET AL.
time, depicting changes in the frequency or probability of of EV of the considered series. The shape parameter
the three considered quantiles over the years 1933–2005. was set constant for the four models, but in GPD-
Numerically, by using the coefficients of the three lin- 2, GPD-3, and GPD-4 time-varying covariate(s) was
ear fittings, it is found that the probability of rainfall of incorporated into the scale parameter, as follows: in GPD-
about 100 mm increased from 1% to 2% between 1933 2 σ describes annual cycle, in GPD-3, σ accommodates
and 1954 and to 5% in 2001. It is important to mention a linear trend, and in GPD-4 σ incorporates both annual
that a procedure, such the one adopted here, of bringing cycle and linear trend. These models were therefore fitted
seasonality into FA of EV can have some undesirable to several samples of exceedances by means of the ML
side effects, whether the extreme rainfall regime is not method, and the results of the ML estimations of models
changing for some month(s) or season(s), or it is changing parameters were examined to find an appropriate qp (td )
in a different manner, for instance, one season show- and the best model among the four concurrent models,
ing increases while the other showing decreases. In such regarding both the performance of the ML estimation and
cases, this procedure is not recommended at all. In the asymptotic theory of the GPD. It was found that q0.98 (td )
case of the extreme rainfall regime in Sao Paulo focused is an appropriate threshold for GPD for the considered
here, pooling data from wet and dry seasons in FA and data. The selection of the models was performed by using
extracting some quantiles seem to be a reasonable proce- the small-sample version of AIC, in its rescaled form,
dure since the two periods for which high quantiles were which isolated GPD-3 as being the best model, i.e. the
extracted show positive trend, though in the latter it is model with a positive linear trend in EV of daily rainfall
less significant than in the first. in Sao Paulo, statistically significant with the confidence
level of about 98%. Another characteristic of this model
is that it is heavy tailed (γ > 0), but the magnitude γ is
4. Summary and concluding remarks quite small. The detected trend, as found in our FA of
EV of daily rainfall in Sao Paulo is a matter of concern,
The FA of EV of daily rainfall in Sao Paulo, in the whatever the associated cause, as floods have been
non-stationary context, was performed using the POT serious problems for the city over last decades, largely as
approach and GPD modelling. The period of analysis the effect of urban growth, inadvertent land use, and lack
is from 1933 to 2005. The FA was carried out with of public policy for preventing higher population density
the entire series without splitting into month or sea- in flood risk-prone areas such as shanty towns. Although
sons. For selecting exceedances, we used a threshold this work was performed with single-station data, the
dependent on the day of the year, denoted as qp (td ), collected information could be applied to the other
td = (1, 2, . . . , 365), composed of p-quantiles of daily points of Sao Paulo, relying on reasonable assumption
rainfall amount obtained from all available data. Non- of homogeneity in terms of atmospheric system and its
stationary features present in the data, such as seasonality manifestation over the city.
and long-term trend, were incorporated as covariates. The
seasonality is a prominent characteristic of rainfall regime
in Sao Paulo, and statistically significant long-term trend Acknowledgements
was captured in the annual and seasonal (October–March This work was partially supported by the Brazil-
and April–September) maximum of daily rainfall series. ian Financial Support Agency (FINEP) under con-
Four GPD models, denoted GPD-1, GPD-2, GPD-3 tracts #01.06.1120.00 and #01.06.1126.00. The authors
and GPD-4, were designed and tested for the distribution acknowledge the staff of the IAG/USP Weather Station
for providing the data, and especially its former direc-
tor Professor Paulo Marques dos Santos, for his effort
on keeping the station operating continuously and con-
cerning always about the quality of the measurements.
The two anonymous reviewers are also acknowledged
for their helpful comments and corrections.
References
Akaike H. 1974. A new look at the statistical model identification.
IEEE Transactions on Automatic Control AC-19: 716–723.
Alexander LV, Zhang X, Peterson TC, Caesar J, Gleason B, Klein
Tank AMG, Haylock M, Collins D, Trewin B, Rahimzadeh F,
Tagipour A, Rupa kumar K, Revadekar J, Griffiths G, Vincent L,
Stephenson DB, Burn J, Aguilar E, Brunet M, Taylor M, New M,
Zhai P, Rusticucci M, Vazquez-Aguirre JL. 2006. Global observed
changes in daily climate extremes of temperature and precip-
Figure 9. Linear regression fit of the 0.99, 0.98 and 0.95 quantiles itation. Journal of Geophysical Research 111: D05109, DOI:
of daily rainfall estimated by GPD-3 (inclined straight lines), for 10.1029/2005JD006290.
October–March, and rainfall amount of 102 mm (horizontal straight Beniston M, Stephenson DB. 2004. Extreme climatic events and their
line) with probability of 1% in the beginning of record and increasing evolution under changing climatic conditions. Global and Planetary
to 2% in 1954, and 5% in 2001. Change 44: 1–9.
Copyright 2008 Royal Meteorological Society Int. J. Climatol. 29: 1339–1349 (2009)
DOI: 10.1002/joc
FREQUENCY ANALYSIS OF EXTREME DAILY RAINFALL IN SAO PAULO 1349
Burnham KP, Anderson DR. 2004. Multimodel inference: understand- Kalnay E, Cai M. 2003. Impact of urbanization and land-use change
ing AIC and BIC in model selection. Sociological Methods & on climate. Nature 423: 528–531.
Research 33: 261–304. Katz RW, Brown BG. 1992. Extreme events in a changing climate:
Carlson TN, Arthur ST. 2000. The impact of land use/land cover variability is more important than averages. Climatic Change 21:
changes due to urbanization on surface microclimate and hydrology: 289–302.
A satellite perspective. Global and Planetary Change 25: 49–65. Katz RW, Brush GS, Parlange MB. 2005. Statistics of extremes:
Carvalho LMV, Jones C, Liebmann B. 2002. Extreme precipitation modeling ecological disturbances. Ecology 86: 1124–1134.
events in Southern South America and large-scale convective Katz RW, Parlange MB, Noveau P. 2002. Statistics of extremes in
patterns in South Atlantic Convergence Zone. Journal of Climate hydrology. Advances in Water Resources 25: 1287–1304.
15: 2377–2394. Khaliq MN, Ouarda TBMJ, Ondo J-C, Gachon P, Bobée B. 2006.
Childs PP, Raman S. 2005. Observations and numerical simulations of Frequency analysis of a sequence of dependent and/or non-stationary
urban heat island and sea breeze circulations over New York City. hydro-meteorological observations: A review. Journal of Hydrology
Pure and Applied Geophysics 162: 1955–1980. 329: 534–552.
Claps P, Laio F. 2003. Can continuous streamflow data support flood Kharin VV, Zwiers FW. 2000. Changes in the extremes in an ensemble
frequency analysis? An alternative to the partial duration series of transient climate simulations with a coupled atmosphere-ocean
approach. Water Resources Research 39: 6-1:6-11. GCM. Journal of Climate 13: 3760–3788.
Cleveland WS, Grosse E, Shyu WM. 1992. Local regression models. Lang M, Ouarda TBMJ, Bobée B. 1999. Towards operational guide-
In Statistical Models, Chambers JM, Hastie TJ (eds). S. Wadsworth lines for over-threshold modeling. Journal of Hydrology 225:
and Brooks/Cole: Pacific Grove, CA; 309–376. 103–117.
Coles SG. 2001. An Introduction to Statistical Modeling of Extreme Laurent C, Parey S. 2007. Estimation of 100-year-return-period
Values. Springer: London; 208. temperatures in France in a non-stationary climate: Results from
Cunderlik JM, Burn DH. 2003. Non-stationary pooled flood frequency observations and IPCC scenarios. Global and Planetary Change 57:
analysis. Journal of Hydrology 276: 210–223. 177–188.
Cunderlik JM, Ouarda TBMJ. 2006. Regional flood-duration-frequency Li Y, Cai W, Campbell EP. 2005. Statistical Modeling of Extreme
modeling in the changing environment. Journal of Hydrology 318: Rainfall in Southwest Western Australia. Journal of Climate 18:
276–291. 852–863.
Davison AC, Smith RL. 1990. Models for exceedances over high Liebmann B, Kiladis GN, Marengo JA, Ambrizzi T, Glick JD. 1999.
thresholds. Journal of the Royal Statistical Society B 52: 393–442. Submonthly convective variability over South America and the South
Dufek AS, Ambrizzi T. 2008. Precipitation variability in Sao Paulo Atlantic convergence zone. Journal of Climate 12: 1877–1891.
State, Brazil. Theoretical and Applied Climatology 93: 167–178, Liebmann B, Vera CS, Carvalho LMV, Camilloni I, Hoerling MP,
DOI: 10.1007/s00704-007-0348-7. Barros VR, Baez J, Bidegain M. 2004. An observed trend in central
Efron B. 1979. Bootstrap methods: another look at the Jackknife. South American Precipitation. Journal of Climate 17: 4357–4367.
Annals of Statistics 7: 1–26. Lombardo MA. 1985. A ilha de calor nas metrópoles: o exemplo de
El Adlouni S, Ouarda TBMJ, Zhang X, Roy R, Bobée B. 2007. São Paulo. Editora Hucitec, in Portuguese, 244.
Generalized maximum likelihood estimators for the nonstationary Lund R, Reeves J. 2002. Detection of undocumented changepoints: A
GEV model. Water Resources Research 43: W03410, DOI: revision of the two-phase regression model. Journal of Climate 15:
10.1029/2005WR004545. 2547–2554.
Felici M, Lucarini V, Speranza A, Vitolo R. 2007. Extreme value Marengo JA. 2004. Interdecadal variability and trends of rainfall across
statistics of the total energy in an intermediate-complexity model the Amazon basin. Theoretical and Applied Climatology 78: 79–96.
of the midlatitude atmospheric jet. Part. I: Stationary case. Journal Nelder JA, Mead R. 1965. A simplex algorithm for function
of the Atmospheric Sciences 64: 2137–2158. minimization. Computer Journal 7: 308–313.
Garcia JA, Gallego MC, Serrano A, Vaquero JM. 2007. Trends in Nogaj M, Parey S, Dacunha-Castelle D. 2007. Non-stationary extreme
block-seasonal extreme rainfall over the Iberian Penı́nsula in the models and a climatic application. Nonlinear Processes in
second half of the twenthieth century. Journal of Climate 20: Geophysics 14: 305–316.
113–130. Oke TR. 1982. The energetic basis of the urban heat island. Quarterly
Gonçalves FLT, Dias PLS, Araújo GP. 2002. Climatological analysis Journal of the Royal Meteorological Society 108: 1–24.
of extreme low temperatures in Sao Paulo City, Brazil: impact of Parey S, Malek F, Laurent C. 2007. Trends and climate evolution:
the oceanic SST anomalies. International Journal of Climatology Statistical Approach for very high temperatures in France. Climatic
22: 1511–1526. Change 81: 331–352.
Haylock MR, Peterson T, Abreu de Sousa JR, Alves LM, Ambrizzi T, Pindyck R, Rubinfeld DL. 1976. Econometric Models and Economic
Baez J, Barbosa JI, Barros VR, Berlato MA, Bidegain M, Coro- Forecasts, 1st edn. The MacGraw-Hill Companies, Inc.: New York;
nel G, Corradi V, Grimm AM, Jaildo dos Anjos R, Karoly D, 634.
Marengo Ja, Marino MB, Meira PR, Miranda GC, Molion L, R Development Core Team. 2006. R A. Language and Environment
Moncunill DF, nechet D, Ontaneda G, Quintana J, Ramirez E, for Statistical Computing. R Foundation for Statistical Computing:
Rebello E, rusticucci M, Santos Jl, Varillas IT, Villanueva JG, Vin- Vienna, ISBN 3-900051-07-0, http://www.R-project.org.
cent L, Yumiko M. 2006. Trends in total and extreme South Amer- Silva RR, Zocchi SS. 2006. A distribuição generalizada de Pareto-
ican rainfall in 1960–2000 and links with sea surface temperature. Poisson no estudo da precipitação pluvial total diária máxima em
Journal of Climate 19: 1490–1512. Piracicaba, SP. Revista Brasileira de Matemática e Estatı́stica 24:
He JF, Liu JY, Zhuang DF, Zhang W, Liu ML. 2007. Assessing the 77–94, in Portuguese.
effect of land use/land cover change on the change of urban heat Smith RL. 2001. Environmental Statistics, Available from http://www.
island intensity. Theoretical and Applied Climatology 90: 217–226. stat.unc.edu/postscript/rs/envnotes.ps, version 5.0, 9 july 2001.
Hipel KW, McLeod AI. 2005. Time Series Modelling of Water Strupczewski WG, Singh VP, Feluch W. 2001a. Non-stationary
Resources and Environmental Systems, Available from www.stats. approach to at-site flood frequency modeling I. Maximum likelihood
uwo.ca/faculty/aim/1994Book/. estimation. Journal of Hydrology 248: 123–142.
Horel J, Geisler J. 1997. Global Environmental Change: An Atmo- Strupczewski WG, Singh VP, Mitosek HT. 2001b. Non-stationary
spheric Perspective. John Wiley and Sons: New York; 152. approach to at-site flood frequency modeling. III. Flood analysis
Houghton JT, Ding Y, Griggs DJ, Noguer M, van der Linden PJ, of Polish rivers. Journal of Hydrology 248: 152–167.
Dai X, Maskell K, Johnson CA. 2001. Climate change 2001: the Trenberth KE. 1999. Conceptual framework for changes of extremes
scientific basis. Contribution of Working Group 1 to the Third of the hydrological cycle with climate change. Climatic Change 42:
Assessment Report of IPCC. International Panel on Climate Change. 327–339.
Available from www.ipcc.ch. Wang XL. 2003. Comments on “detection of undocumented
Huff FA, Changnon SA. 1972. Climatological assessment of urban changepoints: A revision of the two-phase regression model”.
effects on precipitation at St. Louis. Journal of Applied Meteorology Journal of Climate 16: 3383–3385.
11: 823–842. Wang XL. 2008. Penalized maximal F test for detecting undocumented
Hurvich CM, Tsai CL. 1989. Model selection for extended Quasi- mean shift without trend change. Journal of Atmospheric and
Likelihood Models in small Samples. Biometrics 51: 1077–1084. Oceanic Technology 25: 368–384.
Jain S, Lall U. 2001. Floods in a changing climate: Does the past Wigley TML. 1988. The effect of changing climate on the frequency
represent the future? Water Resources Research 37: 3193–3205. of absolute extreme events. Climate Monitor 17: 44–55.
Copyright 2008 Royal Meteorological Society Int. J. Climatol. 29: 1339–1349 (2009)
DOI: 10.1002/joc