Professional Documents
Culture Documents
Stratified Sampled Synthetic Events For Flood Risk Calculations
Stratified Sampled Synthetic Events For Flood Risk Calculations
CALCULATIONS
By
(1)
Gert Leyssen and Joris Blanckaert (1)
(1) IMDC, International marine and dredging consultants, Coveliersstraat 15, 2600 Berchem, Belgium
(gert.leyssen@imdc.be ; joris.blanckaert@imdc.be )
ABSTRACT
A multivariate probabilistic methodology for the creation of extreme boundary conditions for
flood modeling is presented and applied to the river Yser (B). An extreme value distribution is
determined for the primary variables at each boundary: rainfall-runoff from the French part of
the Yser catchment and all tributaries in the Belgian part of the catchment, and downstream
water levels at the Yser mouth in the North Sea at Nieuwpoort (B). The univariate extreme
value distributions are coupled by a Gumbel copula to model dependencies. Next, the
multivariate domain of variables was stratified in different classes, each of which representing a
combination of variable quantiles and corresponding to a joint probability. For each class, a
synthetic event is generated by multiplying the class mean with a normalized unit profile. The
normalized profiles are computed by scaling extreme historical events between 0 and 1. The
profile standard deviation is calculated in order to stratify the profile variation domain. These
synthetic events are the boundary conditions of the hydrodynamic model.
Keywords: extreme value, multivariate, flood risk, copula, stratified sampling, unit hydrograph
3rd STAHY International Workshop on STATISTICAL METHODS FOR HYDROLOGY AND WATER RESOURCES MANAGEMENT
October 1-2, 2012 Tunis, Tunisia
1 INTRODUCTION
In order to prepare flood risk management plans, a full probabilistic methodology was tested for the Belgian
part of the Yser basin (1101 km2). Flooding in the catchment is primarily determined by the combination of
large rainfall runoff volumes and downstream North Sea water levels, since raised levels impede flow by
gravity through the drainage sluices in Nieuwpoort (Figure 1). Rainfall runoff volumes are concentrated at
the upstream French-Belgian border and the Belgian subcatchments. Hence, a multivariate approach is
appropriate to account for joint probabilities of upstream runoff.
Flood risk is calculated from the probability distribution of flood consequences. Therefore the probability
distribution of inundation depths must be determined. Since severe consequences can compensate for
extremely low frequencies, also extreme values have to be considered. Hence statistical extrapolation of
recorded time series is necessary. An extreme value distribution has been determined for each primary
random variable. By means of a technique based on the mean excess function, an appropriate member of the
GPD is selected. Likewise an optimal threshold for peak over threshold selection is determined and the
extreme value distribution is fitted by maximum likelihood optimization. The joint dependency structure for
the primary random variables is modeled by an extreme value copula.
Next, the multivariate domain of variables was stratified in different classes, each of which representing a
combination of variable quantiles and corresponding to a joint probability. For each class, a synthetic event
is generated by multiplying the class mean with the time series of a normalized unit profile (unit
hydrograph). The normalized profiles are computed by scaling extreme historical events between 0 and 1.
The profile standard deviation is calculated in order to stratify the profile variation domain.
These synthetic events can be used as boundary conditions for a hydrodynamic model, allowing the drawing
an empirical frequency distribution for inundation depths in the floodplains, which yields in a damage
distribution and risk assessment.
Chapter 2 gives a description of the Yser basin and the processing of the raw data used to derive the extreme
value distributions in chapter 3. The combination of univariate extreme value distributions into a joint
multivariate distribution is explained in chapter 4. Finally the chapter 5 describes the creation of synthetic
events by combining the unit profile of each variable with the stratified classes. The conclusions are
summarized in chapter 6.
Figure 4 - Selection of optimized threshold for a GPD with = 0 (Exponential Distribution, Yser Roesbrugge).
In this example the mean excess similarity suggests an exponential distribution. Hence an exponential
distribution is repeatedly fitted (Maximum Likelihood fit) to the POT values, taking into account a
decreasing number of POT-values, corresponding to an increasing threshold. An appropriate selection of an
optimized threshold is guaranteed if the RMSE of the fit is reaching a local minimum and if at the same time
the calculated scale parameter is located in a stable zone. With the set scale and shape parameter, the
threshold model can be drawn, as shown in Figure 5.
Figure 5 - Univariate extreme value distribution for flow discharges in the Yser, at French-Belgian border in
Roesbrugge (Exponential Distribution)
4 MULTIVARIATE DEPENDANCIES
4.1 COPULA
Since Li (2000) first introduced copulas into default modelling, there has been increasing interest in this
approach. Until that moment, the copula concept was used frequently in survival analysis and actuaries
sciences.
According to Li (2000) and Nelsen (2006), a copula is a function that joins or couples a multivariate
distribution to their one-dimensional marginal distribution functions or a distribution function whose one-
dimensional margins are uniform. For m uniform random variables U1, U2,, Um the joint distribution
function C is defined as:
! !! , !! , , !! , ! = !" !! !! , !! !! , , !! !!
Where is a dependence parameter.
The copula function C can link univariate marginal or conditional distribution functions
!! !! , !! !! , , !! (!! ) into a multivariate distribution function F with univariate marginal distributions
specified by !! !! , !! !! , , !! (!! ).
! !! !! , !! !! , , !! (!! ) = !(!! , !! , , !! )
Sklar (1959) established the converse. He showed that any joint distribution function F can be seen as a
copula function. He proved that if !(!! , !! , , !! ) is a joint multivariate distribution function with
univariate marginal distribution functions !! !! , !! !! , , !! (!! ), then there exists a copula
! !! , !! , , !!
such that
! !! , !! , , !! = ! !! !! , !! !! , , !! (!! )
If each Fi is continuous then C
is unique (Sklars theorem). Thus, copula functions provide a unifying and
flexible way to study joint distributions. In case of a bivariate copula functions C(u,v) for the uniform
variables U and V, defined over the area !, ! |0 < ! 1, 0 < ! 1 . The joint survival function
(exceedance probability) ! of a two dimensional copula is given by (Nelsen 2006):
! !, ! = ! ! > !, ! > ! = 1 !! ! !! ! + ! !, !
! !, ! = ! !, ! = 1 ! ! + !(!, !)
The copula used is this project was first discussed by Gumbel (1960b) and is referred to as the Gumbel
Copula:
!
! !, ! = !"# ( ln ! + (ln (!)! !/!
! [1, ]
Rho/tau p-value
Pearson 0.25073 0.27295
Kendall 0.26253 0.10281
Spearman 0.34881 0.12121
Rho/tau p-value
Pearson 0.42289 0.00017
Kendall 0.30826 0.00011
Spearman 0.44372 7.5e-05
The Gumbel copula is fitted through the coupled Peak Over Threshold (POT) values of the Yser end the
Poperingevaart, yielding in an coefficient of 1,4633. The POT couples are visualized together with 1000
random samples of the copula in Figure 7a. The survival copula provides the joint exceedance probability of
the discharge events. This exceedance probability is transformed to a yearly exceedance frequency by the
amount of coupled POT values and the length of the original time series.
Figure 7 - a) Copula, random samples (in blue), coupled events (in red), b) Survival Copula
5 SYNTHETIC EVENTS
5.1 Normalized unit profile: Hydrograph
The total discharge of can be divided into runoff and base flow. This baseflow can be calculated by the
procedure described in Eckhardt (2005).
1 !"# ! !! ! 1 + 1 ! !"# !(!)
!! ! =
1 ! !"#
With the recession constant and BFI the baseflow index with default values of respectively 0,99 and 0,35.
The runoff can be calculated by subtracting the base flow from the total flow. The normalized profiles are
computed by scaling or normalizing extreme historical events, both runoff and baseflow, between 0 and 1. A
normalized unit profile (!! ) is then computed as an average profile of a number of normalized profiles
corresponding with the most extreme events in the time series. The number of extreme events is a
compromise between a focus on extreme behavior, by decreasing the number of extreme events, and
reducing the influence of random behavior of a single event, thus increasing the number of extreme events.
To account for the random variation of individual event profiles, a standard deviation (!! ) of the normalized
unit hydrographs is computed, based on the same number of extreme events.
!
!!! Y
!! =
n
! !
!!!Y !!
!! =
n1
Figure 8 displays the normalized unit hydrographs, computed for the Yser ad Roesbrugge, considering the
10 till 50 most extreme events. All the normalized unit hydrographs show major similarity, but steeper limbs
of the hydrograph if only the highest extremes are used. In this study it was decided to include 25 events. In
case of flood risk assessment, the time profile of the event has a major influence on the location and the
extent of the flood area. Short high peak flows affect upstream areas, while large volumes mostly affect
downstream areas. The runoff and baseflow are recombined into a total discharge by a regression curve
where the maximal baseflow and runoff are a function of the total discharge (Figure 9).
The same approach can be repeated for every single boundary condition.
Figure 8 - Mean of normalized hydrograph profile, calculated over the 10, 20, 30, 40 and 50 most extreme events.
Figure 9 - Normalized hydrograph profile with standard deviation for the runoff and the baseflow base on the 25 most
extreme events.
Risk = S dP
Therefore synthetic events must be sampled over the complete range of frequencies P that might lead to
potential flood consequences S. Hence the discharge domain is divided into a number of discharge classes, a
process denoted stratification. Each discharge class or stratum is represented by a synthetic event, which is
generated by multiplying the time profile of the normalized unit hydrograph with the mean discharge of that
particular class.
5.2.1 Stratification of univariate Extreme Value Distribution
In order to ensure taking into account all potential events that could contribute to the risk value, the discharge
domain was subdivided in 10 equidistant discharge classes between the threshold of the extreme value
distribution and the upper limit of the 95% confidence interval at the frequency of 10-3 per year. Lower
frequencies do not contribute to the total risk value and can be neglected.
Figure 10 displays the stratification process of the threshold model for peak flows in a subcatchment in
this particular case a GPD =0. Each synthetic event i, representing a discharge class Qi has a probability of
occurrence, expressed as the expected number of occurrences per year of a peak flow in the considered class
or stratum. The probability of occurrence per year of a random event within stratum i is:
1 1
P!" = f!" f!" =
T!" T!"
With fi1 and fi2 the exceedance frequency, and Ti1 and Ti2 the return periods, of lower limit Qi1 and upper
limit Qi2 of stratum i or discharge class Qi.
6
Extreme Value Distribution
95% CI
5 POT
Stratum i
Range of stratum i
4 Qi2
!Qi
Qi1
Discharge
0
0.1 1 10 Ti1 100 Ti2 1000
Return period [years]
Figure 11 - a) Area of occurrence of couples and uncoupled POT. b) Stratified Area of occurrence of couples and
uncoupled POT
The exceedance probability of the joint occurrences in certain point of area 1 can be calculated by the copula.
If both variable 1 and variable 2 are subdivided into 10 classes, 100 cells and 121 (112) exceedance
probabilities will be obtained. The probability of occurrence per year of each stratum is:
k!
P i, j = C i + 1, j + 1 C i, j + 1 C i + 1, j + C(i, j)
A!
Where C is the exceedance probability given by the copula, k is the number of couples and A is the amount
of years data. The value of variable 1 and 2 in the middle of each class will be assigned to that class (red dots
in Figure 11b). This yields a stratified two dimensional (QYser and QPperingevaart) space with known
frequencies (Figure 12).
As a check the returnlevel-retrunperiod plot can be made by sorting the discharges in descending order and
accumulating the corresponding probabilities Psynth,i,k. This plot is compared with the original univariate
distribution (Figure 13).
6 CONCLUSION
In this paper a methodology has been presented for probabilistic assessment of flood risk. This includes the
generation of synthetic events for hydrodynamic simulation, representing a class of (a combination of)
random variables. Stratified sampling from (extreme value) distributions appears to be an efficient technique
to cover the complete frequency domain of interest, reducing the number of samples by taking into account
the probabilities of the strata. Furthermore, statistical uncertainties can be taken into account without
additional hydrodynamic computations.
The methodology with synthetic hydrographs, generated by stratified sampling, has the mere advantage that
it allows for determining a statistical distribution of the consequences. Hence, this makes the methodology
particularly appropriate for flood risk assessment and related applications.
In a second stage of the study the synthetic boundary conditions will be applied on a hydrodynamic model
and the result validated by comparison with a long term simulation of measurement series. The statistical
distribution of the inundation depths will be used as validation parameter. Furthermore the assumption of the
normal variation of the unit profiles will be further investigated and evaluated.
7 REFERENCES
Li, D 2000 On default correlation: a Copula function Approach, working paper, RiskMetics Group, New York
Beirlant J., Teugels J.L. & Vynckier P. 1996. Practical Analysis of Extreme Values, Leuven: University Press
Beirlant J., Goegebeur, Y. & Teugels, J. 2004. Statistics of Extremes, Theory and Applications, John Wiley & Sons Ltd.
Blanckaert, J., Bulckaen, D. & Schueremans, L. 2007. Average annual flood risk analysis of the Scheldt basin, International
Forum on Engineering Decision Making (IFEDM), Australian Journal of Civil Engineering 2007 (AJCE Vol4 No1)
Coles, S. 2001. An introduction to statistical modeling of extreme values. London: Springer-Verlag
Eckhardt, K., 2005, How to construct recursive digital filters for baseflow separation, Hydrological Processes 19, 507-515
Gumbel, E. J., 1960 distributions des valeurs extreme en plusiers dimensions. Publ Inst Statist Univ Paris 9: 171-173
Kotz, S. & Nadarajah, S. 2000. Extreme value distributions, theory and applications. London: Imperial College Press.
Nelsen B. R. 2006 An Introduction to Copulas. New York, Springer Series in Statistics.
Singh, V.P., et al. 1985. On fitting Gamma Distributions to Synthetic Runoff Hydrographs, Nordic Hydrology, 16, 177-192
Trouw, K, Blanckaert, J, Verwaest, T, Hurdle, D, Van Alboom, W, Monbaliu, J, De Rouck, J, De Wolf, P, Willems, M, Sas, M,
Decroo, D, van Banning, G. Extreme hydrodynamic boundary conditions near Ostend (Belgium). International Conference
on Coastal Engineering 2004, Lisbon.
Willems, P., et al. 2001. Algemene methodologie voor het modelleren van de waterafvoer in bevaarbare waterlopen in
Vlaanderen, Antwerp: Flanders Hydraulics (in Dutch)