Stratified Sampled Synthetic Events For Flood Risk Calculations

STRATIFIED SAMPLED SYNTHETIC EVENTS FOR FLOOD RISK
CALCULATIONS
By
(1)
Gert Leyssen and Joris Blanckaert (1)
(1) IMDC, International marine and dredging consultants, Coveliersstraat 15, 2600 Berchem, Belgium
(gert.leyssen@imdc.be ; joris.blanckaert@imdc.be )
ABSTRACT
A multivariate probabilistic methodology for the creation of extreme boundary conditions for
flood modeling is presented and applied to the river Yser (B). An extreme value distribution is
determined for the primary variables at each boundary: rainfall-runoff from the French part of
the Yser catchment and all tributaries in the Belgian part of the catchment, and downstream
water levels at the Yser mouth in the North Sea at Nieuwpoort (B). The univariate extreme
value distributions are coupled by a Gumbel copula to model dependencies. Next, the
multivariate domain of variables was stratified in different classes, each of which representing a
combination of variable quantiles and corresponding to a joint probability. For each class, a
synthetic event is generated by multiplying the class mean with a normalized unit profile. The
normalized profiles are computed by scaling extreme historical events between 0 and 1. The
profile standard deviation is calculated in order to stratify the profile variation domain. These
synthetic events are the boundary conditions of the hydrodynamic model.
Keywords: extreme value, multivariate, flood risk, copula, stratified sampling, unit hydrograph
3rd STAHY International Workshop on STATISTICAL METHODS FOR HYDROLOGY AND WATER RESOURCES MANAGEMENT
October 1-2, 2012 Tunis, Tunisia
1 INTRODUCTION
In order to prepare flood risk management plans, a full probabilistic methodology was tested for the Belgian
part of the Yser basin (1101 km2). Flooding in the catchment is primarily determined by the combination of
large rainfall runoff volumes and downstream North Sea water levels, since raised levels impede flow by
gravity through the drainage sluices in Nieuwpoort (Figure 1). Rainfall runoff volumes are concentrated at
the upstream French-Belgian border and the Belgian subcatchments. Hence, a multivariate approach is
appropriate to account for joint probabilities of upstream runoff.
Flood risk is calculated from the probability distribution of flood consequences. Therefore the probability
distribution of inundation depths must be determined. Since severe consequences can compensate for
extremely low frequencies, also extreme values have to be considered. Hence statistical extrapolation of
recorded time series is necessary. An extreme value distribution has been determined for each primary
random variable. By means of a technique based on the mean excess function, an appropriate member of the
GPD is selected. Likewise an optimal threshold for peak over threshold selection is determined and the
extreme value distribution is fitted by maximum likelihood optimization. The joint dependency structure for
the primary random variables is modeled by an extreme value copula.
Next, the multivariate domain of variables was stratified in different classes, each of which representing a
combination of variable quantiles and corresponding to a joint probability. For each class, a synthetic event
is generated by multiplying the class mean with the time series of a normalized unit profile (unit
hydrograph). The normalized profiles are computed by scaling extreme historical events between 0 and 1.
The profile standard deviation is calculated in order to stratify the profile variation domain.
These synthetic events can be used as boundary conditions for a hydrodynamic model, allowing the drawing
an empirical frequency distribution for inundation depths in the floodplains, which yields in a damage
distribution and risk assessment.
Chapter 2 gives a description of the Yser basin and the processing of the raw data used to derive the extreme
value distributions in chapter 3. The combination of univariate extreme value distributions into a joint
multivariate distribution is explained in chapter 4. Finally the chapter 5 describes the creation of synthetic
events by combining the unit profile of each variable with the stratified classes. The conclusions are
summarized in chapter 6.
2 YZER BASIN: TIME SERIES

The Yser is a river that finds its origin in Northern France and has a total length of 78 km of which 45 km in
Belgium. The total Yser basin has an area of 1101 km2, with 730 in Belgium. The Belgian part consists of
the Ijzer and six tributaries (Table I, Figure 1). During high discharge volumes a part of the water is drained
though a canal (Figure 1). The probabilistic methodogy is tested on the Belgian part of the Yser. For each
boundary condition of the hydrodynamic model, a long-term time series is obtained. To reduce the
uncertainty of an extreme value analysis, the length of the long-term time series should be maximized while
taking care of the homogeneity of the time series in order to reduce bias. This means 7 runoff measurement
time series (Yser and six tributaries) and a water level series at Nieuwpoort (Table I).
Table I - Time series for boundary conditions.
Boundary Location Start End

North Sea Nieuwpoort 1971/01/01 2009/03/25
Yser Roesbrugge 1987/01/01 2012/07/10
Poperingevaart Oostvleteren 1972/02/16 2012/06/15
Kemmelbeek Reninge 1986/05/14 2012/06/15
Ieperlee Boezinge 1978/06/01 2012/06/15
Martjesvaart Merkem 1986/05/05 2012/06/13
Stenensluisvaart Merkem 1990/09/05 2012/06/16
Handzamevaart Kortemark 1994/04/13 2012/06/15
Stratified sampled synthetic events for flood risk calculations 2

Figure 1 - Yser basin

The runoff values are derived from gauged water levels in combination with a rating curve. High low water
level is the limiting factor for the gravital outflow at river mouth in Nieuwpoort. A long term low water
level series from Ostend is translated to Nieuwpoort and combined with recent low water level
measurements from Nieuwpoort (Trouw, 2004). The trend caused by the sea level rise is calculated by a
combined linear and sinusoidal fit to take the nodal tide constituent into account (Figure 2). The low water
time series is detrended with respect to the reference year 2010.

Figure 2 - Yearly average low water and trend (x=year)
3 EXTREME VALUE ANALYSIS

3.1 Peak Over Threshold
In this study Peak Over Threshold (POT) values are selected from the long-term time series. POT values are
independent extremes exceeding a set threshold. The independency of the events is obtained by use of a
maximum inter-event level and a minimum time lag between two successive events. The initial chosen
threshold has to be sufficiently low in order to include all the extreme events. In a later step a higher
optimized threshold will be determined.
3.2 Threshold excess models

The tail behavior of extreme POT values can be modeled by the Generalized Pareto Distribution (GPD), a
generalized distribution representing the tail behavior of different conditional threshold excess models. A
model calibration method suggested by Beirlant (2004) and Coles (2001) is adopted in this study. By means
of the mean excess function it is possible to get an estimate of the GPD shape parameter , which determines
the appropriate conditional distribution.
The mean excess is the mean of the excess values of all POT values. Each conditional distribution has a
theoretical mean excess function which returns the mean excess as a function of the threshold. The shape of
the empirical mean excess function can be compared with theoretical mean excess functions of the different
threshold models. If the empirical mean excess function has an increasing trend, the corresponding
distribution will belong to the GPD (>0), the Pareto distribution or the Conditional Weibull distribution
(! < 1). In case of a horizontal mean excess function the data will follow GPD (=0), or the Exponential
Distribution. If the mean excess function is decreasing the model is GPD (<0), or the Conditional Weibull
Distribution (! > 1) (Figure 3). After selecting a conditional threshold excess model, the parameter
estimation of the threshold models is based on the choice for an optimal threshold, as shown in Figure 4.

Figure 3 - Mean excess functions
Figure 4 - Selection of optimized threshold for a GPD with = 0 (Exponential Distribution, Yser Roesbrugge).
In this example the mean excess similarity suggests an exponential distribution. Hence an exponential
distribution is repeatedly fitted (Maximum Likelihood fit) to the POT values, taking into account a
decreasing number of POT-values, corresponding to an increasing threshold. An appropriate selection of an
optimized threshold is guaranteed if the RMSE of the fit is reaching a local minimum and if at the same time
the calculated scale parameter is located in a stable zone. With the set scale and shape parameter, the
threshold model can be drawn, as shown in Figure 5.

Figure 5 - Univariate extreme value distribution for flow discharges in the Yser, at French-Belgian border in
Roesbrugge (Exponential Distribution)
4 MULTIVARIATE DEPENDANCIES
4.1 COPULA
Since Li (2000) first introduced copulas into default modelling, there has been increasing interest in this
approach. Until that moment, the copula concept was used frequently in survival analysis and actuaries
sciences.
According to Li (2000) and Nelsen (2006), a copula is a function that joins or couples a multivariate
distribution to their one-dimensional marginal distribution functions or a distribution function whose one-
dimensional margins are uniform. For m uniform random variables U1, U2,, Um the joint distribution
function C is defined as:
! !! , !! , , !! , ! = !" !! !! , !! !! , , !! !!
Where is a dependence parameter.
The copula function C can link univariate marginal or conditional distribution functions
!! !! , !! !! , , !! (!! ) into a multivariate distribution function F with univariate marginal distributions
specified by !! !! , !! !! , , !! (!! ).
! !! !! , !! !! , , !! (!! ) = !(!! , !! , , !! )

Sklar (1959) established the converse. He showed that any joint distribution function F can be seen as a
copula function. He proved that if !(!! , !! , , !! ) is a joint multivariate distribution function with
univariate marginal distribution functions !! !! , !! !! , , !! (!! ), then there exists a copula
! !! , !! , , !! such that
! !! , !! , , !! = ! !! !! , !! !! , , !! (!! )
If each Fi is continuous then C is unique (Sklars theorem). Thus, copula functions provide a unifying and
flexible way to study joint distributions. In case of a bivariate copula functions C(u,v) for the uniform
variables U and V, defined over the area !, ! |0 < ! 1, 0 < ! 1 . The joint survival function
(exceedance probability) ! of a two dimensional copula is given by (Nelsen 2006):
! !, ! = ! ! > !, ! > ! = 1 !! ! !! ! + ! !, !
! !, ! = ! !, ! = 1 ! ! + !(!, !)
The copula used is this project was first discussed by Gumbel (1960b) and is referred to as the Gumbel
Copula:
!
! !, ! = !"# ( ln ! + (ln (!)! !/!
! [1, ]
4.2 Runoff Roesbrugge LW Niewpoort

The dependency between the extreme rainfall runoff events at the upstream boundary of the Belgian part of
the Yser (Roesbrugge) and high low water levels at Nieuwpoort is investigated. The p values of Pearson,
Kendal and Spearman coefficients all indicated that the H0 hypothesis of independency could not be rejected
at =0.05. This conclusion can be explained by substantially different meteorological events that generate
extreme discharge in the Yser and extreme low waters at Nieuwpoort. Hence, joint occurrences of extreme
values are unlikely, and the upstream extreme synthetic runoff events will be coupled with a neap, average
and spring tide.
Table II Correlation coefficients and p-values
Rho/tau p-value
Pearson 0.25073 0.27295
Kendall 0.26253 0.10281
Spearman 0.34881 0.12121
4.3 Runoff Roesbrugge runoff Poperingevaart

The Poperingevaart (Figure 1) is selected as a representative subcathment for all tributaries to calculate the
joint probabilistics of the upstream runoff at Roesbrugge and the runoff from the subcatchments. Now, the
null hypothesis of independency between the runoff events of the Yser (Roesbrugge) and the Poperingevaart
can be discarded at =0.05 (Table III). The underlying meteorological events are clearly linked. The
maximum runoff of the majority of the events occurs first in the Porperingevaart and only after 11h in the
Yser at Roesbrugge. This time lag is used for the synthetic events.
Table III Correlation coefficients and p-values
Rho/tau p-value
Pearson 0.42289 0.00017
Kendall 0.30826 0.00011
Spearman 0.44372 7.5e-05

Figure 6 - Timeschift Yser and Poperingevaart
The Gumbel copula is fitted through the coupled Peak Over Threshold (POT) values of the Yser end the
Poperingevaart, yielding in an coefficient of 1,4633. The POT couples are visualized together with 1000
random samples of the copula in Figure 7a. The survival copula provides the joint exceedance probability of
the discharge events. This exceedance probability is transformed to a yearly exceedance frequency by the
amount of coupled POT values and the length of the original time series.
Figure 7 - a) Copula, random samples (in blue), coupled events (in red), b) Survival Copula
5 SYNTHETIC EVENTS
5.1 Normalized unit profile: Hydrograph
The total discharge of can be divided into runoff and base flow. This baseflow can be calculated by the
procedure described in Eckhardt (2005).
1 !"# ! !! ! 1 + 1 ! !"# !(!)
!! ! =
1 ! !"#
With the recession constant and BFI the baseflow index with default values of respectively 0,99 and 0,35.
The runoff can be calculated by subtracting the base flow from the total flow. The normalized profiles are
computed by scaling or normalizing extreme historical events, both runoff and baseflow, between 0 and 1. A
normalized unit profile (!! ) is then computed as an average profile of a number of normalized profiles
corresponding with the most extreme events in the time series. The number of extreme events is a

compromise between a focus on extreme behavior, by decreasing the number of extreme events, and
reducing the influence of random behavior of a single event, thus increasing the number of extreme events.
To account for the random variation of individual event profiles, a standard deviation (!! ) of the normalized
unit hydrographs is computed, based on the same number of extreme events.
!
!!! Y
!! =
n
! !
!!!Y !!
!! =
n1
Figure 8 displays the normalized unit hydrographs, computed for the Yser ad Roesbrugge, considering the
10 till 50 most extreme events. All the normalized unit hydrographs show major similarity, but steeper limbs
of the hydrograph if only the highest extremes are used. In this study it was decided to include 25 events. In
case of flood risk assessment, the time profile of the event has a major influence on the location and the
extent of the flood area. Short high peak flows affect upstream areas, while large volumes mostly affect
downstream areas. The runoff and baseflow are recombined into a total discharge by a regression curve
where the maximal baseflow and runoff are a function of the total discharge (Figure 9).
The same approach can be repeated for every single boundary condition.
Figure 8 - Mean of normalized hydrograph profile, calculated over the 10, 20, 30, 40 and 50 most extreme events.
Figure 9 - Normalized hydrograph profile with standard deviation for the runoff and the baseflow base on the 25 most
extreme events.

5.2 Stratified Sampling

For risk assessment high events with extremely low frequencies but yet severe consequences as well as less
extreme events with higher frequencies and less severe consequences must be considered.
Risk = S dP
Therefore synthetic events must be sampled over the complete range of frequencies P that might lead to
potential flood consequences S. Hence the discharge domain is divided into a number of discharge classes, a
process denoted stratification. Each discharge class or stratum is represented by a synthetic event, which is
generated by multiplying the time profile of the normalized unit hydrograph with the mean discharge of that
particular class.
5.2.1 Stratification of univariate Extreme Value Distribution
In order to ensure taking into account all potential events that could contribute to the risk value, the discharge
domain was subdivided in 10 equidistant discharge classes between the threshold of the extreme value
distribution and the upper limit of the 95% confidence interval at the frequency of 10-3 per year. Lower
frequencies do not contribute to the total risk value and can be neglected.
Figure 10 displays the stratification process of the threshold model for peak flows in a subcatchment in
this particular case a GPD =0. Each synthetic event i, representing a discharge class Qi has a probability of
occurrence, expressed as the expected number of occurrences per year of a peak flow in the considered class
or stratum. The probability of occurrence per year of a random event within stratum i is:
1 1
P!" = f!" f!" =
T!" T!"
With fi1 and fi2 the exceedance frequency, and Ti1 and Ti2 the return periods, of lower limit Qi1 and upper
limit Qi2 of stratum i or discharge class Qi.
6
Extreme Value Distribution
95% CI
5 POT
Stratum i
Range of stratum i
4 Qi2
!Qi
Qi1
Discharge
0
0.1 1 10 Ti1 100 Ti2 1000
Return period [years]
Figure 10 - Stratified sampling from the extreme value distribution of discharges

5.2.2 Stratification of a copula
A bivariate copula combines 2 univariate distributions into a multivariate distribution. A copula is valid
when the univariate distributions are valid (area 1 in Figure 11a), so for conditional distributions above the
optimal thresholds. Area 2 and 3 in Figure 11a contain the events of respectively variable 1 en 2 where the
corresponding value of the second variable is below the optimal threshold. The stratification of these areas
is explained in 5.2.1.

Figure 11 - a) Area of occurrence of couples and uncoupled POT. b) Stratified Area of occurrence of couples and
uncoupled POT
The exceedance probability of the joint occurrences in certain point of area 1 can be calculated by the copula.
If both variable 1 and variable 2 are subdivided into 10 classes, 100 cells and 121 (112) exceedance
probabilities will be obtained. The probability of occurrence per year of each stratum is:
k!
P i, j = C i + 1, j + 1 C i, j + 1 C i + 1, j + C(i, j)
A!
Where C is the exceedance probability given by the copula, k is the number of couples and A is the amount
of years data. The value of variable 1 and 2 in the middle of each class will be assigned to that class (red dots
in Figure 11b). This yields a stratified two dimensional (QYser and QPperingevaart) space with known
frequencies (Figure 12).
Figure 12 Stratified copula

As a check the returnlevel-retrunperiod plot can be made by sorting the discharges in descending order and
accumulating the corresponding probabilities Psynth,i,k. This plot is compared with the original univariate
distribution (Figure 13).
Figure 13: Recomposition of the returnlevel returnperiod plots after stratification.

5.2.3 Stratification of normalized unit hydrographs
The normalized unit hydrograph was computed as a mean time profile m(t) and a standard deviation s(t). To
account for random variations of the synthetic events, 5 hydrographs are constructed in each discharge
stratum:
m(t)- 2*s(t) : the mean unit event minus twice the standard deviation
m(t)- s(t) : the mean unit event minus the standard deviation
m(t) : the mean unit event
m(t)+ s(t) : the mean unit event plus the standard deviation
m(t)+ 2s(t) : the mean unit event plus twice the standard deviation
This yields another stratification, with 5 strata or profile classes. Corresponding probabilities are computed
under the assumption of a normal distribution, with first moment = m(t) and second moment = s(t) for
each t, as explained in Figure 14. The assumption of normality is reported to be conservative (Singh 1985).

Figure 14 - Stratification of hydrograph profile variation

5.2.4 Synthetic event frequencies
For each of the strata i, from the stratification of the extreme value distributions or copula, 5 different
synthetic events are generated by stratification of the normal variation into 5 strata k. Hence the probability
of a single synthetic hydrograph yields:
P!"#$%,!,! = P!" . P!"#$,!
with P!" computed as shown in Figure 10 or Figure 11, and P!"#$,! computed as shown in Figure 14.
5.2.5 Statistical uncertainty
As part of the extreme value analysis, confidence bounds are computed by means of the bootstrap resampling
technique. This interval represents the statistical uncertainties of the GPD parameter estimation. The
statistical uncertainty can be implemented in the stratification by recalculating the discharge class
probabilities P_Qi for the upper and lower limit of the 95% confidence interval of the GPD. Eventually this
yields a 95% upper and lower confidence bound. Since the statistical uncertainty only affects the class
probabilities, leaving the synthetic time profiles unchanged, no additional hydrodynamic runs are required to
calculate statistical uncertainties for the empirical distribution function of the flood consequences.
5.3 Resulting events

Eventually a set of boundary conditions is obtained with a known frequency of occurrence. Each event is a
set of time series for every model boundary (Figure 15). The amount of events is determined by the
stratification resolution (10*10 for the copula + 20 of area 2 and 3), the set of combinations made with the
downstream boundary (3, neap, average and spring tide) and the variation of the time unit profile (5) which
yields in 1800 synthetic event. These frequencies can be used to obtain a statistical distribution of
consequences, e.g. flood depths, by sorting the maximum flood depths in descending order and accumulating
the corresponding probabilities Psynth,i,k.

Figure 15 - Resulting synthetic events Yser Roesbrugge

6 CONCLUSION
In this paper a methodology has been presented for probabilistic assessment of flood risk. This includes the
generation of synthetic events for hydrodynamic simulation, representing a class of (a combination of)
random variables. Stratified sampling from (extreme value) distributions appears to be an efficient technique
to cover the complete frequency domain of interest, reducing the number of samples by taking into account
the probabilities of the strata. Furthermore, statistical uncertainties can be taken into account without
additional hydrodynamic computations.
The methodology with synthetic hydrographs, generated by stratified sampling, has the mere advantage that
it allows for determining a statistical distribution of the consequences. Hence, this makes the methodology
particularly appropriate for flood risk assessment and related applications.
In a second stage of the study the synthetic boundary conditions will be applied on a hydrodynamic model
and the result validated by comparison with a long term simulation of measurement series. The statistical
distribution of the inundation depths will be used as validation parameter. Furthermore the assumption of the
normal variation of the unit profiles will be further investigated and evaluated.
7 REFERENCES
Li, D 2000 On default correlation: a Copula function Approach, working paper, RiskMetics Group, New York
Beirlant J., Teugels J.L. & Vynckier P. 1996. Practical Analysis of Extreme Values, Leuven: University Press
Beirlant J., Goegebeur, Y. & Teugels, J. 2004. Statistics of Extremes, Theory and Applications, John Wiley & Sons Ltd.
Blanckaert, J., Bulckaen, D. & Schueremans, L. 2007. Average annual flood risk analysis of the Scheldt basin, International
Forum on Engineering Decision Making (IFEDM), Australian Journal of Civil Engineering 2007 (AJCE Vol4 No1)
Coles, S. 2001. An introduction to statistical modeling of extreme values. London: Springer-Verlag
Eckhardt, K., 2005, How to construct recursive digital filters for baseflow separation, Hydrological Processes 19, 507-515
Gumbel, E. J., 1960 distributions des valeurs extreme en plusiers dimensions. Publ Inst Statist Univ Paris 9: 171-173
Kotz, S. & Nadarajah, S. 2000. Extreme value distributions, theory and applications. London: Imperial College Press.
Nelsen B. R. 2006 An Introduction to Copulas. New York, Springer Series in Statistics.
Singh, V.P., et al. 1985. On fitting Gamma Distributions to Synthetic Runoff Hydrographs, Nordic Hydrology, 16, 177-192
Trouw, K, Blanckaert, J, Verwaest, T, Hurdle, D, Van Alboom, W, Monbaliu, J, De Rouck, J, De Wolf, P, Willems, M, Sas, M,
Decroo, D, van Banning, G. Extreme hydrodynamic boundary conditions near Ostend (Belgium). International Conference
on Coastal Engineering 2004, Lisbon.
Willems, P., et al. 2001. Algemene methodologie voor het modelleren van de waterafvoer in bevaarbare waterlopen in
Vlaanderen, Antwerp: Flanders Hydraulics (in Dutch)

Stratified Sampled Synthetic Events For Flood Risk Calculations

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Stratified Sampled Synthetic Events For Flood Risk Calculations

Uploaded by

Copyright:

Available Formats

STRATIFIED SAMPLED SYNTHETIC EVENTS FOR FLOOD RISK

2 YZER BASIN: TIME SERIES

Boundary Location Start End

Stratified sampled synthetic events for flood risk calculations 2

Figure 1 - Yser basin

Stratified sampled synthetic events for flood risk calculations 3

Figure 2 - Yearly average low water and trend (x=year)

3 EXTREME VALUE ANALYSIS

3.2 Threshold excess models

Stratified sampled synthetic events for flood risk calculations 4

Figure 3 - Mean excess functions

Stratified sampled synthetic events for flood risk calculations 5

Stratified sampled synthetic events for flood risk calculations 6

4.2 Runoff Roesbrugge LW Niewpoort

4.3 Runoff Roesbrugge runoff Poperingevaart

Stratified sampled synthetic events for flood risk calculations 7

Figure 6 - Timeschift Yser and Poperingevaart

Stratified sampled synthetic events for flood risk calculations 8

Stratified sampled synthetic events for flood risk calculations 9

5.2 Stratified Sampling

Figure 10 - Stratified sampling from the extreme value distribution of discharges

Stratified sampled synthetic events for flood risk calculations 10

Figure 12 Stratified copula

Stratified sampled synthetic events for flood risk calculations 11

Figure 13: Recomposition of the returnlevel returnperiod plots after stratification.

Stratified sampled synthetic events for flood risk calculations 12

Figure 14 - Stratification of hydrograph profile variation

5.3 Resulting events

Stratified sampled synthetic events for flood risk calculations 13

Figure 15 - Resulting synthetic events Yser Roesbrugge

Stratified sampled synthetic events for flood risk calculations 14

Stratified sampled synthetic events for flood risk calculations 15

You might also like