Lectura 1

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

Research Article

Selecting the best probability distribution for at‑site flood frequency


analysis; a study of Torne River
Mahmood Ul Hassan1   · Omar Hayat2 · Zahra Noreen3

Received: 26 May 2019 / Accepted: 29 October 2019 / Published online: 18 November 2019
© The Author(s) 2019  OPEN

Abstract
At-site flood frequency analysis is a direct method of estimation of flood frequency at a particular site. The appropri-
ate selection of probability distribution and a parameter estimation method are important for at-site flood frequency
analysis. Generalized extreme value, three-parameter log-normal, generalized logistic, Pearson type-III and Gumbel
distributions have been considered to describe the annual maximum steam flow at five gauging sites of Torne River in
Sweden. To estimate the parameters of distributions, maximum likelihood estimation and L-moments methods are used.
The performance of these distributions is assessed based on goodness-of-fit tests and accuracy measures. At most sites,
the best-fitted distributions are with LM estimation method. Finally, the most suitable distribution at each site is used to
predict the maximum flood magnitude for different return periods.

Keywords  Flood frequency analysis · L-moments · Maximum likelihood estimation

1 Introduction insurance studies, planning of flood management and


rescue operations. We come across a number of methods
Floods are natural hazards and cause extreme damages which are available in the literature for flood magnitude
throughout the world. The main reasons of floods are estimation, but at-site flood frequency analysis remains
extreme rainfall, ice and snow melting, dam breakage the most direct method of estimation of flood frequency
and the lack of capacity of the river watercourse to con- at a particular site.
vey the excess water. Floods are natural phenomena which To describe the flood frequency at a particular site,
cause disasters like destruction of infrastructure, damages the choice of an appropriate probability distribution and
in environmental and agricultural lands, mortality and parameter estimation method are of immense importance.
economic losses. Many frequency distribution models The probability distributions used in this study include the
have been developed for determination of hydraulic fre- generalized extreme value (GEV) distribution, Pearson
quency, but none of the distribution models is accepted type-III (P3) distribution, generalized logistic (GLO) distri-
as a universal distribution to describe the flood frequency bution, Gumbel (GUM) distribution and three-parameter
at any gauging site. The selection of a suitable distribution log-normal (LN3) distribution. These distributions are rec-
usually depends on the characteristics of available data ommended for at-sight flood frequency analysis in various
at a particular site. We need to estimate the flood mag- countries [2, 4, 24]. Furthermore, these distributions are
nitude at a particular site for various purposes including most commonly traced in the hydrological literature for
construction of hydraulic structures (barrages, canals, at-site and regional flood frequency analysis.
bridges, dams, embankments, reservoirs and spillways),

*  Mahmood Ul Hassan, scenic555@gmail.com | 1Department of Statistics, Stockholm University, Stockholm, Sweden. 2Department
for Education, London, UK. 3Division of Science and Technology, University of Education, Lahore, Pakistan.

SN Applied Sciences (2019) 1:1629 | https://doi.org/10.1007/s42452-019-1584-z

Vol.:(0123456789)
Research Article SN Applied Sciences (2019) 1:1629 | https://doi.org/10.1007/s42452-019-1584-z

Cicioni et al. [3] conducted at-site flood frequency anal- included in water resources management can be distin-
ysis in Italy by using 107 stations. They identified LN3 and guished in data uncertainties, structural uncertainties
GEV as best-fitting distributions based on the Kolmogo- and model/parameters uncertainties, see e.g. [13, 14].
rov–Smirnov (KS), Anderson–Darling (AD) and Cramer–von Furthermore, there is uncertainty in the estimation of
Mises (CVM) goodness-of-fit tests. Saf [23] found the GLO flood frequency of return periods much larger than the
as the most suitable distribution for Upper-West Medi- actual records, particularly in the type of probability den-
terranean subregion in Turkey. Mkhandi et al. [17] used sity function (PDF) and its parameters. This is particularly
annual maximum flood data of 407 stations from 11 coun- true on the right tail of the PDF, the region of interest for
tries of Southern Africa to conduct the regional frequency flooding. In addition, there is uncertainty in the measure-
analysis. They identified Pearson type-III (P3) distribution ments. For example, see [15] for an in-depth discussion on
with probability weighted moment (PWM) method and epistemic uncertainty (reducible uncertainty) and natural
log-Pearson type-III (LP3) with a method of moment uncertainty (irreducible uncertainty). The flood estimation
(MOM) as the most suitable distributions for the regions. on high return periods are always associated with high
Młyński et al. [18] identified log-normal distribution as the uncertainties. In this study, we quantify the uncertainty of
most suitable for the upper Vistula Basin region (Poland). a given quantile estimate for a specific fitted distribution
There have been many studies in the past literature on by using the parametric bootstrap method.
the comparison of various probability distributions with In this research paper, the flood frequency calculation,
different parameter estimation methods for at-site flood using statistical distribution, is addressed for gauged
frequency analysis, e.g. see [1, 5, 7, 22] (Fig. 1). catchments, for which we dispose a respectively long-
The most commonly used methods for estimation term hydrological time series. The choice of an appropriate
of parameters in flood frequency analysis are the maxi- probability distribution and associated parameter estima-
mum likelihood estimation (MLE) method, the method of tion method is vital for at-site flood frequency analysis. The
moments (MOM), the L-moments (LM) method and the core objective of this study is to find the best-fit distribu-
probability weighted moments method (PWM). The MLE tion among the candidate probability distributions with
method is an efficient and most widely used method for a particular method of estimation (MLE or LM) for annual
estimation of parameters. Recently, the LM method has maximum peak flow data at each site of the Torne River by
gained more attention in the hydrological literature for using goodness-of-fit (GOF) tests and accuracy measures.
estimation of parameters of probability distributions. In We are also interested to look that, is there any best overall
this research study, we used LM and MLE methods for distribution and fitting method for these five sites of Torne
estimation of parameters of the candidate probability River?. Finally, to estimate the quantiles of flood magni-
distribution. tude for the return period of 5, 10, 25, 50, 100, 200 and 500
The methods usually use for selection of the best years with non-exceedance probability at each site of the
distribution are goodness-of-fit (GOF) tests (e.g. Ander- river using the best-fit probability distribution. To address
son–Darling and Cramér–von Mises), accuracy measures the uncertainty of flood estimations, we estimate standard
(e.g. root mean square error and root mean squared per- error of estimated quantiles and construct 95% confidence
centage error), goodness-of-fit indices (e.g. AIC and BIC) interval of flood quantile for the return period using the
and graphical methods (e.g. Q–Q plot and L-moment ratio parametric bootstrap method. This is a first study for at-site
diagram). In the hydrological literature, researchers have flood frequency analysis of Torne River.
used different methods in order to find the best probabil- This paper is organized as follows: Sect. 2 describes the
ity distribution. To identify the best-suited distribution study area and available data for the analysis. Section 3
at each site of Torne River, we have used goodness-of-fit deals with the model description, parameter estima-
(GOF) test and accuracy measures. The GOF tests are used tion method and model comparison methods. Section 4
to test that the data come from a specific distribution. The provides the results and discussion of the application of
accuracy measures provide a term by term comparison of L-moments and maximum likelihood estimation method
the deviation between the hypothetical distribution and based flood frequency analysis of five gauging sites on the
the empirical distribution of the data. The details about Torne River. Finally, Sect. 5 concludes the article.
accuracy measures and goodness-of-fit test used in this
study are described in Sect. 3.
The estimation of flood frequency of the high return 2 Study area and data
period is of great interest in flood frequency analysis. The
flood frequency estimation of return periods is always The Torne River works as a border between northern Swe-
associated with uncertainties. Uncertainty in flood fre- den and Finland, with total catchment area 40157 km2 of
quency analysis arises from many sources. Uncertainties which 60% is within Swedish border and the remaining

Vol:.(1234567890)
SN Applied Sciences (2019) 1:1629 | https://doi.org/10.1007/s42452-019-1584-z Research Article

Table 1  Summary of Torne Station name Station no. Latitude Longitude Catchment Period for time series
River gauging sites area (km2)

Kukkolankoski Övre 16722 65.98 24.06 33,929.60 1911–2018


Pajala Pumphus 2012 67.21 23.40 11,038.10 1969–2018
Abisko 2357 68.19 19.99 3345.50 1984–2018
Junosuando 04 67.43 22.55 4348.00 1967–2018
Övre Abiskojokk 957 68.36 18.78 566.30 1985–2018

[Swedish meteorological and hydrological institute


(SMHI)]. The data of annual maximum flow of five gauging
sites of Torne River (Swedish: Torneälven) are considered in
this study. The data have been collected from SMHI (www.
smhi.se). The length of the data series varies from 34 to 108
years. The summary of Torne River gauging sites character-
istics is presented in Table 1.

3 Methodology

3.1 Candidate probability distributions

To describe the flood frequency at a particular site, the


selection of an appropriate probability distribution
is always important. We have considered generalized
extreme value (GEV), Pearson type-III (P3) distribution,
Fig. 1  Locations of gauging stations used in this study generalized logistic (GLO) distribution, Gumbel (GUM)
distribution and three-parameter log-normal (LN3) distri-
area is in Finland. The Muonio River, which is the biggest bution for the analysis of flood frequency at five gauging
contributor of the Torne River, joins shortly after Pajala sites of the Torne River. The probability density function
Pumphus. Another contributor river Lainio (259.74 km (pdf ) and quantile function y(F) of these distributions are
long) joins the Torne river shortly after Junosuando. In summarized in Table 2. These distributions are common
springtime, water flow is above average level, which con- in the literature and are recommended distributions for
verts into flood and this flood causes the damages to the flood frequency analysis in many countries (see e.g. [2,
waterfront constructions and buildings [6]. Therefore, 4, 21, 22]). We explain the detail of parameter estimation
Torne River is frequently affected by flooding problem method (MLE and LM) in the following subsection.

Table 2  Probability density and quantiles functions of the probability distributions


Distributions Probability density function f (y) Quantile function y(F)

GEV )] 1 −1 )] 1 } 𝛼[
− (− log F)𝜅
{ [ ]
1
[ (
y−𝜇 k
(
y−𝜇 k 𝜇+ 1
𝛼
1−𝜅 𝛼 exp − 1 − 𝜅 𝛼 𝜅

P3 Explicit analytical form is not available


{ }
1 (y−𝜇)
𝛽𝛼 𝛤 𝛼
(y − 𝜇)𝛼−1 exp − 𝛽

GLO )}1∕𝜅 ]−2 𝛼[


− {(1 − F)∕F}𝜅
]
[ ( )( 1 −1) ][ { ( 𝜇+ 1
1 y−𝜇 k y−𝜇 𝜅
𝛼
1−𝜅 𝛼
1+ 1−𝜅 𝛼

LN3 Explicit analytical form is not available


� � � � � ��2 �
1 1−k(y−𝜇) 1 1 1−k(y−𝜇)
√ exp − log 𝛼
− 2
− k
log 𝛼
𝛼 2𝜋

GUM
[ ( )]
1 y−𝜇 y−𝜇
exp − 𝛼 − exp − 𝛼 𝜇 − 𝛼 log (− log F)
𝛼

where 𝜇 , 𝛼 and 𝜅 are the location, scale and shape parameters of the distribution

Vol.:(0123456789)
Research Article SN Applied Sciences (2019) 1:1629 | https://doi.org/10.1007/s42452-019-1584-z

3.2 Maximum likelihood estimation (MLE) method The unbiased sample estimators of 𝛽i of the first four PWMs
for any distribution can be computed as follows
The MLE method estimates the parameters by maximizing n
the log-likelihood function of a probability distribution. b0 = n−1

yj∶n
Suppose we have n independent and identically distrib- j=1
n
uted observations y1 , y2 , … , yn . Each yi has a pdf given by b1 = n−1
∑ (j−1)
y
(n−1) j∶n
f (yi ;𝝁) . Here, 𝝁 = (𝜇1 , 𝜇2 , … , 𝜇k ) is a vector of unknown j=2
n
parameters to be estimated. Then, the � log-likelihood b2 = n−1
∑ (j−1)(j−2)
y
(n−1)(n−2) j∶n
function is defined as l(𝝁) = ni=1 log f yi ;𝝁  . The maxi-
∑ �
j=3
n
mum likelihood estimate of 𝝁 is the value of the param- b3 = n−1
∑ (j−1)(j−2)(j−3)
y
eter vector 𝝁 that maximize the l(𝝁) for given data Y. We (n−1)(n−2)(n−3) j∶n
j=4
use numerical optimization methods in order to search 𝝁
which give the maximum value of l(𝝁) . Many numerical where the data ( y1∶n ) are an ordered sample in ascend-
optimization methods, e.g. Newton–Raphson method, ing order from 1 to n. The parameters with L-moments
Nelder and Mead, differential evolution, etc. are found in estimation method are obtained by equating the sample
the literature. We have used Nelder and Mead method for L-moments with distribution L-moments.
numerical optimization proposed by Nelder and Mead
[19].
3.4 Standard error of estimated parameters
3.3 Theory of L‑moments (LM)
The standard errors (SE) of estimated parameters indicate
L-moments are introduced by Hosking [9, 10], which are a measure of reliability of estimates and performance of
linear functions of probability weighted moments (PWM’s). estimation technique. In this study, we have obtained SE of
L-moments are alternative to the conventional moments, estimated parameters by Monte Carlo simulation method.
but computed from linear combinations of order statis- The description of this method is given as
tics. L-moments can be defined for any random variable Y
whose mean exists [10]. The rth-order PWM ( 𝛽r ) is defined • We use estimated parameters with MLE and LM method
as at each gauging site and draw 1000 samples of size
equal to the length of data from each probability dis-
1
tribution.

𝛽r = y(F)F(y)r dFr = 0, 1, 2, … • For each simulated sample, we obtain the MLE and LM
0 estimates for the parameters of the distribution.
• For each gauging site, the standard errors are obtained
where F(y) is a cumulative probability distribution and by taking the standard deviation of these 1000 MLE and
y(F) is a quantile function of distribution. The first four LM estimates of the parameters of each distribution.
L-moments in terms of linear combination of PWM are
defined as
𝜆1 = 𝛽0
𝜆2 = 2𝛽1 − 𝛽0
𝜆3 = 6𝛽2 − 6𝛽1 + 𝛽0 3.5 Goodness‑of‑fit (GOF) tests
𝜆4 = 20𝛽3 − 30𝛽2 + 12𝛽1 − 𝛽0
The goodness-of-fit tests are used to test that the observed
The first L-moment ( 𝜆1 ) is a measure of location (mean), data follow a particular distribution. We consider the
while the second L-moment represents the dispersion. Anderson–Darling (AD) test for the study. This test is often
Finally, the L-moment ratios defined by Hosking [10] are used in flood frequency analysis and has shown good per-
given below formance in case of small sample size and heavy-tailed dis-
𝜆2
tributions [12, 20]. The test statistic for AD test is defined as
L-Coefficient of variation (𝜏2 ) = 𝜆1
𝜆3 AD = −n − S
L-Skewness (𝜏3 ) = 𝜆2 ∑n � ��
where i=1 2i−1

𝜆4 log(1 − F(y )) + log(F(y ))
L-Kurtosis (𝜏4 ) = 𝜆2 n n−i+1 i
where F(yi ) represents the cumulative distribution func-
tion (CDF) of the specified distribution.

Vol:.(1234567890)
SN Applied Sciences (2019) 1:1629 | https://doi.org/10.1007/s42452-019-1584-z Research Article

3.6 Accuracy measure method estimation of flood frequency of return periods. The con-
fidence interval of flood quantile gives an estimated range
In accuracy measure (AM) methods, we have used the of values which is likely to include the flood frequency of
mean absolute error (MAE), mean absolute percentage return periods. We use a parametric bootstrap method for
error (MAPE), root mean square error (RMSE), root mean estimation of standard error of estimated quantiles and
squared percentage error (RMSPE) and correlation coeffi- confidence interval of flood quantiles of return periods.
cient ( R2 ) to evaluate how adequately a given distribution This method is more precise than an asymptotic computa-
fits the observed data. These measures are defined as tion when n is small [16, p. 133]. The detail of procedure
n
involves in parametric bootstrap method is given in [16,
MAE =
1�
�F(yi ) − F(̂yi )� p. 133].
n i=1
n
100 � �� (F(yi ) − F(̂yi )) ��
MAPE =
n i=1 �� F(yi ) �
� 4 Result and discussion

∑n
(F(yi ) − F(̂yi ))2
RMSE = i=1 We summarized the basic statistics of five gauging sites in
n Table 5. All data on gauging sites in the table are in cubic

� n �
� 1 � F(y ) − F(̂y ) �2 metre per second. It is observed that all data at these sites
i i
RMSPE =� ×100 are skewed. This is a enough evidence to model the data
n i=1 F(yi )
with non-normal distribution. In flood frequency analysis,
∑n ̄ i ))2
(F(̂yi ) − F(y the basic statistical assumptions are independence, ran-
2 i=1
R =∑ domness and stationarity of the data series (see e.g. [8,
n ̄ i ))2 + ∑n (F(yi ) − F(̂yi ))2
yi ) − F(y
i=1 (F(̂ i=1 11]). The independence and randomness of the data series
at given site are tested by using correlation coefficient (r)
∑n
F(̂y )
̄ i ) = i=1 i and n represents the size of the data
where F(y n at lag-1 and Wald–Wolfowitz (WW) test, respectively. To
series. In all above accuracy measures, F(yi ) is the empiri-
check the stationarity of the data series, Mann–Kendall
cal cumulative distribution function (CDF) of the data
(MK) test has been applied. The assumptions verification
(observed ordered values) and F(̂yi ) indicates the theoreti-
results are summarized in Table 3. The results in Table 3
cal CDF of the distribution (ordered estimated values from
indicate that the data series at each gauging site of Torne
the distribution).
River are suitable for flood frequency analysis and prob-
ability density estimation.
The estimated parameters for each distribution at each
3.7 Quantile estimation
gauging site by using MLE and LM method of estimation
along with standard error (SE) are reported in Table 4. To
After selection of the best probability distribution, the
identify the best distribution at each site, we use GOF tests
main goal of flood frequency analysis is to estimate the
and accuracy measures. Each distribution with parameter
quantile yT for a return period (T) of scientific relevance.
estimation method is ranked in each GOF test and accu-
P(Y ⩾ yT ) = T1 indicates the probability of exceedance from
racy measure in Table 6. The distribution is assigned a rank
flood level yT once in T years. The cumulative probability
score between 1 and 10 in GOF test and accuracy meas-
of non-exceedance is defined as
ures, rank score 10 for the best-fitted and 1 for the worse
1 fitted distribution. The rank score scheme is based on the
F = F(yT ) = P(Y ⩽ yT ) = 1 − P(Y ⩾ yT ) = 1 −
T relative magnitude of accuracy measures and AD test P
value. The distribution with the lowest RMSE, lowest MAE,
The distribution function F(yT ) can be expressed in inverse
lowest RMSEP, lowest MAEP or the highest R2 has the high-
form as yT = y(F) , and we can directly evaluate estimated
est rank score value 10. In AD test, the distribution with
quantile yT by replacing F. Sometimes, inverse of F(yT ) does
the highest P value has the highest rank score value 10.
not exist analytically, and then, the numerical method is
The best distribution with estimation method at each site
used to evaluate yT for the given value of F. The expres-
is identified based on the total rank score in GOF tests and
sions of quantile function of candidate distributions are
accuracy measures methods. The total rank score in Table 6
summarized in Table 2. The quantile estimate for T years
indicates that GLO with MLE estimation method is best for
is calculated by substituting the value of F = (1 − T1 ) in
Junosuando. For site Pajala Pumphus and Abisko, the LN3
the expressions of quantile in Table 2. The standard error
distribution is performed better than other distributions
of estimated quantiles represents the uncertainty in the
with the LM method. The GEV and PE3 distribution with

Vol.:(0123456789)
Research Article SN Applied Sciences (2019) 1:1629 | https://doi.org/10.1007/s42452-019-1584-z

Table 3  Assumptions results Station name n r P value Mann–Kendall test Wald–Wol-


of five gauging sites of Torne fowitz test
River
Test statistics P value P value

Kukkolankoski Övre 108 −  0.08 0.23 1.85 0.06 0.86


Pajala Pumphus 50 −  0.11 0.25 0.32 0.75 0.94
Abisko 35 0.03 0.80 0.21 0.83 0.17
Junosuando 52 −  0.09 0.34 1.41 0.16 0.95
Övre Abiskojokk 34 −  0.06 0.66 −  0.62 0.53 0.33

where n represents the sample size of time series and r indicates the Pearson correlation coefficient

Table 4  Estimated parameters with MLE and LM methods


Station Probability distributions
GEV P3 GUM GLO LN3
MLE MLE MLE MLE MLE
𝜇̂ 𝛼̂ 𝜅̂ 𝜇̂ 𝛼̂ 𝜅̂ 𝜇̂ 𝛼̂ 𝜇̂ 𝛼̂ 𝜅̂ 𝜇̂ 𝛼̂ 𝜅̂

16722 1993.73 450.20 0.16 297.25 128.76 14.72 1955.47 433.50 2149.33 280.04 − 0.11 2152.08 483.32 − 0.17
(48.38) (34.34) (0.07) (763.60) (58.50) (12.50) (43.91) (33.52) (49.03) (22.95) (0.07) (51.38) (33.78) (0.09)
2012 768.35 223.53 0.43 1663.77 − 52.30 15.99 718.05 240.19 839.62 117.07 0.08 844.26 204.85 0.16
(34.29) (24.67) (0.08) (516.20) (36.45) (20.76) (36.57) (25.24) (29.98) (13.92) (0.10) (31.44) (21.03) (0.12)
2357 207.26 52.41 0.24 − 285.00 5.69 90.04 200.73 50.87 225.42 30.92 − 0.04 225.60 53.77 − 0.06
(9.80) (6.86) (0.11) (869.20) (9.86) (308.12) (9.01) (6.84) (9.49) (4.35) (0.12) (9.93) (6.45) (0.15)
4 281.69 74.13 0.02 113.75 40.43 5.18 280.81 73.61 307.58 48.63 − 0.20 308.92 85.91 − 0.32
(11.84) (8.10) (0.11) (51.00) (15.39) (3.10) (10.49) (8.00) (11.97) (6.16) (0.08) (12.80) (9.24) (0.12)
957 108.80 26.47 0.08 61.85 17.66 3.42 107.71 25.83 119.34 17.35 − 0.13 118.11 29.65 − 0.28
(5.18) (3.77) (0.14) (27.44) (13.39) (4.08) (4.56) (3.47) (5.56) (2.54) (0.14) (5.76) (3.86) (0.19)
Station Probability distributions
GEV P3 GUM GLO LN3
LM LM LM LM LM
𝜇̂ 𝛼̂ 𝜅̂ 𝜇̂ 𝛼̂ 𝜅̂ 𝜇̂ 𝛼̂ 𝜇̂ 𝛼̂ 𝜅̂ 𝜇̂ 𝛼̂ 𝜅̂

16722 1990.07 456.59 0.15 11.28 114.04 19.12 1959.44 403.29 2157.98 276.98 − 0.07 2154.46 490.66 − 0.15
(46.62) (36.53) (0.07) (994.14) (57.03) (19.50) (40.39) (36.53) (46.90) (22.79) (0.05) (48.38) (35.96) (0.09)
2012 764.68 220.31 0.40 1964.77 − 39.13 29.07 728.59 170.98 839.06 117.80 0.06 840.27 208.73 0.12
(33.65) (23.58) (0.11) (781.80) (42.25) (22.81) (26.28) (22.34) (28.85) (13.84) (0.08) (32.31) (22.61) (0.13)
2357 206.33 53.49 0.22 − 305.49 5.80 91.88 201.21 45.18 225.54 31.26 − 0.03 225.36 55.39 − 0.07
(10.49) (6.96) (0.13) (270.70) (15.19) (23.56) (8.01) (7.06) (9.51) (4.46) (0.09) (10.37) (6.62) (0.16)
4 279.83 73.16 − 0.01 148.25 50.90 3.43 280.25 74.02 308.19 48.68 − 0.18 306.66 85.97 − 0.37
(10.94) (9.12) (0.10) (89.07) (21.98) (8.62) (10.75) (9.07) (12.16) (6.18) (0.08) (13.10) (9.70) (0.14)
957 109.40 28.50 0.14 − 3.65 7.84 16.06 107.62 25.38 119.92 17.40 − 0.08 119.68 30.82 − 0.17
(5.50) (4.18) (0.13) (103.66) (7.83) (20.99) (4.55) (4.10) (5.26) (2.54) (0.10) (5.90) (3.93) (0.17)

Here, 𝜇̂ , 𝛼̂ and 𝜅̂ represent the estimated location, scale and shape parameters, respectively. The value in parenthesis is a standard error of
estimated parameter

the LM method are best-fit distributions for gauging site method is identified at gauging site which has the high-
Kukkolankoski Övre and Övre Abiskojokk, respectively. est CV and skewness, see Tables 5 and 6. It seems that the
In this study, a single distribution has not emerged as sites having extreme average of annual maxima of flood
the best distribution for all gauging sites. This was also and catchment area (either very large or very small) are in
the case in [1, 5]. Overall, the LM estimation method per- favour of the LM method of estimation, see Tables 1, 5 and
formed better for identifying the suitable distribution (also 6. If we look the landscape setting, the gauging sites which
see, [1]). The most suited distribution with MLE estimation are at an extreme position (close and far away) to the Gulf

Vol:.(1234567890)
SN Applied Sciences (2019) 1:1629 | https://doi.org/10.1007/s42452-019-1584-z Research Article

Table 5  Descriptive statistics Station n Mean Median S CV Skewness Kurtosis


(cubic metre per second)
16722 108 2192.22 2140.00 493.03 0.22 0.37 − 0.19
2012 50 827.28 837.53 211.16 0.26 − 0.54 0.90
2357 35 227.29 223.53 54.71 0.24 0.17 − 0.07
04 52 322.98 296.50 93.32 0.29 0.93 0.95
957 34 122.27 120.67 31.39 0.26 0.73 1.24

Here, S represents the standard deviation and CV indicates the coefficient of variation

of Bothnia are in favour of the LM estimation method. estimation techniques are used to estimate the distribu-
The sample size of the time series does not seem to be tion’s parameters. The study investigates the selection of
an important factor in favour of particular distribution or best-fit probability distribution and estimation method for
estimation method in this study. at-site flood frequency analysis of Torne river. The best-fit
One major objective of flood frequency analysis is to frequency distribution is identified at each gauging site
estimate the quantiles in the extreme upper tail of the based on the highest total rank score in goodness-of-fit
best-fitted distribution at each gauging site. The quantiles tests and accuracy measures.
estimate for the return periods 5, 10, 25, 50, 100, 200 and The results indicate that the GLO distribution using MLE
500 years are calculated by using quantile function and for gauging site Junosuando and the LN3 distribution with
parameters value of the best-fitted distributions. Quan- a LM method for Pajala Pumphus and Abisko perform bet-
tile estimate yT with non-exceedance probability F for the ter than other distributions of this study. The GEV and P3
best-fitted distributions is given in Table 7. The estimate of distributions using the LM method are the most suitable
uncertainty ( 𝜎s ) in quantile estimates and 95% confidence distribution at Kukkolankoski Övre and Övre Abiskojokk,
intervals of quantiles of flood for different return period respectively. At most gauging sites, the best distributions
are also presented in Table 7. The SE indicates that longer using LM estimation method are identified as the best-
return periods have more uncertainty around the flood suited distributions.
quantile estimates. The results found in this research study for flood fre-
quency analysis of Torne River can be used in flood study,
water resource planning and designing of hydraulic
5 Conclusion structures within the same basin and similar catchments.
The best-fitted distributions used in this study could be
In this study, the annual maximum steam flow series of considered as candidate distributions for regional flood
five gauging sites of Torne River are examined. Flood frequency analysis of Torne River basin or at-site flood fre-
frequency analysis is performed by using GEV, P3, GUM, quency analysis on other rivers in Sweden as well.
GLO and LN3 distributions. The MLE and LM parameter

Vol.:(0123456789)
Research Article SN Applied Sciences (2019) 1:1629 | https://doi.org/10.1007/s42452-019-1584-z

Table 6  Rank score of Station Method Distribution AD RMSE MAE RMSEP MAEP Total rank
R2
distribution in both GOF tests
and accuracy measures Kukkolankoski Övre MLE GEV 7 5 7 7 7 5 38
P3 6 7 6 6 5 7 37
GLO 4 3 4 4 4 3 22
LN3 5 6 5 5 6 6 33
GUM 2 1 2 2 2 1 10
LM GEV 10 10 10 10 10 10 60
P3 9 9 9 9 9 9 54
GLO 3 4 3 3 3 4 20
LN3 8 8 8 8 8 8 48
GUM 1 2 1 1 1 2 08
Pajala Pumphus (2012) MLE GEV 3 4 7 4 4 4 26
P3 6 5 5 5 5 5 31
GLO 4 8 3 8 8 8 39
LN3 7 6 6 6 6 6 37
GUM 1 1 1 2 1 1 7
LM GEV 5 3 8 3 3 3 25
P3 9 9 9 9 9 9 54
GLO 8 7 4 7 7 7 40
LN3 10 10 10 10 10 10 60
GUM 2 2 2 1 2 2 11
Abisko (2357) MLE GEV 3 3 3 3 4 3 19
P3 8 5 6 4 6 5 34
GLO 4 6 4 9 3 6 32
LN3 7 4 5 5 5 4 30
GUM 2 1 2 2 2 1 10
LM GEV 6 7 8 6 8 7 42
P3 9 9 9 7 9 9 52
GLO 5 8 7 10 7 8 45
LN3 10 10 10 8 10 10 58
GUM 1 2 1 1 1 2 8
Junosuando (4) MLE GEV 5 1 1 2 2 1 12
P3 2 1 1 2 2 1 09
GLO 10 10 6 9 7 10 52
LN3 4 3 4 5 5 3 24
GUM 8 6 9 6 8 6 43
LM GEV 6 9 8 7 6 9 45
P3 1 4 5 1 1 4 16
GLO 9 5 3 10 9 5 41
LN3 3 7 7 4 4 7 32
GUM 7 8 10 8 10 8 51
Övre Abiskojokk (957) MLE GEV 6 5 7 7 7 5 37
P3 1 7 1 4 1 7 21
GLO 5 10 4 3 4 10 36
LN3 4 6 5 6 6 6 33
GUM 3 9 3 5 3 9 32
LM GEV 8 1 8 9 8 1 35
P3 9 2 10 10 10 2 43
GLO 7 4 6 1 5 4 27
LN3 10 3 9 8 9 3 42
GUM 2 8 2 2 2 8 24

The bold values are rank score for best-fitted distribution

Vol:.(1234567890)
SN Applied Sciences (2019) 1:1629 | https://doi.org/10.1007/s42452-019-1584-z Research Article

Table 7  Quantile estimates of flood with 95% confidence interval at five gauging sites of Torne River

Station Distribution Method Statistics Non-exceedance probability and return periods (years)
0.80 0.90 0.96 0.98 0.99 1.00 1.00
5.00 10.00 25.00 50.00 100.00 200.00 500.00

16772 GEV LM Lower 2475.06 2703.84 2929.24 3056.12 3157.22 3236.94 3324.16
Fit 2601.25 2857.74 3142.00 3327.52 3492.74 3640.52 3812.68
Upper 2728.22 3011.45 3358.67 3612.61 3867.51 4120.59 4461.08
𝜎s 64.33 78.28 109.83 142.87 181.92 225.49 288.13
2012 LN3 LM Lower 943.17 1016.74 1080.04 1114.79 1141.18 1164.31 1188.27
Fit 1007.08 1087.59 1168.75 1218.72 1262.09 1300.52 1345.53
Upper 1067.68 1155.55 1258.97 1332.30 1402.33 1470.83 1558.80
𝜎s 31.69 35.55 45.80 55.86 66.96 78.66 94.65
2357 LN3 LM Lower 250.78 272.75 293.06 304.31 313.38 321.13 330.25
Fit 273.37 299.61 328.49 347.65 365.24 381.63 401.88
Upper 296.48 327.91 368.54 400.46 433.38 466.35 511.51
𝜎s 11.66 14.15 19.52 24.73 30.66 37.15 46.49
04 GLO MLE Lower 350.20 392.09 446.76 486.07 529.94 577.23 641.49
Fit 385.52 442.47 525.23 596.78 678.28 771.63 916.74
Upper 424.56 499.39 629.33 754.07 920.40 1132.33 1491.74
𝜎s 18.70 27.11 46.15 68.45 99.73 142.66 223.51
957 P3 LM Lower 133.74 146.98 159.11 166.19 171.67 176.38 181.46
Fit 147.67 163.85 182.29 194.88 206.65 217.81 231.83
Upper 162.08 183.06 210.76 231.51 251.71 272.35 299.50
𝜎s 7.19 9.28 13.09 16.44 20.02 23.77 28.89

The lower limit is the fifth percentile and upper limit is the 95th percentile of the quantile distribution. The standard deviation of a quantile
distribution is 𝜎s which represents the standard error of quantile estimates. The flood quantile estimates for different return periods indicate
with bold value

Acknowledgements  Open access funding provided by Stockholm methods for flood-frequency analysis in Europe. Technical
University. report, (NERC) Centre for Ecology & Hydrology
3. Cicioni G, Giuliano G, Spaziani FM (1973) Best fitting of prob-
Data availability  The data that support the findings of this study are ability functions to a set of data for flood studies. In: Floods and
openly available at SMHI [25]. droughts
4. Cunnane C (1989) Statistical distributions for flood frequency
analysis. Operational Hydrology Report (WMO)
Compliance with ethical standards  5. Drissia TK, Jothiprakash V, Anitha AB (2019) Flood frequency
analysis using L moments: a comparison between at-site and
Conflict of interest  The authors declare that they have no conflict of regional approach. Water Resour Manag 33(3):1013–1037
interest. 6. Elfvendahl S, Liljaniemi P, Salonen N (2006) The River Torne inter-
national watershed: common Finnish and Swedish typology,
Open Access  This article is distributed under the terms of the Crea- reference conditions and a suggested harmonised monitoring
tive Commons Attribution 4.0 International License (http://creati​ veco​ program: results from the TRIWA project. County Administrative
mmons​.org/licen​ses/by/4.0/), which permits unrestricted use, distri- Board of Norrbotten [Länsstyrelsen i Norrbottens län]
bution, and reproduction in any medium, provided you give appro- 7. Haddad K, Rahman A (2011) Selection of the best fit flood fre-
priate credit to the original author(s) and the source, provide a link to quency distribution and parameter estimation procedure: a case
the Creative Commons license, and indicate if changes were made. study for Tasmania in Australia. Stoch Environ Res Risk Assess
25(3):415–428
8. Hamed K, Rao AR (1999) Flood frequency analysis. CRC Press, Boca
Raton
References 9. Hosking JRM (1986) The theory of probability weighted moments.
IBM Research Rep RC12210, IBM, Yorktown Heights, NY Google
1. Ahmad I, Fawad M, Mahmood I (2015) At-site flood frequency Scholar
analysis of annual maximum stream flows in Pakistan using 10. Hosking JRM (1990) L-moments: analysis and estimation of distri-
robust estimation methods. Pol J Environ Stud 24(6):2345–2353 butions using linear combinations of order statistics. J R Stat Soc
2. Castellarin A, Kohnová S, Gaál L, Fleig A, Salinas JL, Toumazis A, Ser B (Methodol) 52(1):105–124
Kjeldsen TR, Macdonald N (2012) Review of applied-statistical

Vol.:(0123456789)
Research Article SN Applied Sciences (2019) 1:1629 | https://doi.org/10.1007/s42452-019-1584-z

11. Kite GW (2019) Frequency and risk analyses in hydrology. 20. Önöz B, Bayazit M (1995) Best-fit distributions of largest available
Water Resour Publications, LLC. https​://books​.googl​e.se/books​ flood samples. J Hydrol 167(1–4):195–208
?id=b9OKx​AEACA​AJ 21. Opere AO, Mkhandi S, Willems P (2006) At site flood frequency
12. Laio F (2004) Cramer–von Mises and Anderson–Darling goodness analysis for the Nile Equatorial basins. Phys Chem Earth Parts
of fit tests for extreme value distributions with unknown param- A/B/C 31(15–16):919–927
eters. Water Resour Res 40(9):W09308 22. Rahman AS, Rahman A, Zaman MA, Haddad K, Ahsan A, Imteaz M
13. Leandro J, Leitão JP, de Lima JLMP (2013) Quantifying the uncer- (2013) A study on selection of probability distributions for at-site
tainty in the soil conservation service flood hydrographs: a case flood frequency analysis in Australia. Nat Hazards 69(3):1803–1813
study in the Azores Islands. J Flood Risk Manag 6(3):279–288 23. Saf B (2009) Regional flood frequency analysis using L-moments
14. Leandro J, Gander A, Beg MNA, Bhola P, Konnerth I, Willems W, Car- for the West Mediterranean region of Turkey. Water Resour Manag
valho R, Disse M (2019) Forecasting upper and lower uncertainty 23(3):531–551
bands of river flood discharges with high predictive skill. J Hydrol 24. Sevruk B, Geiger H (1981) Selection of distribution types for
576:749–763 extremes of precipitation (No. 551.577). Secretariat of the World
15. Merz B, Thieken AH (2005) Separating natural and epistemic Meteorological Organization
uncertainty in flood frequency analysis. J Hydrol 309(1–4):114–132 25. The Swedish Meteorological and Hydrological Institute (2019)
16. Meylan P, Favre AC, Musy A (2012) Predictive hydrology: a fre- Hydrologiska observationer. Data files retrieved from SMHI
quency analysis approach. CRC Press, Boca Raton hydrological observations. https​://vatte​nwebb​.smhi.se/stati​on/.
17. Mkhandi SH, Kachroo RK, Gunasekara TAG (2000) Flood frequency Accessed 20 Mar 2019
analysis of Southern Africa: II. Identification of regional distribu-
tions. Hydrol Sci J 45(3):449–464 Publisher’s Note Springer Nature remains neutral with regard to
18. Młyński D, Wałęga A, Stachura T, Kaczor G (2019) A new empirical jurisdictional claims in published maps and institutional affiliations.
approach to calculating flood frequency in ungauged catchments:
a case study of the upper Vistula basin, Poland. Water 11(3):601
19. Nelder JA, Mead R (1965) A simplex method for function minimiza-
tion. Comput J 7(4):308–313

Vol:.(1234567890)

You might also like