Young 2017

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 12

Applied Soft Computing 53 (2017) 205–216

Contents lists available at ScienceDirect

Applied Soft Computing


journal homepage: www.elsevier.com/locate/asoc

A physically based and machine learning hybrid approach for accurate


rainfall-runoff modeling during extreme typhoon events
Chih-Chieh Young a , Wen-Cheng Liu b,c,∗ , Ming-Chang Wu c
a
Hydrotech Research Institute, National Taiwan University, Taipei 10617, Taiwan
b
Department of Civil and Disaster Prevention Engineering, National United University, Miaoli 36063, Taiwan
c
Taiwan Typhoon and Flood Research Institute, National Applied Research Laboratories, Taipei 10093, Taiwan

a r t i c l e i n f o a b s t r a c t

Article history: Accurate rainfall-runoff modeling during typhoon events is an essential task for natural disaster reduc-
Received 16 May 2016 tion. In this study, a novel hybrid model which integrates the outputs of physically based hydrologic
Received in revised form modeling system into support vector machine is developed to predict hourly runoff discharges in Chis-
30 November 2016
han Creek basin in southern Taiwan. Seven storms (with a total of 1200 data sets) are used for model
Accepted 29 December 2016
Available online 3 January 2017
calibration (training) and validation. Six statistical indices (mean absolute error, root mean square error,
correlation coefficient, error of time to peak discharge, error of peak discharge, and coefficient of effi-
ciency) are employed to assess prediction performance. Overall, superiority of the present approach
Keywords:
Rainfall-runoff especially for a longer (6-h) lead time prediction is revealed through a systematic comparison among
Typhoon events three individual methods (i.e., the physically based hydrologic model, artificial neural network, and
Hydrologic modeling system (HEC-HMS) support vector machine) as well as their two hybrid combinations. Besides, our analysis and in-depth
Support vector regression (SVR) discussions further clarify the roles of physically based and data-driven components in the proposed
Artificial neural network (ANN) framework.
Hybrid approach © 2017 Elsevier B.V. All rights reserved.

1. Introduction some mathematical equations that represent the hydrological pro-


cesses to the best of current knowledge [10,11]. The data-driven
Accurate prediction of hourly runoff discharges during storm or models establish a direct mapping between hydrologic variables
heavy rainfall events is an important task for natural disaster reduc- and extract their relationship from historical measured data by the
tion [1,2]. For example, induced by extreme stream flow, floods are algorithms developed in the fields of statistics, computational intel-
the most devastating hazards causing numerous losses of life and ligence, machine learning, and data mining [12–15]. Between the
significant damages to the economy [3,4]. In order to meet prac- two fundamentally different types of methods, the physically based
tical demands (e.g., alert or warning), a lot of efforts have been modeling confronts a much higher challenge to reach the required
made to develop effective hydrological models which facilitate bet- accuracy with the limited watershed information [11].
ter understanding of characteristics, processes, and responses in a Artificial neural networks (ANNs), a kind of data-driven models,
watershed [5–8]. have prevailed in rainfall-runoff modeling with extensive applica-
To date, various methods have been used to study the compli- tions during the past several decades [16–19]. Although without
cated and nonlinear rainfall-runoff process which usually presents a priori knowledge for the catchment characteristics, the ANNs
a great spatial and temporal variability owing to the mixed influ- imitate human brain cells to effectively learn the rules in nature
ences from weather conditions, land covers, and soil types [9]. from sufficient training data [20]. They have proven their advan-
Generally, these modeling frameworks can be divided into two tage on handling a highly nonlinear system over most physically
major groups: the physically based and/or data-driven approaches. and statistically based approaches [11,21,22]. Excellent review for
The physically based models mainly construct a simplified water- the ANN theories as well as their applications in hydrology can be
shed system and express the interior behavior through solving found in Govindaraju and Rao [23]. However, the prediction perfor-
mance of ANN relies on an appropriate network structure and its
interior parameters. Typically, this problem is tackled by a simple
trial-and-error procedure or better resolved with the evolution-
∗ Corresponding author at: Department of Civil and Disaster Prevention Engineer-
ary optimization algorithms [17,24,25]. Support vector machines
ing, National United University, Miaoli 36063, Taiwan.
E-mail address: wcliu@nuu.edu.tw (W.-C. Liu).
(SVMs) or regressions (SVRs), another type of data-driven mod-

http://dx.doi.org/10.1016/j.asoc.2016.12.052
1568-4946/© 2017 Elsevier B.V. All rights reserved.
206 C.-C. Young et al. / Applied Soft Computing 53 (2017) 205–216

els, have received increasing attentions in the recent years and


emerged as an alternative to the ANNs for rainfall-runoff modeling
[2,26–31]. They employ structural risk minimization (SRM) prin-
ciple from statistical learning theory [32]. Hence, SVMs and SVRs
with a unique architecture and globally optimal weights show great
efficiency in training process and superior generalization capability
free of the unwanted over-fitting issue [2,30].
Several different thoughts for obtaining better solutions in
hydrological modeling have also been evolved [22,25,31,33–36].
Their basic idea is to develop an effective hybrid combination rather
than an individual type of approach since each kind of model cap-
tures a certain aspect of the phenomena. For example, Xiong and
O’Connor [34] enhanced flow prediction performance using the
lumped soil moisture accounting and routing (SMAR) conceptual
model in conjunction with error updating techniques. Shamseldin
et al. [35] presented accurate predictions through combining the
outputs of five rainfall-runoff models in three different ways. In
addition to the error updating and multi-model approaches, Zhang
and Govindaraju [37] and Hosseini and Mahjouri [31] precisely
estimated runoff in a watershed by implementing the geomor-
phologic instantaneous unit hydrograph (GIUH) theory into the
architecture of ANN and SVM, respectively. Similarly, Young and
Liu [22] proposed a physically based and artificial neural network
hybrid model to improve rainfall-runoff modeling. Particularly,
the two components in the hybrid model based upon different
philosophies complement each other with respect to their inherent
strengths and limitations. While the important hydrological pro-
cesses involved in a physically based model make up the black-box
feature of a data-driven model, the difficulty in accurate physical
modeling can be alleviated by the powerful data-driven methodol- Fig. 1. The Chishan Creek basin study site.
ogy [38,39]. Success of these hybrid approaches motivates a further
research in the present study.
3. Methodology
The objective of this paper is to develop a new hybrid model
which integrates the physically based HEC-HMS model into the
3.1. HEC-HMS rainfall–runoff model
support vector machine (regression) for accurate rainfall-runoff
modeling with an application in the Chishan Creek basin in south-
The Hydrologic Engineering Center-Hydrologic Modeling Sys-
ern Taiwan. The hourly runoff discharges during typhoon events are
tem (HEC-HMS) model is widely applied for the simulations of
predicted in different lead time using the physically based model
rainfall-runoff process [10,41]. In the HEC-HMS model, effective
(HEC-HMS), data-driven (ANN and SVR) models, and hybrid models
(or excess) precipitation in a watershed is determined by pervious
(HEC-HMS-ANN and HEC-HMS-SVR). The prediction performance
characteristics of the connected surfaces. The overland flow and
is examined by six statistical indices, i.e., mean absolute error, root
near-surface flow are then combined to form direct runoff in the
mean square error, coefficient of correlation, error of time to peak
stream.
discharge, error of peak discharge, and coefficient of efficiency.
The effective rainfall on pervious land surfaces is computed
Through our rigorous analysis and in-depth discussions, the roles
using an empirical equation, i.e.,
of physically based and data-driven components in the proposed
hybrid framework are clarified. (P − Ia )2
Pe = , (1)
P − Ia + S

2. Description of study site and data collection where Pe and P are the precipitation excess and accumulated rain-
fall depth at time t, respectively, Ia is the initial abstraction (e.g.,
The Chishan Creek basin, located in Kaoping River watershed, infiltration loss), and S is the potential maximum retention. The
southern Taiwan, is an important study area. In typhoon seasons, Soil Conservation Service (SCS) suggested good estimations for Ia
this region frequently suffers from natural hazards (e.g., floods, and S (in SI unit, cm) based on the curve number (CN) derived from
debris flows, and landslides). One of the most serious tragedies empirical data, i.e., Ia = 0.2 S and
is the destruction of an entire village (500 people) in 2009 [40]. 2, 540
S= − 25.4. (2)
The Chishan Creek basin has a total area of 842 km2 with a mean CN
elevation of 824 m and an average slope of 0.4374 m/m. The main Note that the curve number (CN) with a range from 1 to 100 is a
stream is about 118 km long while the stream order of the Chis- function of land use, land cover types, and hydrological soil group
han Creek is up to 5. The rainfall-runoff data were collected by [42].
the Water Resources Agency, Taiwan. While the Sin-Gao-Kou and The storm hydrograph (Q) is obtained by convolving the pre-
Min-Zu stations record the rainfall, the Nan-Fong Bridge station cipitation increments (P) with unit hydrograph ordinates (U), i.e.,
measures the discharge at the outlet of Chishan Creek basin (see 
n≤M
Fig. 1). Fig. 2 presents the rainfall and discharge variations dur- Qn = Pm Un−m+1 , where m increases from 1 to n. In practice, the
ing Typhoon Bilis in 2006 and Typhoon Sepat in 2007. The different m=1
patterns at two rain-gauge stations indicate the significant regional translation of excess precipitation to direct runoff can be mod-
and topographic effects on precipitation in this basin [22]. eled using modified Clark unit hydrograph (ModClark) method [43].
C.-C. Young et al. / Applied Soft Computing 53 (2017) 205–216 207

Fig. 2. Measured rainfall and discharge data during the Bilis and Sepat typhoons: (a) and (d) are the rainfall at the Sin-Gao-Kou station, (b) and (e) are the rainfall at the
Min-Zu station, and (c) and (f) are the runoff (discharge) at the Nan-Fong Bridge station.

Based upon the time of concentration (tc ) and maximum traveling outflow at time level t is advanced from the information at the
distance (dmax ), travel time (or translation lag) to each watershed previous time step t − 1 (see Eq. (3)). To obtain reasonable predic-
cell (tcell ) can be calculated, i.e., tcell = tc (dcell /dmax ), where dcell is the tion performance, parameters (constants) in the equations above
distance from the cell to basin outlet. The cell outflow hydrograph should be properly estimated or tuned based upon the watershed
is then routed using a linear reservoir concept [41], i.e., characteristics and the observed data during model calibration. For
example, a larger CN indicates smaller retention (e.g., CN = 98 for
t t
Q (t) = I(t) + [1 − ]Q (t − 1), (3) an almost impervious condition), yielding higher excess rainfall to
R + 0.5t R + 0.5t form surface runoff. Among various watersheds in Taiwan, the time
where Q(t) and Q(t − 1) are the outflows at current and previous of concentration typically ranges from 1.5 to 5.5 h [22,25]. The stor-
time levels t and t − 1, respectively; I(t) is the average inflow at age coefficient is tuned to reflect the rainfall excess partly stored
time t; R is the storage coefficient used to represent discharge in a watershed. The baseflow has an exponential decay constant
attenuation; and t is the time increment. Besides, the baseflow with the Order of O(0.01)–O(0.5). After model calibration, all the
is taken into account using an exponential decrease function [44], hydrological simulations were carried out with the fixed parame-
i.e., Q = Q0 e−kt , where Q0 is the averaged initial baseflow before a ters CN = 83.5, tc = 3.2 h, R = 4.15, and k = 0.35. The time step was set
storm; k is an exponential decay constant. The Muskingum-Cunge as 1 h. Besides, sensitivity analysis indicated that CN is the major
Standard Section method was employed for channel routing. factor affecting the simulated results.
The HEC-HMS model can be applied to simulate or forecast the
river runoff for the historical or current typhoon events as the pre- 3.2. The HEC-HMS-ANN hybrid model
cipitation data can be obtained from the field measurement records
or numerical weather predictions. A whole streamflow hydrograph In this study, a HEC-HMS-ANN hybrid model was developed by
is then obtained after the recursive time integration, where the combining the popular back-propagation artificial neural network
208 C.-C. Young et al. / Applied Soft Computing 53 (2017) 205–216

where P is the length of training data points; L is a total of neu-


rons in the output layer; el (p) represents the difference between
target and output at neuron l for the pth training pattern. In this
study, the robust Levenberg–Marquardt algorithm [46] was utilized
to accomplish the back-propagation training.
The HEC-HMS-ANN hybrid model (see Fig. 3(a)) was constructed
to predict the hourly runoff discharges at Nan-Fong Bridge sta-
tion QNFB (t + 1) in a 1-h lead time. The input layer included Psyn (t),
Psyn (t − 1), QNFB (t), QNFB (t − 1), and QHEC (t + 1), where Psyn is the
synthetic rainfall intensity determined by the Thiessen Polygon
method; QNFB and QHEC represent measured and predicted runoff
discharges at Nan-Fong Bridge station, respectively. Consideration
of HEC-HMS predicted value at t + 1 and measured data at t and t-1
incorporated physical mechanism and lag-time influences into the
proposed framework. Note that inputs with more lagged times gave
insignificant differences in the output results. For the hidden layer,
only three neurons were used to avoid the common over-fitting
issue. Once the training process is completed, the hybrid model can
provide accurate 1-h-ahead prediction. Further, a recursive proce-
dure which updates the input node (the measured discharge) with
the latest value of output node (the predicted discharge) is applied
for a longer lead-time forecasting.

3.3. The HEC-HMS-SVR hybrid model

Another hybrid model HEC-HMS-SVR was also developed using


the support vector machine [32] rather than the neural network.
The support vector regression (SVR) has attracted many attentions
owing to its non-linear generalization capability [2,30]. The SVR
model maps the input vector x into a high-dimensional feature
space nonlinearly and then finds its regression relationship to the
desired output y with a tolerance error ␧, i.e.,

ŷ = f (x, w, b) = w T (x) + b, (6)

where w and b are the weights and bias, respectively; and ŷ is


the estimated output. The regression can be further expressed as a
convex optimization problem based upon the structural risk mini-
mization (SRM) learning principle [32], i.e.,

Fig. 3. Architecture of (a) the neural network (NN) and (b) support vector machine
(SVM). 1 T  Nd

minimize R(w, ,   ) = w w+C (i +   i ), (7a)


2
i=1

(BPNN) with the physically based rainfall-runoff model (i.e., HEC-


subject to
HMS).
The BPNN [45] consists of a number of neurons in three (i.e., yi − ŷi = yi − (w T (xi ) + b) ≤ ε + i
the input, hidden, and output) layers (see Fig. 3(a) for the network , (7b)
ŷi − yi = (w T (xi ) + b) − yi ≤ ε +   i
structure). Between two successive layers, each neuron receives a
weighted sum of inputs from neurons in the previous layer and con- i ≥ 0,   i ≥ 0, i = 1,2,..., Nd
verts it into a temporary or final output signal through an activation
function, i.e., where the risk function R consists of the regularization and training
error (i.e., the first and second terms in R.H.S., respectively); C is a
Hn = f (wI m,n Im ) or Ol = f(wH n,l Hn ), (4) positive parameter determining the trade-off between model com-
plexity and empirical risk [28]; Nd is the number of training data;
where Hn is the temporary signal of neuron n; Ol is the final output ␰ and ␰ are slack variables specifying the upper and lower training
of neuron l; Im is the normalized input to neuron m; wIm,n and wHn,l errors, respectively. Typically, the optimization problem above is
represent the synaptic weights; the hyperbolic tangential sigmoid expressed in its dual form using Lagrange multipliers, i.e.,
and linear transfer functions are adopted in the hidden and output
layers, i.e., f (x) = 2/[(1 + e−2x )-1] and f (x) = x, respectively.
1 
Nd Nd
A training process which minimizes the cost function CNN is
required to obtain a set of appropriate weights and biases for the minimize (˛i − ˛j )(˛i − ˛j )ϕ(xi )T ϕ(xj )
2
neural network, i.e., i=1 j=1

1  2
P L Nd 
CNN = el (p), (5) − yi (˛i − ˛j ) − ε(˛i + ˛j ) , (8a)
P
p=1 l=1 i=1
C.-C. Young et al. / Applied Soft Computing 53 (2017) 205–216 209

subject to

Nd

(˛i − ˛j ) = 0 , (8b)
i=1

0 ≤ ˛i , ˛j ≤ C, i = 1, 2, ..., Nd

where ˛ and ˛ are the Lagrange multipliers that can be solved using
the quadratic programming algorithm. Finally, the estimation can
be rewritten as


Nd

Nd

f (x) = (−˛i + ˛i )(xi )T (x) + b = ˛i K(xi , x) + b, (9)


i=1 i=1

where ˛i = (−˛i + ˛i ) is a shorthand notation; K(xi , x) is the kernel


function.
The HEC-HMS-SVR hybrid model was constructed in a 3-layer
structure (see Fig. 3(b)) for hourly streamflow forecasting at the
Nan-Fong Bridge station. Similarly, measured rainfall/runoff at t
and t − 1 and HEC-HMS predicted discharge at t + 1 were adopted
in the input layer. From training data sets, support vectors for the
hidden layer were automatically selected based on the Lagrange
multipliers by specifying penalty (or regularization) parameter
C = 100. The Kernel inner products using a radial basis function, i.e.,
K(xi , x) = exp (- |xi − x|2 /2), were generated for the output estima-
tion. To obtain a longer lead-time prediction, the HEC-HMS-SVR
hybrid model followed the same recursive approach described in
Section 3.2.

3.4. Indices for prediction performance

To evaluate the prediction performance, we utilized six statis-


tics including mean absolute error (MAE), root mean square error
(RMSE), coefficient of correlation (R), error of time to peak discharge Fig. 4. Comparison of the observed and simulated runoff discharge for the Sandi-
(ETp ), error of peak discharge (EQp ), and coefficient of efficiency men gauge station for model calibration using (a) the HEC-HMS model, (b) the ANN
(CE), i.e., model, (c) the SVR model, and (d) the HEC-HMS-ANN and HEC-HMS-SVR hybrid
models.

1
N
MAE = |(Qm )i − (Qo )i |, (10)
N 4. Results and discussion
i=1

 N In this study, we applied several rainfall-runoff prediction
1
RMSE = 
approaches in the Chishan Creek basin in southern Taiwan during
[(Qm )i − (Qo )i ]2 , (11)
N the historical storm events. The prediction lead time ranges from
i=1
t + 1 to t + 6 h (based on a recursive method). Note that the rainfalls
were adopted from the field measured data (and can be obtained

N
1
[(Qm )i − Qm ][(Qo )i − Qo ] from the numerical weather predictions in the practical operation).
N
In the following, the results predicted from each individual model
R=  i=1
 , (12) (i.e., HEC-HMS, ANN, and SVR) and the hybrid models (i.e., HEC-
 N  N
  2  2 HMS-ANN and HEC-HMS-SVR) are presented and discussed.
1 [(Qm )i − Qm ]  1 [(Qo )i − Qo ]
N N
i=1 i=1 4.1. Event-based rainfall-runoff modeling using HEC-HMS
ETp = |Tm,p − To,p |, (13)
We considered a total of seven storm events, where the June
(Qm,p − Qo,p ) 9 Flood (2006), Typhoon Sepat (2007), Typhoon Kamegi (2008),
EQp = × 100%, (14)
Qo,p and Typhoon Jangmi (2008) were chosen for model calibration; and
Typhoon Bilis (2006), Typhoon Krosa (2007), and Typhoon Sinlaku

N

N
2
(2008) were adopted for model validation.
CE = 1 − [(Qo )i − (Qm )i ]2 / [(Qo )i − Qo ] , (15)
The model calibration and validation results for the HEC-HMS
i=1 i=1
model are shown in Figs. 4(a) and 5 respectively. During the cal-
where N is the total number of data points; Qm and Qo is the mod- ibration phase, the reasonable runoff prediction can be obtained.
eled and observed runoff discharge, respectively; Tm,p and To,p are For the validation events, the HEC-HMS model did not fully cap-
the peak time for the modeled and observed peak runoff discharge, ture the runoff patterns (rising/falling limbs), although the time
respectively; Qm,p and Qo,p are the modeled and observed peak and discharge of predicted peak flow are still acceptable. Further,
runoff discharges, respectively; the overbar indicates the mean Table 1 evaluates the prediction performance for model validation
value. events (since excellent results should be achieved through the cali-
210 C.-C. Young et al. / Applied Soft Computing 53 (2017) 205–216

Table 1
Prediction performance of HEC-HMS, ANN, and hybrid models.

Typhoon event Model Lead time (h) Statistical parameter

MAE (m3 /s) RMSE (m3 /s) R ETP (h) EQP (%) CE

BILIS HEC-HMS – 50.8 70.3 0.937 0 −0.5 0.841


ANN 1 16.4 31.6 0.984 1 1.6 0.968
2 20.2 37.5 0.977 1 3.2 0.955
4 30.1 52.5 0.955 1 1.6 0.911
6 39.9 68.2 0.923 27 9.1 0.850
HEC-HMS-ANN 1 16.3 31.4 0.984 1 2.5 0.968
2 19.2 36.5 0.978 1 5.4 0.957
4 26.5 46.2 0.966 1 1.3 0.931
6 32.3 53.9 0.954 27 0.5 0.906

KROSA HEC-HMS – 46.2 69.2 0.947 2 3.9 0.874


ANN 1 7.6 15.8 0.997 1 5.9 0.993
2 11.8 23.3 0.993 1 11.8 0.986
4 25.0 49.4 0.970 1 13.3 0.936
6 36.1 71.3 0.939 0 14.5 0.866
HEC-HMS-ANN 1 7.8 15.3 0.997 1 7.9 0.994
2 14.9 29.1 0.989 1 13.8 0.978
4 28.5 54.3 0.962 1 15.3 0.923
6 37.6 67.1 0.942 0 17.8 0.882

SINLAKU HEC-HMS – 47.6 65.4 0.920 1 8.8 0.826


ANN 1 10.2 18.0 0.994 1 3.0 0.987
2 14.5 24.5 0.989 1 6.0 0.975
4 24.5 40.2 0.973 1 9.7 0.934
6 34.0 54.7 0.952 0 12.9 0.878
HEC-HMS-ANN 1 10.7 18.7 0.993 1 2.8 0.986
2 14.1 23.3 0.990 1 6.0 0.978
4 20.9 32.1 0.981 1 8.5 0.958
6 26.1 39.1 0.972 0 9.3 0.938

bration process). The MAE, RMSE, R, ETp , EQp , and CE are 50.8 m3 /s,
70.3 m3 /s, 0.937, 0 h, −0.5%, and 0.841 for Typhoon Bilis, respec-
tively. Notice that the worst CE occurs in Typhoon Sinlaku, i.e.,
CE = 0.826.

4.2. Runoff prediction using the single data-driven models

4.2.1. The ANN model prediction


The single ANN model (with a similar network architecture but
without the input from the HEC-HMS model) was also applied for
rainfall-runoff prediction. Figs. 4(b) and 6 show excellent agree-
ment between the 1-h-ahead predictions and the measured data
during training and validation phases, respectively. The MAE, RMSE,
R, ETp , EQp , and CE for the 1-h-ahead prediction performance are
in the ranges of 7.6–16.4 m3 /s, 15.8–31.6 m3 /s, 0.984–0.997, 1 h,
1.6–5.9%, and 0.968–0.993, respectively (see Table 1). To reach
an operational forecasting, we examined the n-h-ahead predic-
tions using a recursive procedure, in which the discharge input
for the subsequent step is continuously updated from the newly
predicted output (instead of the measured data). The predictions
in Fig. 7 show significant accumulated errors with the increased
lead time up to t + 6, e.g., an overestimated discharge of peak flow
(and even an unrealistic peak flow in the rising limb). The perfor-
mance of 6-h-ahead runoff prediction during Typhoon Bilis gives
MAE = 39.9 m3 /s, RMSE = 68.2 m3 /s, R = 0.923, ETp = 27 h, EQp = 9.1%,
and CE = 0.850 (see Table 1).

4.2.2. The SVR model prediction


Similarly, the SVR model was used to predict rainfall-runoff
relationship in the Chishan Creek basin. A very good comparison
between the 1-h-ahead prediction and the measured data can be
Fig. 5. Comparison of the observed and simulated runoff discharge for the Nan-Fong found in both training and validation phases (see Figs. 4(c) and 8).
Bridge gauge station for model validation using the HEC-HMS model: (a) Typhoon The MAE, RMSE, R, ETp , EQp , and CE for the 1-h-ahead prediction
Bilis, (b) Typhoon Krosa, and (c) Typhoon Sinlaku. are in the ranges of 7.9–15.7 m3 /s, 16.6–31.1 m3 /s, 0.984–0.996,
1 h, 0.7–9.0%, and 0.969–0.993, almost identical to the performance
assessment of the ANN model (see Table 2). For the n-h lead time,
C.-C. Young et al. / Applied Soft Computing 53 (2017) 205–216 211

Table 2
Prediction performance of HEC-HMS, SVR, and hybrid models.

Typhoon event Model Lead time (h) Statistical parameter

MAE (m3 /s) RMSE (m3 /s) R ETP (h) EQP (%) CE

BILIS HEC-HMS – 50.8 70.3 0.937 0 −0.5 0.841


SVR 1 15.7 31.1 0.984 1 0.7 0.969
2 16.1 33.0 0.982 1 1.5 0.965
4 20.9 39.8 0.974 1 −2.6 0.949
6 26.6 49.1 0.960 1 −5.1 0.922
HEC-HMS-SVR 1 15.0 29.9 0.986 1 1.1 0.971
2 16.1 32.8 0.982 1 2.3 0.965
4 20.5 37.7 0.977 1 −2.5 0.954
6 24.6 41.4 0.973 1 −5.6 0.945

KROSA HEC-HMS – 46.2 69.2 0.947 2 3.9 0.874


SVR 1 7.9 16.6 0.996 1 9.0 0.993
2 8.6 19.9 0.995 1 21.2 0.990
4 16.5 35.2 0.986 1 20.2 0.967
6 23.4 50.6 0.970 1 13.5 0.933
HEC-HMS-SVR 1 7.3 15.1 0.997 1 10.3 0.994
2 12.6 27.6 0.990 1 24.0 0.980
4 21.5 41.5 0.977 0 13.5 0.955
6 25.9 47.6 0.970 3 2.4 0.940

SINLAKU HEC-HMS – 47.6 65.4 0.920 1 8.8 0.826


SVR 1 9.5 17.2 0.994 1 3.3 0.988
2 11.1 20.2 0.993 1 6.9 0.983
4 16.7 29.1 0.988 1 10.9 0.965
6 22.4 38.5 0.980 1 13.1 0.940
HEC-HMS-SVR 1 10.0 17.9 0.994 1 3.3 0.987
2 11.4 20.1 0.992 1 6.7 0.983
4 15.3 25.4 0.988 1 8.4 0.974
6 18.7 30.5 0.983 1 8.0 0.962

Fig. 6. Comparison of the observed and simulated runoff discharge for the Nan-Fong
Bridge gauge station for the validation phase using the ANN model with 1h-ahead
prediction (a) Typhoon Bilis, (b) Typhoon Krosa, and (c) Typhoon Sinlaku.
Fig. 7. Comparison of the observed and simulated runoff discharge for the Nan-Fong
Bridge gauge station during Typhoon Bilis using the ANN model with (a) 1 h-, (b)
2 h-, (c) 4 h-, and (d) 6 h-ahead prediction.
212 C.-C. Young et al. / Applied Soft Computing 53 (2017) 205–216

Fig. 8. Comparison of the observed and simulated runoff discharge for the Nan-Fong
Bridge gauge station for the validation phase using the SVR model with 1h-ahead
prediction (a) Typhoon Bilis, (b) Typhoon Krosa, and (c) Typhoon Sinlaku. Fig. 9. Comparison of the observed and simulated runoff discharge for the Nan-Fong
Bridge gauge station during Typhoon Bilis using the SVR model with (a) 1 h-, (b) 2 h-,
(c) 4 h-, and (d) 6 h-ahead prediction.
the SVR model demonstrates a superior prediction capability, i.e.,
less accumulated errors with the increased lead time (as shown in
Fig. 9) The MAE, RMSE, R, ETp , EQp , and CE for the 6-h-ahead runoff additional input from HEC-HMS predictions effectively reduces the
prediction during Typhoon Bilis are 26.6 m3 /s, 49.1 m3 /s, 0.960, 1 h, accumulated errors in the data-driven models associated with the
−5.1%, and 0.922, respectively (see Table 2). increased lead time. When integrating a physically based compo-
nent into ANN (or SVR), the maximum improvements in the root
4.3. Runoff prediction using the hybrid models mean square error (RMSE), correlation coefficient (R) and model
efficiency coefficient (CE) are 28.5%, 3.4%, and 6.8% (or 20.8%, 1.4%,
Last, Figs. 4(d) and 10 compare the 1-h-ahead forecasts and 2.5%), respectively.
from HEC-HMS-ANN and HEC-HMS-SVR hybrid models in train-
ing and validation phases. The prediction and the observation 4.4. Comparison and discussion of model predictions
are in excellent agreement. The ranges of MAE, RMSE, R, ETp ,
EQp , and CE among the validation events (see Tables 1 and 2) To apparently show the forecasting performance, Figs. 12 and 13
are 7.3–15.0 m3 /s, 15.1–29.9 m3 /s, 0.986–0.997, 1 h, 1.1–10.3%, further compare the observation-prediction pairs of hourly runoff
and 0.971–0.994 for the HEC-HMS-SVR model, respectively discharge for all the calibration (or training) and validation events,
(7.8–16.3 m3 /s, 15.3–31.4 m3 /s, 0.984–0.997, 1 h, 2.5–7.9%, and respectively. According to the spread of the scattering, the 1-h-
0.968–0.994 for the HEC-HMS-ANN model). ahead runoff discharge predictions by the data-driven and hybrid
Fig. 11 also shows the 1, 2, 4, and 6-h-ahead runoff predic- models (i.e., ANN, SVR, HEC-HMS-ANN, and HEC-HMS-SVR) are
tions from both hybrid models during Typhoon Bilis. In general, superior to that obtained by the HEC-HMS model. While the HEC-
the simulated patterns remain quite close to the measured data, HMS model yields the correlation coefficient R = 0.946 (0.931) in
reasonably capturing the discharge and rise time of the peak calibration (validation), the rest models give an almost perfect
flow. The HEC-HMS-SVR model presents great accuracy even in R over 0.990. In terms of their linear regressions, the predic-
the 6-h-ahead forecasting with MAE = 24.6 m3 /s, RMSE = 41.4 m3 /s, tions of HEC-HMS model averagely give a 5–9% underestimation
R = 0.973, ETp = 1 h, EQp = −5.6%, and CE = 0.945 (see Table 2 for the of the observations, i.e., y = 0.948 x + 0.64 for calibration and
performance assessment). In addition, the results from the HEC- y = 0.912 x − 3.9 for validation. The slopes of regression mR for the
HMS-SVR model is much better than those by the HEC-HMS-ANN data-driven and hybrid models are 0.986–0.998 (i.e., around or less
model (i.e., MAE = 32.3 m3 /s, RMSE = 53.9 m3 /s, R = 0.954, ETp = 27 h, 1% error).
EQp = 0.5%, and CE = 0.906). Figs. 14–16 show the scatter plots for the n-h-ahead forecast-
Further, Table 3 compares the performances of 6-h-ahead pre- ing results from the ANN, SVR, hybrid models, respectively. The
dictions among the hybrid and data-driven only models. The ANN and SVR algorithms present different generalization capabil-
C.-C. Young et al. / Applied Soft Computing 53 (2017) 205–216 213

Table 3
Assessment of prediction improvement for 6-h ahead runoff discharge.

Typhoon event Model Statistical parameters Improvement in

RMSE (m3 /s) R CE RMSE (%) R (%) CE (%)

BILIS ANN 68.2 0.923 0.850 – – –


HEC-HMS-ANN 53.9 0.954 0.906 −21.0 +3.4 +6.6
SVR 49.1 0.960 0.922 – – –
HEC-HMS-SVR 41.4 0.973 0.945 −15.7 +1.4 +2.5

KROSA ANN 71.3 0.939 0.866 – – –


HEC-HMS-ANN 67.1 0.942 0.882 −5.9 +0.3 +1.8
SVR 50.6 0.970 0.933 – – –
HEC-HMS-SVR 47.6 0.970 0.940 −5.9 +0.0 +0.8

SINLAKU ANN 54.7 0.952 0.878 – – –


HEC-HMS-ANN 39.1 0.972 0.938 −28.5 +2.1 +6.8
SVR 38.5 0.980 0.940 – – –
HEC-HMS-SVR 30.5 0.983 0.962 −20.8 +0.3 +2.3

Fig. 10. Comparison of the observed and simulated runoff discharge for the Nan-
Fong Bridge gauge station for the validation phase using the HEC-HMS-ANN and
HEC-HMS-SVR hybrid models with 1h-ahead prediction: (a) Typhoon Bilis, (b) Fig. 11. Comparison of the observed and simulated runoff discharge for the Nan-
Typhoon Krosa, and (c) Typhoon Sinlaku. Fong Bridge gauge station during Typhoon Bilis using the HEC-HMS-ANN and HEC-
HMS-SVR hybrid models with (a) 1 h-, (b) 2 h-, (c) 4 h-, and (d) 6 h-ahead prediction.

ities in the recursive predictions although their training results


are almost the same. The ANN model tends to over-predict the both SVR and ANN used in the hybrid models greatly improve the
rising limb (and under-predict the falling limb) of the runoff hydro- predictions from HEC-HMS, consistent with those utilizing differ-
graph (see Fig. 9(d)), explaining the wider spread of the scattering ent neural networks [22,25] or various error updating approaches
with the correlation coefficient R = 0.939 and the slope of regres- [34]. Besides, the outputs of physically based HEC-HMS model
sion mR = 0.976 (see Fig. 14(c)). The SVR model gives overall better can be interpreted as a predictor for the data-driven ANN and
results, i.e., R = 0.970 and mR = 1.003 (see Fig. 15(c)). The hybrid SVR models, like the concept of predictor-corrector numerical
model HEC-HMS-SVR (or HEC-HMS-ANN) provides further accu- schemes employed in hydrodynamic models (e.g., see the cited
racy (see Fig. 15(c)) from R = 0.970 to 0.975 (or R = 0.939 to 0.956). references in [47]). The positive influences become evident par-
The systematic comparisons reveal the important roles of ticularly in the long lead-time predictions. Advantages of coupling
physically based model and data-driven models in the proposed more physics also can be found in some similar works [31,37],
combination frameworks. For the 1-h-ahead forecasting results, where the watershed characteristics derived from the GIUH theory
214 C.-C. Young et al. / Applied Soft Computing 53 (2017) 205–216

Fig. 14. Scatter plots of predicted and measured runoff discharges using the ANN
model with (a) 2 h-, (b) 4 h-, and (c) 6 h-ahead prediction, where the solid lines
Fig. 12. Scatter plots of predicted and measured runoff discharges for (a) calibration
represent linear regression of the prediction-measurement pairs (circles).
of the HEC-HMS model, (b) training of the ANN model, (c) training of the SVR model,
and (d) training of the HEC-HMS-ANN and HEC-HMS-SVR hybrid models, where
the lines indicate linear regression of the prediction-measurement pairs (symbols).
(HEC-HMS-ANN: x-marks & dashed lines and HEC-HMS-SVR: circles and solid lines).

Fig. 15. Scatter plots of predicted and measured runoff discharges using the SVR
model with (a) 2 h-, (b) 4 h-, and (c) 6 h-ahead prediction, where the solid lines
represent linear regression of the prediction-measurement pairs (circles).

Fig. 13. Scatter plots of predicted and measured runoff discharges for (a) validation were implemented into the structure of ANN and SVR (as connec-
of the HEC-HMS model, (b) validation of the ANN model, (c) validation of the SVR tion weights between hidden and output layers rather than inputs
model, and (d) validation of the HEC-HMS-ANN and HEC-HMS-SVR models, where in this study). Overall, our thorough analysis indicated that the
the lines indicate linear regression of the prediction-measurement pairs (symbols).
combination of HEC-HMS and SVR models yields the most accurate
(HEC-HMS-ANN: x-marks & dashed lines and HEC-HMS-SVR: circles and solid lines).
runoff discharge predictions for the Chishan Creek basin.
C.-C. Young et al. / Applied Soft Computing 53 (2017) 205–216 215

native to the data assimilation approaches for accurate hydrologic


modeling [48,49].

Acknowledgments

This research was conducted with the support of the National


Science Council grant no. 103-2625-M-239-001. This financial sup-
port is greatly appreciated. The authors would like to express their
appreciation to the Water Resources Agency for providing access
to their recorded data.

References

[1] E. Todini, Rainfall-runoff modeling-past present and future, J. Hydrol. 100


(1988) 341–352.
[2] P.S. Yu, S.T. Chen, I.F. Chang, Support vector regression for real-time flood
stage forecasting, J. Hydrol. 328 (3–4) (2006) 704–716.
[3] R.J. Moore, V.A. Bell, D.A. Jones, Forecasting for flood warning, C. R. Geosci. 337
(1–2) (2005) 203–217.
[4] F. Pappenberger, K.J. Beven, N.M. Hunter, P.D. Bates, B.T. Gouweleeuw, J.
Thielen, A.P.J. de Roo, Cascading model uncertainty from medium range
weather forecasts (10 days) through a rainfall-runoff model to food
inundation predictions within the European Flood Forecasting System (EFFS),
Hydrol. Earth Syst. Sci. 9 (4) (2005) 381–393.
[5] M.P. Tripathi, P.K. Panda, N.S. Raghuwanshi, Identification and prioritization
of critical sub-watersheds for soil conservation management using SWAT
model, Biosyst. Eng. 85 (3) (2003) 365–379.
[6] M. Arabi, J.R. Frankenberger, B.A. Engel, J.G. Arnold, Representation of
agricultural practices with SWAT, Hydrol. Process. 22 (16) (2008) 3042–3055.
[7] A.O. Pektas, H.K. Cigizoglu, ANN hybrid model versus ARIMA and ARIMAX
Fig. 16. Scatter plots of predicted and measured runoff discharges using the HEC-
models of runoff coefficient, J. Hydrol. 500 (2013) 21–36.
HMS-ANN (circles) and HEC-HMS-SVR (x-marks) hybrid models with (a) 2 h-, (b)
[8] J. Joo, T. Kjeldsen, H.J. Kim, A comparison of two event-based flood models
4 h-, and (c) 6 h-ahead prediction, where lines represent linear regression of the
(ReFH-rainfall runoff model and HEC-HMS) at two Korean catchments, Bukil
prediction-measurement pairs (symbols). (HEC-HMS-ANN: x-marks & dashed lines and Jeungpyeong, KSCE J. Civ. Eng. 18 (1) (2014) 330–343.
and HEC-HMS-SVR: circles and solid lines). [9] W. Wang, J. Ding, Wavelet network model and its application to the
prediction of hydrology, Nat. Sci. 1 (1) (2003) 67–71.
[10] A.D. Feldman, Hydrologic Modeling System HEC-HMS. Technical Reference
5. Conclusions Manual, U.S. Army Corps of Engineers, Hydrologic Engineering Center, HEC,
Davis, CA, USA, 2000.
[11] M. Rezaeianzadeh, A. Stein, H. Tabari, H. Abghari, N. Jalalkamali, E.Z.
In this research, we proposed a new hybrid approach integrating
Hosseinipour, V.P. Singh, Assessment of a conceptual hydrological model and
the physically based (HEC-HMS) and data-driven (SVR) models to artificial neural networks for daily outflows forecasting, Int. J. Environ. Sci.
predict the hourly runoff discharges in the Chishan Creek basin, Technol. 10 (6) (2013) 1181–1192.
[12] C.W. Dawson, R.L. Wilby, Hydrological modelling using artificial neural
southern Taiwan. Seven storm events (with 1200 data sets) were
networks, Prog. Phys. Geogr. 25 (1) (2001) 80–108.
considered for model calibration (or training) and validation. The [13] R.J. Abrahart, P.E. Kneale, L. See, Neural Networks for Hydrological Modelling,
rainfall-runoff modeling under different lead time were carried out Taylor & Francis, 2004, pp. 304.
using the HEC-HMS, ANN, SVR, HEC-HMS-ANN and HEC-HMS-SVR [14] H. Tabari, S. Marifi, H.Z. Abyaneh, M.R. Sharifi, Comparison of artificial neural
network and combined models in estimating spatial distribution of snow
models. Their performances were assessed by six statistical indices depth and snow water equivalent in Samsami basin of Iran, Neural Comput.
(i.e., MAE, RMSE, R, ETp, EQp, and CE). Appl. 19 (4) (2010) 625–635.
Based upon the detailed and thorough comparison, the HEC- [15] A. Talei, L.H.C. Chua, C. Quek, P.E. Jansson, Runoff forecasting using a
Takagi-Sugeno neuro-fuzzy model with online learning, J. Hydrol. 488 (2013)
HMS model cannot fully describe runoff patterns during the 17–32.
extreme typhoon events. The ANN and SVR models have exactly [16] A.Y. Shamseldin, Application of a neural network technique to rainfall-runoff
the same 1-h-ahead predictions, in excellent agreement with the modelling, J. Hydrol. 199 (3–4) (1997) 272–294.
[17] A. Mukerji, C. Chatterjee, N.S. Raghuwanshi, Flood forecasting using ANN,
measured data. However, they present different generalization neuro-fuzzy, and neuro-GA models, J. Hydrol. Eng. ASCE 14 (6) (2009)
capabilities in recursive predictions. The results obtained by the 647–652.
SVR model are much better than those from the ANN model. For [18] V. Nourani, O. Kalantari, Integrated artificial neural network for
spatiotemporal modeling of rainfall-runoff-sediment processes, Environ. Eng.
the hybrid approaches, the HEC-HMS-SVR model provides the most
Sci. 27 (5) (2010) 411–422.
accurate runoff discharge predictions for the Chishan Creek basin. [19] M. Rezaeianzadeh, H. Tabari, A.A. Yazdi, S. Isik, L. Kalin, Flood flow forecasting
The roles of the physically based and the data-driven components using ANN, ANFIS and regression models, Neural Comput. Appl. 25 (1) (2014)
25–37.
in the present hybrid framework are further revealed. The predic-
[20] J.D. Garbrecht, Comparison of three alternative ANN designs for monthly
tions of HEC-HMS can be improved by the ANN and SVR, consistent rainfall-runoff simulation, J. Hydrol. Eng. ASCE 11 (5) (2006) 502–505.
with those error updating approaches [34]. Meanwhile, similar to [21] G.P. Zhang, Time series forecasting using a hybrid ARMIA and neural network
the implementation of physically based component into the data- model, Neurocomputing 50 (1) (2003) 159–175.
[22] C.C. Young, W.C. Liu, Prediction and modelling of rainfall-runoff during
driven structure [37], the additional input from HEC-HMS can serve typhoon events using a physically-based and artificial neural network hybrid
as a predictor [47] for the ANN and SVR so that the unwanted model, Hydrol. Sci. J. 60 (12) (2015) 2102–2116.
accumulated errors in the n-h-ahead prediction can be eliminated [23] R.S. Govindaraju, A.R. Rao, Artificial Neural Networks in Hydrology, Kluwer,
The Netherlands, 2000.
effectively. [24] D.A. Savic, G.A. Walters, J. Davidson, A genetic programming approach to
In the future, the newly developed hybrid model with its opera- rainfall-runoff modeling, Water Resour. Manag. 13 (3) (1999) 219–231.
tional nature can be used by the government to reduce the flooding [25] C.C. Young, W.C. Liu, C.E. Chung, Genetic algorithm and fuzzy neural networks
combined with the hydrologic modeling system for forecasting watershed
impacts during typhoon events. For further research, a dual form of runoff discharge, Neural Comput. Appl. 26 (7) (2015) 1631–1643.
this hybrid approach (i.e., the HEC-HMS model with the embedded [26] C. Sivapragasam, S.Y. Liong, M.F.K. Pasha, Rainfall and runoff forecasting with
data-driven tools) will be investigated, providing a potential alter- SSA-SVM approach, J. Hydroinform. 3 (3) (2001) 141–152.
216 C.-C. Young et al. / Applied Soft Computing 53 (2017) 205–216

[27] M. Bray, D. Han, Identification of support vector machines for runoff [39] G. Napolitano, L. See, B. Calvo, F. Savi, A. Heppenstall, A conceptual and neural
modeling, J. Hydroinform. 6 (4) (2004) 265–280. network model for real-time flood forecasting of the Tiber River in Rome,
[28] C.L. Wu, K.W. Chau, Y.S. Li, River stage prediction based on a distributed Phys. Chem. Earth 35 (3–5) (2010) 187–194.
support vector regression, J. Hydrol. 358 (1–2) (2008) 96–111. [40] C.Y. Tsou, Z.Y. Feng, M. Chigira, Catastrophic landslide induced by Typhoon
[29] R. Maity, P.P. Bhagwat, A. Bhatnagar, Potential of support vector regression for Morakot, Shiaolin, Taiwan, Geomorphology 127 (2011) 166–178.
prediction of monthly streamflow using endogenous property, Hydrol. [41] US Army Corps Engineers, Hydrologic Modeling System (HEC-HMS)
Process. 24 (7) (2010) 917–923. Application Guide: Version 3.1.0, Institute for Water Resources, Hydrologic
[30] M.C. Wu, G.F. Lin, H.Y. Lin, Improving the forecasts of extreme streamflow by Engineering Center, Davis, CA, 2008.
support vector regression with the data extracted by self organizing map, [42] US Soil Conservation Service Urban Hydrology for Small Watershed (technical
Hydrol. Process. 28 (2) (2014) 386–397. Release 55), US Department of Agriculture, 1986.
[31] S.M. Hosseini, N. Mahjouri, Integrating support vector regression and a [43] C.O. Clark, Storage and the unite hydrograph, Trans. Am. Soc. Civ. Eng. 110
geomorphologic artificial neural network for daily rainfall-runoff modeling, (1945) 1419–1488.
Appl. Soft Comput. 38 (2016) 329–345. [44] V.T. Chow, D.R. Maidment, L.W. Mays, Applied Hydrology, McGraw-Hill, New
[32] V.N. Vapnik, An overview of statistical learning theory, IEEE Trans. Neural York, 1988.
Netw. 10 (5) (1999) 988–999. [45] D.E. Rumelhart, G.E. Hinton, R.J. Williams, Learning representations by
[33] A.Y. Shamseldin, K.M. O’Connor, A non-linear neural network technique for back-propagating errors, Nature 323 (1986) 533–536.
updating of river flow forecasts, Hydrol. Earth Syst. Sci. 5 (4) (2001) 577–597. [46] M.T. Hagan, M. Menhaj, Training feedforward networks with the Marquardt
[34] L. Xiong, K.M. O’Connor, Comparison of four updating models for real-time algorithm, IEEE Trans. Neural Netw. 5 (6) (1994) 989–993.
river flow forecasting, Hydrol. Sci. J. 47 (4) (2002) 621–639. [47] C.C. Young, C.H. Wu, An efficient and accurate non-hydrostatic model with
[35] A.Y. Shamseldin, K.M. O’Connor, A.E. Nasr, A comparative study of three embedded Boussinesq-type like equations for surface wave modeling, Int. J.
neural network forecast combination methods for simulated river flows of Numer. Methods Fluids 60 (1) (2009) 27–53.
different rainfall-runoff models, Hydrol. Sci. J. 52 (5) (2007) 896–916. [48] M.P. Clark, D.E. Rupp, R.A. Woods, X. Zheng, R.P. Ibbitt, A.G. Slater, J. Schmidt,
[36] A.K. Fernando, A.Y. Shamseldin, R.J. Abrahart, Use of gene expression M.J. Uddstrom, Hydrological data assimilation with the ensemble Kalman
programming for multimodel combination of rainfall-runoff models, J. filter: use of streamflow observations to update states in a distributed
Hydrol. Eng. ASCE 17 (9) (2012) 975–985. hydrological model, Adv. Water Resour. 31 (10) (2008) 1309–1324.
[37] B. Zhang, R.S. Govindaraju, Geomorphology-based artificial neural networks [49] H.K. McMillan, E.O. Hreinsson, M.P. Clark, S.K. Singh, C. Zammit, M.J.
(GANNs) for estimation of direct runoff over watersheds, J. Hydrol. 273 (2003) Uddstrom, Operational hydrological data assimilation with the recursive
18–34. ensemble Kalman filter, Hydrol. Earth Syst. Sci. 17 (1) (2013) 21–38.
[38] R.K. Panda, N. Pramanik, B. Bala, Simulation of river stage using artificial
neural network and MIKE 11 hydrodynamic model, Comput. Geosci. 36 (6)
(2010) 735–745.

You might also like