Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

Journal of Hydrology 486 (2013) 334–342

Contents lists available at SciVerse ScienceDirect

Journal of Hydrology
journal homepage: www.elsevier.com/locate/jhydrol

Typhoon flood forecasting using integrated two-stage Support Vector


Machine approach
Gwo-Fong Lin a,⇑, Yang-Ching Chou a, Ming-Chang Wu b
a
Department of Civil Engineering, National Taiwan University, Taipei 10617, Taiwan
b
Taiwan Typhoon and Flood Research Institute, National Applied Research Laboratories, Taipei 10093, Taiwan

a r t i c l e i n f o s u m m a r y

Article history: Accurate runoff forecasts are essential for flood mitigation and warning. In this paper, a two-stage flood
Received 29 October 2012 forecasting model that is based on Support Vector Machine (SVM) is presented. In the first stage, the
Received in revised form 17 January 2013 observed typhoon characteristics and observed rainfall are used to produce rainfall forecast; and in the
Accepted 2 February 2013
second stage, the forecasted rainfall and observed runoff are used to produce runoff forecast. A dataset
Available online 18 February 2013
This manuscript was handled by
of 16 typhoon storms from Taiwan were used to evaluate the two-stage SVM model. The SVM model gen-
Konstantine P. Georgakakos, Editor-in-Chief, erated accurate rainfall and runoff forecasts with a 1–6 h lead time, especially for the peak runoff values.
with the assistance of Eylon Shamir, A substantial performance improvement of flood forecast is shown for the 4- to 6-h lead time. In conclu-
Associate Editor sion, the SVM model provides an operational advantage by increasing the forecast lead time during
typhoon events.
Keywords: Ó 2013 Elsevier B.V. All rights reserved.
Flood forecasting
Forecasted rainfall
Support Vector Machines
Typhoon characteristics

1. Introduction bility in modeling nonlinear systems and their computational effi-


ciency, NNs have received a considerable attention.
To mitigate disasters due to typhoons, accurate and reliable However, flood forecasting performance of most NNs decreases
runoff forecasts are required to provide early warning of impend- rapidly with increasing of the forecast lead time. Operational agen-
ing floods and their improvement has been recognized as an cies that are responsible for flood mitigation and warnings can
important task. However, the highly non-linear and complex pro- benefit from improved forecast accuracy of the longer lead times.
cesses of typhoon rainfall and runoff make it difficult to construct Multi-stage NN-based models were developed in attempt to im-
a reliable physically-based model. Data driven Neural Network prove the longer forecast lead time (Chang et al., 2007; Lin and
(NN) approaches have been recommended as an attractive alterna- Wu, 2011). The concept of multi-stage NN-based models is that
tive to the physically-based models. The American Society of Civil two or more NN-based models are connected. Using the two-stage
Engineers (ASCE) Task Committee (2000a,b) and Maier and Dandy as an example, the connection between the two stages is that fore-
(2000) provide a general introduction and a comprehensive review casted values from the first-stage module are used as input to the
of the application of NNs in hydrology. In recent years, NNs have second-stage module. It is widely known that rainfall is one of the
been successfully used in various hydrologic modeling applications most important inputs to flood forecasting model and the accuracy
(e.g., de Vos and Rientjes, 2005; Hu et al., 2007; Lin and Chen, of long lead time flood forecasting can be improved with more
2004) and specifically for rainfall and flood forecasting (e.g., Chang accurate rainfall forecasts. Lin et al. (2009c) improved longer lead
et al., 2004; Chiang et al., 2007; Lin and Chen, 2005; Lin et al., 2010; time streamflow forecast by using Multilayer Perceptron with back
Luk et al., 2001; Pramanik et al., 2011; Rathinasamy and Khosa, propagation training algorithm (that is BPN hereinafter) to predict
2012; Toth and Brath, 2007). rainfall as input to a Radial Basis Function (RBF)-based reservoir in-
The major advantage of NNs is their capability to simulate com- flow forecasting model. Chiang and Chang (2009) used Quantita-
plex relationship between desired output and available input given tive Precipitation Forecasting (QPF) information as input to
the existence of sufficient training datasets. Because of their flexi- Recurrent Neural Network (RNN)-based flood forecasting model
and reported a similar finding, that is, the forecasted rainfall was
⇑ Corresponding author. Tel.: +886 2 3366 4368; fax: +886 2 2363 1558. capable of providing useful information for flood forecasting, espe-
E-mail address: gflin@ntu.edu.tw (G.-F. Lin). cially for long lead time.

0022-1694/$ - see front matter Ó 2013 Elsevier B.V. All rights reserved.
http://dx.doi.org/10.1016/j.jhydrol.2013.02.012
G.-F. Lin et al. / Journal of Hydrology 486 (2013) 334–342 335

In previous studies, multi-stage NN-based flood forecasting dimensional feature space by a non-linear function /(x). Then the
models have been based on conventional NNs. The architecture ^
regression function that relates the input vector x to the output y
and the weights of these conventional NNs are determined by a can be written as:
trial and error procedure which consists of iterative time-consum-
^ ¼ f ðxÞ ¼ wT /ðxÞ þ b
y ð1Þ
ing process. Although the selection of NN-based models commonly
disregard the efficiency of the model training, it is essential to de- where w and b are weights and bias of the regression function,
velop a well-performing model that can be quickly trained. respectively. According to the Structural Risk Minimization (SRM)
In this paper, a two-stage Support Vector Machine (SVM)-based induction principle, the learning objective of a SVM is to minimize
model is developed to yield 1- to 6-h lead time runoff forecasts. both the empirical risk and the model complexity. Based on the
Support Vector Machines (SVMs) have been used for hydrologic SRM induction principle, w and b are estimated by minimizing
time series forecasting (Liong and Sivapragasam, 2002; Sivapraga- the following structural risk function:
sam and Liong, 2005; Wu et al., 2009; Yu and Liong, 2007; Yu et al., Nd
1 T X
2004). More recently, Rasouli et al. (2012) used SVM and other ma- R¼ ^i Þ
w w þ C Le ðy ð2Þ
chine learning methods with weather and climate inputs to fore- 2 i¼1
cast daily streamflow. Based on statistical learning theory, SVM
has better generalization ability and requires less training time where the Vapnik’s e-insensitive loss function Le is defined as:

than conventional NNs (e.g., BPN). For both rainfall and inflow fore- 0 for jy  f ðxÞj < e
^Þ ¼ jy  f ðxÞje ¼
L e ðy ð3Þ
casting, Lin et al. (2009a,b) demonstrated that SVM-based models jy  f ðxÞj  e for jy  f ðxÞj  e
outperform BPN-based models. Moreover, the development of
SVM-based models is efficient and thus expected to be suitable The first and second terms in Eq. (2) represent the model com-
for development of the two-stage model presented herein. plexity and the empirical error, respectively. The trade-off between
The objective of this study is to demonstrate a two-stage SVM- the model complexity and the empirical error is specified by a
based model for typhoon flood forecasting. A rainfall forecasting user-defined parameter C, and C = 1 represents a case that the
module is developed in the first stage to pre-process the typhoon model complexity is as important as the empirical error. The use
information (namely, typhoon characteristics and rainfall) and to of SRM induction principle results in the better generalization abil-
produce rainfall forecasts. Then, the rainfall forecasts along with ity of SVMs and avoids over-training of the model.
the observed runoff are used as input to the flood forecasting mod- Vapnik (1995) expressed the SVR problem in terms of the fol-
ule in the second stage. This procedure is expected to reduce the lowing optimization problem:
input dimensionality and improve the performance of the longer XNd
1
forecast lead times, especially for the prediction of peak runoff. Minimize Rðw; b; n; n0 Þ ¼ kwk2 þ C ðni þ n0i Þ ð4Þ
This paper is organized in the following manner. Section 2 de- 2 i¼1
scribes the development of the two-stage model. The application subject to
of the proposed two-stage model and the forecast results are pre-
sented in Section 3. Section 4 summarizes main conclusions in this ^i ¼ yi  ðwT /ðxi Þ þ bÞ  e þ ni
yi  y
study. ^i  yi ¼ ðwT /ðxi Þ þ bÞ  yi  e þ n0i
y
ð5Þ
ni  0; i ¼ 1; 2; :::; Nd
n0i  0; i ¼ 1; 2; :::; Nd
2. Model development
where n and n0 , which are slack variables used to convert an
2.1. SVM theory inequality constraint into an equality constraint, represent the
upper and the lower training errors, respectively. The above opti-
SVMs were developed for classification applications in the early mization problem is usually solved in its dual form using La-
1990s, and later extended for regression analysis by Vapnik (1995). grange multipliers. Rewriting Eq. (4) in its dual form and
In this section, the methodology of the support vector regression differentiating with respect to the primal variables (w, b, n, n0 )
(SVR) used in this paper is briefly described and more details can gives:
be found in several text books (Cristianini and Shaw-Taylor,
Nd
X Nd
X
2000; Vapnik, 1995, 1998). Maximize yi ðai  a0i Þ  e ðai þ a0i Þ
The architecture of a SVM is presented in Fig. 1. Based on Nd i¼1 i¼1
training data ½ðx1 ; y1 Þ; ðx2 ; y2 Þ; . . . ; ðxNd ; yNd Þ, the objective of the Nd XNd
1X
SVR is to find a non-linear regression function to yield the output  ðai  a0i Þðaj  a0j Þ/ðxi ÞT /ðxj Þ ð6Þ
^, which is the best approximate of the desired output y with an
y 2 i¼1 j¼1
error tolerance of e. First, the input vector x is mapped into a higher
subject to:
Nd
X
ðai  a0i Þ ¼ 0
i¼1
ð7Þ
0  ai  C; i ¼ 1; 2; :::; Nd
0  a0i  C; i ¼ 1; 2; :::; Nd
where a and a0 are the dual Lagrange multipliers. Note that the
solution to the optimal problem (Eq. (6)) is guaranteed to be unique
and converge to global optima because the objective function is a
convex function.
The optimal Lagrange multipliers a⁄ are solved by the standard
quadratic programming algorithm and then the regression func-
Fig. 1. Architectural graph of SVM. tion can be rewritten as:
336 G.-F. Lin et al. / Journal of Hydrology 486 (2013) 334–342

Nd
X typhoon characteristics are directly used as input to SVM-QRT
f ðxÞ ¼ ai Kðxi ; xÞ þ b ð8Þ without preprocessing (Lin et al., 2009a). The architecture of
i¼1
SVM-QRT is illustrated in Fig. 2b.
where K(xi, x) is the kernel function and the kernel function used in The construction of the proposed model, SVM-QRf, is summa-
this paper is the radial basis function: rized below. First, the rainfall and typhoon characteristics are used
  as input to the rainfall forecasting module. The general form of the
1 rainfall forecasting module is:
Kðxi ; xÞ ¼ exp  jxi  xj2 ð9Þ
nx
RtþDt ¼ f ðRt ; Rt1 ; . . . ; RtðLR 1Þ ; TY t ; TY t1 ; . . . ; TY tðLTY 1Þ Þ ð11Þ
where nx is the number of components in input vector x.
where t is the current time, Dt is the lead-time period (from 1 to
Some of solved Lagrange multipliers (a  a0 ) are zero and
6 h), Rt is observed rainfall at time t, and LR denotes the lag length
should be eliminated from the regression function. Finally, the
of rainfall, TYt is typhoon characteristics at time t, LTY denotes the
regression function involves the nonzero Lagrange multipliers
lag length of typhoon characteristics, and Rt+Dt is the forecasted
and the corresponding input vectors of the training data, which
rainfall at time t + Dt.
are called the support vectors. The final regression function can
Then, the forecasted rainfall (Rt+Dt) and observed runoff data are
be rewritten as:
used as input to the flood forecasting module in the second stage.
X
Nsv The general form of the proposed model is:
f ðxÞ ¼ ak Kðxk ; xÞ þ b ð10Þ
k¼1 Q tþDt ¼ f ðQ t ; Q t1 ; . . . ; Q tðLQ 1Þ ; RtþDt Þ ð12Þ

where xk denotes the kth support vector and Nsv is the number of where Qt is observed runoff at time t, and LQ denotes the lag length
support vectors. of runoff, Qt+Dt is the forecasted runoff at time t + Dt.
For conventional NN, the architecture and the weights are The general form of the SVM-QRT model is:
respectively determined by a trial-and-error procedure which is a
Q tþDt ¼ f ðQ t ; Q t1 ; . . . ; Q tðLQ 1Þ ; Rt ; Rt1 ; . . . ; RtðLR 1Þ ; TY t ; TY t1 ; . . . ; TY tðLTY 1Þ Þ
time-consuming iterative process. In contrary, the optimal archi-
ð13Þ
tecture and weights of SVM are quickly ‘‘solved’’, not ‘‘searched’’,
which provides an efficient and consistent platform. The flowchart of SVM-QRf and SVM-QRT is shown in Fig. 3. In
model construction, determination of the appropriate lag lengths
2.2. Model construction of input is an important step. A trial-and-error procedure is applied
to determine the lag lengths of input. The lag lengths, LTY, LR and LQ,
The architecture of the proposed two-stage SVM-based model are determined by the same process. The criterion for selecting the
(hereinafter SVM-QRf) is illustrated in Fig. 2a. In the first stage, lag lengths is the relative percentage error (RPE):
the rainfall forecasting module, is used to pre-process the typhoon EðLÞ  EðL þ 1Þ
information (that is typhoon characteristics and rainfall) to pro- RPE ¼  100 ð14Þ
EðLÞ
duce rainfall forecasts. Then in the second stage, the forecasted
rainfall (Rf) in conjunction with observed runoff (Q) is used as in- where E(L) and E(L + 1) are the Root mean square errors (RMSEs) for
put to the flood forecasting module. For comparison with the models with L and L + 1 lag lengths, respectively. In general, the
SVM-QRf, a one-stage SVM-based flood forecasting model (named RMSE decreases with increasing lag term. When the RPE is less than
SVM-QRT) with observed runoff (Q), rainfall (R) and typhoon char- 5%, the increase of lag lengths is stopped and the best inputs of fore-
acteristics (T) is constructed. It should be noted that rainfall and casting models are selected.

Fig. 2. Architectural graphs of: (a) the proposed model and (b) the existing model.
G.-F. Lin et al. / Journal of Hydrology 486 (2013) 334–342 337

Fig. 3. Flowchart of the model development.

For event-based data, the collected events are separated into 1X N

two datasets: training and testing. Some of the collected events MCE ¼ CEj ð16Þ
N j¼1
are chosen as training data and used to construct NN-based mod-
els. The performance of the NN-based models is tested by the where CEj is the CE for the jth testing event.
remaining events. Different selections of training data and testing 2. Root mean square error (RMSE)
data yield different results and sometimes lead to different conclu- The RMSE is a measure which represents the errors between
sions. In this study, we used cross validation and each single ty- two sets of data. The smaller the RMSE value, the better the fore-
phoon event (except from the event with the maximum-runoff) casts. The RMSE is written as:
was used to test the NN-based models in turn. Hence, for N ty- rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
1 Xn ^ t Þ2
phoon events, a total of N  1 test results were obtained. The con- RMSE ¼ ðQ t  Q ð17Þ
n t¼1
clusions are drawn on the basis of the overall performance for
these testing results. 3. Mean error of peak runoff (MEPR)
For a single testing event, the error of peak runoff (EPR) is de-
fined as:
2.3. Performance measures
^p  Q
Q p
EPR ¼ j j ð18Þ
To evaluate the forecast performance of the models the follow- Qp
ing performance criteria were selected: ^ p and Qp is the forecasted peak runoff and the observed
where Q
peak runoff respectively. The mean error of peak runoff of N testing
1. Mean coefficient of efficiency (MCE)
events is written as:
For a single testing event, the coefficient of efficiency (CE) is 1X N

written as: MEPR ¼ EPRj ð19Þ


N j¼1
Pn ^ t Þ2
ðQ t  Q where EPRj is the error of peak runoff for the jth testing event.
CE ¼ 1  Pt¼1
n
ð15Þ
 2
t¼1 ðQ t  Q Þ
3. Application, results and discussion
where Qt and Q ^ t denote the observed and forecasted runoff at time
t, respectively, Q is the average of the observed runoff, and n is the 3.1. Application
number of time steps. If the CE value is equal to one, the forecasts
are perfect. Because the cross validations are used herein, the mean The island of Taiwan is located in one of the main paths of the
CE of N testing events is written as: north-western Pacific typhoons. During the past 100 years, on
338 G.-F. Lin et al. / Journal of Hydrology 486 (2013) 334–342

Fig. 4. The study area and locations of rainfall and water-level stations.

Table 1
Description of typhoon events used in the modeling.

Name Date Duration (h) Scale Maximum hourly rainfall (mm) Peak runoff (m3/s)
Tim 10 July 1994 50 Intense typhoon 11.48 87.8
Doug 8 August 1994 30 Intense typhoon 38.03 749
Toraji 29 July 2001 62 Moderate typhoon 85.07 1530
Nari 16 September 2001 111 Moderate typhoon 19.00 84.5
Nakri 9 July 2002 96 Minor typhoon 21.09 244
Nanmadol 3 December 2004 37 Moderate typhoon 13.21 61.6
Haitang 17 July 2005 81 Intense typhoon 38.94 916.6
Talim 31 August 2005 55 Intense typhoon 14.36 262.4
Longwang 1 October 2005 40 Intense typhoon 16.73 208.4
Sepat 17 August 2007 48 Intense typhoon 21.14 228.2
Wipha 17 September 2007 52 Moderate typhoon 20.02 183
Krosa 4 October 2007 79 Intense typhoon 27.01 267.5
Kalmaegi 16 July 2008 58 Moderate typhoon 67.56 370.2
Fung-wong 26 July 2008 73 Moderate typhoon 30.25 388.2
Sinlaku 11 September 2008 127 Intense typhoon 47.86 662
Jangmi 27 September 2008 52 Intense typhoon 29.44 696

Note: According to the classification system of the Taiwan Central Weather Bureau, the intensities of minor, moderate and intense typhoons are 34–63, 64–99, and
P100 knot, respectively.

Table 2
Input variables to the NN models.

Lead time (h) Input


SVM-QRT SVM-QRf SVM-QRi
1 Q(t), Q(t  1), R(t), Ty(t) ^ (t + 1)
Q(t), Q(t  1), R, Q(t), Q(t  1), R(t + 1)
2 Q(t), Q(t  1), R(t), Ty(t) ^ (t + 2)
Q(t), Q(t  1), R, Q(t), Q(t  1), R(t + 2)
3 Q(t), Q(t  1), R(t), Ty(t) ^ (t + 3)
Q(t), Q(t  1), R, Q(t), Q(t  1), R(t + 3)
4 Q(t), Q(t  1), R(t), Ty(t) ^ (t + 4)
Q(t), Q(t  1), R, Q(t), Q(t  1), R(t + 4)
5 Q(t), Q(t  1), R(t), Ty(t) ^ (t + 5)
Q(t), Q(t  1), R, Q(t), Q(t  1), R(t + 5)
6 Q(t), Q(t  1), R(t), Ty(t) ^ (t + 6)
Q(t), Q(t  1), R, Q(t), Q(t  1), R(t + 6)

^ forecasted rainfall; Ty: observed typhoon characteristics.


Note: Q: observed runoff; R: observed rainfall; R:

average, about four typhoons have hit Taiwan each year. The study the main river is 119 km and the average slope is 1/92. Heavy rain-
area is the Wu River basin in central Taiwan. The basin with an fall brought by typhoons frequently cause flood disasters in the Wu
area of 2026 km2 is the fourth largest in Taiwan. The length of River basin. The city of Taichung with a population of about 3
G.-F. Lin et al. / Journal of Hydrology 486 (2013) 334–342 339

million people is located downstream of the Nan-Pei Bridge on the as SVM-QRT, is also presented in Fig. 5. As shown in Fig. 5, SVM-
Wu River. In 2008, two typhoons (Kalmaegi and Fung-Wong) suc- QRi performed best among all models. Furthermore, SVM-QRT
cessively hit central Taiwan to cause an economic loss of about 3 yielded higher MCE than BPN-QRT, which is consistent with the
billion USD. conclusion of Lin et al. (2009a, 2009b). It is seen that both SVM-
Fig. 4 shows the study area and the location of four hourly rain- QRT and BPN-QRT cannot yield effective forecasts for a forecast
fall stations (Pei-Shan, Chin-Liu, Hui-Suen and Tsui-Luan) and one lead time that is greater than 3 h, whereas SVM-QRi produced
hourly water-level station (Nan-Pei Bridge). The rainfall and runoff accurate flood forecasts up to 6 h. This supports the argument that
data were obtained from the Water Resources Agency and the data the SVM-based model can effectively mitigate the negative impact
of typhoon characteristics were provided from the Central Weather of increasing forecast lead time if reliable and accurate rainfall
Bureau. The mean areal rainfall of the watershed was calculated forecasts are made available.
using Thiessen method. The typhoon characteristics dataset in- Lin et al. (2009b) confirmed that adding typhoon characteristics
cludes the latitude and longitude (degree) of the typhoon center, significantly improve the rainfall forecasting performance, espe-
the distance (km) between the typhoon center and the water-level cially for forecast lead times that are longer than 3-h forecasting.
station, the near-center maximum wind speed (m/s), the central Following their recommendation, data of rainfall and typhoon
pressure (hPa), the storm radius (km) and the speed (km/hr) of characteristics are used herein to develop a SVM-based rainfall
the typhoon movement. The typhoon events that were used in this forecasting module. The RMSE values resulting from the rainfall
study are listed in Table 1.

Table 3
3.2. Rainfall forecasts MCE and MEPR for various models.

Lead time (h) SVM-QRT SVM-QRf SVM-QRi


In order to show the influence of rainfall forecasts on flood fore-
MCE
casting, a hypothetical SVM-based model (named SVM-QRi) was
1 0.73 0.93 0.93
first tested. It should be noted that the rainfall input to the SVM- 2 0.56 0.85 0.84
QRi is considered as optimal, that is, the raingauge measurements 3 0.47 0.76 0.74
were used to represent the perfect rainfall forecast. Table 2 pre- 4 0.12 0.65 0.63
sents the list of inputs that were used to construct SVM-based 5 0.28 0.58 0.55
6 0.80 0.46 0.39
models. The MCE values of SVM-QRi and the conventional model
(SVM-QRT) are presented in Fig. 5. Additionally, the result of a MEPR (%)
1 7.28 4.42 3.74
BPN-based model (named BPN-QRT), which uses the same inputs 2 13.23 8.16 4.06
3 16.41 11.49 5.21
4 22.28 13.58 6.73
5 28.12 14.28 7.92
6 32.51 15.10 11.78

Fig. 5. MCE values of SVM-QRT, BPN-QRT and SVM-QRi.

Fig. 7. (a) MCE values of SVM-QRT and SVM-QRf and (b) the improvement in MCE
Fig. 6. RMSE values of the rainfall forecasts. due to the use of SVM-QRf instead of SVM-QRT.
340 G.-F. Lin et al. / Journal of Hydrology 486 (2013) 334–342

Table 4
Paired comparison t-tests of two performance measures (CE and EPR) resulting from
SVM-QRT and SVM-QRf.

Alternative t Statistic Critical t value Statistically


hypothesis significant
at the 1% level
CESVM-QRf > CESVM-QRT 2.95 2.37 Yes
EPRSVM-QRf < EPRSVM-QRT 4.19 2.37 Yes

3.3. Influence of forecasted rainfall on flood forecasting

The MCE and MEPR of three SVM-based models (SVM-QRT, SVM-


QRf and SVM-QRi) for 1- to 6-h lead time forecasts are summarized
in Table 3. The input data of SVM-QRi are the observed record that
represents optimal rainfall forecast and antecedent runoff. As for
the input data of both SVM-QRT and SVM-QRf, the antecedent run-
off, rainfall and typhoon characteristics were used. However, in the
SVM-QRf model, the rainfall forecasting module was used to pre-
process typhoon information (that is, typhoon characteristics and
rainfall) and to provide the forecasted rainfall. For SVM-QRT, the
rainfall and typhoon characteristics were directly used as inputs
without further processing. In the following subsection we focus
on the comparison between SVM-QRf and SVM-QRT.
The MCE values for runoff forecasts of both SVM-QRT and SVM-
QRf decrease with increasing forecast lead time (Fig. 7a). However,
the MCE values of SVM-QRT decrease more rapidly than those of
SVM-QRf. For 1- to 3-h lead time forecasts, both models provided
reasonable runoff forecasts. For 4- to 6-h lead time forecasts, the
Fig. 8. (a) MEPR values of SVM-QRT and SVM-QRf and (b) the improvement in
MEPR due to the use of SVM-QRf instead of SVM-QRT. performance of SVM-QRT gets worse and the MCE values are al-
most equal or even lower than zero. Clearly, the SVM-QRT cannot
yield effective forecasts when the forecast lead time is greater than
forecasting module are presented in Fig. 6. As shown in Fig. 6, 3 h. As for the SVM-QRf, the performance is still acceptable for
RMSE values increase from 4.3 mm to 6.3 mm for 1- to 3-h lead longer lead times up to 6-h. Regardless of the forecast lead time,
time forecasts, while they slightly increase from 6.6 mm to 7 mm the proposed model improved accuracy of the runoff forecast when
for 4- to 6-h lead time forecasts. We note that in this region, the compared with the model without forecasted rainfall. Further-
maximum yearly rainfall is higher than 2000 mm and the maxi- more, the improvement in MCE due to the use of SVM-QRf instead
mum hourly rainfall is higher than 80 mm. The RMSE values of of SVM-QRT is presented in Fig. 7b. It is also concluded that SVM-
1- to 6-h lead time forecasts are all lower than 8 mm, which indi- QRf outperformed SVM-QRT. As for the other performance measure
cates high accuracy of the rainfall forecasts. (MEPR), SVM-QRf yields significantly lower MEPR values than

Fig. 9. Number of events for which (a) CE values of SVM-QRf are higher than those of SVM-QRT and (b) EPR values of SVM-QRf are lower than those of SVM-QRT.
G.-F. Lin et al. / Journal of Hydrology 486 (2013) 334–342 341

Fig. 10. Comparison of the observed runoff with the 1- to 6-h lead time forecasts resulting from SVM-QRf.

Fig. 11. Comparison of the observed runoff with the 1-h lead time forecasts resulting from: (a) SVM-QRf and (b) SVM-QRT for Typhoon Haitang.

SVM-QRT (Fig. 8a), and the improvement increase with increasing direct use of observed data (runoff, rainfall and typhoon character-
forecast lead time (Fig. 8b). This indicates that SVM-QRf is more istics) in model development cannot provide useful information for
appropriate for forecasting peak runoff than SVM-QRT, especially lead time forecasting that are longer than 3-h. When the forecast
for lead time forecasting that is longer than 3-h. lead time increases, the data used for longer lead time forecasting
We found that runoff forecasts cannot be improved by using ty- include more complex noise and the correlation between desired
phoon characteristics as direct input to an SVM-based model. Be- output and available input decreases. Because the rainfall forecast-
cause of the short time of concentration in the study basin, the ing module successfully reduces the complication caused by the
342 G.-F. Lin et al. / Journal of Hydrology 486 (2013) 334–342

typhoon characteristics, the proposed model effectively improves 002-007 and NSC 99-2221-E-002-092-MY3. We would like to
the long lead-time forecasting. especially thank the Associate Editor and reviewers for their con-
In addition to the overall performance, evaluation of individual structive suggestions that greatly improved the manuscript.
events is described herein. The number of events for which SVM-
QRf yields a higher CE than SVM-QRT is counted and presented References
in Fig. 9a. In a like manner, Fig. 9b presents the results for another
performance measure, EPR. Fig. 9 shows that SVM-QRf performs ASCE Task Committee on Application of Artificial Neural Networks in Hydrology,
2000a. Artificial neural networks in hydrology. I: Preliminary concepts. J.
better than SVM-QRT for most of the events. To further assess Hydrol. Eng. 5 (2), 115–123.
whether SVM-QRf performs better than SVM-QRT for the same ASCE Task Committee on Application of Artificial Neural Networks in Hydrology,
testing events, paired comparison t-tests are conducted at the 1% 2000b. Artificial neural networks in hydrology. II: Hydrologic applications. J.
Hydrol. Eng. 5 (2), 124–137.
significance level. Table 4 shows that SVM-QRf yields significantly Chang, L.C., Chang, F.J., Chiang, Y.M., 2004. A two-step-ahead recurrent neural
higher CE and lower EPR than SVM-QRT. For 1- to 6-h lead times, network for stream-flow forecasting. Hydrol. Process. 18 (1), 81–92.
the comparison of the observed runoff with the forecasts resulting Chang, F.C., Chiang, Y.M., Chang, L.C., 2007. Multi-step-ahead neural networks for
flood forecasting. Hydrol. Sci. J.-J. Sci. Hydrol. 52 (1), 114–130.
from SVM-QRf is presented in Fig. 10. It is shown that the proposed Chiang, Y.M., Chang, F.C., 2009. Integrating hydrometeorological information for
two-stage model (SVM-QRf) produced reliable forecasts and the rainfall-runoff modelling by artificial neural networks. Hydrol. Process. 23 (11),
forecasted hydrograph accurately matches the observed hydro- 1650–1659.
Chiang, Y.M., Chang, F.C., Jou, J.D.B., Lin, P.F., 2007. Dynamic ANN for precipitation
graph. To highlight the comparison, Fig. 11 shows the hydrographs
estimation and forecasting from radar observations. J. Hydrol. 334 (1–2), 250–
of 1-h lead time forecasts resulting from SVM-QRf and SVM-QRT 261.
for the most extreme runoff event (resulting from Typhoon Hai- Cristianini, N., Shaw-Taylor, J., 2000. An Introduction to Support Vector Machines
and Other Kernel-Based Learning Methods. Cambridge University Press, New
tang). As shown in Fig. 11, both SVM-QRf and SVM-QRT slightly
York.
underestimate the peak runoff, but reproduce low runoff appropri- de Vos, N.J., Rientjes, T.H.M., 2005. Constraints of artificial neural networks for
ately because low runoff is more frequent in data set than high rainfall–runoff modeling: trade-offs in hydrological state representation and
runoff. However, SVM-QRf captures the peak runoff better than model evaluation. Hydrol. Earth Syst. Sci. 9 (1–2), 111–126.
Hu, T.S., Wu, F.Y., Zhang, X., 2007. Rainfall–runoff modeling using principal
SVM-QRT. Although the result confirms that the proposed model component analysis and neural network. Hydrol. Res. 38 (3), 235–248.
improves the forecasts of peak runoff, more validation of the mod- Lin, G.F., Chen, L.H., 2004. A non-linear rainfall-runoff model using radial basis
els in extrapolation is still required in future research. function network. J. Hydrol. 289 (1–4), 1–8.
Lin, G.F., Chen, L.H., 2005. Application of artificial neural network to typhoon rainfall
forecasting. Hydrol. Process. 19 (9), 1825–1837.
4. Summary and conclusions Lin, G.F., Wu, M.C., 2011. An RBF network with a two-step learning algorithm for
developing a reservoir inflow forecasting model. J. Hydrol. 405 (3–4), 439–450.
Lin, G.F., Chen, G.R., Huang, P.Y., Chou, Y.C., 2009a. Support Vector Machine-based
In this paper, a two-stage SVM-based model (i.e. SVM-QRf) is models for hourly reservoir inflow forecasting during typhoon-warning periods.
proposed for improving runoff forecast during typhoon events. In J. Hydrol. 372 (1–4), 17–29.
the first stage, the rainfall forecasting module is used to pre-pro- Lin, G.F., Chen, G.R., Wu, M.C., Chou, Y.C., 2009b. Effective forecasting of hourly
typhoon rainfall using Support Vector Machines. Water Resour. Res. 45,
cess the typhoon information (namely, typhoon characteristics W08440. http://dx.doi.org/10.1029/2009WR007911.
and rainfall) and to produce rainfall forecasts. Then, in the second Lin, G.F., Wu, M.C., Chen, G.R., Tsai, F.Y., 2009c. An RBF-based model with an
stage, the forecasted rainfall and observed runoff are used as input information processor for forecasting hourly reservoir inflow during typhoons.
Hydrol. Process. 23 (25), 3598–3609.
to the flood forecasting module to yield runoff forecasts. A case Lin, G.F., Huang, P.Y., Chen, G.R., 2010. Using typhoon characteristics to improve the
study for the Wu River basin in central Taiwan is performed to as- long lead-time flood forecasting of a small watershed. J. Hydrol. 380 (3–4), 450–
sess the model performance. In addition, a single-stage SVM-based 459.
Liong, S.Y., Sivapragasam, C., 2002. Flood stage forecasting with Support Vector
model (i.e. SVM-QRT), which directly uses the observed runoff, Machines. J. Am. Water Resour. Assoc. 38 (1), 173–186.
rainfall and typhoon characteristics as input without any process- Luk, K.C., Ball, J.E., Sharma, A., 2001. An application of artificial neural networks for
ing, is constructed for comparison. rainfall forecasting. Math. Comput. Modell. 33, 683–693.
Maier, H.R., Dandy, G.C., 2000. Neural networks for the prediction and forecasting of
Regarding the performance of rainfall forecasting, it is found water resources variables: a review of modeling issues and applications.
that the first-stage of the proposed model yields quite accurate Environ. Modell. Softw. 15, 101–124.
1- to 6-h lead time rainfall forecasts. The use of typhoons charac- Pramanik, N., Panda, R.K., Singh, A., 2011. Daily river flow forecasting using wavelet
ANN hybrid models. J. Hydroinform. 13 (1), 49–63.
teristics can effectively reduce the negative impacts of increasing
Rasouli, K., Hsieh, W.W., Cannon, A.J., 2012. Daily streamflow forecasting by
forecast lead time. As to the performance of flood forecasting, a machine learning methods with weather and climateinputs. J. Hydrol. 414–415,
comparison between the proposed two-stage model and the sin- 284–293.
gle-stage model shows that the proposed model significantly im- Rathinasamy, M., Khosa, R., 2012. Multiscale nonlinear model for monthly
streamflow forecasting: a wavelet-based approach. J. Hydroinform. 14 (2),
proved the runoff forecasts. In addition to the overall 424–442.
performance, the proposed model significantly improved the fore- Sivapragasam, C., Liong, S.Y., 2005. Flow categorization model for improving
casts of peak runoff, especially for long lead time forecasting. The forecasting. Nord. Hydrol. 36 (1), 37–48.
Toth, E., Brath, A., 2007. Multistep ahead streamflow forecasting: role of calibration
better performance of the proposed model confirms that the pro- data in conceptual and neural network modeling. Water Resour. Res. 43,
cessed typhoon information is more useful than the raw typhoon W11405. http://dx.doi.org/10.1029/2006WR005383.
information. The use of forecasted rainfall and the proposed two- Vapnik, V., 1995. The Nature of Statistical Learning Theory. Springer, New York.
Vapnik, V., 1998. Statistical Learning Theory. John Wiley, New York.
stage structures are justified and it expected to improve hourly ty- Wu, C.L., Chau, K.W., Li, Y.S., 2009. Predicting monthly streamflow using data-driven
phoon flood forecasting. models coupled with data-preprocessing techniques. Water Resour. Res. 45,
W08432. http://dx.doi.org/10.1029/2007WR006737.
Yu, X.Y., Liong, S.Y., 2007. Forecasting of hydrologic time series with ridge
Acknowledgements regression in feature space. J. Hydrol. 332 (3–4), 290–302.
Yu, X.Y., Liong, S.Y., Babovic, V., 2004. EC-SVM approach for real-time hydrologic
This paper is based on research partially supported by the Na- forecasting. J. Hydroinform. 6 (3), 209–223.
tional Science Council, Taiwan, under grants NSC 101-2625-M-

You might also like