Download as pdf or txt
Download as pdf or txt
You are on page 1of 12

Journal of Hydrology 528 (2015) 329–340

Contents lists available at ScienceDirect

Journal of Hydrology
journal homepage: www.elsevier.com/locate/jhydrol

Quantifying predictive uncertainty of streamflow forecasts based


on a Bayesian joint probability model
Tongtiegang Zhao a,b,⇑, Q.J. Wang b, James C. Bennett b, David E. Robertson b, Quanxi Shao a, Jianshi Zhao c
a
CSIRO Digital Productivity Flagship, Private Bag 5, Wembley, WA 6913, Australia
b
CSIRO Land and Water Flagship, 37 Graham Road, Highett, VIC 3190, Australia
c
State Key Laboratory of Hydro-Science and Engineering, Department of Hydraulic Engineering, Tsinghua University, Beijing, China

a r t i c l e i n f o s u m m a r y

Article history: Uncertainty is inherent in streamflow forecasts and is an important determinant of the utility of forecasts
Received 4 April 2015 for water resources management. However, predictions by deterministic models provide only single
Received in revised form 17 June 2015 values without uncertainty attached. This study presents a method for using a Bayesian joint probability
Accepted 18 June 2015
(BJP) model to post-process deterministic streamflow forecasts by quantifying predictive uncertainty.
Available online 25 June 2015
This manuscript was handled by Geoff
The BJP model is comprised of a log–sinh transformation that normalises hydrological data, and a
Syme, Editor-in-Chief bi-variate Gaussian distribution that characterises the dependence relationship. The parameters of the
transformation and the distribution are estimated through Bayesian inference with a Monte Carlo
Keywords:
Markov chain (MCMC) algorithm. The BJP model produces, from a raw deterministic forecast, an
Forecast uncertainty ensemble of values to represent forecast uncertainty. The model is applied to raw deterministic forecasts
Heteroscedasticity of inflows to the Three Gorges Reservoir in China as a case study. The heteroscedasticity and
Non-Gaussianity non-Gaussianity of forecast uncertainty are effectively addressed. The ensemble spread accounts for
Ensemble spread the forecast uncertainty and leads to considerable improvement in terms of the continuous ranked
Reliability probability score. The forecasts become less accurate as lead time increases, and the ensemble spread
Three Gorges Reservoir provides reliable information on the forecast uncertainty. We conclude that the BJP model is a useful tool
to quantify predictive uncertainty in post-processing deterministic streamflow forecasts.
Ó 2015 Elsevier B.V. All rights reserved.

1. Introduction of water resources based on streamflow forecasts helps efficiently


cope with operational risks due to streamflow variability and
Recent advances in hydrological modelling, weather climate change (Vicuna et al., 2010; Georgakakos et al., 2012a,b).
forecasting, and hydro-climatic teleconnections have facilitated Despite the great potential and wide use of streamflow forecasts,
considerable improvements in streamflow forecasts (Cloke and the uncertainty inherent in the forecasts has been a major obstacle
Pappenberger, 2009; Wood et al., 2011; Schepen et al., 2012). to their applications (Maurer and Lettenmaier, 2004; You and Cai,
The forecasts provide insightful information not only on hourly 2008; Sankarasubramanian et al., 2009). Under-estimation of fore-
and daily streamflows, but also on monthly and seasonal stream- cast uncertainty leads to operational risks, while over-estimation
flows (Maurer and Lettenmaier, 2003; Georgakakos et al., 2012a; induces overly conservative decisions (Zhao et al., 2014). The utility
Chen et al., 2014). As a result, streamflow forecasts have been of forecast information reduces as the magnitude of forecast uncer-
incorporated into decision-making processes (Maurer and tainty increases (Zhao et al., 2011, 2013; Hejazi et al., 2014).
Lettenmaier, 2004; Georgakakos et al., 2012b; Zhao and Zhao, Forecast information at long lead-times may be of little value to
2014). Operational systems for streamflow forecasting and water decision-makers due to the considerable forecast uncertainty
resources management have been established throughout the (Zhao et al., 2012; Wang et al., 2014; Xu et al., 2014).
world, for example, in the Columbia River (Hamlet et al., 2002; There are in general three approaches to estimating forecast
Alemu et al., 2011), the Nile River (Block, 2011), and the Yangtze uncertainty in hydrology (Montanari and Brath, 2004; Coccia and
River (Kwon et al., 2009; Li et al., 2010). Adaptive management Todini, 2011). The first option is to formulate a probabilistic fore-
casting model that outputs a streamflow forecast along with its
confidence intervals (Krzysztofowicz, 2001). For example, ensem-
⇑ Corresponding author at: CSIRO Digital Productivity Flagship, Private Bag 5,
ble forecasts are generated from multiple initial conditions and
Wembley, WA 6913, Australia.
hydrological forcing, and they contain multiple streamflow
E-mail address: tony.zhao@csiro.au (T. Zhao).

http://dx.doi.org/10.1016/j.jhydrol.2015.06.043
0022-1694/Ó 2015 Elsevier B.V. All rights reserved.
330 T. Zhao et al. / Journal of Hydrology 528 (2015) 329–340

Three Gorges Region

Three Gorges Dam

Fig. 1. Location map of the Three Gorges Reservoir and its drainage basin under the GCS_Xian_1980 geographic coordinate system.

Fig. 2. Relationships between forecast and forecast error for different lead-times ranging from 1 to 4 days.

scenarios to represent future uncertainty (Cloke and Pappenberger, 2011; Pokhrel et al., 2013a,b). The third option is based on
2009). The second option is to estimate forecast uncertainty by Monte Carlo methods that use simulation and re-sampling tech-
analysing statistical properties of forecast errors (Wood and niques (Montanari and Brath, 2004). In addition, expert informa-
Schaake, 2008). A number of post-processing models have been tion can also be incorporated into hydrological forecasting for
proposed to infer forecast uncertainty based on archived samples empirical bias correction and uncertainty estimation (Wood and
of past streamflow forecasts and observations (e.g., Weerts et al., Schaake, 2008; Liersch and Volk, 2007; Pappenberger et al., 2013).
T. Zhao et al. / Journal of Hydrology 528 (2015) 329–340 331

Fig. 3. Normal quantile–quantile plot examining the Gaussianity of the distribution of forecast errors.

In practice, deterministic forecasts, which are periodically this study explores the use of the BJP model in post-processing raw
updated (Zhao et al., 2013), are often used for decision-making. daily streamflow forecasts and estimating the predictive uncer-
Unfortunately, the uncertainty inherent in deterministic forecasts tainty. Focus is given to short-term forecast uncertainty modelling
is not provided, limiting the power of its application. How can and the examination of the reliability of ensemble spread in cap-
the corresponding forecast uncertainty be quantified? This study turing forecast uncertainty. As will be demonstrated later in this
attempts to address this question by quantifying the predictive paper, the heteroscedasticity and non-Gaussianity of forecast
uncertainty based on a Bayesian joint probability model (BJP), uncertainty are effectively addressed.
through a case study of forecast data collected from the Three The remainder of the paper is structured as follows. Section 2
Gorges Reservoir. In comparison with traditional post-processing illustrates the case under investigation with a focus on the
models which deal with forecast error between the original observed characteristics of forecast uncertainty. Section 3 presents
forecast and observation data, the BJP model employs data the BJP model, along with the prerequisite variance stabilizing
transformation and handles transformed data in its post-processing. transformation and the predictive uncertainty estimation proce-
In statistics, when forecast and observation data are from dures. Section 4 examines performances of the BJP model through
non-Gaussian distributions, which is almost always the case with a process of leave-one-year-out cross validation. Section 5
hydrological data, the distribution of forecast error would be more discusses results and concludes the study.
complicated and difficult to handle in model formulation. This
problem is circumvented in BJP by transforming the distributions
of original forecast and observation data to Gaussian before form- 2. Case study and data description
ing their association (correlation). Moreover, the Bayesian
approach used in BJP produces samples of the model parameters The predictive uncertainty of real-time streamflow forecast of
and therefore make it easy to conduct predictive uncertainty inflow to the Three Gorges Reservoir is analysed in this study.
analysis. The Three Gorges Reservoir is one of the largest reservoirs in the
The BJP model was developed by Wang et al. (2009) and Wang world. The reservoir controls floods from 56% of the drainage area
and Robertson (2011) for seasonal streamflow forecasting. This of the Yangtze River. A deterministic forecasting system exists to
model has since been used to calibrate precipitation and stream- aid reservoir operations (Li et al., 2010; Zhao et al., 2013).
flow predictions and bridge the relationships between climatic Forecasts are based on two upstream streamflow gauge stations
indices and local precipitation and streamflow (e.g., Schepen (Cuntan Station on the mainstream of Yangtze River, drainage area
et al., 2012; Pokhrel et al., 2013a,b; Bennett et al., 2014a,b). The 866,000 km2; Wulong Station on the Wujiang River, drainage area
former studies have applied the model to generate seasonal fore- 83,000 km2) and forty rainfall gauge stations in the mountainous
casts of precipitation and streamflow. Instead of making forecasts, Three Gorges region (drainage area 55,000 km2) (see Fig. 1). Flash
332 T. Zhao et al. / Journal of Hydrology 528 (2015) 329–340

Fig. 4. Cross validation of forecasts in terms of root mean squared error (black circle, raw forecast; blue plus sign, mean of post-processed forecast). (For interpretation of the
references to colour in this figure legend, the reader is referred to the web version of this article.)

floods and complex terrain in the Three Gorges region make it chal- Moreover, quantiles of forecast errors have a wider range than
lenging to forecast inflows to the Three Gorges Reservoir (Li et al., the theoretical Gaussian quantiles. The results indicate that the
2014; Liu et al., 2015). The forecasting system consists of coupled relationship between forecast and observation cannot be captured
hydrological and hydro-dynamical models. Model outputs are sub- by traditional linear regression, in which Gaussian distribution is a
ject to empirical correction by experienced water resources basic assumption. Figs. 2 and 3 illustrate the need for transforming
engineers. the data to account for the heteroscedasticity and non-Gaussianity.
The forecasts are produced at a daily time step for lead times
ranging from 1 day to 4 days. A data set of historical forecasts, cov-
ering the period from 2004 to 2009, is used in this study. The data- 3. Methods
set was used by Zhao et al. (2013) to examine uncertainty
evolution in dynamically updated streamflow forecasts. Zhao The BJP model characterizes the relationship between predic-
et al. (2013) found that forecast uncertainty varies with season – tors and predictand using a joint multivariate Gaussian distribu-
the distribution of forecast uncertainty is different for the tion after transformation. In this study, the predictor is the
pre-flood season (May and June) and the flood season (July, deterministic streamflow forecast generated by the Three Gorges
August, and September), but it is generally consistent for the three operational streamflow forecasting system and the predictand is
months in the flood season. This study investigates the flood sea- the observed streamflow. Therefore, the relationship between
son and focuses on the forecast uncertainty of flood forecasts at raw forecast and observed streamflow, which indicates forecast
different lead times. Fig. 2 plots forecast error against streamflow uncertainty, is formulated. In the BJP model, the log–sinh transfor-
forecast for each lead time. The figure clearly illustrates the mation (Wang et al., 2012) is employed to stabilize variance and
heteroscedastic nature of forecast errors. As can be seen, the spread make the variables Gaussian. Parameters of the transformation
of forecast errors is narrow when the value of the forecast is small. and the joint distribution function are estimated through
It gradually grows and then tapers off as the forecasted streamflow Bayesian inference.
increases.
The distribution of forecast errors is furthermore analysed using
normal quantile–quantile plot, as shown in Fig. 3. At different 3.1. Log–sinh transformation
lead-times, the quantile of forecast errors is plotted against the
theoretical Gaussian quantile. If forecast errors follow a Gaussian Streamflow data usually exhibit heteroscedasticity – that is,
distribution, then there is a linear relationship, which is indicated variance of streamflow tends to be much higher at high flow than
by the red dashed line, between the quantiles. However, Fig. 3 at low flow – and therefore forecast uncertainty depends on the
illustrates that forecast errors are actually not Gaussian. value of the forecast. A variance stabilizing transformation is
T. Zhao et al. / Journal of Hydrology 528 (2015) 329–340 333

Fig. 5. Cross validation of forecasts in terms of root mean squared error in probability (black circle, raw forecast; blue plus sign, median of post-processed forecast). (For
interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

applied to deal with heteroscedasticity in the forecast data. For a exhibits this kind of heteroscedasticity. Wang et al. (2012) derived
given variable y, suppose we have the following the variance stabilizing transformation corresponding to Eq. (3)
from Eq. (2):
varðyÞ ¼ XðEðyÞÞ ð1Þ
1 1
TðyÞ ¼ logðsinhða þ byÞÞ þ c ð4Þ
In the above equation, var( ) and E( ) indicate the operators of s0 b
variance and expectation, respectively; XðÞ represents the function Removing the constants from Eq. (4) gives:
of the relationship between E(y) and var(y).
In statistics, a transformation of y is derived from Eq. (1), as 1
TðyÞ ¼ logðsinhða þ byÞÞ ð5Þ
follows b
Z y
The above equation is called the log–sinh transformation. The
1 transformation stabilizes the variance and normalizes the variable,
TðyÞ ¼ dD ð2Þ
1 ½XðEðDÞÞ1=2 facilitating the use of simple models in uncertainty analysis (Wang
et al., 2012; Robertson et al., 2013a,b,c). This transformation has
In Eq. (2), T(y) represents a generalized variance stabilizing
been tested on a wide range of hydrological data, including
transformation of y (Huber et al., 2002). The variance of T(y), the
hydrological extremes (Bennett et al., 2014a,b), and been shown
variable after the transformation, is derived as varðTðyÞÞ 
to outperform other commonly used transformations (Wang
v arðTðEðyÞÞ þ T 0 ðyÞðy  EðyÞÞÞ ¼ ½T 0 ðyÞ2 varðyÞ. T0 (y) is derived from et al., 2012; Del Giudice et al., 2013).
Eq. (2), i.e., T 0 ðyÞ ¼ ½XðEðyÞÞ
1
1=2 . Therefore, varðTðyÞÞ 1
 ½XðEðyÞÞ XðEðyÞÞ ¼
1. As can be seen, the variance of T(y) becomes constant after the 3.2. Bayesian joint probability
transformation.
The log–sinh transformation proposed by Wang et al. (2012) The log–sinh transformation (Wang et al., 2012) is incorporated
assumes that into the BJP model (Wang et al., 2009; Wang and Robertson, 2011).
The raw forecast x and the observation y are transformed to ^ x and
2
varðyÞ ¼ ½s0 tanhða þ byÞ ð3Þ ^, respectively, by Eq. (5). A bi-variate Gaussian distribution is
y
applied to formulate the relationship between ^ x and y^. The mean
In Eq. (3), tanh( ) is the hyperbolic tangent function and
aþby ðaþbyÞ
vector is
tanhða þ byÞ ¼ eeaþby e
þeðaþbyÞ
. Eq. (3) indicates that the transformation !
is for the case in which the variance of the variable y grows with l^x
l¼ ð6Þ
the value of y and eventually tapers off (Wang et al., 2012; ly^
Bennett et al., 2014a,b). As shown in Fig. 2, forecast uncertainty
334 T. Zhao et al. / Journal of Hydrology 528 (2015) 329–340

Fig. 6. Cross validation of forecasts in terms of continuous ranked probability score (black circle, raw forecast; blue plus sign, median of post-processed forecast). (For
interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

and the variance–covariance matrix is please refer to Wang et al. (2009), Wang and Robertson (2011),
and Robertson et al. (2013a,b,c).
! The predictive uncertainty is derived as follows for the testing
X r^2x r r^x ry^ data period. Given a raw forecast xD;n (n = 1, 2, . . ., N), the
¼ ð7Þ
r ry^ r^x r2y^ parameters determined by the training data are applied to infer
the observation yn . Firstly, the log–sinh transformation (Eq. (5))
To summarize, there are in total 9 parameters in the BJP model, converts xD;n into ^
xD;n :
including l^x and ly^ (the expectation of ^ x and y ^, respectively), r^x
and ry^ (the standard deviation of ^ x and y ^, respectively), r (the linear 1
^xD;n ¼ logðsinhðax þ bx xD;n ÞÞ ð8Þ
correlation between ^ ^), and ax , ay , bx , and by (the parameters
x and y bx
in transforming x and y into ^ ^). All the model parameters are
x and y ^n conditional on ^
Secondly, the mean value and variance of y xD;n
reparameterised to reduce non-linear dependencies between rr
^n Þ ¼ ly^ þ r y^ ð^
are Eðy xD;n  l^x Þ and varðy ^n Þ ¼ ð1  rÞr2y^ , respectively.
parameters and increase the efficiency of parameter inference ^
x

As can be seen, the BJP model deals with transformed data and
(Robertson et al., 2013a,b,c).
associates the distribution of y ^n to ^
xD;n . The model corrects the bias
The BJP model is fitted using training data [xD;m ; yD;m ] (m = 1, 2,
in transformed data (as illustrated by the formulation of the
. . ., M) and then validated by testing data [xD;n ; yD;n ] (n = 1, 2, . . ., N).
expected mean value) and estimates the uncertainty (as shown
Based on the dataset of training samples, the uncertainties of the 9
by the derivation of variance). A random sample y ^n is drawn from
parameters are estimated with Bayesian inference. Noninformative
the conditional distribution
prior distributions are set for the parameters. Posterior distribu-
tions of the parameters are then determined by the Monte Carlo y ^n Þ; v arðy
^n  NðEðy ^n ÞÞ ð9Þ
Markov chain (MCMC) algorithm (Wang et al., 2009; Wang and
Robertson, 2011). In this study, the MCMC computation sets the ^n by the
Thirdly, a post-processed forecast is determined from y
number of burn-in iterations to 10,000 and the length of the inverse log–sinh transformation
Markov chain as 150,000. The first sample of every
arcsinhðeby y^n Þ  ay
150,000/1000 = 150 samples in the Markov chain is kept. As a yn ¼ ð10Þ
result, one thousand representative samples of the parameters by
are drawn for the purpose of predictive uncertainty estimation. In Eq. (10), the post-processed forecast generated by the BJP
For the sake of simplicity, the index of the parameters [l^x , ly^ , r^x , model is distinct from observed streamflow. As there are 1000
ry^ , r, ax , ay , bx , by ] is eliminated in the following formulation. For representative sets of parameters, for a given raw streamflow
more information about the implementation of the algorithm, forecast data xD;n , the BJP model outputs an ensemble of 1000
T. Zhao et al. / Journal of Hydrology 528 (2015) 329–340 335

Fig. 7. PIT uniform probability plot (1:1 solid red line, theoretical uniform distribution; dashed blue lines, Kolmogorov 5% significance band; black circles, PIT values of
observed streamflow against their rank divided by the samples size). (For interpretation of the references to colour in this figure legend, the reader is referred to the web
version of this article.)

post-processed forecast values to represent the probabilistic probability score (CRPS) are used in this study to evaluate forecasts
forecast for yn . The ensemble spread provides a numerical (Wang et al., 2009; Wang and Robertson, 2011; Bennett et al.,
representation of the forecast uncertainty of yn . 2014a,b). RMSE measures the standard deviation of forecast error
Parameters of the log–sinh transformation and the joint proba- and is applied to the ensemble mean. RMSEP, which assesses fore-
bility are obtained from the training dataset. In different case stud- cast error on a probability scale (the observed climatology distribu-
ies, the errors of raw forecasts depend on the streamflow tion), is calculated for the ensemble median. CRPS is widely used to
forecasting models themselves. As a result, the parameters, which assess probabilistic forecasts, and indicates the distance between
describe the forecast uncertainty in relation to raw forecasts, will the forecast probability distribution and the observation. As a com-
differ for different rivers or forecasting systems. However, the parison, the three error scores are also applied to the raw forecasts.
MCMC algorithm is efficient in estimating the parameters – repre- Note that CRPS of the (deterministic) raw forecasts is equivalent to
sentative samples are drawn from the Markov chain to capture the the mean absolute error.
parameter uncertainty, and are applied to quantify the predictive The three error scores of RMSE, RMSEP, and CRPS provide over-
uncertainty for a given raw forecast (Eqs. (8), (9), (10)). all evaluations of the BJP model. Furthermore, the ensemble spread
We note that the BJP model and the log–sinh transformation have is assessed by probability integral transform (PIT) and reliability
been successfully used for a variety of hydrological applications on a diagram, to check the reliability of the estimated uncertainty of
wide range of catchments in Australia and China, including the post-processed forecasts. The reliability is evaluated by PIT as
post-processing rainfall forecasts at seasonal time scales follows
(Hawthorne et al., 2013; Peng et al., 2014a,b; Schepen and Wang,
2014) and sub-daily time scales (Bennett et al., 2014a,b; Robertson PIT n ¼ F n ðyD;n Þ ð11Þ
et al., 2013a,b,c; Shrestha et al., 2015), generating statistical seasonal
streamflow forecasts (Peng et al., 2014a,b; Robertson et al., For forecasts to be reliable, the cumulative distribution func-
2013a,b,c; Robertson and Wang, 2013), and post-processing sea- tions F n ðÞ, which are estimated using the post-processed forecasts
sonal streamflow simulations (Pokhrel et al., 2013a,b). Therefore, from the BJP model, should be consistent with the observation data
the BJP model has the flexibility to estimate predictive uncertainty yD;n . More specifically, PIT n should follow a uniform distribution
of raw forecasts from different streamflow forecasting models. U[0, 1]. Therefore, the reliability of the forecasts can be checked
by displaying ranked PIT n (n = 1, 2, . . ., N) values in a uniform prob-
3.3. Forecast evaluation ability plot (Laio and Tamea, 2007). Points lie along the 1:1 line
when the forecasts are reliable.
Three error scores of root mean squared error (RMSE), root In addition, the reliability (or attribute) diagram is employed to
mean squared error in probability (RMSEP), and continuous ranked examine the forecast probability of binary events. As flooding risk
336 T. Zhao et al. / Journal of Hydrology 528 (2015) 329–340

Fig. 8. Forecast reliability diagrams of an event greater than 22,500 m3/s at different lead times. 1:1 dashed line, perfectly reliable forecast; circles, observed relative
frequency; insets, proportion of forecasts falling in the [0.00, 0.25), [0.25, 0.50), [0.50, 0.75), and [0.75, 1.00] bins.

is a major concern during the flood season, focus is given to flood case study. We note that other BJP studies observed considerable
event yn larger than a threshold ythres . The exceedance probability systematic bias in raw outputs from climate and weather predic-
is estimated by F n ðÞ: tion models and hydrological models, and they showed that the
BJP model is able to correct the bias through forecast calibration
Probðyn > ythres Þ ¼ 1  F n ðythres Þ ð12Þ (Robertson et al., 2013a,b,c; Pokhrel et al., 2013a,b). Since the
The reliability diagram plots the observed frequency against the raw forecast of the Three Gorges Reservoir has undergone
forecast probability and examines how well the predicted proba- empirical bias correction, results in Fig. 4 suggest that bias is not
bility of an event corresponds to its observed frequency (Wang of concern in the raw forecast data.
and Robertson, 2011). For ideal forecasts, the observed frequency Fig. 2 illustrates the close relationship between lead time and
should be equal to the forecast probability and the scatter plot lies forecast uncertainty. Across the six years of data, the average
along the 1:1 line. RMSE of the raw forecast is 1200 m3/s, 2000 m3/s, 2800 m3/s, and
4200 m3/s at lead-times of 1, 2, 3, and 4 days, respectively. The
increase of forecast uncertainty with the lead time is generally
4. Results
attributable to the set-up of the forecasting model. Floods entering
the Three Gorges Reservoir can be forecasted from upstream floods
The BJP-based predictive uncertainty estimation is evaluated
gauged by the Cuntan Station and Wulong Station, and locally
through leave-one-year-out cross validation. There are in total
generated floods originating from the Three Gorges Region
6 ⁄ 92 = 552 data points. The forecast-observation pairs in one year
downstream of the stream gauges, which are estimated using
are selected as testing data [xD;n ; yD;n ] (n = 1, 2, . . ., 92), and the pairs
observations from rainfall gauge stations (Li et al., 2010). The
in the other five years as training data [xD;m ; yD;m ] (m = 1, 2, . . ., 460). routing time of locally generated floods to the Three Gorges Dam
The validation is conducted from 2004 to 2009 and the results are is 1–2 days and that of upstream floods is 2–3 days. Therefore,
pooled to perform the evaluation. the 1-day ahead forecast is the most accurate as it is based on
routing of observed streamflow. To obtain the 2-day ahead forecast
4.1. Examination of RMSE, RMSEP, and CRPS requires forecasts of locally generated floods from the mountain-
ous Three Gorges Region. The 3- and 4-day forecasts require pre-
The RMSE of the ensemble mean at different lead times is eval- dictions of upstream floods, which are generated by large-scale
uated. The RMSE of the raw forecast is also assessed and provides a hydrological models. Streamflow forecasting becomes increasingly
reference. Fig. 4 shows that the RMSEs of the ensemble mean and difficult as lead time increases.
the raw forecast are generally similar. That is, the application of the In addition to RMSE, the RMSEP is employed to evaluate the
BJP model does not yield notable improvements in RMSE for our ensemble median and the raw forecast. Fig. 5 illustrates that the
T. Zhao et al. / Journal of Hydrology 528 (2015) 329–340 337

Fig. 9. Uncertainty estimation by the BJP model trained by samples from 2004 to 2008 (blue line, ensemble median; grey band, 50% credible interval; light grey band, 90%
credible interval; red dot, streamflow observation corresponding to raw forecast in the Year 2009). (For interpretation of the references to colour in this figure legend, the
reader is referred to the web version of this article.)

RMSEP of the ensemble median is not substantially different to BJP model (Eq. (11)). In the case study, the PIT value exhibits auto-
that of the raw forecast. As will be illustrated later in this paper, correlation that makes the Kolmogorov 5% significance band not
there is a linear relationship, which approximates to the 1:1 line, applicable. To circumvent the problem, the PIT plots are generated
between the ensemble median and the raw streamflow forecast. from every 10-th post-processed forecast. As shown in Fig. 7, the
Therefore, Figs. 4 and 5 indicate that in this case post-processing PIT plots at different lead times are generally along the 1:1
does not lead to forecasts where the ensemble mean or median line and lie within the significance bands. The results indicate that
improves on the raw deterministic forecast. In the following, focus although the errors of raw forecasts are heteroscedastic and
is given to the ensemble spread. non-Gaussian, the ensemble spread is reliable and captures the
The raw forecast is deterministic (point forecast), and the predictive uncertainty.
ensemble forecast provides a numerical representation of the fore- The ensemble spread at a lead time of 4 days is the most reliable
cast uncertainty (Eqs. (8), (9), and (10)). The CRPS provides an inte- as the PIT values fall on the 1:1 line (Fig. 7). On the other hand, the
grated evaluation of the forecast accuracy and reliability of the PIT plots of 1-day ahead forecasts exhibit an inverse S shape
forecast ensemble spread. Fig. 6 illustrates the CRPS values at dif- (though still largely within the confidence intervals), which indi-
ferent lead times. The CRPS values of the raw forecast (i.e., mean cates that the ensemble spread is somewhat wider than the ideal
absolute errors) increase with lead time. On the other hand, the spread (Wang et al., 2009; Wang and Robertson, 2011). This find-
BJP model reduces the CRPS of the raw forecast by 9%, 16%, 20%, ing is counter intuitive – the raw 1-day ahead forecasts are the
and 19%, on average, at lead-times of 1, 2, 3 and 4 days, respec- most accurate, but the estimation of their predictive uncertainty
tively. CRPS penalises forecasts on the basis of mean errors and is the least reliable. This result is attributed to the fact that fore-
unreliable ensemble spread. We have seen that mean errors in casts at short lead times are quite accurate and therefore forecast
the raw forecasts are often not substantially larger than uncertainty is small. A small defect in the forecast distribution will
post-processed forecasts, thus much of the gain in CRPS in the lead to a considerable distortion in the PIT plots because the uncer-
post-processed forecasts is due to accounting for forecast uncer- tainty spread is narrow. We note that the uncertainty estimation is
tainty. Deterministic forecasts are inherently over-confident even often less important to operational forecasters when forecasts are
if they are reasonably accurate in mean values, and the CRPS values highly accurate.
of raw forecasts reflect this shortcoming. The reliability of estimated ensemble spread in capturing the
probability of flood events greater than ythres = 22,500 m3/s (the
4.2. Reliability of the ensemble spread median of flow during the flood season) is also evaluated (Fig. 8).
First of all, the forecast probability is divided into four bins
The PIT values of the observed streamflow are derived from the [0.00, 0.25), [0.25, 0.50), [0.50, 0.75), and [0.75, 1.00]. The propor-
cumulative distributions of the ensemble spread generated by the tion of events falling in each bin is presented by inset bar plots.
338 T. Zhao et al. / Journal of Hydrology 528 (2015) 329–340

Fig. 10. Uncertainty estimation of forecasts in the flood season of 2009 based on the BJP model (blue line, raw forecast, which is approximately equal to the ensemble median;
grey band, 50% credible interval; light grey band, 90% credible interval; red dot, streamflow observation). (For interpretation of the references to colour in this figure legend,
the reader is referred to the web version of this article.)

Then, for each bin, the observed frequency of flood events greater Fortunately, even the 1-day ahead post-processed forecasts gener-
than the threshold is plotted against the mean of the forecast prob- ated by the BJP model are reasonably reliable (shown in both reli-
ability, as shown by the circles. The 90% confidence interval of ability diagrams and PIT plots), even if they are not as reliable as
observed frequency is derived by bootstrapping the lumped fore- 4-day ahead forecasts. Further, the PIT analyses indicate that the
casts from 2004 to 2009, as illustrated by vertical lines across the 1-day ahead forecasts are actually under-confident, meaning that
circles in Fig. 8. these forecasts could be even sharper. Overall, forecasts at all lead
The reliability plots at different lead times are generally close to times tend to be sharp as well as reliable. We conclude that the BJP
the 1:1 line. This result indicates that the forecast probabilities post-processing produces ensemble forecasts that are highly likely
derived from the ensemble (Eq. (12)) tend to be reliable – the to be useful to users of the forecasts.
observed streamflow approximately has a frequency of p being lar-
ger than ythres when the forecast probability is p. Comparing the 4.3. Uncertainty estimation for the Year 2009
plots at different lead times, there are two interesting observa-
tions: (1) the reliability plot at the lead time of 4 days provides The above sections illustrate the results of cross validation for
the best match of the 1:1 line; and (2) the relative frequency of all years, while this section focuses on the predictive uncertainty
forecast probability exhibits the most prominent U shape at the for the Year 2009. The analysis is similar to real-time predictive
lead time of 1 day. As with the PIT plots, the reliability diagrams uncertainty analysis that uses past forecast and observation data
suggest that the estimated ensemble spread at the lead time of to infer forecast uncertainty of a new forecast. The samples in
1 day is the least reliable one, but it is the sharpest one – forecast the years 2004–2008 are applied to train the BJP model and deter-
probabilities tend to be near 0 or 1 for the targeted flood events. mine the parameters. The samples in 2009 are tested. Fig. 9 illus-
Sharpness is a desirable property in forecasts because it allows trates the ensemble median and spread against the raw forecast.
users to be more decisive. However, if forecasts are very sharp but As can be seen, there is a linear relationship between the raw fore-
not reliable, they may be overconfident, leading to poor decisions. cast and the ensemble median, which generally lies along the 1:1
T. Zhao et al. / Journal of Hydrology 528 (2015) 329–340 339

line. In Fig. 9, the ensemble spread is represented by 50% and 90% This study provides a BJP-based analysis of the real-time fore-
credible intervals, which expand as the forecast value increases. cast uncertainty at different lead times for the Three Gorges
This pattern corresponds to Figs. 2 and 3 and demonstrates that Reservoir. The BJP model is a post-processing model that estimates
the BJP model is able to capture the heteroscedasticity and forecast uncertainty of raw forecasts. The BJP model works in con-
non-Gaussianity in the raw forecasts. Therefore, this model pro- cert with deterministic hydrological and hydro-dynamical models,
vides an effective approach to modelling the short-term forecast which play the fundamental role in producing the raw streamflow
uncertainty. forecasts. In the case study of the Three Gorges Reservoir, the fore-
The relationship between raw forecasts and observations cast at a lead time of 2–3 days depends on forecasts of locally gen-
becomes more scattered as lead time increases. The variations of erated floods from the Three Gorges Region, and the forecast at a
observed streamflow at different lead times are captured by the lead-time of 3–4 days relies on forecasts of upstream floods from
50% and 90% credible intervals. Therefore, the BJP model trained the mainstream of Yangtze River and the Wujiang River. In this
by the samples from 2004 to 2008 provides satisfactory predictive case study, the BJP model was not required to correct large biases,
uncertainty estimations of the samples in 2009. As we noted earlier, as the forecasts were already largely unbiased. We note that in
forecasts for the Three Gorges dam at longer lead times become other applications that the BJP model has been very effective at
more challenging because of the difficulty of forecasting locally gen- correcting biases, and that this model is likely to work well as a
erated floods and upstream basin-scale floods, and these difficulties post-processor in cases where forecasts are less accurate or more
are reflected in the widening uncertainty bands at longer lead times. biased. Working in concert with deterministic hydrological
Fig. 10 illustrates the predictive uncertainty chronologically for forecasting models, the statistical BJP model provides reliable
the 2009 flood season. Similar to Fig. 9, Fig. 10 demonstrates that predictive uncertainty estimation.
there is a larger forecast uncertainty at a longer lead time. The
1-day ahead raw deterministic forecast is quite accurate and cap-
Acknowledgement
tures the low flows and peak flows. The corresponding 50% and
90% credible intervals are very narrow. The 2-day and 3-day ahead
This study is partially supported by NSFC (51409145) and MSTC
deterministic forecasts are less accurate, but the 50% and 90% cred-
(2013BAB05B03).
ible intervals from the post-processed forecasts effectively capture
the flood process. The 4-day ahead deterministic forecast is the
least accurate and does not capture the largest flood peak in References
2009. The corresponding post-processed forecasts provide credible
Alemu, E.T., Palmer, R.N., Polebitski, A., Meaker, B., 2011. Decision support system
intervals and yield insightful information of the flood peak.
for optimizing reservoir operations using ensemble streamflow predictions. J.
Therefore, the BJP model provides a reliable predictive uncertainty Water Resour. Plan. Manage.-ASCE 137 (1), 72–82.
estimation and represents a useful tool in streamflow forecasting. Bennett, J.C., Wang, Q.J., Pokhrel, P., Robertson, D.E., 2014a. The challenge of
forecasting high streamflows 1–3 months in advance with lagged climate
indices in southeast Australia. Nat. Hazards Earth Syst. Sci. 14 (2), 219–233.
Bennett, J.C., Robertson, D.E., Shrestha, D.L., Wang, Q.J., Enever, D., Hapuarachchi, P.,
5. Summary and conclusions Tuteja, N.K., 2014b. A system for continuous hydrological ensemble forecasting
(SCHEF) to lead times of 9 days. J. Hydrol. 519, 2832–2846. http://dx.doi.org/
10.1016/j.jhydrol.2014.08.010.
Uncertainty is an inherent part of any streamflow forecast. It is Block, P., 2011. Tailoring seasonal climate forecasts for hydropower operations.
an important determinant of the utility of forecasts for water Hydrol. Earth Syst. Sci. 15 (4), 1355–1368.
resources management. This study addresses the problem of esti- Chen, X., Hao, Z., Devineni, N., Lall, U., 2014. Climate information based streamflow
and rainfall forecasts for Huai River basin using hierarchical Bayesian modeling.
mating the uncertainty of streamflow forecasts for the Three Hydrol. Earth Syst. Sci. 18 (4), 1539–1548.
Gorges Reservoir. A BJP model is set up to post-process determin- Cloke, H.L., Pappenberger, F., 2009. Ensemble flood forecasting: a review. J. Hydrol.
istic forecasts to quantify predictive uncertainty. The parametric 375 (3–4), 613–626.
Coccia, G., Todini, E., 2011. Recent developments in predictive uncertainty
variance-stabilizing log–sinh transformation is employed to deal assessment based on the model conditional processor approach. Hydrol. Earth
with heteroscedasticity and to normalize hydrological variables. Syst. Sci. 15, 3253–3274. http://dx.doi.org/10.5194/hess-15-3253-2011.
A bi-variate Gaussian distribution is applied to formulate the joint Del Giudice, D., Honti, M., Scheidegger, A., Albert, C., Reichert, P., Rieckermann, J.,
2013. Improving uncertainty estimation in urban hydrological modeling by
probability and to capture the dependence between normalized
statistically describing bias. Hydrol. Earth Syst. Sci. 17, 4209–4225. http://
variables. The BJP model produces an ensemble of post-processed dx.doi.org/10.5194/hess-17-4209-2013.
forecasts to capture the predictive uncertainty of real-time Georgakakos, K.P., Graham, N.E., Cheng, F.Y., Spencer, C., Shamir, E., Georgakakos,
streamflow forecasts. The ensemble mean, median, and spread A.P., Yao, H., Kistenmacher, M., 2012a. Value of adaptive water resources
management in northern California under climatic variability and change:
are examined to evaluate the performance of the BJP model. dynamic hydroclimatology. J. Hydrol. 412, 47–65.
The results show that the BJP model is a useful tool for Georgakakos, A.P., Yao, H., Kistenmacher, M., Georgakakos, K.P., Graham, N.E.,
predictive uncertainty estimation in short-term streamflow fore- Cheng, F.Y., Spencer, C., Shamir, E., 2012b. Value of adaptive water resources
management in Northern California under climatic variability and change:
casting. The heteroscedastic characteristic of forecast uncertainty, reservoir management. J. Hydrol. 412, 34–46.
i.e., uncertainty grows as the value of forecast increases, is charac- Hamlet, A.F., Huppert, D., Lettenmaier, D.P., 2002. Economic value of long-lead
terized by the model. Forecast uncertainty increases with lead streamflow forecasts for Columbia River hydropower. J. Water Resour. Plan.
Manage.-ASCE 128 (2), 91–101.
time, and the forecast becomes less accurate as lead time increases. Hawthorne, S., Wang, Q.J., Schepen, A., Robertson, D.E., 2013. Effective use of GCM
The ensemble spread explicitly accounts for the uncertainty of raw outputs for forecasting monthly rainfalls to long lead times. Water Resour. Res.
deterministic forecast at different lead times. As a result, the CRPS 49, 5427–5436. http://dx.doi.org/10.1002/wrcr.20453.
Hejazi, M.I., Cai, X.M., Yuan, X., Liang, X.Z., Kumar, P., 2014. Incorporating
is considerably reduced, in particular at a longer lead time. The PIT reanalysis-based short-term forecasts from a regional climate model in an
plot illustrates that the distribution of observed streamflow is reli- irrigation scheduling optimization problem. J. Water Resour. Plan. Manage. 140
ably captured by the ensemble spread. The reliability diagram (5), 699–713.
Huber, W., von Heydebreck, A., Sültmann, H., Poustka, A., Vingron, M., 2002.
shows that the forecast probability of streamflow greater than a
Variance stabilization applied to microarray data calibration and to the
threshold can be reliably derived from the ensemble spread. We quantification of differential expression. J. Bioinform. 18 (Suppl. 1), S96–S104.
conclude that the BJP is able to estimate uncertainty very accu- http://dx.doi.org/10.1093/bioinformatics/18.suppl_1.S96.
rately in real-time applications. The BJP model trained by archived Krzysztofowicz, R., 2001. The case for probabilistic forecasting in hydrology. J.
Hydrol. 249 (1–4), 2–9.
forecast-observation dataset can efficiently infer predictive Kwon, H.H., Brown, C., Xu, K.Q., Lall, U., 2009. Seasonal and annual maximum
uncertainty of a new forecast. streamflow forecasting using climate information: application to the Three
340 T. Zhao et al. / Journal of Hydrology 528 (2015) 329–340

Gorges Dam in the Yangtze River basin, China. Hydrol. Sci. J.-J. Sci. Hydrol. 54 Schepen, A., Wang, Q.J., 2014. Ensemble forecasts of monthly catchment rainfall out
(3), 582–595. to long lead times by post-processing coupled general circulation model output.
Laio, F., Tamea, S., 2007. Verification tools for probabilistic forecast of continuous J. Hydrol. 519, 2920–2931. http://dx.doi.org/10.1016/j.jhydrol.2014.03.017.
hydrological variables. Hydrol. Earth Syst. Sci. 11 (4), 1267–1277. Schepen, A., Wang, Q.J., Robertson, D., 2012. Evidence for using lagged climate
Li, X.A., Guo, S.L., Liu, P., Chen, G.Y., 2010. Dynamic control of flood limited water indices to forecast Australian seasonal rainfall. J. Clim. 25 (4), 1230–1246.
level for reservoir operation by considering inflow uncertainty. J. Hydrol. 391 Shrestha, D.L., Robertson, D.E., Bennett, J.C., Wang, Q.J., 2015. Improving
(1–2), 126–134. precipitation forecasts by generating ensembles through post-processing,
Li, Z., Yang, D.W., Hong, Y., Zhang, J., Qi, Y.C., 2014. Characterizing spatiotemporal Monthly Weather Review (in press).
variations of hourly rainfall by gauge and radar in the mountainous three Vicuna, S., Dracup, J.A., Lund, J.R., Dale, L.L., Maurer, E.P., 2010. Basin-scale water
Gorges region. J. Appl. Meteorol. Climatol. 53 (4), 873–889. system operations with uncertain future climate conditions: methodology and
Liu, P., Li, L.P., Guo, S.L., Xiong, L.H., Zhang, W., Zhang, J.W., Xu, C.Y., 2015. Optimal case studies. Water Resour. Res. 46, W04505. http://dx.doi.org/10.1029/
design of seasonal flood limited water levels and its application for the Three 2009WR007838.
Gorges Reservoir [J]. J. Hydrol. 527, 1045–1053. Wang, Q.J., Robertson, D.E., 2011. Multisite probabilistic forecasting of seasonal
Liersch, S., Volk, M., 2007. Towards empirical knowledge as additional information flows for streams with zero value occurrences. Water Resour. Res. 47, W02546.
in data-based flood forecasting techniques. In: Modsim 2007: International http://dx.doi.org/10.1029/2010WR009333.
Congress on Modelling and Simulation: Land, Water and Environmental Wang, Q.J., Robertson, D.E., Chiew, F.H.S., 2009. A Bayesian joint probability
Management: Integrated Systems for Sustainability, pp. 1596–1602. modeling approach for seasonal forecasting of streamflows at multiple sites.
Maurer, E.P., Lettenmaier, D.P., 2003. Predictability of seasonal runoff in the Water Resour. Res. 45, W05407. http://dx.doi.org/10.1029/2008WR007355.
Mississippi River basin. J. Geophys. Res.-Atmos. 108 (D16). Wang, Q.J., Shrestha, D.L., Robertson, D.E., Pokhrel, P., 2012. A log–sin h
Maurer, E.P., Lettenmaier, D.P., 2004. Potential effects of long-lead hydrologic transformation for data normalization and variance stabilization. Water
predictability on Missouri River main-stem reservoirs. J. Clim. 17 (1), 174–186. Resour. Res. 48, W05514. http://dx.doi.org/10.1029/2011WR010973.
Montanari, A., Brath, A., 2004. A stochastic approach for assessing the uncertainty of Wang, L., Koike, T., Ikeda, M., Tinh, D.N., Nyunt, C.T., Saavedra, O., Nguyen, L.C., Sap,
rainfall–runoff simulations. Water Resour. Res. 40 (1). T.V., Tamagawa, K., Ohta, T., 2014. Optimizing multidam releases in large river
Pappenberger, F., Stephens, E., Thielen, J., Salamon, P., Demeritt, D., vanAndel, S.J., basins by combining distributed hydrological inflow predictions with rolling-
Wetterhall, F., Alfieri, L., 2013. Visualizing probabilistic flood forecast horizon decision making. J. Water Resour. Plan. Manage. 140 (10).
information: expert preferences and perceptions of best practice in Weerts, A.H., Winsemius, H.C., Verkade, J.S., 2011. Estimation of predictive
uncertainty communication. Hydrol. Process. 27 (1), 132–146. hydrological uncertainty using quantile regression: examples from the
Peng, Z., Wang, Q.J., Bennett, J.C., Pokhrel, P., Wang, Z., 2014a. Seasonal precipitation National Flood Forecasting System (England and Wales). Hydrol. Earth Syst.
forecasts over China using monthly large-scale oceanic-atmospheric indices. J. Sci. 15 (1), 255–265.
Hydrol. 519, 792–802. http://dx.doi.org/10.1016/j.jhydrol.2014.08.012. Wood, A.W., Schaake, J.C., 2008. Correcting errors in streamflow forecast ensemble
Peng, Z., Wang, Q.J., Bennett, J.C., Schepen, A., Pappenberger, F., Pokhrel, P., Wang, Z., mean and spread. J. Hydrometeorol. 9 (1), 132–148.
2014b. Statistical calibration and bridging of ECMWF system 4 outputs for Wood, E.F., Roundy, J.K., Troy, T.J., van Beek, L.P.H., Bierkens, M.F.P., Blyth, E., de Roo,
forecasting seasonal precipitation over China. J. Geophys. Res. (Atmospheres) A., Doll, P., Ek, M., Famiglietti, J., Gochis, D., van de Giesen, N., Houser, P., Jaffe,
119, 7116–7135. http://dx.doi.org/10.1002/2013JD021162. P.R., Kollet, S., Lehner, B., Lettenmaier, D.P., Peters-Lidard, C., Sivapalan, M.,
Pokhrel, P., Robertson, D.E., Wang, Q.J., 2013a. A Bayesian joint probability post- Sheffield, J., Wade, A., Whitehead, P., 2011. Hyperresolution global land surface
processor for reducing errors and quantifying uncertainty in monthly modeling: meeting a grand challenge for monitoring Earth’s terrestrial water.
streamflow predictions. Hydrol. Earth Syst. Sci. 17, 795–804. http://dx.doi.org/ Water Resour. Res. 47, W05301. http://dx.doi.org/10.1029/2010WR010090.
10.5194/hess-17-795-2013. Xu, W., Zhang, C., Peng, Y., Fu, G., Zhou, H., 2014. A two stage Bayesian stochastic
Pokhrel, P., Robertson, D.E., Wang, Q.J., 2013b. A Bayesian joint probability post- optimization model for cascaded hydropower systems considering varying
processor for reducing errors and quantifying uncertainty in monthly uncertainty of flow forecasts. Water Resour. Res. 50 (12), 9267–9286. http://
streamflow predictions. Hydrol. Earth Syst. Sci. 17 (2), 795–804. dx.doi.org/10.1002/2013wr015181.
Robertson, D.E., Wang, Q.J., 2013. Seasonal forecasts of unregulated inflows into the You, J.Y., Cai, X.M., 2008. Hedging rule for reservoir operations: 1. A theoretical
Murray River, Australia. Water Resour. Manage 27, 2747–2769. http:// analysis. Water Resour. Res. 44 (1), W01415. http://dx.doi.org/10.1029/
dx.doi.org/10.1007/s11269-013-0313-4. 2006WR005481.
Robertson, D.E., Shrestha, D.L., Wang, Q.J., 2013a. Post-processing rainfall forecasts Zhao, T.T.G., Zhao, J.S., 2014. Joint and respective effects of long- and short-term
from numerical weather prediction models for short-term streamflow forecast uncertainties on reservoir operations. J. Hydrol. 517, 83–94.
forecasting. Hydrol. Earth Syst. Sci. 17 (9), 3587–3603. Zhao, T.T.G., Cai, X.M., Yang, D.W., 2011. Effect of streamflow forecast uncertainty
Robertson, D.E., Shrestha, D.L., Wang, Q.J., 2013b. Post-processing rainfall forecasts on real-time reservoir operation. Adv. Water Resour. 34 (4), 495–504.
from numerical weather prediction models for short-term streamflow Zhao, T.T.G., Yang, D.W., Cai, X.M., Zhao, J.S., Wang, H., 2012. Identifying effective
forecasting. Hydrol. Earth Syst. Sci. 17, 3587–3603. http://dx.doi.org/10.5194/ forecast horizon for real-time reservoir operation under a limited inflow forecast.
hess-17-3587-2013. Water Resour. Res. 48, W01540. http://dx.doi.org/10.1029/2011WR010623.
Robertson, D.E., Pokhrel, P., Wang, Q.J., 2013c. Improving statistical forecasts of Zhao, T.T.G., Zhao, J.S., Yang, D.W., Wang, H., 2013. Generalized martingale model of
seasonal streamflows using hydrological model output. Hydrol. Earth Syst. Sci. the uncertainty evolution of streamflow forecasts. Adv. Water Resour. 57, 41–51.
17, 579–593. http://dx.doi.org/10.5194/hess-17-579-2013. Zhao, T.T.G., Zhao, J.S., Lund, J.R., Yang, D.W., 2014. Optimal hedging rules for
Sankarasubramanian, A., Lall, U., Devineni, N., Espinueva, S., 2009. The role of reservoir flood operation from forecast uncertainties. J. Water Resour. Plan.
monthly updated climate forecasts in improving intraseasonal water allocation. Manage. 140 (12).
J. Appl. Meteorol. Climatol. 48 (7), 1464–1482.

You might also like