Download as pdf or txt
Download as pdf or txt
You are on page 1of 96

Stochastic Analysis,

Modeling, and Simulation (SAMS)


Version 2000
USER's MANUAL

J. D. Salas, N. Saada, C. H. Chung, W. L. Lane, and D. K. Frevert

October, 2000

Computing Hydrology Laboratory


Water Resources, Hydrologic and Environmental Sciences
Engineering Research Center
Fort Collins, Colorado

TECHNICAL REPORT No.10


Stochastic Analysis, Modeling, and
Simulation (SAMS)
Version 2000 - User's Manual

by

Jose D. Salas1, Nidhal Saada2, and Chen-hua Chung2


Water Resources, Hydrologic and Environmental Sciences
Department of Civil Engineering, Colorado State University
Fort Collins, Colorado, U.S.A

William L. Lane3
Consultant, Hydrology and Water Resources Engineering,
1091 Xenophon St., Golden, CO 80401-4218.

and

Donald K. Frevert4
U.S Department of Interior
Bureau of Reclamation
Denver, Colorado
U.S.A

1
Professor, Water Resources, Hydrologic and Environmental Sciences, Civil Engineering
Department, Colorado State University.
2
Former graduate students, Water Resources, Hydrologic and Environmental Sciences , Civil
Engineering Department, Colorado State University.
3
Consultant, Hydrology and Water Resources Engineering, 1091 Xenophon St., Golden, CO
80401-4218.
4
Hydraulic Engineer, Water Resources Services, Technical Service Center, U.S Bureau of
Reclamation, Denver, CO 80225.
TABLE OF CONTENTS
Page
PREFACE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
ACKNOWLEDGEMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
1. INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

2. DESCRIPTION OF SAMS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2.1 General Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.2 Statistical Analysis of Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.3 Fitting a Stochastic Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.4 Generating Synthetic Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

3. DEFINITION OF STATISTICAL CHARACTERISTICS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32


3.1 Basic Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.1.1 Annual Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.1.2 Seasonal Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.2 Flood, Storage, and Drought Related Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.2.1 Storage Related Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.2.2 Drought Related Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.2.3 Surplus Related Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

4. MATHEMATICAL MODELS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.1 Data Transformations and Standardization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.2 Univariate ARMA (p,q) Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.3 Univariate GAR (1) Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.4 Univariate PARMA (p,q) Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.5 Multivariate MAR (p) Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.6 Multivariate CARMA (p,q) Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.7 Multivariate MPAR (p) Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.8 Disaggregation Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.8.1 General . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.8.2 Model Formulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.9 Model Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

5. EXAMPLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
5.1 Statistical Analysis of Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
5.2 Stochastic Modeling and Generation of Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
5.2.1 Univariate ARMA(p,q) Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
5.2.2 Univariate GAR(1) Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
5.2.3 Univariate PARMA(p,q) Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
5.2.4 Multivariate MAR(p) Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
5.2.5 Multivariate CARMA(p,q) Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.2.6 Disaggregation Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

APPENDIX A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
APPENDIX B . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
APPENDIX C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

i
PREFACE
Several computer packages has been developed since the 1970's for analyzing the stochastic
characteristics of time series in general and hydrologic and water resources time series in particular.
For instance, the LAST package was developed in 1977-1979 by the US Bureau of Reclamation
(USBR) in Denver, Colorado. Originally the package was designed to run on a mainframe
computer, but later it was modified for use on personal computers. While various additions and
modifications have been made to LAST over the past twenty years, the package has not kept pace
with either advances in time series modeling or advances in computer technology. These facts
prompted USBR to promote the initial development of SAMS, a computer software package that
deals with the Stochastic Analysis, Modeling, and Simulation of hydrologic time series, particularly
annual and seasonal streamflow series. It is written in C and Fortran and runs under modern
windows operating systems such as WINDOWS NT and WINDOWS 98. This manual describes
the current version of SAMS denoted as SAMS 2000.

ACKNOWLEDGEMENTS
SAMS has been developed as a cooperative effort between USBR and Colorado State
University (CSU) under USBR Advanced Hydrologic Techniques Research Project through an
Interagency Personal Agreement with Professor Jose D. Salas as Principal Investigator. Drs. W.L.
Lane and D.K. Frevert provided additional expert guidance and supervision on behalf of USBR.
Several former CSU graduate students collaborated in various parts of this project including, M.W.
AbdelMohsen, who developed many of the Fortran codes, M. Ghosh who initiated the programming
in C language followed by Mr. Bradley Jones, Nidhal M. Saada, and Chen-Hua Chung.
Acknowledgements are due to the funding agency and to the several students who collaborated in
this project.

ii
STOCHASTIC ANALYSIS, MODELING, AND SIMULATION
(SAMS 2000)

1. INTRODUCTION
Stochastic simulation of water resources time series in general and hydrologic time series
in particular has been widely used for several decades for various problems related to planning and
management of water resources systems. Typical examples are determining the capacity of a
reservoir, evaluating the reliability of a reservoir of a given capacity, evaluation of the adequacy of
a water resources management strategy under various potential hydrologic scenarios, and evaluating
the performance of an irrigation system under uncertain irrigation water deliveries (Salas et al, 1980;
Loucks et al, 1981).
Stochastic simulation of hydrologic time series such as streamflow is typically based on
mathematical models. For this purpose a number of stochastic models have been suggested in
literature (Salas, 1993; Hipel and McLeod, 1994). Using one type of model or another for a
particular case at hand depends on several factors such as, physical and statistical characteristics of
the process under consideration, data availability, the complexity of the system, and the overall
purpose of the simulation study. Given the historical record, one would like the model to reproduce
the historical statistics. This is why a standard step in streamflow simulation studies is to determine
the historical statistics. Once a model has been selected, the next step is to estimate the model
parameters, then to test whether the model represents reasonably well the process under
consideration, and finally to carry out the needed simulation study.
The advent of digital computers several decades ago led to the development of computer
software for mathematical and statistical computations of varied degree of sophistication. For
instance, well known packages are IMSL, STATGRAPHICS, ITSM, MINITAB, SAS/ETS, SPSS,
and MATLAB. These packages can be very useful for standard time series analysis of hydrological
processes. However, despite of the availability of such general purpose programs, specialized
software for simulation of hydrological time series such as streamflow, have been attractive because
of several reasons. One is the particular nature of hydrological processes in which periodic
properties are important in the mean, variance, covariance, and skewness. Another one is that some
hydrologic time series include complex characteristics such as long term dependence and memory.

1
Still another one is that many of the stochastic models useful in hydrology and water resources have
been developed specifically oriented to fit the needs of water resources, for instance temporal and
spatial disaggregation models. Examples of specific oriented software for hydrologic time series
simulation are HEC-4 (U.S Army Corps of Engineers, 1971), LAST (Lane and Frevert, 1990), and
SPIGOT (Grygier and Stedinger, 1990).
The LAST package was developed during 1977-1979 by the U. S. Bureau of Reclamation
(USBR). Originally, the package was designed to run on a mainframe computer (Lane, 1979) but
later it was modified for use on personal computers (Lane and Frevert, 1990). While various
additions and modifications have been made to LAST over the past 20 years, the package has not
kept pace with either advances in time series modeling or advances in computer technology. This
is especially true of the computer graphics. These facts prompted USBR to promote the initial
development of the SAMS package. The first version of SAMS (SAMS-96.1) was released in 1996.
Since then, corrections and modifications were made based on feedback received from the users.
In addition, new functions and capabilities have been implemented.
SAMS 2000 has the following capabilities and limitations:
1. Analyze annual and seasonal data. For seasonal data the maximum number of seasons is 12 (time
intervals within a year).
2. It includes several types of transformation options to transform the original data into normal.
3. It includes a number of single site, multisite, and disaggregation stochastic models that have been
widely used in literature.
4. It includes two major modeling schemes for modeling and generation of complex river network
systems.
5. Maximum number of stations is 40.
6. Maximum number of stations for a group (for purposes of multivariate disaggregation) is 10.
7. Maximum number of years for the input data file is 600.
8. The number of samples that can be generated is unlimited.
9. The number of years that can be generated is unlimited.
The purpose of this manual is to provide a detailed description of the current version of
SAMS developed for the stochastic simulation of hydrologic time series such as annual and monthly
streamflows.

2
2. DESCRIPTION OF SAMS
In section 2.1, a general description of
SAMS is presented in which different operations
undertaken by SAMS are briefly explained.
Then, each operation is explained and illustrated
in subsequent sections more thoroughly.
2.1 General Overview
SAMS is a computer software package
that deals with the stochastic analysis, modeling,
and simulation of hydrologic time series. It is
written in C and Fortran and runs under modern Fig. 1 SAMS main menu
windows operating systems such as WINDOWS
NT and WINDOWS 98. The package consists of many menu option windows which enables the
user to choose between different options that are currently available. SAMS 2000 is a modified and
expanded version of SAMS-96.1. It consists of three primary application modules: 1) Statistical
Analysis of Data, 2) Fitting a Stochastic Model (includes parameter estimation and testing), and 3)
Generating Synthetic Series. Figure 1 shows the SAMS main menu. The user can select any of the
main modules by clicking on the desired option shown in this menu. Before running the
applications, the user must select (open) a file
that contains the (historical) input data. This can
be done by clicking on the "File Menu" option
shown on the top part of the main menu. This
will take the user to another menu, as shown in
Fig.2. Then the user may “Open A File” (select
a data file) and “Display Current Data File”
where the content of the opened file can be seen.
Examples of seasonal and annual input files are
shown in Appendices A and B, respectively.
SAMS has the capability of analyzing
single site and multisite annual and seasonal data Fig. 2 File menu

3
and the results of the analysis are presented in graphical or tabular forms or are written on output
files. The current version of SAMS can be applied to annual and seasonal data, such as quarterly
and monthly data.
The “Statistical Analysis of Data” module consists of data plotting, checking the normality
of the data, data transformation, and data statistical characteristics. Plotting the data may help
detecting trends, shifts, outliers, or errors in the data. Probability plots are included for verifying
the normality of the data. The data can be transformed to normal by using different transformation
techniques. Currently, logarithmic, power, and Box-Cox transformations are available. SAMS
determines a number of statistical characteristics of the data. These include basic statistics such as
mean, standard deviation, skewness, serial correlations (for annual data), season-to-season
correlations (for seasonal data), annual and seasonal cross-correlations for multisite data, and
drought, surplus, and storage related statistics. These statistics are important in investigating the
stochastic characteristics of the data.
The second main application of SAMS “Fitting a Stochastic Model” includes parameter
estimation and model testing for alternative univariate and multivariate stochastic models. The
following models are included: (1) univariate ARMA(p,q) model, where p and q can vary from 1
to 10, (2) univariate GAR(1) model, (3) univariate periodic PARMA(p,q) model, (4) univariate
seasonal disaggregation, (5) multivariate autoregressive MAR(p) model, (6) contemporaneous
multivariate CARMA(p,q) model, where p and q can vary from 1 to 10, (7) multivariate periodic
MPAR(p) model, (8) multivariate annual (spatial) disaggregation model, and (9) multivariate
temporal disaggregation model. Two estimation methods are available, namely the method of
moments (MOM) and the least squares method (LS). MOM is available for most of the models
while LS is available only for univariate ARMA, PARMA, and CARMA models. For CARMA
models, both the method of moments (MOM) and the method of maximum likelihood (MLE) are
available for estimation of the variance-covariance (G) matrix. Regarding multivariate annual
(spatial) disaggregation models, parameter estimation is based on Valencia-Schaake or Mejia-
Rousselle methods, while for annual to seasonal (temporal) disaggregation Lane's condensed method
is applied.
For stochastic simulation at several sites in a stream network system a direct modeling
approach based on multivariate autoregressive and CARMA processes are available for annual data

4
and multivariate periodic autoregressive process is available for seasonal data. In addition, two
schemes based on disaggregation principles are available. For this purpose, it is convenient to
divide the stations into key stations, substations, and subsequent stations. Generally the key stations
are the farthest downstream stations, substations are the next upstream stations, and subsequent
stations are the next further upstream stations. In the first scheme, the annual flows at the key
stations are added creating an annual flow data at an “artificial or index station”. Subsequently, a
univariate ARMA(p,q) model is fitted to the annual flows of the index station. Then, a spatial
disaggregation model relating the annual flows of the index station to the annual flows of the key
stations is fitted. Further, a statistical disaggregation model relating the annual flows of the key
station to those of the substations and another disaggregation model relating the annual flows of the
substations and the subsequent stations, are fitted. In fact, this is a three-level (spatial)
disaggregration procedure. In the second scheme a multivariate AR(p) model is fitted to the annual
data of the key stations, then the rest of the model relating the annual flows at the key station,
substations, and subsequent stations are conducted in a similar manner as in the first scheme.
Furthermore, if the objective of the modeling exercise is to generate seasonal data by using
disaggregration approaches, then an additional temporal disaggregration model is fitted that relates
the annual flows of a group of stations with the corresponding seasonal flows.
The third main application of SAMS is “Generating Synthetic Series”, i.e. simulating
synthetic data. Data generation is based on the models, approaches, and schemes as mentioned
above. The model parameters for data generation can be those which are estimated by SAMS or
they can be provided by the user. If provided by the user, the program prompts the user to insert the
model type and then the model parameters. The statistical characteristics of the generated data are
presented in graphical or tabular forms along with the historical statistics of the data that was used
in fitting the generating model. The generated data including the "generated" statistics can be
displayed graphically or in table form, and be printed and/or written on specified output files. As
a matter of clarification, we will summarize here the overall data generation procedure for
generating seasonal data based on scheme 2:
(a) a multivariate AR(p) model is used to generate annual flows at the key stations;
(b) a spatial disaggregation model is used to disaggregate the generated annual flows at the key
stations into annual flows at the substations;

5
(c) a spatial disaggregation model is used to disaggregate the generated annual flows at substation
into annual flows at subsequent stations;
(d) a temporal disaggregation model is used to disaggregate the annual flows at a group of stations
into the corresponding seasonal flows at those stations.
In modeling and data generation of complex water resources systems involving many
stations, despite the versatility of SAMS 2000, keeping track of different options, components,
parameters, etc. involved can be a time consuming and confusing task. To help alleviating this
problem, a “Status” button (see Fig.3) can be activated. The user can review the current
transformation, modeling, and generation status and related information by clicking on the “Status”
button in any menu or window.

2.2 Statistical Analysis of Data


Figure 3 shows the statistical analysis data menu. By selecting the annual or seasonal button
the user can specify the type of data to be analyzed. Then, the following operations can be selected:
1. Plot time series data.
2. Check normality and transform time series.
3. Statistical characteristics of time series.
In the following sections, we will examine and
illustrate each of these options.
Plot Time Series Data
Plotting of the data can help in detecting
trends, shifts, outliers, and errors in the data.
SAMS can plot the data as curve, stick, and bar
graphs. Figure 4 illustrates a time series plot for
annual data. The scale of the plot is determined
based on the sample maximum and minimum as
shown in the control bar at the bottom, but the user Fig.3 Statistical analysis menu
can change it by keying in the desired graph scale
range. This enables the user to zoom in and out the plot to examine the data and do on-screen
graphical check for the variability of the data. Note that if the station names or ID’s are available
in the input data file, they will be shown on the plots or tables.

6
Fig.4 Plotting of annual time series

Check Normality and Transform Time series


SAMS tests the normality of the data by plotting the data on normal probability paper and
by using the skewness test of normality. To examine the adequacy of the transformation, the
comparison of the theoretical generated distribution based on the transformation and the counterpart
historical sample distribution are plotted as shown in Fig. 5 for annual data. For seasonal data, the
results of the seasonal skewness tests are presented in graphical and tabular formats. The test critical
values are also shown on the screen which are guides to check whether the data is within the normal
range. For example, if the sample skewness coefficient for a given season is less than or equal to
the critical value, the hypothesis of normality of the data can not be rejected. On the other hand, if
the sample skewness coefficient is greater than the table value, the hypothesis of normality is
rejected. In addition, for the specified season, the normal probability plot for the transformed
seasonal data and the comparison of the theoretical generated distribution and the sample
distribution for that season are also displayed.

7
Fig.5 Annual data transformation result

If the data at hand is not normal, one can check whether it can be normalized by a certain
transformation function. This can be done by clicking on "Transformations" button and a menu with
different types of transformations will appear. Fig. 6 shows the transformation menu for seasonal
data. The user can choose any type of transformation by simply clicking on the corresponding
button. Three types of transformations are available: logarithmic, power, and Box-Cox
transformations. The transformation can be done all at once for all seasons or on a season by season
basis. The user can choose any of the above transformations and accordingly key in the
transformation coefficients, then click the "Display" button to preview the transformation result.
Clicking on the "Accept Transformation" button will actually conduct the transformation for
the data of current station and store the transformation type and coefficients in memory. From this
point, SAMS will recognize the transformed data as the default data and will process this data
instead of the original data. For clarification, suppose that the user has chosen to transform the
annual data for site 1 by a logarithmic transformation and accepted the transformation by clicking

8
on the "Accept Transformation" button.
Suppose further that the user wants to model
site 1 data with an ARMA (p,q) model. Then,
the ARMA model will be fitted to the
transformed data and not the original data.
The question that can be raised here is: can I
get the model to fit the original data (without
having to start the whole process over again)?
The answer is yes. You can get your original
data back by clicking on again the
"Transformation" button, then choose the "No
Transformation" button (shown at the bottom in
Fig.6), and then in the next window (refer to
Figs. 5 and 7) use "Accept Transformation" to
retrieve the original data. Fig.6 Transformation menu for
The save option (refer to Figs. 5 and 7) seasonal data
allows the user to save the transformation
parameters in a special file. Before clicking on “save”, remember to actually transform the data
by clicking on “Accept Transformation". Clicking on the "Save" button will prompt a file menu
and allow the user to select the file name (with an extension ".atr" and ".str" automatically attached
for annual and seasonal data, respectively) for storing the transformation parameters. This will
enable the user to access to the transformation parameters at any other time. To understand this
convenient feature of SAMS, suppose that a user transformed the data and fitted the PARMA (1,1)
model to the data. Subsequently, the user wants to fit a different model to the transformed data.
Instead of doing the transformation process over again, the user can simply open the transformation
file which was saved previously. The user can access to this file by clicking on the
"Transformation" button and then on the "Open File that Contains Transformation Parameters"
button. After the file has been opened, one must click on "Accept Transformation" to actually
transform the data. For multisite data, instead of clicking on "Accept Transformation" for each site,
the user can simply click (once) on "Transform all sites" to conduct the data transformation for all

9
sites. Figure 7 shows an example of seasonal transformation results. In the example the
logarithmic transformation has been used with varying values of the coefficient a..

Fig.7 Seasonal data transformation results

The steps that are usually involved in using the transformation window option presented in
Fig.5 and 7 are summarized below:
1. To check normality of data and use transformation options:
! Key in the proper site number.
! Key in the season number (available for seasonal data only).
! Click on "Transformation" button.

10
! From the transformation menu (for instance see Fig.6 for seasonal data), select a
transformation type.
! Click "Display" on the next window (for instance see Fig. 5 and 7).
! Key in the transformation coefficients (if necessary) and click "Display". See the
results and try other coefficients as needed.
2. To actually transform the data by using the selected transformation type and coefficients
! Click on "Accept Transformation" button.
3. To save the selected transformation type and coefficients in a file
! Click on "Save" button (previously you must have clicked on “accept transformation).
4. To transform data by loading the previously saved transformation parameter file
! Click on "Transformation" button and choose "Open File that Contains
Transformation Parameters" to open the transformation coefficients file.
! Click on "Transform all sites".
It is suggested that if transformations are needed for both annual and seasonal data, the user
should conduct annual data transformation before conducting seasonal data transformation.
Statistical Characteristics of Time Series
A number of statistical characteristics can be calculated for the original and transformed data.
They can be available in graphical and
tabular formats and can be saved in an output
file. These are summarized below.
- For Annual Data:
! Basic statistics such as mean,
standard deviation, skewness
coefficient, coefficient of
variation, maximum, and
minimum values.
! Serial correlation coefficients.
! Cross-correlation coefficients
for multisite data.
! Drought, surplus (flood), and Fig.8 Annual statistical characteristics
menu
11
storage related statistics.
Figure 8 shows the annual statistical characteristics menu.
- For Seasonal Data:
! Basic statistics such as seasonal means, standard deviations, skewness coefficients,
coefficients of variation, maximum, and minimum values.
! Season-to-season correlation coefficients.
! Season-to-season cross-correlation coefficients for multisite data.
! Drought, surplus (flood), and storage related statistics.
Figure 9 shows the seasonal statistical characteristics menu.
The menus shown in Figs. 8 and 9 are
used to select the desired statistics for either
annual and seasonal data, respectively. By
clicking on the "Save Statistics" button the
calculated statistics can be saved in files with
extensions ".ast" and ".sst" for annual and
seasonal time series, respectively.
Depending on the user’s selection of the type
of statistic a window similar to the one
shown in Fig.10 will appear. The "Graph"
button permits the user to view the statistics
of the data in graphical format. For example,
Fig.9 Seasonal statistical characteristics
Fig.10 shows the plot of the season-to-season menu
correlation coefficients for monthly data.
The two dashed gray lines represent the 95%
limits. If a correlation coefficient lies between these two lines, it means that the correlation is not
statistically significant. The "List" button provides the same information in tabular format. The user
must key in the needed information such as the site(s) number and other pertinent data depending
on the window at hand and then click on the "Graph" or "List" buttons to view the results. For
instance, the stations indicated in Fig. 10 are stations 1, 2, 3, and 4 and the time lags for calculating
the season to season correlations are 1, 2, and 3. The season to season correlations results are shown

12
for up to 4 stations. If the stations specified are more 4 stations(sites), say 7, then after viewing the
results for the first 4 stations, clicking on the "Next" button will enable one to view the results of the

Fig.10 Window showing the season to season correlations of seasonal data

remaining 3 stations.
2.3 Fitting a Stochastic Model
The LAST package included several programs to perform several objectives regarding
stochastic modeling of time series. The basic procedure involved modeling and generating the
annual time series using a multivariate AR(1) or AR(2) model, then using a disaggregation model
to disaggregate the generated annual flows to their corresponding seasonal flows. In contrast,
SAMS has two major modeling strategies which are direct and indirect modeling. Direct modeling
means fitting an stationary model (univariate ARMA or multivariate AR or CARMA) directly to the

13
annual data or fitting a periodic (seasonal) model (univariate PARMA or multivariate PAR) directly
to the seasonal data of the system at hand. Annual to seasonal disaggregation modeling on the other
hand is an indirect procedure since the modeling of seasonal data involves also modeling of the
corresponding annual data as well. Figure 11 displays the referred direct or indirect (using
disaggregation) modeling procedures under annual or seasonal categories. Regardless whether the
input data available is annual data or seasonal (for example monthly data) the user must select on
the “annual” button if the final objective of the modeling exercise is to generate annual flows only.
Otherwise, if the objective is to generate monthly
quantities then the seasonal button must be
selected.
The following specific models are
currently available in SAMS under each
category:
1. For Annual Modeling:
! Univariate ARMA(p,q) model.
! Univariate GAR(1) model.
! Multivariate AR(p) model (MAR).
! Contemporaneous ARMA(p,q)
model (CARMA).
! Multivariate annual (spatial)
disaggregation.
2. For Seasonal Modeling:
! Univariate PARMA(p,q) model. Fig.11 Stochastic modeling menu
! Univariate seasonal
disaggregation.
! Multivariate PAR(p) model (MPAR).
! Multivariate seasonal disaggregation.
Figures 12 and 13 display the menus that can be used for selecting annual and seasonal
models, respectively. The user will need to click on the button corresponding to the desired model
and in turn a modeling menu will appear where the site number, the model order, etc. can be

14
specified. For example, Fig.14 shows a menu
that can be used to fit a PARMA(p,q) model.
Similar menus are available for ARMA, GAR(1),
MAR, CARMA, and MPAR models. The user
needs to specify the station(s) or site(s)
number(s). If standardization of the data is
desired, one must click on the "Standardize Data"
button. Generally, the modeling is performed
with data in which the mean is subtracted. Thus,
standardization implies that not only the mean
will be subtracted but in addition the data will be
further transformed to have a standard deviation
equal to one. For example, for the data of season
5 the mean for season 5 will be subtracted from
each data point, then each observed data point for
that season will be divided by the standard
Fig.12 Annual stochastic modeling
th
deviation of the 5 season. As a result, the mean
menu
and the standard deviation of the standardized
data of the 5th season will become equal to zero
and one, respectively. Then, the order of the model to be fitted can be selected by clicking on "Enter
model order" button. For instance, one must enter p and q for ARMA models. In the case of MAR
or MPAR models, the user needs to key in the order p only. Subsequently, the method of estimation
of the model parameters must be selected.
Currently SAMS provides two methods of estimation namely the method of moments
(MOM) and the least squares (LS) method. MOM is available for the ARMA(p,q), GAR(1),
MAR(p), PARMA(p,1), and MPAR(p) models while LS is available for ARMA(p,q), CARMA(p,q),
and PARMA(p,q) models. The LS method requires initial parameters estimates (starting points).
These starting points can be selected by the user or the MOM parameters estimates can be used as
the starting points. For cases where the MOM estimates are not available such as for the PARMA
(p,q) model where q>1, the MOM parameter estimates of the closest model will be used instead.

15
For example, for the PARMA(3,3) model, the MOM estimates of the PARMA(3,1) model (including
zeros for the two remaining parameters) will be used as the starting points. For fitting CARMA(p,q)
models, the residual variance-covariance G matrix can be estimated using either the method of
moments (MOM) or the maximum likelihood estimation (MLE) method (Stedinger et al., 1985).

Fig.13 Seasonal stochastic modeling menu

The estimated model parameters can be


saved in a file selected by the user. This can be
done by clicking on the "Save" button in the
estimation of parameters window and a menu
will appear in which the user can assign the file
name as shown in Fig.15. The file is written in
a certain format and it is recommended that the
user does not change or edit this file unless it is
necessary. Saving the parameters in a file is
important since this file will be used by SAMS in
the generation of data as we will see in the next Fig.14 SAMS modeling menu

16
sections.

Fig.15 SAMS model parameter window

After the model has been fitted and the estimated parameters have been saved, it is
recommended that the fitted model be tested to ensure that it is appropriate for the data at hand. In
general, this can be done by testing the residuals and comparing the model and historical properties
of the data. SAMS has the ability to perform such testing. Testing of the residuals is an important
part of the modeling process by which the modeler can test whether the fitted model is adequate.
In all the models available in the current version of SAMS except the GAR(1) model, the basic
assumptions about the residuals are that they are normal and independent. SAMS performs certain
statistical tests to check the validity of these assumptions. The hypothesis that the residuals are
normally distributed is tested based on the skewness test of normality. The results are presented in
terms of rejecting or not rejecting the hypothesis. In addition, the residuals are plotted on normal

17
probability paper in order to check graphically whether the residuals are normally distributed. For
testing the independence of the residuals, the Porte Manteau test of independence (Salas, et al, 1980)
is utilized. The correlogram of the residuals is also plotted to help the user in checking the
independence of the residuals. Figure 16 shows an example of results of both normality and
independence tests of the residuals.

Fig.16 Testing the normality and the independence of the residuals

Once the model has been fitted to the data, the moments, e.g. the theoretical covariance
structure can be calculated based on the estimated parameters. Comparing the model and historical
covariance (correlation) structure is another method of testing. SAMS provides the user with the
ability to perform such comparisons. The user must click on "Comparing Model and Historical
Correlations" button and then a window will appear in which the theoretical and historical

18
correlograms are presented in graphical or tabular format. Figure 17 is an example of graphical
comparison of model and historical month-to-month correlations. Additional examination of the
model can be made regarding model parsimony. The so called Akaike Information Criteria (AIC)
may be used for this purpose. SAMS uses AIC for testing model parsimony when stationary ARMA
models are utilized.

Fig.17 Comparing the model and the historical correlograms

Figure 18 illustrates the seasonal disaggregation menu when scheme 1 is chosen under
multivariate seasonal disaggregation (refer to Fig.13). In disaggregation modeling, the user should
conduct the process step by step following the menu’s order. The steps that have been done will be
marked successively with relevant text or double arrows to update the user. At the end of
disaggregation modeling, the user may click on "Definition of Spatial and Temporal Adjustment "
to define the "adjustment methods" (refer to Fig.19) and the corresponding system structure (refer

19
Fig.18 Seasonal disaggregation modeling menu

to Fig.20) for the stations (sites) that are subject to


modeling. This is necessary if adjustments are needed
for the generated series. The “system structure for
adjustment” usually depends upon the orders and
positions of the stations relative to each
other. This is important when adjustments need to be
done to the generated series based on spatial
disaggregation. The system structure means defining
for each main river system the sequence of stations
(sites) that conform the river network.
SAMS uses the concept of key stations and subkey
stations (substations and subsequent stations). A key Fig.19 Spatial and temporal
adjustment method menu
station is the farthest downstream station along a main
stream. For instance, station 1 is a key station in the
river system shown in Fig.21. Likewise, 2 and 3 are also key stations. On the other hand, if station

20
1 would not exist (or not used in the analysis), then in this case stations 4 and 5 will become key
stations. Let us continue the explanation assuming that stations 1, 2, and 3 in Fig.21 are key
stations. Substations are the next upstream stations draining to a key station. For instance, stations
4 and 5 are substations draining to key station 1. Likewise, stations 6 and 7 and 8 and 9 are,
respectively, substations for key stations 2 and 3. Subsequent stations are the next upstream stations
draining into a substation. For instance, stations 11 and 12 are subsequent stations relative to
substation 5 and station 10 is a subsequent station regarding substation 4.
On the other hand, for defining a
"disaggregation configuration" SAMS uses the
concept of groups. As shown in Fig.22, a group
consists of one or more key stations and their
corresponding substations. Groups must be
defined in each disaggregation step. Each group
contains a certain number of stations to be
modeled in a multivariate fashion or "jointly" in
order to preserve their cross-correlations. For
instance, if a certain group has two key stations
and three substations, then the disaggregation
process will preserve the cross-correlations
between all the key and the substations. On the
other hand, if two separate groups are selected,
then the cross-correlations between the stations
that belong to the same group will be preserved, Fig.20 System structure input menu
but the cross-correlations between stations
for key station and substations
belonging to different groups will not be
preserved.
The definition of a group is very important in the disaggregation process. For instance,
referring to Fig. 22, key stations 1 and 2 and substations 4, 5, 6, and 7 form one group in which the
flows of all these stations are modeled jointly in a multivariate framework, while key station 3 and
its substations 8 and 9 form another group. In this case, the cross-correlations between the stations

21
Fig.21 Schematic representation of a streamflow network

within each group will be preserved but the cross-


correlations among stations in different groups will
not be preserved. For example, in the above
configuration, the cross-correlations between
stations 1 and 3 will not be preserved but the cross-
correlations between stations 1 and 2 will be
preserved. On the other hand, if all the stations are
defined in a single group, then the cross-correlations
between all the stations will be preserved. In the
final step of disaggregation, a group may contain
stations 4, 5, 10, 11, and 12. In the current version
of SAMS, the total combined number of stations in
Fig.22 Disaggregation
any defined group must not exceed 10 stations. configuration input menu for
After modeling the annual flows using the above key station and substations

22
configuration, the annual flows can be disaggregated into seasonal flows. This is handled again by
using the concept of groups as was explained above. The user, for example, can choose stations 3,
8, 9, 17, 18, and 19 as one group. In this case, the annual flows for these stations will be
disaggregated into seasonal flows by a multivariate disaggregation model so as to preserve the
seasonal cross-correlations between all the stations.
Currently, SAMS has two schemes for modeling the key stations. The first scheme, denoted
as scheme 1 (see the modeling menus of Figs.12 and 13), will aggregate the annual flows of the key
stations that belong to a certain group, then use a univariate ARMA(p,q) to model the aggregated
flows, then the aggregated annual flows are disaggregated (spatially) back to each key station by
using the Valencia and Schaake or the Mejia and Rouselle disagregation method. The second
scheme, denoted as scheme 2, will model the annual flows of the key stations belonging to a given
group by a multivariate MAR(p) model. Once the flows at key stations are modeled, the rest of the
procedure for generating annual flows at all substations and subsequent stations and then for
generating the seasonal flows at all stations is the same as in scheme 1 (as above mentioned).
Additional details about disaggregation modeling are shown in chapter 3, where a mathematical
description of the disaggregation methods is presented, and in chapter 4, where an example of
disaggregation modeling applied to real data is given.
2.4 Generating Synthetic Series
Data generation is an important subject in stochastic hydrology and has received a lot of
attention in hydrologic literature. Data generation is used by hydrologists for many purposes. These
include, for example, reservoir sizing, planning and management of an existing reservoir, and
reliability of a water resources system such as a water supply or irrigation system (Salas et al,1980).
Stochastic data generation can aid in making key management decisions especially in critical
situations such as extended droughts periods (Frevert et al, 1989). The main philosophy behind
synthetic data generation is that synthetic samples are generated which preserve certain statistical
properties that exist in the natural hydrologic process (Lane and Frevert, 1990). As a result, each
generated sample and the historic sample are equally likely to occur in the future. The historic
sample is not more likely to occur than any of the generated samples (Lane and Frevert, 1990).
Generation of synthetic time series is based on the models, approaches and schemes
presented in section 2.3 of this manual. Once the model has been defined and the parameters have

23
been estimated, one can generate synthetic samples based on this model. SAMS allows the user to
generate synthetic data and eventually compare important statistical characteristics of the historical
and the generated data. Such comparison is important for checking whether the model used in
generation is adequate or not. If important historical and generated statistics are comparable, then
one can argue that the model is adequate. The generated data is stored in a file. This allows the user
to further analyze the generated data as needed. Furthermore, when data generation is based on
spatial or temporal disaggregation, one may like to make adjustments to the generated data. This
may be necessary in many cases to enforce that the sum of the disaggregated quantities will add up
to the original total quantity. For example, spacial adjustments may be necessary if the annual flows
at a key station is exactly the sum of the annual flows at the corresponding substations. Likewise,
in the case of temporal disaggregation, one may like to assure that the sum of monthly values will
add up to the annual value. Various options of adjustments are included in SAMS. Further
description on spacial and temporal adjustments are described in Section 4.8.2.
Figure 23 shows the data generation menu.
In this menu the user must specify necessary
information for the generation process. The type of
data to generate (either annual or seasonal) and the
type of modeling, which is either univariate (single
site) or multivariate (multisite) must be selected.
For example, if the user wants to generate annual
data at a single station by using an ARMA model,
then the option "Annual" and "Single site" must be
selected. On the other hand, to generate seasonal
data at several stations from a disaggregation model,
one must select "Seasonal" and "Multisite". In
addition, the data length (in years) and the number
of samples to be generated, and a seed number to
initiate the generation process need to be specified.
In this version of SAMS, both the number of
samples and the length of data to be generated are Fig.23 SAMS generation menu

24
unlimited. The user should consider however the computer time it will take to generate many
samples or very long samples especially if the generation is to be done for multisite seasonal data.
Furthermore, one of four options regarding the generation model, as shown in the dialog box
in Fig.23, must be chosen. One must select "Yes" if SAMS was used to fit the model from which
data are to be generated. On the other hand, if one would like to generate data using one of the
models available in SAMS, but the model was not fitted by SAMS, then the "No" option must be
selected. To illustrate this point further, let’s assume that the user fitted an ARMA (1,1) model by
using an estimation method which is not available in the current version of SAMS or by using a
different package but he wants to generate data using SAMS. Then, the user should select either the
first or the second "No" option to generate the required data. Another difference between the "Yes"
and the "No"options is that after generating the data SAMS will compare the generated and
historical statistics only if the "Yes" option is selected. In the second "No" option the user will open
a (parameter) file which must have the model parameters. This parameter file has to be in a certain
format to be recognized by SAMS. The format of this file must be exactly the same as the format
of the parameter file that SAMS generates after fitting a stochastic model as mentioned in section
2.3. To make sure of this, the user may like to run SAMS to generate a parameter file using the
model desired, then edit the parameter file to insert the new parameter set. Again for clarification,
let’s consider the ARMA(1,1) model where a method different than those available in SAMS was
used to estimate the parameters. SAMS can be used to fit an ARMA(1,1) model to the same data
but using say MOM estimation. Then the MOM parameters can be saved on a file and then the file
can be edited to replace the MOM parameters by the desired set of parameters. In this case, the user
needs to change the parameters φ , θ , and σ ε2 (refer to Section 4.2 for details). One must be aware
that this file must also contain the transformation parameters if transformation was used. Finally,
SAMS will generate data from the referred model based on the parameters contained in the edited
file.
After providing all the information needed for data generation, the user can click on the "Ok"
button shown in Fig.23. A generation menu will appear on the screen which will allow the user to
open the file which contains the model parameters. For example, Fig.24 will appear if the options
to generate single site and seasonal data were chosen. By clicking on the "Open Model Parameters
File" button, a window will appear which will allow the user to select the file that contains the model

25
Fig.24 Univariate seasonal generation menu

parameters as shown in Fig.24. After clicking on the "Generate and Save Data" button ( also shown
in Fig. 24) another menu will appear so that a file name (with an extension “.gen” automatically
attached) can be assigned to store the generated data. If the generation is based on a disaggregation
model, a menu as shown in Fig.19 will appear to remind the user about the adjustment methods
(which should have been read from the previously referred parameter file.) One can also make
changes to the adjustment methods at this point. Next, if statistical analysis of the generated data
is desired, the "Statistical Analysis of Generated Data" button must be clicked on and another menu
box as in Fig.25 will appear which will enable one to view the results. For example, the time series
of the generated data will be shown by clicking on the "Plot Time Series" button. In the case of
analysis pertaining drought, surplus, and storage related statistics, SAMS will ask the user to input
the desired threshold demand level, as shown in Fig.26. The default demand level is the sample
mean, but one can change it by keying a fraction of the sample mean or the actual desired demand
level. The results of the statistical analysis of the generated data can be saved into a file by clicking

26
on "Save Statistical Analysis" button. This will create a file with the extension “.gst” automatically
attached to store the results. Note that the referred feature of the statistical comparison of the
historical and generated data can be also used for further testing and verifying whether the fitted
model performs as desired.

Fig.25 Seasonal statistical characteristics of generated data menu

In estimating the generated statistics,


the statistics of each generated sample are
firstly estimated then the means and standard
deviations of those statistics are computed
which will be used to compare with their
historical counterparts. The results are
presented in graphical or tabular formats.
Figure 27 shows a comparison of the
Fig.26 Window regarding the demand
level
(observed) historical annual series and the

27
Fig.27 Time series plots of the historical and generated annual flows

generated series for one sample. The user can change the station number, sample number, and the
graph scale as needed. For annual series, the comparisons of the historical and generated mean,
standard deviation, skewness coefficient, coefficient of variation, and sample maximum and
minimum are presented in tabular form. For seasonal series, the comparisons are presented in both
graphical and tabular formats as shown in Fig.28. The comparisons of correlations for annual and
seasonal data may be presented in graphical or tabular formats as shown in Fig.29 (for seasonal
data). The comparisons of drought, surplus, and storage related statistics include the longest
drought, maximum deficit, longest surplus, maximum surplus, storage capacity, rescaled range, and
Hurst coefficient. Before showing these results, a window as in Fig.26 will pop up again to allow
the user to change the demand level if needed. The results are presented in tabular format and box
plots as shown in Fig.30. The box plots reflect the ratios of the means, quartiles, maximums, and

28
minimums of those statistics calculated from the generated series to the observed historical values.
The scale of the box plot can be adjusted by the user based on the ratio ranges provided in the dialog
box.

Fig.28 Comparison between the historical and the “generated “ monthly


mean and standard deviations

29
Fig.30 Comparison of drought, surplus, and storage related
statistics

Finally, the “Status” button has been added in all window menus in order to keep track of
all major results and options selected throughout the analysis, modeling, and generation exercise.

Fig.29 Comparisons of the historical and generated seasonal cross-


correlations

30
For example, by clicking on the “Status” button under any menu or window, the user can review the

transformation methods and coefficients utilized for each site, the fitted model including parameters
and adjustments options, etc. and information related to the data generation as that shown in Fig.31.

Fig.31 Example of update information regarding the transformation,


modeling, and generation steps. This view is shown by clicking on
“Status”

31
3 DEFINITION OF STATISTICAL CHARACTERISTICS
A time series process can be characterized by a number of statistical properties such as the
mean, standard deviation, coefficient of variation, skewness coefficient, season-to-season
correlations, autocorrelations, cross-correlations, and storage and drought related statistics. These
statistics are defined for both annual and seasonal data as shown below.
3.1 Basic Statistics
3.1.1 Annual Data
The mean and the standard deviation of a time series yt are estimated by
N
y = (1 / N ) ∑ yt (3.1)
t =1

and
1 N
s= ∑ ( y t − y )2 (3.2)
N t =1
respectively, where N is the sample size. The coefficient of variation is defined as cv = s / y .
Likewise, the skewness coefficient is estimated by

1 N
∑ ( yt − y ) 3
N t =1 (3.3)
g=
s3

The sample autocorrelation coefficients rk of a time series may be estimated by


mk
rk = (3.4)
m0
where N −k
mk = (1 / N ) ∑ ( y t + k − y )( yt − y ) (3.5)
t =1

and k = time lag. Likewise, for multisite series, the lag-k sample cross-correlations between site i
and site j, denoted by rkij , may be estimated by
mkij
rkij = (3.6)
(m )
1/ 2
ii
0 m0jj
where
( )(
N −k
ij (i )
mk = (1 / N ) ∑ y t + k − y (i ) y t( j ) − y ( j )
t =1
) (3.7)

in which m0ii is the sample variance for site i.

3.1.2 Seasonal data


Seasonal hydrologic time series, such as monthly flows, are better characterized by seasonal

32
statistics. Let yν,τ be the seasonal time series, where ν represents years and τ seasons; ν =1,...,N
with N=number of years, and τ=1,...,ω, and ω =number of seasons. The mean and standard
deviation for season τ can be estimated by
1 N
yτ = ∑y (3.8)
N ν =1 ν ,τ
and
1 N
sτ = ∑ ( y − yτ )2 (3.9)
N ν = 1 ν ,τ

respectively. The seasonal coefficient of variation is cvτ = sτ / yτ . Similarly, the seasonal


skewness coefficient is estimated by
1 N
∑ ( y − yτ )3
N ν = 1 ν ,τ (3.10)
gτ =
sτ3
The sample lag-k season-to-season correlation coefficient may be estimated by
mk ,τ
rk ,τ = (3.11)
(m0,τ m0,τ − k )1/ 2
where
mk ,τ =
1 N
∑ y − yτ
N ν =1 ν ,τ
( )( y ν ,τ − k − yτ − k ) (3.12)

in which m0,τ represents the sample variance for season τ. Likewise, for multisite series, the
lag-k
sample cross-correlations between site i and site j, for season τ, rkij,τ may be estimated by
mkij,τ
rkij,τ = (3.13)
(m )
ii 1/ 2
and 0,τ m0jj,τ − k

ij
mk ,τ =
1 N (i )
N ν =1
[ ( j)
][
( j)
∑ yν ,τ − yτ(i ) yν ,τ − k − yτ − k ] (3.14)
ii
in which m0,τ represents the sample variance for season τ and site i. Note that in Eqs. (3.11)
through (3.14) when τ − k < 1 , the terms, ν = 1 , yν ,τ − k , yν ,τ − k , m0,τ − k , yν( ,jτ) − k , yτ( −j )k , and
m0,jjτ − k are replaced by ν = 2 , yν − 1,ω + τ − k , yν ,ω + τ − k , m0,ω + τ − k , yν( −j )1,ω + τ − k , yω( j+)τ − k , and
m0,jjω + τ − k , respectively.

3.2 Storage, Drought, and Surplus Related Statistics


3.2.1 Storage Related Statistics
The storage-related statistics are particularly important in modeling time series for
simulation studies of reservoir systems. Such characteristics are generally functions of the
variance and autocovariance structure of a time series. Consider the time series yi , i = 1, ..., N

33
and a subsample y1 , ..., yn with n # N. Form the sequence of partial sums Si as
Si = Si −1 + ( yi − yn ) i = 1,..., n (3.15)
where S0 = 0 and y n is the sample mean of y1 , ..., yn which is determined by Eq.(3.1). Then,
the adjusted range Rn* and the rescaled adjusted range Rn** can be calculated by

Rn* = max( S 0 , S1 ,..., S n ) − min( S 0 , S1 ,..., S n ) (3.16)


and
Rn*
Rn** = (3.17)
sn
respectively, in which sn is the standard deviation of y1 , ..., yn which is determined by Eq.
(3.2). Likewise, the Hurst coefficient for a series is estimated by
ln( Rn** )
K= , n>2 (3.18)
ln( n / 2)

The calculation of the storage capacity is based on the sequent peak algorithm (Loucks,
et al., 1981) which is equivalent to the Rippl mass curve method. The algorithm, applied to the
time series yi , i = 1, ..., N may be described as follows. Based on yi and the demand level d, a
new sequence Si′ can be determined as

Si′−1 + d − yi if positive
Si′ =  (3.19)
0 otherwise

where S0′ = 0. Then the storage capacity is obtained as

S c = max[S1' ,..., S N' ] (3.20)


Note that algorithms described in Eqs.(3.15) to (3.20) apply also to seasonal series. In
this case, the underlying seasonal series yν ,τ is simply denoted as yt .

3.2.2 Drought Related Statistics


The drought-related statistics are also important in modeling hydrologic time series. For
the series yi , i = 1, ..., N, the demand level d may be defined as α y ,0 < α ≤ 1 (for example, for
α = 1, d = y . ) A deficit occurs when yi < d consecutively during one or more years until yi >
d again. Such a deficit can be defined by its duration L, by its magnitude M, and by its intensity
I = M/L. Assume that m deficits occur in a given hydrologic sample, then the maximum deficit
duration (longest drought or maximum run-length) is given by
L* = max( L1 ,..., Lm ) − min( L1 ,..., Lm ) (3.21)
and the maximum deficit magnitude (maximum run-sum) is defined by
M * = max( M1 ,..., M m ) (3.22)

34
In SAMS, the longest drought duration and the maximum deficit magnitude are estimated for
both annual and seasonal series.

3.2.3 Surplus Related Statistics


For our purpose here, surplus related statistics are simply the opposite of drought related
statistics. Considering the same threshold level d, a surplus occurs when yi > d consecutively
until yi < d again. Then, assuming that m surpluses occur during a given time period N, the
maximum surplus period L* and maximum surplus magnitude M* may be determined also from
Eqs. (3.21) and (3.22).

4 MATHEMATICAL MODELS
4.1 Data Transformations and Standardization
In cases where the normality tests indicate that the observed series are not normally
distributed, the data has to be transformed into normal before applying the models. To normalize
the data, the following transformations are available in SAMS:
- Logarithmic transformation
Y = ln( X + a ) (4.1)
- Power transformation
Y = ( X + a )b (4.2)
- Box-Cox transformation
( X + a )b − 1
Y= , b≠0 (4.3)
b
where Y is the normalized series, X is the original observed series, and a and b are transformation
coefficients. Note that the logarithmic transformation is simply the limiting form of the Box-Cox
transform as the coefficient b approaches zero. Also, the power transformation is a shifted and
scaled form of the Box-Cox transform. The variables Y and X can represent either annual or
seasonal data. For seasonal data a and b can be chosen to vary with the season. The normalized
data can then be standardized by subtracting the mean and dividing by the standard deviation
(standardization is actually an option in SAMS). For example, for seasonal series, the
standardization may be expressed as:
Yν ,τ − Yτ
Zν ,τ = (4.4)
Sτ (Y )

where Zν ,τ is the standardized series, and Yτ and Sτ (Y ) are the mean and the standard deviation
of the transformed series for month τ . Then, the stochastic models can be fitted to the

35
standardized series Zν ,τ . For generating flows, the reverse procedure is followed. After
generating Zν ,τ then Yν ,τ can be obtained by
Yν ,τ = Yτ + Sτ (Y )Zν ,τ (4.5)
and X ν ,τ can be generated by applying the appropriate inverse transformation to the Yν ,τ
process. For example, if X ν ,τ was transformed by a natural log transformation, the process
X ν ,τ can be obtained from Yν ,τ by applying the following inverse transformation:
X ν ,τ = exp(Yν ,τ ) − aτ (4.6)

4.2 Univariate ARMA(p,q) Model


The ARMA(p,q) model may be expressed as:
φ ( B) Yt = θ ( B) et (4.7)
where Yt represents the streamflow process for year t, it is normally distributed with mean zero
2
and variance σ2(Y) , et is the uncorrelated noise term with mean zero and variance (e) and
also is normally distributed; and φ ( B) and θ ( B) are polynomials in B defined as

φ( B) = 1 − φ1 B1 − φ2 B 2 −⋅⋅⋅ − φ p B p (4.8a)

θ ( B) = 1 − θ1 B1 − θ2 B 2 −⋅⋅⋅ − θq B q (4.8b)

where φ1 , φ2 , . . ., φ p are the autoregressive parameters; θ1 ,θ2 , . . .,θq are the moving average
parameters; B is the backward shift operator, i.e., B c Yt = Yt − c , and p and q define the order of
the ARMA model.
Method of moments (MOM) may be used in parameter estimation of ARMA(p, q)
models. For example, the moment estimators for the ARMA (1,0) , ARMA (1,1) and ARMA
(2,1) models are shown below:
- ARMA (1,0) model:
Yt = φ1Yt − 1 + et (4.9)

φ$1 = m1 (4.10)

σ$ 2 (e) = (1 − φ$12 ) s 2 (4.11)

- ARMA (1,1) model:


Yt = φ1Yt − 1 + et − θ1et − 1 (4.12)
m
φ$1 = 2 (4.13)
m1

( s 2 − φ$1m1 ) 1
θ$1 = φ$1 + − (4.14)
(φ$1 s 2 − m1 ) θ$1

36
φ$1s 2 − m1
σ$ 2 ( e) = (4.15)
θ$1

in which θ$1 can be obtained by solving Eq. (4.14)


- ARMA (2,1) model:
Yt = φ1Yt − 1 + φ2Yt − 2 + et − θ1et − 1 (4.16)

m m − s2 m
φ$1 = 2 2 1 2 3 (4.17)
m1 − s m2

m m −m m (4.18)
φ$2 = 3 2 1 2 2 3
m1 − s m2

( s 2 − φ$1m1 − φ$2 m2 ) (φ$1s 2 − m1 + φ$2 m1 )


θ$1 = φ$1 + − (4.19)
(φ$1 s 2 − m1 + φ$2 m1 ) (φ$1 s 2 − m1 + φ$2 m1 )θ$1

φ$1s 2 + φ$2 m1 − m1
σ$ 2 ( e) = (4.20)
θ$
1

where s2 is the variance of Yt and mk is the estimate of the lag-k autocovariance of Yt which is
defined as Mk = E[Yt Yt-k]. In the foregoing model it is assumed that the mean has been removed
or E(Yt)=0. Note also that s2 = m0.
However, the Least Squares (LS) method is generally a more efficient parameter
estimation method. In this method, the parameters φ ′s and θ ′s are estimated by minimizing the
sum of squares of the residuals defined by
N
F = ∑ et2 (4.21)
t =1

where N is the number of years of data. For the ARMA (p,q) model, the residuals are defined
as
p q
et = Yt − ∑ φi Yt −i + ∑θi et −i (4.22)
i =1 i =1

Once the φ ′s and θ ′s are determined, then the noise variance σ 2 (e) is determined by
(1 / N ) Σet2 . The minimization of the sum of squares of Eq. (4.21) may be obtained by a
numerical scheme. Powell's algorithm has been commonly employed for least squares
estimation of parameters of ARMA models. The Powell algorithm (Gill et al, 1981 and
Himmelblau, 1972), is an expanded version of the univariate gradient search which is a useful
optimization technique that does not require derivatives. The moment estimates of ARMA(p,q)

37
models may be taken as the initial values in the search algorithm. The non-derivative
optimization techniques depend very much on the starting points when the objective function is
not convex. In these cases there is no guarantee that the solution found corresponds to the global
minimum. The solution may be improved by choosing a different starting point.
To generate synthetic series from an ARMA model , Eq. (4.7) can be used. First, a
standard uncorrelated normal random variable εt is generated, then et is calculated as
et = σ (e)εt (4.23)
To generate the correlated series Yt , the warm-up procedure is followed. In this procedure,
values of Yt prior to t=1 are assumed to be equal to the mean of the process (which is zero in this
case). Thus, Y1 , Y2 , ..., YN+L can be generated using Eq. (4.7) by generating e1-q , e2-q , e3-q , ...
from Eq. (4.23) where N is the required length to be generated and L is the warm-up length
required to remove the effect of the initial assumptions of Yt . L is arbitrarily chosen as 50. The
advantage of the warm up procedure is that it can be used for low order and high order stationary
and periodic models while exact generation procedures available in the literature apply only for
stationary ARMA models or the low order periodic models.
4.3 Univariate GAR(1) model
Gamma-autoregressive (GAR) models assume that the underlying series is dependent
with a gamma marginal distribution and the models do not require variable transformation.
SAMS provides modeling and data generation based on the GAR(1) model. The model
parameters are estimated based on a procedure suggested by Fernandez and Salas (1990).
The GAR(1) model can be expressed as (Lawrence and Lewis, 1981)

X t = φ X t −1 + εt (4.24)

where Xt is a gamma variable defined at time t, φ is the autoregression coefficient, and εt is the
independent noise term. Xt is a three-parameter gamma distributed variable with marginal
density function given by:
α β ( x − λ )β − 1 exp[−α ( x − λ )]
f X (x) = (4.25)
Γ (β )
where λ, α, and β are the location, scale, and shape parameters, respectively. Lawrence (1982)
found that gt can be obtained by the following scheme:
ε = λ(1 − φ) + η (4.26)

where

38
η = 0 if M=0


 M
(4.27)
η = ∑ Y (φ)U j if M>0
 j =1
j

where M is an integer random variable Poisson distributed with mean −β ln(φ) and Uj , j =1,2,
....are independent identically distributed (iid) random variables with uniform (0,1) distribution.
Additionally, Yj ,j =1,2, ....are iid random variables exponential distributed with mean 1 / α .
The stationary GAR(1) process of Eq. (4.24) has four parameters, namely λ, α, β, and φ .
It may be shown that the relationships between the model parameters and the population
moments of the underlying variable X t are:

β
µ= λ+ (4.28)
α

β
σ2 = (4.29)
α2

2
γ = (4.30)
β

ρ1 = φ

(4.31)
where µ , σ 2 , γ , and ρ1 are the mean, variance, skewness coefficient, and the lag-one
autocorrelation coefficient, respectively.
Based on results given by Kendall (1968), Wallis and O’Connell (1972), and Matalas
(1966) and based on extensive simulation experiments conducted by Fernandez and Salas
(1990), they suggested the following estimation procedure:

r1 N + 1
ρ$1 = (4.32)
N −4
N −1 2
σ$ 2 = s (4.33)
N −K

[ N (1 − ρ$12 ) − 2 ρ$1 (1 − ρ$1N )]


K= (4.34)
[ N (1 − ρ$1 ) 2 ]

in which r1 is the lag-1 sample autocorrelation coefficient and s 2 is the sample variance. In
addition,

γ$ 0
γ$ =
( . ρ$13.7 N −0.49
1 − 312 ) (4.35)

39
where γ$0 is the skewness coefficient suggested by Bobee and Robitaille (1975) as

  L2  
Lg1  A + B  g12 
  N   (4.36)
γ$ 0 =
N

in which g1 is the sample skewness coefficient and the constants A, B, and L are given by
A = 1 + 6.51N −1 + 20.2 N −2 (4.37)

. N −1 + 6.77 N −2 ,
B = 148 (4.38)
and (N − 2)
L= , (4.39)
(N − 1)
respectively. Furthermore, the mean is estimated by the usual sample mean x . Therefore,
substituting the population statistics µ, σ , γ , and ρ1 in Eqs.(4.28) through (4.31) by the
corresponding estimates x, σ$ , γ$ , and ρ$1 as above suggested and solving the equations
simultaneously give the MOM estimates of the GAR(1) model parameters. For more details, the
interested reader is referred to Fernandez and Salas (1990).
4.4 Univariate PARMA(p,q) Model
Stationary ARMA models have been widely applied in stochastic hydrology to annual
time series where the mean, variance, and the correlation structure do not depend on time.
Seasonal statistics such as the mean and standard deviation may be reproduced by a stationary
ARMA model by means of standardizing the underlying seasonal series. However, this
procedure does not account for the season-to-season correlations that are generally exhibited by
hydrologic time series such as monthly streamflows. Thus, periodic ARMA (PARMA) models
have been suggested in the literature for this purpose.
A PARMA(p,q) model may be expressed as (Salas, 1993):

φτ ( B)Yν ,τ = θτ ( B)eν ,τ (4.40)

where Yν ,τ represents the streamflow process for year ν and season τ, it has mean zero and
variance στ2 (Y ) and is normally distributed; eν ,τ is the uncorrelated noise term which is
normally distributed with mean zero and variance στ2 (e) ; φτ ( B) and θτ ( B)

are periodic polynomials in B defined as

φτ ( B) = 1 − φ1,τ B1 − φ2,τ B 2 − ... − φ p ,τ B p (4.41a)

40
θτ ( B) = 1 − θ1,τ B1 − θ2 ,τ B 2 − ... − θq ,τ B q (4.41b)

where φ1,τ , ⋅⋅⋅ , φ p ,τ are the seasonal autoregressive parameters; θ1,τ ,...,θq ,τ are the seasonal
moving average parameters; B is the backward shift operator, i.e., B cYν ,τ = Yν ,τ −c , and p and q
define the order of the PARMA model.
Method of moments (MOM) may be used in parameter estimation of low order
PARMA(p, q) models. In SAMS the MOM estimates are available for the PARMA(p,1) model.
For example, the moment estimators for the PARMA (1,1) and PARMA (2, 1) models are shown
below (Salas et al, 1982):
- PARMA (1,1) model:
Yν ,τ = φ1,τ Yν ,τ −1 + eν ,τ − θ1,τ eν ,τ −1 (4.42)

m2,τ
φ$1,τ = (4.43)
m1,τ −1

( sτ 2 − φ$1,τ m1,τ ) (φ$1,τ +1sτ2 − m1,τ +1 )


$ $
θ1,τ = φ1,τ + $ 2 − (4.44)
(φ1,τ sτ −1 − m1,τ ) (φ$1,τ sτ2−1 − m1,τ )θ$1,τ +1

φ$1,τ +1 sτ2−1 − m1,τ +1


σ$ τ2 (e) = (4.45)
θ$ 1,τ +1

- PARMA (2,1) model:

Yν ,τ = φ1,τ Yν ,τ −1 + φ2 ,τ Yν ,τ −2 + eν ,τ − θ1,τ eν ,τ −1

(4.46)

m2,τ m1,τ − 2 − sτ2− 2 m3,τ


φ$1,τ = (4.47)
m1,τ −1 m1,τ −2 − sτ2− 2 m2 ,τ −1

m3,τ m1,τ −1 − m2 ,τ m2 ,τ −1
φ$2,τ = (4.48)
m1,τ −1 m1,τ − 2 − sτ2− 2 m2 ,τ −1

( sτ2 − φ$1,τ m1,τ − φ$2,τ m2 ,τ ) (φ$1,τ +1 sτ2 − m1,τ +1` + φ$2 ,τ +1 m1,τ )
$ $
θ1,τ = φ1,τ + $ 2 − (4.49)
(φ1,τ sτ −1 − m1,τ + φ$2,τ m1,τ −1 ) (φ$1,τ sτ2−1 − m1,τ + φ$2,τ m1,τ −1 )θ$1,τ +1

φ$1,τ +1 sτ2 + φ$2,τ +1 m1,τ − m1,τ +1


σ$ τ2 (e) =
θ$1,τ +1

41
(4.50)

2
where sτ is the seasonal variance and mk,τ is the estimate of the lag-k season-to-season
covariance of Yν ,τ which is equal to
M k ,τ = E[Yν ,τ Yν ,τ − k ] (4.51)
because E (Yν ,τ ) = 0. Note also that sτ2 = m0,τ .
In a similar manner as for the ARMA(p,q) model, the Least Squares (LS) method can be
used to estimate the model parameters of PARMA(p,q) models. In this case, the parameters φ ′ s
and θ ′ s are estimated by minimizing the sum of squares of the residuals defined by
N ω
F = ∑ ∑ eν2,τ (4.52)
ν =1 τ =1

where ω is the number of seasons and N is the number of years of data. For the PARMA (p,q)
model, the residuals are defined as
p q
eν ,τ = Yν ,τ − ∑ φi ,τ Yν ,τ −i + ∑θi ,τ eν ,τ −i (4.53)
i =1 i =1
Once the φ ′ s and θ ′ s are determined the seasonal noise variance στ2 (e) can be estimated by
(1 / Nω) ΣΣeν2,τ . Alternatively, the method of moments can be applied but this later option is
still not available in the current version of SAMS. In using Powell’s algorithm, for obtaining
the least squares estimates of the φ' s and θ' s the moment estimates of low order PARMA(p,q)
models such as PARMA(p,1) may be taken as the initial values in the search algorithm.
Generation of data from PARMA (p,q) models is carried out in a similar manner as for
ARMA(p,q) models. The warm up length procedure can be used again to generate seasonal
sequences of the Yν ,τ process by assuming that values of Yν ,τ prior to season 1 of year 1 are
equal to zero and generating uncorrelated random sequences of eν ,τ as needed in a similar
manner as for the ARMA (p,q) model. The warm-up period is taken as 50 years.
4.5 Multivariate MAR(p) Model
The MAR(p) model can be expressed as
Φ( B)Yt = et (4.54)
where Φ( B) is a square matrix of polynomials in B which is defined as

Φ( B) = I − Φ1 B1 − Φ2 B 2 −⋅⋅⋅ − Φ p B p (4.55)

in which I is an (n×n) identity matrix; Φ j , j = 1,..., p, are n×n parameter matrices; Bj is a

42
scalar difference operator such that B j Zt = Zt − j ; Yt is an (n×1) column vector with elements Yti,
i = 1, ... , n; and et is an (n x 1) vector of normally distributed noise terms with mean 0 and
variance - covariance matrix G. The noises et are independent in time but are dependent in space
and n is the number of sites. Such spatially correlated noise can be modeled by
et = Bε t (4.56)
where gt is a (n x 1) vector of standardized normal variables independent in both time and space
and B is an (n x n) parameter matrix.
It can be shown that the moment equations of the MAR(p) model are given by
p
M 0 = ∑ Φi M iT + G (4.57)
i =1
p
M k = ∑ Φi M k-i , k ≥1 (4.58)
i =1

where Mk is the lag-k cross covariance matrix of Yt defined as:


T
M k = E[Yt Yt-k ]
(4.59)
in which the superscript T indicates a matrix transpose and E(Yt)=0. In finding the MOM
estimates, Eq.(4.58) for k=1, ..., p, is solved simultaneously for the parameter matrices Φ j , j =
1, ..., p, by in Eq. (4.58) the population covariance
substituting
m a t r i c e s Mk , k ' 1, 2, ..., p , by the sample covariance
matrices Mk , k ' 1, 2, ... p . Then Eq.(4.57) is used to estimate the variance-
covariance matrix of the residuals G . For example, the moment estimators of the MAR(1)
model are:
$ = M$ M$ −1
Φ 1 0

(4.60)
G$ = M
$ − $ M
M $ −1 M
$T
0 1 0 1

(4.61)
in which superscript -1 indicates a matrix inverse.
After estimating Φ j , j = 1,..., p and G as indicated above, B of Eq. (4.56) can be
determined from
G$ = BB T (4.62)

The above matrix equation can have more than one solution. However, a unique solution can
be obtained by assuming that B is a lower triangular matrix. This solution, however, requires
that G be a positive definite matrix.

43
4.6 Multivariate CARMA(p,q) Model
When modeling multivariate hydrologic processes based on the full
multivariate ARMA model, often problems arises in parameter estimation. The CARMA
(Contemporaneous Autoregressive Moving Average) model was suggested as a simpler
alternative to the full multivariate ARMA model (Salas, et al., 1980). In the CARMA model,
both autoregressive and moving average parameter matrices are assumed to be diagonal such that
a multivariate model can be decoupled into component univariate models. Thus, the model
parameters Φ and Θ do not need to be estimated jointly, but, instead, they can be estimated
independently for each single site by regular univariate ARMA model estimation procedures.
This allows that the best univariate ARMA model can be identified for each single station.
The CARMA(p, q) model can be expressed as
p q
Z t = ∑ Φ j Zt − j + εt − ∑ Θ t − j εt − j (4.63)
j =1 j =1

where Zt is a multi-dimensional vector of the normalized and mean corrected observations at


time t, gt is the multi-dimensional vector of noises (residuals) of the processes at time
t, Φj are the diagonal autoregressive parameter matrices, and Θj are the diagonal moving
average parameter matrices. Equation (4.63) can be decoupled into the model components as
p q
Zti = ∑ φij Zti− j + εti − ∑θ ij εti − j (4.64)
j =1 j =1

Thus, Eq.(4.64) is the expression of a univariate ARMA(p,q) model for site i such that the
parameters φij and θ ij can be estimated by the regular ARMA model estimation methods.
The matrix of residual (noise) terms εt = [εt1 , εt2 ,..., εtn ] can be expressed as
εt = Bξt (4.65)
where, the random vector ξt is uncorrelated in time and space, i.e. E (ξt ξtT ) = I . It may be
shown that the variance covariance matrix G of the correlated series εt is equal to
G = E (εt εtT ) = BB T
(4.66)

Thus, a CARMA model implies that the cross-correlations between sites are carried through the
residuals.
Two methods are used for estimating the G matrix:
1. The MLE estimate of G is obtained by

1
G$ = ∑ ε$t ε$tT (4.67)
n t

44
where ĝt are the residuals calculated from each single site models by using the estimated
parameters Φ j and Θ j .
2. The moment (MOM) estimate of G computed from the moment estimator as a function of the
given parameters and the cross-covariances of the data, i.e.,
G$ = f (Φm , Θr , M k ) (4.68)

where, Mk are the lag-k variance-covariance matrices of processes Z, m = 1, ..., p; r = 1, ..., q,


and k = 0, ..., max(p, q) - 1.
A moment estimator of the G matrix for a general CARMA model is obtained as follows.
T
By multiplying both sides of Eq. (4.63) by Zt (the transpose of Zt ) one may obtain

Z t ZtT = Φ1Zt −1ZtT + ⋅⋅⋅ + Φ p Zt − p ZtT + εt Z tT − Θ1εt −1Z tT − ⋅⋅⋅ − Θq εt − q ZtT (4.69)
T T
Because E(ZtZt&k) ' Mk and E(gtgt ) ' G , the lag-0, lag-1, ..., lag-k moment
equations M0 , M1 , . . . , Mp can be obtained by taking expectations on both sides of Eq.(4.69).
Then, the (i, j) elements of the moment matrices, M 0ij , M1ij , M 2ij ,..., M ijp , can be expressed as
functions of (φ1i , φ1j ) , (φ2i , φ2j ) , . . ., (φipφ pj ) ; (θ1i ,θ1j ), (θ2i ,θ2j ) , . . ., (θqi ,θqj ) and Gij; which are
the elements of the matrices Φ1 , Φ2 ,..., Φ p ; Θ1 , Θ2 ,..., Θq ; and G; respectively. Analogously,
another p sets of equations for the (j,i) elements M 0ji , M1ji , M 2ji ,..., M pji can be obtained by
switching the site indices because on the symmetric structure of the CARMA model moment
matrices. Since G ij = G ji , and M 0ij = M 0ji are estimated from the observed processes, a
system of 2p+1 linear equations with 2p+1 unknowns, namely, for G ij , M1ij , M 2ij ,..., M ijp , etc.
is formed. Solving each system of linear equations indexed (i, j), the G matrix estimate can
be obtained.
To obtain G ij let
q k
ij j j j
K0 = 1 − ∑θki ∑ (φl − θl )φk −l (4.70)
k =1 l =1
and
q k −m
j j j
Kmij = θmi + ∑θki ∑ (φl − θl )φk −m−l (4.71)
k = m+1 l =1

where, m = 1, ..., p and φ0j = 1 . For instance, for a CARMA(3, q) model M 0ij , M1ij , M 2ij , M 3ij
can be expressed as
M 0ij = φ1i M1ji + φ2i M 2ji + φ3i M pji + K0ij G ij (4.72)

M1ij = φ1i M 0ij + φ2i M1ji + φ3i M 2ji − K1ij G ij (4.73)

M 2ij = φ1i M1ij + φ2i M 0ij + φ3i M1ji − K2ij G ij (4.74)

45
M 3ij = φ1i M 2ij + φ2i M1ij + φ3i M 0ij − K3ij G ij (4.75)

Thus, based on Eqs. (4.70) and (4.71) a system of 2p+1 linear moment equations analogous to
Eqs. (4.72) to (4.75) can be written as
AX = B
(4.76)
where X is a (2p+1 × 1) vector of unknown variables,
ij ji ij ji ij ji T
[G ij, M1 , M1 , M2 , M2 , þþ, Mp , Mp ] , A is a (2p+1 × 2p+1) square matrix of
coefficients, and B is a (2p+1 × 1) vector of known constants, which can be written as

K 0 φ1i 0 φ2i ⋅⋅⋅ 0 φip   M ij 


 0ij   i 0 ij 
 K1 − φ0i − φ2i − φ−i 1 − φ3i ⋅⋅⋅ − φ1i−( p −1) − φ1+( p −1) 
i
 φ1 M 0 
 ij   φi M ij 
 K2 − φ1i − φ3i − φ0i − φ4i ⋅ ⋅ ⋅ − φ2i −( p −1) − φ2i +( p −1)   2 0
 ⋅ ⋅ ⋅ ⋅ ⋅ ⋅⋅⋅ ⋅ ⋅   ⋅ 
   i ij 
A =  K ijp − φip −1 − φip +1 − φip −2 − φip + 2 ⋅ ⋅ ⋅ − φip −( p −1) − φip +( p −1)  , B = φp M 0  ,
 ji j j j j j j  φ j M ij 
 K1 − φ0 − φ2 − φ−1 − φ3 ⋅⋅⋅ − φ1−( p −1) − φ1+( p −1)   1j 0ij 
 K ji − φ1
j
− φ3
j
− φ0
j
− φ4
j j
⋅ ⋅ ⋅ − φ2 −( p −1)
j
− φ2 +( p −1)  φ2 M 0 
 2  
 ⋅ ⋅ ⋅ ⋅ ⋅ ⋅⋅⋅ ⋅ ⋅   ⋅ 
 ji j j j j j j  φpj M 0ij 
 K p − φp −1 − φp +1 − φp − 2 − φp + 2 ⋅ ⋅ ⋅ − φp −( p −1) − φp +( p −1)   

where, φ0i = −1 and φki = 0 if k < 0 or k > p. Thus, X can directly solved by X = A-1B.

4.7 Multivariate MPAR(p) Model


The MPAR(p) model can be expressed as
Φτ ( B)Yν ,τ = eν ,τ

(4.77)
where Φτ ( B) is a square diagonal matrix of periodic polynomials in B which is defined as

Φτ ( B) = I − Φ1,τ B1 − Φ2 ,τ B 2 −⋅⋅⋅ − Φ p ,τ B p
(4.78)

in which I is an (n×n) identity matrix; Φ j,τ , j=1,..., p are n×n diagonal parameter matrices for
season τ ; Bj is a scalar difference operator such that B j Zν,τ ' Zν,τ & j ; Yν,τ is an
i
(n×1) column vector with elements Yν,τ , i=1, ... , n; and eν,τ is an (n x 1) vector of normally
distributed noise terms with mean 0 and variance - covariance matrix Gτ . The
noises eν,τ are independent in time but are dependent in space and n is the number of sites.
Such spatially correlated noise can be modeled by

46
eν ,τ = Bτ εν ,τ

(4.79)

where εν,τ is a (n x 1) vector of standardized normal variables independent in both time and
space and Bτ is an (n x n) parameter matrix.
The parameters of the MPAR(p) model are estimated by the MOM by substituting the
sample moments into the moment equations in a similar manner as for the MAR(p) model. The
moment equations of the MPAR(p) model may be shown to be:

p
M 0,τ = ∑ Φi ,τ M iT,τ + Gτ
i =1
(4.80)

p
M k ,τ = ∑ Φi ,τ M k −i ,τ −i for τ - i ≥ 0 and k ≥ 1 (4.81a)
i =1
p
M k ,τ = ∑ Φi ,τ M iT− k ,τ − k for τ - i < 0 and k ≥ 1
i =1
(4.81b)

where Mk,τ is the seasonal lag-k cross covariance matrix of Yν,τ defined as:

M k ,τ = E[Yν ,τ YνT,τ − k ] (4.82)

in which E(Yν,τ)'0 . In a similar manner as for the MAR(p) model, the MOM estimates can
be found by solving Eq. (4.81) for k=1,2,..., p simultaneously for φ ′ s by substituting the
population covariance matrices M k ,τ , k = 1,..., p by the corresponding sample covariance
matrices M
$
k ,τ , k = 1 ,..., p . Then Eq.(4.80) is used to estimate the variance-covariance matrix

of the residuals Gτ .
After estimating Φ j,τ , j=1,..., p and Gτ as indicated above, Bτ can be

estimated from

Gτ = Bτ BτT

(4.83)

As for the MAR(p) model, a solution for the above equation can be obtained by assuming that
Bτ is a lower triangular matrix. Note that Gτ must be positive definite.
4.8 Disaggregation Models
4.8.1 General
Disaggregation stochastic modeling of hydrologic time series are efficient techniques for

47
cases where the preservation of statistical characteristics of both annual and seasonal scales is
essential for the project under study. Valencia and Schaake (1973) and later extension by Mejia
and Rousselle (1976) introduced the basic disaggregation model for temporal disaggregation of
annual flows into seasonal flows. However, the same model can also be used for spatial
disaggregation. For example, the sum of flows of several stations can be disaggregated into
flows at each of these stations or the total flows at key stations can be disaggregated into flows
at substations which usually, but not necessarily, sum to form the flows of the key stations. The
Valencia and Schaake and the Mejia and Rousselle models require that many parameters to be
estimated especially for the temporal disaggregation. For example, Valencia and Schaake model
requires 156 parameters for the case of disaggregating annual flows into 12 seasons for one
station. Mejia and Rouselle model require 168 parameters. If the same disaggregation is to be
held for 3 sites, the models require 1,404 and 1,512 for both models, respectively. Lane (1979)
introduced the condensed model for temporal disaggregation which reduces the number of
parameters required drastically. For example, for the cases mentioned above, Lane's model
requires 36 parameters for the one site case and 324 parameters for the 3 site case.
In SAMS, Lane’s model will be used for temporal (seasonal) disaggregation. The
Valencia and Schaake and Mejia and Rousselle models will be used for spatial disaggregation
and univariate seasonal disaggregation where the annual flows for only one site will be
disaggregated into seasonal flows for the same site.
In using disaggregation models for data generation, adjustments may be needed to ensure
additivity constraints. For instance, in spatial disaggregation, to ensure that the generated flows
at substations (or at subsequent stations) add to the total or a fraction (depending on the
particular case at hand) of the corresponding generated flow at a key station (or subkey station)
or, in temporal disaggregation, to ensure that the generated seasonal values add exactly to the
generated annual value, three methods of adjustment based on Lane and Frevert (1990) are
provided in SAMS. These methods will be described in detail in the following sections.
4.8.2 Model Formulations
Valencia and Schaake Model
The model can be expressed as (Valencia and Schaake, 1973)

Yt = AX t + Bε t

(4.84)
i
in which Yt is an (f×1) column vector with elements Yt , i = 1, ... , f ; Xt is an (h×1)

48
i
column vector with elements Xt , i = 1, ... , h where h and f are appropriate matrix
dimensions. For example, in the key station to substation disaggregation f and h represent the
number of key and substations, respectively. εt is an (f x 1) vector of normally distributed noise
terms with mean 0 and the identity matrix as its variance - covariance matrix. The noises εt are
independent in both time and space. A and B are (f x h) and (h x h) parameter matrices. The
number of key stations f in the above equations can be more than one so the above model can
be used to disaggregate annual flows at several key stations to their corresponding flows at
substations in a multivariate form which would be able to preserve the inter (cross) correlations
among the stations.
The model parameter matrices A and B can be estimated by using the MOM as (Valencia
and Schaake, 1973):
A = M 0 (YX ) M 0−1 ( X ) (4.85)
BB T = M 0 (Y ) − M 0 (YX ) M 0−1 ( X ) M 0 ( XY ) (4.86)

where
M k ( X ) = E[ X t X tT− k ]
M k (Y ) = E [Yt YtT− k ]

M k (YX ) = E[Yt X tT− k ]


M k ( XY ) = E[ X t YtT− k ]

Equations (4.85) and (4.86) can be used to obtain estimates of A and B by substituting the
population moments M0(X), M0(Y), M0(X Y), and M0(Y X) by their corresponding
sample estimates.
Mejia and Rousselle Model
This model can be expressed as
Yt = AX t + Bεt + CYt −1
(4.87)
in which Yt , Xt , εt , A, and B are defined in the same way as for the Valencia and Schaake
model and C is an additional (h x h) parameter matrix. As for the Valencia and Schaake model,
the number of key stations f in the above equations can be more than one so the above model can
be used to disaggregate annual flows at several key stations to their corresponding flows at
substations.
The model parameter matrices A, B, and C can be estimated by using the MOM as:

49
A = {[ M 0 (YX ) − M1 (Y ) M 0−1 (Y ) M1T ( XY )]
(4.88)
[ M 0 ( XX ) − M1 ( XY ) M 0−1 (Y ) M1T ( XY )]−1}

C = [ M1 (Y ) − AM1 ( XY )] M 0−1 (Y ) (4.89)

BB T = M 0 (Y ) − AM 0 ( XY ) − CM1T (Y ) (4.90)

Equations (4.88) through (4.89) can be used to obtain estimates of A, B, and C by substituting
t h e p o p u l a t i o n
moments M0(X) , M0(Y) , M0(X Y) , M0(Y X) , M1(X) , M1(Y) , M1(X Y) ,
and M1(Y X) by their corresponding sample estimates. Lane (1981) showed that some
problems exist if one uses the above equations to estimate the parameters. Specifically, the
problem is in using M1(XY) . He showed that the generated moments are affected and some
key moments are not preserved. As a result, he suggested that, instead of using a sample
estimate of M1(XY) , one should use the model (population) M1(XY) that would result
from the model structure (for further details, the reader is referred to Lane and Frevert,1991).
In the final analysis, the suggested equation is
M1* ( XY ) = M1 ( X ) M 0−1 ( X ) M 0 ( XY ) (4.91)

The equation should be used for calculating M1(XY) .


above The value
(
of M1 (XY) calculated in Eq. (4.91) should be used in Eqs. (4.88) through (4.90) for
estimating the model parameters. Lane suggested also that M1(Y) should be calculated as:

M1* (Y ) = M1 (Y ) + M 0 (YX ) M 0−1 (Y )[ M1* ( XY ) − M1 ( XY )] (4.92)

The reader is referred to Lane and Frevert (1991) for more in depth details about these
adjustments.
Lane's Condensed Model
The model can be expressed as
Yν ,τ = Aτ X ν + Bτ εν ,τ + Cτ Yν ,τ −1
(4.93)
i
in which Yν,τ is an (n × 1) column vector with elements Yν,τ , i=1, ... ,n; Xt is an (n×1)
i
column vector with elements X t , i = 1, ... , n ; εν,τ is an (n x 1) vector of normally
distributed noise terms with mean 0 and the identity matrix as its variance-covariance matrix.
The noises εν,τ are independent in time and space and n is the number of sites.
The model parameter matrices A, B, and C can be estimated by using the MOM as (Lane

50
and Frevert, 1991):

Aτ = {[ M 0,τ (YX ) − M1,τ (Y ) M 0−,1τ −1 (Y ) M1T,τ ( XY )]


(4.94)
[ M 0 ( X ) − M1,τ ( XY ) M 0−,1τ −1 (Y ) M1T,τ ( XY )]−1}

Cτ = [ M1,τ (Y ) − Aτ M1,τ ( XY )] M 0−,1τ (Y ) (4.95)

Bτ BτT = M 0,τ (Y ) − Aτ M 0,τ ( XY ) − Cτ M1T,τ (Y ) (4.96)

where

M k ( X ) = E[ X ν X νT− k ]

M k ,τ (Y ) = E[Yν ,τ YνT,τ − k ]

M k ,τ (YX ) = E[Yν ,τ X νT− k ]

M k ,τ ( XY ) = E[ X ν YνT,τ − k ]

The MOM parameter matrices can be estimated by substituting the population moments by their
corresponding sample estimates and solving Eqs. (4.94) through (4.96) for the parameters.
In a similar manner as for the Mejia and Rousselle’s model, Lane (1981) suggested that
the following moments should be adjusted as follows:

M1*,τ ( XY ) = M1 ( X ) M 0−1 ( X ) M 0,τ −1 ( XY )


(4.97)

M1*,τ (Y ) = [ M1,τ (Y ) + M 0,τ (YX ) M 0−1 ( X )][ M1*,τ ( XY ) − M1,τ ( XY )]


(4.98)

The above adjustments are needed only for the first season.

Adjustment for spatial disaggregation


Three approaches are available for the adjustment of spatial disaggregated data. They
are:

(i )
n | q$ t(i ) − µ$ |
approach 1: q$ t*(i ) = q$ t(i ) + [rq$ t − ∑ q$ t( j ) ] n , (4.99)
j =1 ( j)
∑ | q$t( j ) − µ$ |
j =1

q$ t(i ) (r q$ t )
approach 2: q$ t*(i ) = n (4.100)
∑ q$t( j )
j =1

51
approach 3:
2
(i )
n σ$
q$ t*(i ) = q$ t(i ) + (rq$t − ∑ q$ t( j ) ) n
j =1 2
∑ σ$ ( j )
(4.101) j =1

where N
r = (1 / N ) ∑ rt , (4.102)
t =1
n
∑ qt( j )
j =1 (4.103)
rt = ,
qt
and N is the number of observations, n is the number of substations (or subsequent
(j)
stations), qt is the t-th observed value at a key station (or substation), qt is the t-th
observed value at substation (or subsequent station) j, q̂t is the generated value at the key
(i)
station (or substation), q̂ t is the generated value at substation i (or subsequent
((i)
station), q̂ t is the adjusted generated value at substation i (or subsequent station), µ̂(i) is
(i)
the estimated mean of q̂ t for site i, and σ̂(i) is the estimated standard deviation of
(i)
q̂ t for site i.

Adjustment for temporal disaggregation


Three approaches are also available for the adjustment of temporal disaggregated data.
They are:
approach 1:
ω | q$ν ,τ − µ$ τ |
q$ν*,τ = q$ν ,τ + (Q$ν − ∑ q$ν ,t ) ω ,
t =1
∑ | q$ν ,t − µ$ t |(4.104)
t =1

q$ν ,τ Q$ν
approach 2: q$ν*,τ = ω (4.105)
∑ q$ν ,τ
t =1

and
2
 ω  σ$ τ
approach 3: q$ν*,τ = q$ν ,τ +  Q$ν − ∑ q$ν ,t  ω (4.106)
 t =1 
∑ σ$ t2
t =1

where ω is the number of seasons, Q̂ν is the generated annual value, q̂ν,τ is the generated
(
seasonal value, q̂ν,τ is the adjusted generated seasonal value, µ̂τ is the estimated mean of
q̂ν,τ for season τ, and σ̂τ is the estimated standard deviation of q̂ν,τ for season τ.

52
4.9 Model Testing
The fitted model must be tested to determine whether the model complies with the model
assumptions and whether the model is capable of reproducing the historical statistical properties
of the data at hand. Essentially the key assumptions of the models refer to the underlying
characteristics of the residuals such as normality and independence.
Testing the properties of the residuals
Testing the residuals properties generally involves testing the normality and the
independence of the residuals. First, the residuals are obtained from the specified models after
the parameters are estimated. For instance, in the case of the univariate PARMA model of Eq.
(4.40), the residuals are the numbers e1 , 1 , e1 , 2 , e1 , 3 , ... that are derived from the model.
On the other hand, in the case of the MPAR model of Eq. (4.77), the residuals are the set of
(i) (i) (i)
numbers e1 , 1 , e1 , 2 , e1 , 3 , ... , i = 1,..., n each set i corresponding to each site or station.
Testing the residual properties can be done in several ways depending on how the residuals are
arranged.
Several tests are available for testing the normality of the residuals. Common normality
tests include the skewness test, the chi-square goodness of fit test, the Kolmogorov-Smirnov test,
and the product moment correlation test (Salas et al, 1999). For periodic-stochastic models, the
normality tests should be applied on a month-by-month basis. Often though the tests are applied
considering the entire sample of residuals. In the case of multivariate models, the normality tests
should be applied for each set of data (site by site). In SAMS, the skewness test of normality
is applied on a month-by-month basis and on a site by site basis.
Likewise, several tests are available for testing the independence of the residuals. The
Portmanteau lack of fit test and the Anderson test (Salas et al, 1980) are commonly used for
testing independence in time when the residuals are derived from stationary stochastic models.
On the other hand, the cross-correlation t-test may be used for testing independence in time when
the residuals are derived from periodic-stochastic models such as those described in the previous
sections. The t-test is applied for the correlation between the residuals of two successive months,
i.e. twelve tests for monthly data. However, the Portmanteau or Anderson tests may be also
applied for testing the independence of residuals derived from periodic-stochastic models, based
on the autocorrelation of the entire residuals series. In SAMS, the Portmanteau test of
independence was applied. For testing the independence between residuals of two different sites
(independence in space), the usual test is based on the cross-correlation t-test. Also this test

53
should be applied for the cross-correlation between residuals of two sites on a season-by-season
basis (twelve tests for monthly data), although the test can be applied based on the cross-
correlation of the entire residual series for each pair of sites.
Testing ARMA model parsimony
For a fitted ARMA(p,q) model, SAMS tests its model parsimony using Akaike
Information Criterion (AIC) (Salas, et al., 1980). For comparing among competing ARMA(p,q)
models, the following equation is used:

AIC(p,q) = N ln(σ$ ε2 ) + 2( p + q ) (4.107)

where N is the sample size and σ$ε2 is the maximum likelihood estimate of the residual variance.
Under this criterion the model which gives the minimum AIC is the one to be selected. SAMS
computes AIC’s for the fitted model and the models of both one step higher order and one step
lower order for comparison. For instance, for a fitted ARMA(1,1) model, SAMS will compute
the AIC values for ARMA(1,1), ARMA(2,1), ARMA(1,2), ARMA(1,0), and ARMA(0,1)
models for comparison. Besides, to test the assumption of white noise, the AIC of the
ARMA(0,0) is also computed.
Testing the properties of the process
Testing the properties of the process generally means comparing the statistical properties
(statistics) of the process being modeled, for instance, the process Yν , τ in Eq.(4.40), with
those of the historical sample. In general, one would like the model to be capable of reproducing
the necessary statistics that affect the variability of the data. Furthermore, the model should be
capable of reproducing certain statistics that are related to the intended use of the model.
If Yν , τ has been previously transformed from Xν , τ , the original non-normal process,
then one must test, in addition to the statistical properties of Y, some of the properties of X.
Generally, the properties of Y include the seasonal mean, seasonal variance, seasonal skewness,
and season-to-season correlations and cross-correlations (in the case of multisite processes), and
the properties of X include the seasonal mean, variance, skewness, correlations, and cross-
correlations (for multisite systems). Furthermore, additional properties of Xν , τ such as those
related to low flows, high flows, droughts, and storage may be included depending on the
particular problem at hand.
In addition, it is often the case that not only the properties of the seasonal processes
Yν , τ and Xν , τ must be tested but also the properties of the corresponding annual processes
AY and AX . For example, this case arises when designing the storage capacity of reservoir

54
systems or when testing the performance of reservoir systems of given capacities, in which one
or more reservoirs are for over year regulation. In such cases the annual properties considered
are usually the mean, variance, skewness, autocorrelations, cross-correlations (for multisite
systems), and more complex properties such as those related to droughts and storage.
The comparison of the statistical properties of the process being modeled versus the
historical properties may be done in two ways. Depending on the type of model, certain
properties of the Y process such as the mean(s), variance(s), and covariance(s), can be derived
from the model in close form. If the method of moments is used for parameter estimation, the
mean(s), variance(s), and some of the covariances should be reproduced exactly, however,
except for the mean, that may not be the case for other estimation methods. Finding properties
of the Y process in close form beyond the first two moments, for instance, drought related
properties, are complex and generally are not available for most models. Likewise, except for
simple models, finding properties in close form for the corresponding annual process AY, is not
simple either. In such cases, the required statistical properties are derived by data generation.
Data generation studies for comparing statistical properties of the underlying process Y
(and other derived processes such as AY, X and AX) are generally undertaken based on samples
of equal length as the length of the historical record and based on a certain number of samples
which can give enough precision for estimating the statistical properties of concern. While there
are some statistical rules that can be derived to determine the number of samples required, a
practical rule is to generate say 100 samples which can give an idea of the distribution of the
statistic of interest say θ . In any case, the statistics θ ( i ) , i = 1,...,100 are estimated from
the 100 samples and the mean θ̄ and variance S 2 ( θ ) are determined. Then, the mean
deviation, MD ( θ )
MD(θ ) = θ − θ ( H ) (4.108)
and the relative root mean square deviations, RRMSD ( θ )
100
∑[θ(i ) − θ( H )]2
i =1 (4.109)
RRMSD(θ) =
θ ( H)

are obtained in which θ(H) is the statistic derived from the historical sample (historical
statistic). The statistics MD ( θ ) and RRMSD ( θ ) are useful for comparing between the
historical and model statistics derived by data generation. In addition, one can observe
where θ(H) falls relative to θ̄ & S ( θ ) and θ̄ % S ( θ ) . Also graphical comparisons

55
such as the Box-Cox diagrams can be useful.

56
5 EXAMPLES
5.1 Statistical Analysis of Data
In this section, SAMS operations will be used to model actual hydrologic data. The data
used is the monthly data of the Yakima basin. The data will be read from the file yakima.dat
which can be obtained from the diskette accompanying this manual. The file contains data for
12 stations in the Yakima basin. Each station's data consists of 12 seasons and is 48 years long.
As an illustration a sample of the data file is shown in Appendix A. SAMS was used to analyze
the statistics of the seasonal and annual data. Some of the statistics calculated by SAMS are
shown below.

Annual Statistics
Site Number: 1 KEECHELUS_RESERVOIR

Historical
Mean 242.9312
Standard Deviation 55.3134
Skewness Coefficient 0.3416
Coef. Variation 0.2277
Maximum 375.5001
Minimum 151.7000

Correlation Structure
LAG
0 1.0000
1 0.2773
2 -0.0591
3 0.0644
4 0.0104
5 0.0736
6 -0.1389
7 -0.1669
8 -0.0322
9 -0.1162
10 0.0034
Lag-0 Cross Correlations
Sites
1 and 1 (KE & KE) 1.0000
1 and 2 (KE & KA) 0.9877
1 and 3 (KE & YA) 0.7864
1 and 4 (KE & CL) 0.9826
1 and 5 (KE & YA) 0.9834
1 and 6 (KE & YA) 0.9525
1 and 7 (KE & BU) 0.9190
1 and 8 (KE & NA) 0.8831
1 and 9 (KE & TI) 0.8787
1 and 10 (KE & TI) 0.8698
1 and 11 (KE & NA) 0.8626
1 and 12 (KE & YA) 0.9243

Storage and Drought Statistics

57
Demand Level = 1.0000 * sample mean

Longest Drought 7.0000


Maximum Deficit 344.2187
Longest Surplus 6.0000
Maximum Surplus 244.0125
Storage Capacity 576.3561
Rescaled Range 10.4198
Hurst Coefficient 0.7375

Site Number: 2 KACHESS_RESERVOIR


***********
Historical
Mean 211.7479
Standard Deviation 52.4475
Skewness Coefficient 0.2010
Coef. Variation 0.2477
Maximum 324.6000
Minimum 120.1000

Correlation Structure
LAG
0 1.0000
1 0.2790
2 -0.0329
3 0.0957
4 -0.0304
5 0.0323
6 -0.1500
7 -0.1782
8 -0.0666
9 -0.1703
10 -0.0300

Lag-0 Cross Correlations


Sites
2 and 1 (KA & KE) 0.9877
2 and 2 (KA & KA) 1.0000
2 and 3 (KA & YA) 0.7712
2 and 4 (KA & CL) 0.9913
2 and 5 (KA & YA) 0.9923
2 and 6 (KA & YA) 0.9632
2 and 7 (KA & BU) 0.9470
2 and 8 (KA & NA) 0.9157
2 and 9 (KA & TI) 0.9072
2 and 10 (KA & TI) 0.9017
2 and 11 (KA & NA) 0.9027
2 and 12 (KA & YA) 0.9425

Storage and Drought Statistics

Demand Level = 1.0000 * sample mean

Longest Drought 7.0000


Maximum Deficit 310.3353
Longest Surplus 6.0000
Maximum Surplus 234.6083

58
Storage Capacity 503.2062
Rescaled Range 9.5945
Hurst Coefficient 0.7115

Seasonal Statistics
Site Number: 1 KEECHELUS_RESERVOIR
***********
Season Historical

Mean
1 21.6250
2 22.5979
3 17.8708
4 14.1542
5 15.5708
6 26.8333
7 47.4375
8 38.1917
9 14.9604
10 4.7375
11 5.4792
12 13.4729

Standard Deviation
1 13.5856
2 13.9981
3 10.2554
4 8.9925
5 8.5916
6 8.5001
7 14.4123
8 19.0200
9 11.6909
10 2.6210
11 4.3821
12 8.4761

Skewness Coefficient
1 1.0570
2 1.6400
3 0.8679
4 1.0953
5 2.2601
6 0.2109
7 0.1997
8 0.2420
9 1.1964
10 1.3112
11 2.8219
12 0.8688

Season to Season Correlations

LAG 1
1 0.5775
2 0.2969
3 0.2198

59
4 0.4555
5 0.4143
6 0.3211
7 -0.0872
8 0.5527
9 0.8343
10 0.8618
11 0.2814
12 0.4562

LAG 2
1 0.3728
2 0.4746
3 0.1630
4 0.0556
5 0.2264
6 -0.0199
7 -0.1219
8 -0.3637
9 0.3692
10 0.7047
11 0.2319
12 0.1770

Lag-0 Season to Season Cross Correlations

Sites 1 and 2 (KE & KA)

1 0.9853
2 0.9828
3 0.9793
4 0.9847
5 0.9924
6 0.9632
7 0.9788
8 0.9906
9 0.9888
10 0.8572
11 0.9504
12 0.9888

Storage and Drought Statistics

Demand Level = 1.0000 * sample mean

Longest Drought 11.0000


Maximum Deficit 123.7427
Longest Surplus 7.0000
Maximum Surplus 163.8901
Storage Capacity 640.1103
Rescaled Range 39.0407
Hurst Coefficient 0.6471

5.2 Stochastic Modeling and Generation of Streamflow Data

SAMS was used to model the annual and monthly flows of site 1 of Yakima basin (refer

60
to file yakima.dat). Both annual and monthly data used in the following examples are
transformed using logarithmic transformation and the transformation coefficients are shown in
Appendix C.
5.2.1 Univariate ARMA(p,q) Model
SAMS was used to model the annual flows of site 1 with an ARMA(1,1) model. The
MOM was used to estimate the model parameters. SAMS was also used to generate 150
samples each 48 years long using the estimated parameters. The following is a summary of the
results of the model fitting and generation by using the ARMA(1,1) model.
Results of fitting an ARMA(1,1) model to the transformed and standardized annual
flows of site 1:
Model:ARMA

Number_of_sites: 1
Site(s)_ID: 1
Data_Transformations:
Site_1: LOG
a-coef= 49.000000
Data_Standardization: YES
Mean_of_the_process:
5.658607
Standard_deviation_of_the_process:
0.189585

Model_order(p,q): 1 1

phi_parameters: (Annual)

phi_1
-0.138036

theta_parameters: (Annual)

theta_1
-0.494947

Variance_of_the_residuals: (Annual)
0.885071

Results of statistical analysis of the data generated from the ARMA(1,1) model:

Model: Univariate ARMA, (Statistical Analysis of Generated Data)

Site Number: 1 KEECHELUS_RESERVOIR


***********
Historical Generated
Mean 242.9312 242.9985
Standard Deviation 55.3134 53.8040
Skewness Coefficient 0.3416 0.4131

61
Coef. Variation 0.2277 0.2212
Maximum 375.5001 385.1967
Minimum 151.7000 138.7450

Correlation Structure
LAG
0 1.0000 1.0000
1 0.2773 0.2691
2 -0.0591 -0.0625
3 0.0644 -0.0349
4 0.0104 -0.0237
5 0.0736 -0.0202
6 -0.1389 -0.0310
7 -0.1669 -0.0308
8 -0.0322 -0.0448
9 -0.1162 -0.0426
10 0.0034 -0.0277

Storage and Drought Statistics

Demand Level = 1.0000 * sample mean

Longest Drought 7.0000 6.0267


Maximum Deficit 344.2187 287.2662
Longest Surplus 6.0000 5.2533
Maximum Surplus 244.0125 311.2614
Storage Capacity 576.3561 488.3525
Rescaled Range 10.4198 9.1089
Hurst Coefficient 0.7375 0.6879

SAMS was also used to model the transformed and standardized annual flows of site 7
with an ARMA(2,2) model using the Approximate LS method. The result of modeling for this
site are shown below:

Model:ARMA

Number_of_sites: 1
Site(s)_ID: 7
Data_Transformations:
Site_7: LOG
a-coef= 450.000000
Data_Standardization: YES
Mean_of_the_process:
6.488171
Standard_deviation_of_the_process:
0.081923

Model_order(p,q): 2 2

phi_parameters: (Annual)

phi_1
0.316854
phi_2
-0.122860

62
theta_parameters: (Annual)

theta_1
-0.002752
theta_2
0.003944

Variance_of_the_residuals: (Annual)
0.918059

150 samples each 48 years long were generated using these estimated parameters. The
statistical analysis results of the generated data are shown below:

Model: Univariate ARMA, (Statistical Analysis of Generated Data)

Site Number: 7 BUMPING_RESERVOIR


***********
Historical Generated
Mean 209.5250 209.2238
Standard Deviation 53.9224 53.1033
Skewness Coefficient 0.1097 0.1912
Coef. Variation 0.2574 0.2541
Maximum 316.4000 338.9804
Minimum 112.1000 96.7291

Correlation Structure
LAG
0 1.0000 1.0000
1 0.2548 0.2532
2 -0.0238 -0.0711
3 0.0770 -0.0782
4 -0.0034 -0.0399
5 0.0430 -0.0203
6 -0.1625 -0.0320
7 -0.1544 -0.0294
8 -0.1121 -0.0311
9 -0.2085 -0.0229
10 -0.0532 -0.0273

Storage and Drought Statistics

Demand Level = 1.0000 * sample mean

Longest Drought 4.0000 5.8067


Maximum Deficit 255.5000 287.1717
Longest Surplus 6.0000 5.4667
Maximum Surplus 268.4500 295.3221
Storage Capacity 498.2249 461.1258
Rescaled Range 9.2397 8.7080
Hurst Coefficient 0.6996 0.6742

5.2.2 Univariate GAR(1) Model


An GAR(1) model was fitted to the annual data of site 1. Based on this model, the
skewness coefficient of the historical data can be preserved without data transformation. The
estimated parameters of the model are shown below:

63
Model:GAR

Number_of_sites: 1
Site(s)_ID: 1
Data_Transformations:
Site_1: NONE
Data_Standardization: NO
Mean_of_the_process:
242.931244
Standard_deviation_of_the_process:
55.313374
Skewness_coefficient_of_the_process:
0.341578
beta_parameters:
25.621111
alpha_parameters:
0.089647
lamda_parameters:
-42.867931
ph_parameters:
0.325271

150 samples each 48 years long were generated using these estimated parameters. The
statistical analysis results of the generated data are shown below:
Model: Univariate GAR(1), (Statistical Analysis of Generated Data)

Site Number: 1 KEECHELUS_RESERVOIR


***********
Historical Generated
Mean 242.9312 241.6216
Standard Deviation 55.3134 55.1494
Skewness Coefficient 0.3416 0.3156
Coef. Variation 0.2277 0.2282
Maximum 375.5001 379.7300
Minimum 151.7000 131.1173

Correlation Structure
LAG
0 1.0000 1.0000
1 0.2773 0.2614
2 -0.0591 0.0640
3 0.0644 0.0105
4 0.0104 -0.0150
5 0.0736 -0.0288
6 -0.1389 -0.0384
7 -0.1669 -0.0416
8 -0.0322 -0.0436
9 -0.1162 -0.0421
10 0.0034 -0.0410

Storage and Drought Statistics

Demand Level = 1.0000 * sample mean

Longest Drought 7.0000 6.4067


Maximum Deficit 344.2187 332.5194

64
Longest Surplus 6.0000 6.0267
Maximum Surplus 244.0125 354.8967
Storage Capacity 576.3561 535.3652
Rescaled Range 10.4198 9.6257
Hurst Coefficient 0.7375 0.7045

5.2.3 Univariate PARMA(p,q) Model


A PARMA (1,1) model was fitted to the transformed and standardized monthly data of
site 1 of the Yakima basin using MOM. Part of the modeling result obtained by SAMS are
shown below:
Model:PARMA

Number_of_seasons: 12
Number_of_sites: 1
Site(s)_ID: 1
Data_Transformations:
Site_1: LOG
a-coef= 8.0 3.5 1.7 -1.7 -7.3 40.0 120.0 80.0 -1.4 0.0 -1.1 2.5
Data_Standardization: YES

Model_order(p,q): 1 1

parameters:
phi_1 theta_1 Variance_of_the_residuals
Season_1 0.799617 0.176386 0.576008
Season_2 0.601515 0.325764 0.802792
Season_3 0.562456 0.475817 0.931586
Season_4 0.351970 -0.173212 0.734578
Season_5 0.372740 0.031907 0.877790
Season_6 0.083416 -0.256987 0.897436
Season_7 -0.546931 -0.523320 0.968819
Season_8 4.603543 4.142385 0.133168
Season_9 0.845774 -0.414590 0.168387
Season_10 0.706037 0.137241 0.530972
Season_11 0.432385 -0.196226 0.702498
Season_12 0.265219 -0.149505 0.858247

The estimated parameters were used to generate 100 samples of seasonal (12 seasons)
data each sample 48 years long. The statistical analysis result of the generated data are shown
below:

Model: Univariate PARMA, (Statistical

Analysis of Generated Data)

Site Number: 1 KEECHELUS_RESERVOIR

***********

65
Season Historical Generated

Mean
1 21.6250 21.4531
2 22.5979 22.5754
3 17.8708 17.8748
4 14.1542 13.9850
5 15.5708 15.4822
6 26.8333 26.5404
7 47.4375 47.5850
8 38.1917 38.5255
9 14.9604 15.4387
10 4.7375
4.7413
11 5.4792
5.4180
12 13.4729
13.2594

Standard Deviation
1 13.5856
13.3797
2 13.9981
13.4388
3 10.2554
10.6862
4 8.9925 9.4890
5 8.5916 8.8690
6 8.5001 8.3496
7 14.4123 13.9888
8 19.0200 18.9623
9 11.6909 13.9993
10 2.6210 2.5598
11 4.3821 4.1501
12 8.4761 8.5231

Skewness Coefficient
1 1.0570
1.0899
2 1.6400
1.2611
3 0.8679
1.3163
4 1.0953
1.8644
5 2.2601
2.4466
6 0.2109
0.3551
7 0.1997
0.2544
8 0.2420 0.4822
9 1.1964 2.3082
10 1.3112 1.3478
11 2.8219 2.1814
12 0.8688 1.2928

Season to Season Correlations

LAG 1

66
1 0.5775
0.6249
2 0.2969
0.4015
3 0.2198
0.1513
4 0.4555
0.4693
5 0.4143
0.2756
6 0.3211
0.2770
7 -0.0872 -
0.0946
8 0.5527 0.5754
9 0.8343 0.8147
10 0.8618 0.6320
11 0.2814 0.4625
12 0.4562 0.3269

LAG 2
1 0.3728 0.2491
2 0.4746 0.3682
3 0.1630 0.2180
4 0.0556 0.0162
5 0.2264 0.1639
6 -0.0199 0.0267
7 -0.1219 -0.1336
8 -0.3637 -0.3810
9 0.3692 0.4268
10 0.7047 0.6075
11 0.2319 0.2310
12 0.1770 0.1110

Storage and Drought Statistics

Demand Level = 1.0000 * sample


mean

Longest Drought 11.0000


10.7400
Maximum Deficit 123.7427
131.8937
Longest Surplus 7.0000
6.5900
Maximum Surplus 163.8901
177.8746
Storage Capacity 640.1103
487.0978
Rescaled Range 39.0407
29.0030
Hurst Coefficient 0.6471 0.5907

67
5.2.4 Multivariate MAR(p) Model
SAMS was also used to model the transformed and standardized annual data of sites 3,
5, and 7 of the Yakima basin using the MAR (1) model. The modeling results are shown below:
Model:MAR

Number_of_sites: 3
Site(s)_ID: 3 5 7
Data_Transformations:
Site_3: LOG
a-coef= -205.000000
Site_5: LOG
a-coef= 2000.000000
Site_7: LOG
a-coef= 450.000000
Data_Standardization: YES
Mean_of_the_process:
6.096067
8.147832
6.488171
Standard_deviation_of_the_process:
0.461667
0.103274
0.081923

Model_order(p,q): 1 0

phi_parameters: (Annual)

phi_1
0.802852 -0.091863 -0.271925
0.180350 0.241441 -0.103272
0.127788 0.243420 -0.083069

Variance_of_the_residuals: (Annual)

68
0.716938 0.736062 0.704988
0.736062 0.900586 0.868521
0.704988 0.868521 0.919150

These estimated parameters were used to generate 150 samples annual data each of 48
years long for the three sites. The statistical analysis result of the generated data is shown
below:

Model: Multivariate AR (MAR), (Statistical Analysis of Generated Data)


Site Number: 3 YAKIMA_RIVER_AT_EASTON_DIVERSION_DAM
***********
Historical Generated
Mean 699.3479 687.1061
Standard Deviation 246.3507 218.7747
Skewness Coefficient 1.8333 1.1163
Coef. Variation 0.3523 0.3161
Maximum 1726.4000 1400.4399
Minimum 346.9000 367.8550

Correlation Structure
LAG
0 1.0000 1.0000
1 0.4976 0.3898
2 0.2140 0.1702
3 0.1931 0.0740
4 0.0206 0.0299
5 0.0005 0.0106
6 -0.1358 -0.0216
7 -0.1159 -0.0458
8 -0.0234 -0.0382
9 -0.0729 -0.0508
10 0.0363 -0.0590

Lag-0 Cross Correlations


Sites
3 and 3 (YA & YA) 1.0000 1.0000
3 and 5 (YA & YA) 0.8040 0.8653
3 and 7 (YA & BU) 0.7269 0.8142

Storage and Drought Statistics

Demand Level = 1.0000 * sample mean

Longest Drought 7.0000 8.8200


Maximum Deficit 1338.4355 1530.0778
Longest Surplus 6.0000 6.0600
Maximum Surplus 2412.7124 1740.6711
Storage Capacity 2420.0576 2481.6851
Rescaled Range 9.8236 11.1852
Hurst Coefficient 0.7189 0.7527

Site Number: 5 YAKIMA_RIVER_AT_CLE_ELUM


***********
Historical Generated
Mean 1474.3375 1461.5977
Standard Deviation 358.9830 348.3850
Skewness Coefficient 0.2136 0.2240

69
Coef. Variation 0.2435 0.2386
Maximum 2345.5000 2300.8103
Minimum 826.0001 732.6721

Correlation Structure
LAG
0 1.0000 1.0000
1 0.2872 0.2546
2 -0.0224 0.0445
3 0.1007 -0.0004
4 0.0092 -0.0187
5 0.0426 -0.0137
6 -0.1397 -0.0363
7 -0.1650 -0.0478
8 -0.0598 -0.0251
9 -0.1297 -0.0347
10 -0.0224 -0.0406

Lag-0 Cross Correlations


Sites
5 and 3 (YA & YA) 0.8040 0.8653
5 and 5 (YA & YA) 1.0000 1.0000
5 and 7 (YA & BU) 0.9536 0.9563

Storage and Drought Statistics

Demand Level = 1.0000 * sample mean

Longest Drought 7.0000 6.2200


Maximum Deficit 2220.5625 2088.8184
Longest Surplus 6.0000 5.6133
Maximum Surplus 1561.5746 2044.6892
Storage Capacity 3803.9871 3397.3022
Rescaled Range 10.5966 9.7093
Hurst Coefficient 0.7428 0.7070

Site Number: 7 BUMPING_RESERVOIR


***********
Historical Generated
Mean 209.5250 207.7169
Standard Deviation 53.9224 52.5678
Skewness Coefficient 0.1097 0.1658
Coef. Variation 0.2574 0.2534
Maximum 316.4000 332.2505
Minimum 112.1000 95.6784

Correlation Structure
LAG
0 1.0000 1.0000
1 0.2548 0.2156
2 -0.0238 0.0339
3 0.0770 -0.0114
4 -0.0034 -0.0204
5 0.0430 -0.0103
6 -0.1625 -0.0307
7 -0.1544 -0.0409
8 -0.1121 -0.0252
9 -0.2085 -0.0357
10 -0.0532 -0.0369

Lag-0 Cross Correlations

70
Sites
7 and 3 (BU & YA) 0.7269 0.8142
7 and 5 (BU & YA) 0.9536 0.9563
7 and 7 (BU & BU) 1.0000 1.0000

Storage and Drought Statistics

Demand Level = 1.0000 * sample mean

Longest Drought 4.0000 6.0933


Maximum Deficit 255.5000 303.3795
Longest Surplus 6.0000 5.5733
Maximum Surplus 268.4500 299.3968
Storage Capacity 498.2249 495.2468
Rescaled Range 9.2397 9.3981
Hurst Coefficient 0.6996 0.6966

5.2.5 Multivariate CARMA(p,q) Model


A CARMA(2,2) model was also fitted to sites 3, 5, and 7 of the Yakima basin. The
modeling results are shown below:
Model:CARMA

Number_of_sites: 3
Site(s)_ID: 3 5 7
Data_Transformations:
Site_3: LOG
a-coef= -205.000000
Site_5: LOG
a-coef= 2000.000000
Site_7: LOG
a-coef= 450.000000
Data_Standardization: YES
Mean_of_the_process:
6.096067
8.147832
6.488171
Standard_deviation_of_the_process:
0.461667
0.103274
0.081923

Model_order(p,q): 2 2

phi_parameters: (Annual)

phi_1
0.558511 0.000000 0.000000
0.000000 0.397362 0.000000
0.000000 0.000000 0.316854
phi_2
-0.222751 0.000000 0.000000
0.000000 -0.169891 0.000000
0.000000 0.000000 -0.122860

theta_parameters: (Annual)

tht_1

71
0.000000 0.000000 0.000000
0.000000 0.000000 0.000000
0.000000 0.000000 -0.002752
tht_2
-0.000000 0.000000 0.000000
0.000000 -0.000000 0.000000
0.000000 0.000000 0.003944

Variance_of_the_residuals: (Annual)

0.752099 0.716343 0.702230


0.716343 0.859100 0.847136
0.702230 0.847136 0.904820

These estimated parameters were used to generate 150 samples annual data each of 48
years long for the three sites. The statistical analysis result of the generated data is shown
below:
Model: Contemporaneous ARMA (CARMA), (Statistical Analysis of Generated Data)

Site Number: 3YAKIMA_RIVER_AT_EASTON_DIVERSION_DAM


Historical Generated
Mean 699.3479 699.0977
Standard Deviation 246.3507 226.7137
Skewness Coefficient 1.8333 1.1653
Coef. Variation 0.3523 0.3232
Maximum 1726.4000 1436.5048
Minimum 346.9000 374.4970

Correlation Structure
LAG
0 1.0000 1.0000
1 0.4976 0.3893
2 0.2140 -0.0032
3 0.1931 -0.0993
4 0.0206 -0.0777
5 0.0005 -0.0441
6 -0.1358 -0.0301
7 -0.1159 -0.0107
8 -0.0234 -0.0129
9 -0.0729 -0.0340
10 0.0363 -0.0464

Lag-0 Cross Correlations


Sites
3 and 3 (YA & YA) 1.0000 1.0000
3 and 5 (YA & YA) 0.8040 0.8490
3 and 7 (YA & BU) 0.7269 0.7923

Storage and Drought Statistics

Demand Level = 1.0000 * sample mean

Longest Drought 7.0000 7.7133


Maximum Deficit 1338.4355 1302.8521
Longest Surplus 6.0000 5.2867
Maximum Surplus 2412.7124 1569.8881
Storage Capacity 2420.0576 2132.3733

72
Rescaled Range 9.8236 9.3943
Hurst Coefficient 0.7189 0.6974

Site Number: 5 YAKIMA_RIVER_AT_CLE_ELUM


***********
Historical Generated
Mean 1474.3375 1480.3129
Standard Deviation 358.9830 346.5107
Skewness Coefficient 0.2136 0.2796
Coef. Variation 0.2435 0.2344
Maximum 2345.5000 2343.4680
Minimum 826.0001 772.9506

Correlation Structure
LAG
0 1.0000 1.0000
1 0.2872 0.2943
2 -0.0224 -0.0618
3 0.1007 -0.0956
4 0.0092 -0.0564
5 0.0426 -0.0236
6 -0.1397 -0.0129
7 -0.1650 -0.0007
8 -0.0598 -0.0195
9 -0.1297 -0.0348
10 -0.0224 -0.0441

Lag-0 Cross Correlations


Sites
5 and 3 (YA & YA) 0.8040 0.8490
5 and 5 (YA & YA) 1.0000 1.0000
5 and 7 (YA & BU) 0.9536 0.9543

Storage and Drought Statistics

Demand Level = 1.0000 * sample mean

Longest Drought 7.0000 6.0867


Maximum Deficit 2220.5625 1927.5646
Longest Surplus 6.0000 5.4867
Maximum Surplus 1561.5746 1992.9727
Storage Capacity 3803.9871 3076.5337
Rescaled Range 10.5966 8.8885
Hurst Coefficient 0.7428 0.6790

Site Number: 7 BUMPING_RESERVOIR


***********
Historical Generated
Mean 209.5250 210.4516
Standard Deviation 53.9224 52.2904
Skewness Coefficient 0.1097 0.2260
Coef. Variation 0.2574 0.2489
Maximum 316.4000 338.8398
Minimum 112.1000 100.5427

Correlation Structure
LAG
0 1.0000 1.0000
1 0.2548 0.2406
2 -0.0238 -0.0693
3 0.0770 -0.0752

73
4 -0.0034 -0.0457
5 0.0430 -0.0238
6 -0.1625 -0.0168
7 -0.1544 -0.0106
8 -0.1121 -0.0201
9 -0.2085 -0.0345
10 -0.0532 -0.0279

Lag-0 Cross Correlations


Sites
7 and 3 (BU & YA) 0.7269 0.7923
7 and 5 (BU & YA) 0.9536 0.9543
7 and 7 (BU & BU) 1.0000 1.0000

Storage and Drought Statistics

Demand Level = 1.0000 * sample mean

Longest Drought 4.0000 5.8000


Maximum Deficit 255.5000 275.3951
Longest Surplus 6.0000 5.2667
Maximum Surplus 268.4500 284.3499
Storage Capacity 498.2249 453.1582
Rescaled Range 9.2397 8.6618
Hurst Coefficient 0.6996 0.6714

5.2.6 Disaggregation Models


A spatial-temporal disaggregation modeling and generation example using SAMS based
on multivariate data of the Yakima basin is demonstrated here. In this example both annual and
monthly data being modeled are transformed using logarithmic transformation. The schematic
representation of the stations’ locations in the basin are shown in Fig.26. Clearly, stations 5 and
11 can be considered as key stations. Stations 3, 4, 8, and 10 are substations and 1, 2, 7, and 9
are subsequent stations. Scheme 1 will be used to model the key stations so that the annual flows
of the key stations will be added together to form one series of annual data as an index station.
The index station data will be fitted with an ARMA(1,1) model and then a disaggregation model
(either Valencia and Schaake or Mejia and Rousselle) will be used to disaggregate the annual
flows of the index station into the annual flows at the key stations. The key station to substation
disaggregation will be done using two groups. The first group contains key station 5 and
substations 3 and 4. The second group contains key station 11 and substations 8 and 10. The
substation to subsequent station disaggregation was also done based on two groups. The first
group contains substations 3 and 4 and subsequent stations 1 and 2. The second group contains
substations 8 and 10 and subsequent stations 7 and 9. The modeling results for the annual and
monthly data are summarized below.

74
Fig.32 Schematic representation of the river network for the
disaggregation example

Annual (spatial) disaggregation


Disaggregation Model: Valencia and Schaake

!Modeling of Key stations


Disaggregation scheme:1
Key stations Id : 5 and 11

Model of Index station: ARMA(1,1)


!Key station to substation disaggregation modeling
Number of groups: 2
Group #: 1
Keystations Id : 5
Substations Id : 3 and 4
Group #: 2
Keystations Id : 11
Substations Id : 8 and 10
!Substation to subsequent station disaggregation modeling
Number of groups: 2
Group #: 1
Substations Id : 3 and 4
Subsequent stations Id : 1 and 2

75
Group #: 2
Substations Id : 8 and 10
Subsequent stations Id : 7 and 9

Using the above configuration the estimated model parameters are given below.

Modeling of Key stations


Basic_statistics_of_the_index_station:

Mean:
16.337555
Standard_deviation:
0.196377

Model_order(p,q): 1 1

phi_parameters: (Annual)
phi_1
0.095154

theta_parameters: (Annual)
theta_1
-0.151970

Variance_of_the_residuals: (Annual)
0.036331

Disaggregation_of_index_to_key_stations:

A_matrix
0.515312
0.484688

B_matrix
0.020397 0.000000
-0.020397 0.000001

Disaggregation of Key stations to substations


Number_of_groups: 2

group_#: 1

Number_of_key_stations: 1
Key_stations_ID: 5
Data_Transformations:
Station_5: LOG
a-coef= 2000.000000

Basic_statistics_of_the_key_stations:
Mean_of_the_process:
8.147832
Standard_deviation_of_the_process:
0.103274

Number_of_sub_stations: 2
Sub_stations_ID: 3 4
Data_Transformations:

76
Station_3: LOG
a-coef= -205.000000
Station_4: LOG
a-coef= 1000.000000

Basic_statistics_of_the_sub_stations:
Mean_of_the_process:
6.096067
7.417926
Standard_deviation_of_the_process:
0.461667
0.094175

A_matrix

3.933447
0.904966

B_matrix

0.219627 0.000000
-0.002626 0.011408

group_#: 2

Number_of_key_stations: 1
Key_stations_ID: 11
Data_Transformations:
Station_11: LOG
a-coef= 2406.000000

Basic_statistics_of_the_key_stations:
Mean_of_the_process:
8.189722
Standard_deviation_of_the_process:
0.097387

Number_of_sub_stations: 2
Sub_stations_ID: 8 10
Data_Transformations:
Station_8: LOG
a-coef= 2500.000000
Station_10: LOG
a-coef= 100.000000

Basic_statistics_of_the_sub_stations:
Mean_of_the_process:
8.090804
6.195611
Standard_deviation_of_the_process:
0.072572
0.205165

A_matrix
0.738420
1.995138

B_matrix
0.010106 0.000000
0.052522 0.040097

77
Disaggregation of substations to subsequent stations
Number_of_groups: 2

group_#: 1

Number_of_sub_stations: 2
Sub_stations_ID: 3 4
Data_Transformations:
Station_3: LOG
a-coef= -205.000000
Station_4: LOG
a-coef= 1000.000000

Basic_statistics_of_the_sub_stations:
Mean_of_the_process:
6.096067
7.417926
Standard_deviation_of_the_process:
0.461667
0.094175

Number_of_subsequent_stations: 2
Subsequent_stations_ID: 1 2
Data_Transformations:
Station_1: LOG
a-coef= 49.000000
Station_2: LOG
a-coef= 210.000000

Basic_statistics_of_the_subsequent_stations:
Mean_of_the_process:
5.658607
6.036669
Standard_deviation_of_the_process:
0.189585
0.124544

A_matrix
0.027025 1.867341
0.005409 1.288824

B_matrix
0.033196 0.000000
0.008933 0.013417

group_#: 2

Number_of_sub_stations: 2
Sub_stations_ID: 8 10
Data_Transformations:
Station_8: LOG
a-coef= 2500.000000
Station_10: LOG
a-coef= 100.000000

Basic_statistics_of_the_sub_stations:
Mean_of_the_process:
8.090804
6.195611
Standard_deviation_of_the_process:

78
0.072572
0.205165

Number_of_subsequent_stations: 2
Subsequent_stations_ID: 7 9
Data_Transformations:
Station_7: LOG
a-coef= 450.000000
Station_9: LOG
a-coef= 40.000000

Basic_statistics_of_the_subsequent_stations:
Mean_of_the_process:
6.488171
5.980681
Standard_deviation_of_the_process:
0.081923
0.220482

A_matrix
0.841615 0.093955
-0.007637 1.071983

B_matrix
0.017719 0.000000
0.007666 0.020592

Seasonal disaggregation
For annual-monthly disaggregation modeling, the stations were divided into two groups.
The first group contains the stations 1, 2, 3, 4, and 5, while the second group contains stations
7, 8, 9, 10, and 11. Part of the annual-monthly disaggregation modeling results are shown below.
Disaggregation Model: Lane condensed Model
Number of groups: 2
Group #: 1
stations id :1, 2, 3, 4, and 5
Group #: 2
stations id : 7, 8, 9, 10, and 11

group #: 1
Season : 1

A matrix
0.100187 -5.591171 -0.175279 2.833674 7.173420
-0.790256 -5.794002 -0.220576 3.963215 8.640571
-1.087746 -6.127742 0.154692 3.974580 8.148174
-0.736562 -10.566408 -0.291538 9.683748 9.577835
-0.812205 -7.984121 -0.239491 5.462620 10.251024

B matrix
0.272571 0.000000 0.000000 0.000000 0.000000
0.316740 0.062122 0.000000 0.000000 0.000000
0.313677 0.037441 0.051319 0.000000 0.000000
0.337333 0.087106 0.001906 0.104401 0.000000
0.342454 0.064863 0.021514 0.041748 0.024983

79
C matrix
-0.553959 1.056880 0.162744 0.342263 -0.967182
-0.669728 1.411542 0.154738 0.321621 -1.119348
-0.739637 1.300074 0.319091 0.432287 -1.188065
-0.610305 1.144258 0.254001 0.507121 -1.012993
-0.730781 1.344898 0.245251 0.458153 -1.139128

group #: 1
Season : 5
A matrix
-7.485024 22.856783 1.439505 -12.453279 -6.186196
-6.270320 20.105253 1.168698 -11.709853 -5.100427
-8.476122 23.010061 1.391239 -15.063147 -1.557045
-8.248028 16.965302 0.467788 -8.150460 0.988259
-9.367930 24.837963 1.053099 -15.114661 -1.299358

B matrix
0.812630 0.000000 0.000000 0.000000 0.000000
0.701064 0.166081 0.000000 0.000000 0.000000
0.697859 -0.003222 0.186866 0.000000 0.000000
0.728257 0.054581 -0.011022 0.191075 0.000000
0.736211 0.000507 0.141137 0.098642 0.103991

C matrix
4.226117 -2.866502 -2.391560 0.463517 1.038808
3.171489 -2.221176 -2.099235 0.547366 1.114273
2.938035 -2.149964 -0.642270 0.927850 -0.697620
2.794332 -1.867283 -1.273152 1.024580 -0.114892
3.306370 -2.100979 -1.783531 0.725862 0.423544

group #: 2
Season : 1
A matrix
1.716307 16.572737 -3.392607 3.279173 -9.404758
-0.044698 23.627403 -2.921220 1.744206 -10.550832
-0.326232 20.584972 -1.793802 1.468768 -11.500857
-0.327360 15.794100 -2.042270 1.905422 -8.688657
-0.147128 22.826544 -3.548427 1.898677 -8.968649

B matrix
0.356707 0.000000 0.000000 0.000000 0.000000
0.429611 0.108808 0.000000 0.000000 0.000000
0.285874 0.072522 0.067637 0.000000 0.000000
0.248105 0.057508 0.059324 0.035298 0.000000
0.420281 0.099421 0.026707 0.014496 0.050552

C matrix
0.298232 -0.406830 -0.629304 0.654334 0.595607
0.272947 -0.334373 -0.576146 0.674303 0.592925
0.103876 -0.131294 -0.130068 0.182709 0.381739
0.106382 -0.066460 -0.164665 0.211029 0.244746
0.254331 -0.362385 -0.476519 0.584160 0.659288

group #: 2
Season : 5
A matrix
-3.034014 6.733072 -2.220287 1.720777 1.226575
-10.645264 11.803424 -3.768532 2.314499 6.803477

80
-3.917177 -0.284301 -0.828180 0.804703 6.347968
-5.067391 3.801157 -2.485281 2.491937 4.987008
-8.766766 9.480949 -3.613741 2.021952 6.793612

B matrix
0.490879 0.000000 0.000000 0.000000 0.000000
0.408971 0.285451 0.000000 0.000000 0.000000
0.435884 0.160045 0.127493 0.000000 0.000000
0.417643 0.209446 0.115327 0.075774 0.000000
0.338970 0.236938 -0.016704 -0.030911 0.049310

C matrix
2.416789 1.505506 -3.527168 2.026513 -2.032238
1.659479 1.830661 -2.907587 1.818996 -2.139591
1.618642 1.827437 -2.917736 2.342135 -2.449902
1.504343 1.679651 -2.833496 2.304887 -2.352898
1.396754 0.869733 -2.370874 1.566433 -1.248207

These estimated parameters were used to generate 100 samples of monthly data each of
48 years long for the 10 sites. Part of the statistical analysis results of the generated data is
shown below:
Model: Seasonal Disaggregation,(Statistical Analysis of Generated Data)
Site Number: 2 KACHESS_RESERVOIR
Season Historical Generated
Mean
1 16.8646
16.9702
2 19.7521
20.0277
3 16.1458
16.1352
4 13.3875
13.4198
5 15.2688
15.3021
6 26.2375
26.3170
7 44.6521
44.3891
8 33.4583 33.1077
9 11.4625 11.5481
10 2.5000 2.5730
11 3.0542 3.0691
12 8.9646 9.0435

Standard Deviation
1 12.1013
12.1722
2 12.8655
12.9882
3 9.2932
9.5560
4 7.9937
8.6000
5 8.5009
8.7934
6 8.1718

81
7.9901
7 14.4182 13.9078
8 16.2124 16.0398
9 9.5621 11.1432
10 2.2897 2.8500
11 3.0264 2.7782
12 6.7700 7.4574

Skewness Coefficient
1 1.1320
1.4399
2 1.8967
1.5356
3 0.7127
1.2846
4 0.9671
1.7533
5 2.2952
2.4070
6 0.1600
0.2967
7 0.3599 0.3659
8 0.2885 0.5599
9 1.1013 2.1439
10 1.2974 2.2914
11 3.1266 1.9573
12 1.1720 1.8414

Season to Season Correlations

LAG 1
1 0.6589
0.4416
2 0.4100
0.5262
3 0.3546
0.2913
4 0.4388
0.3904
5 0.4377
0.3056
6 0.2811
0.1841
7 0.0489
0.0670
8 0.5925 0.6309
9 0.8565 0.8436
10 0.8978 0.6757
11 0.3768 0.4301
12 0.6031 0.4574

Storage and Drought Statistics

Demand Level = 1.0000 * sample mean

Longest Drought 10.0000 10.7700


Maximum Deficit 112.4566 124.0420
Longest Surplus 8.0000 7.6300

82
Maximum Surplus 163.0347 181.7904
Storage Capacity 564.9718 593.0753
Rescaled Range 36.5715 37.6897
Hurst Coefficient 0.6356 0.6359

Site Number: 11 NACHES_R_BELOW_TIETON_R_NEAR_NACHES


Season Historical Generated
Mean
1 56.0854 56.1426
2 76.6458
77.0317
3 64.6104
64.7072
4 62.5875
63.0430
5 77.3479
77.0965
6 151.5771
152.9251
7 280.5313
280.5401
8 236.7167
236.6873
9 103.7792 103.1596
10 40.6542 40.6756
11 28.0208 28.1190
12 36.2292 35.8892

Standard Deviation
1 40.6182
44.2509
2 70.4072
64.4993
3 42.6869
41.9235
4 39.8475
41.8386
5 48.1696
43.1282

83
6 59.2644 58.0738
7 96.9022 94.0675
8 103.3385 100.5874
9 57.9129 57.7420
10 17.0232 16.5544
11 10.6177 10.5495
12 20.6300 19.6031

Skewness Coefficient
1 0.9662
1.8455
2 3.4510
2.1245
3 2.3219
1.8524
4 1.3324
1.8464
5 2.7078
1.7876
6 0.4732
0.5181
7 0.5250
0.6177
8 0.2663 0.4145
9 0.8792 1.2173
10 0.0272 0.1560
11 -0.4915 0.1691
12 1.3003 1.3090

Season to Season Correlations

LAG 1
1 0.6809
0.4309
2 0.5099
0.6573
3 0.7709
0.5482
4 0.5542
0.5406
5 0.4644
0.5438
6 0.3119
0.4710
7 0.3550
0.3264
8 0.5396 0.5680
9 0.8317 0.8473
10 0.8759 0.8521
11 0.7917 0.7977
12 0.5972 0.6452

Storage and Drought Statistics

Demand Level = 1.0000 * sample mean

Longest Drought 10.0000 11.2400


Maximum Deficit 693.8212 767.2662
Longest Surplus 7.0000 7.7000
Maximum Surplus 1063.9717 1208.4659
Storage Capacity 3839.3513 3697.9873

84
Rescaled Range 39.6617
38.2223
Hurst Coefficient 0.6499
0.6381

Lag-0 Season to Season Cross Correlations


Sites 1 and 2 (KE & KA)

1 0.9853 0.9844
2 0.9828 0.9780
3 0.9793 0.9725
4 0.9847 0.9738
5 0.9924 0.9650
6 0.9632 0.9615
7 0.9788
0.9761
8 0.9906
0.9891
9 0.9888
0.9578
10 0.8572
0.6456
11 0.9504
0.8028
12 0.9888
0.9815

Sites 3 and 8 (YA & NA)

1 0.9068 0.3507
2 0.8623 0.1565
3 0.6949 0.2006
4 0.8251 0.1275
5 0.9108 0.0830
6 0.7394 0.0994
7 0.7722 0.2210
8 0.8394 0.3087
9 0.7933 0.2841
10 0.4031 0.1972

85
11 0.2735 0.2272
12 0.8937 0.2705

Sites 5 and 11 (YA & NA)

1 0.9392 0.3422
2 0.8905 0.1526
3 0.8189
0.2284
4 0.9137
0.1553
5 0.9296
0.0695
6 0.9286
0.0676
7 0.9512
0.2773
8 0.9699
0.3916
9 0.9462
0.3967
10 0.6776 0.4131
11 0.3608 0.3005
12 0.9007 0.2319

Sites 8 and 10 (NA & TI)

1 0.9755
0.9557
2 0.9867
0.9086
3 0.9796
0.9642
4 0.9847
0.9656
5 0.9827
0.9274
6 0.9781
0.9773
7 0.9833
0.9752
8 0.9897 0.9888
9 0.9770 0.9533
10 0.7619 0.7168
11 0.5003 0.4811
12 0.9741 0.9092

86
REFERENCES
Fernandez, B., and J. D. Salas, 1990, Gamma-Autoregressive Models for Stream-Flow
Simulation, ASCE Journal of Hydraulic Engineering, vol. 116, no. 11, pp. 1403-1414.
Frevert, D. K., M. S. Cowan, and W. L. Lane, 1989, Use of Stochastic Hydrology in Reservoir
Operation, J. Irrig. Drain. Eng., vol. 115, no. 3, pp. 334-343.
Gill, P. E., W. Murray, and M. H. Wright, 1981, Practical Optimization, Academic Press, N.
York.
Grygier, J. C., and J. R. Stedinger, 1990, SPIGOT, A Synthetic Streamflow Generation Software
Package, technical description, version 2.5, School of Civil and Environmental
Engineering, Cornell University, Ithaca, N.Y..
Himmenlblau, D. M., 1972, Applied Nonlinear Programming, McGraw-Hill, New York.
Hipel, K. and McLeod, A.I. 1994. "Time Series Modeling of Water Resources and
Environmental Systems", Elsevier, Amsterdam, 1013 pages.
Kendall, M. G., 1963, The advanced theory of statistics, vol. 3, 2nd Ed., Charles Griffin and Co.
Ltd., London, England.
Lane, W. L., 1979, Applied Stochastic Techniques (Last Computer Package); User Manual,
Division of Planning Technical Services, U.S. Bureau of Reclamation, Denver, Colo..
Lane, W. L., 1981, Corrected Parameter Estimates for Disaggregation Schemes, Inter. Symp.
On Rainfall Runoff Modeling, Mississippi State University.
Lane, W. L., and D. K. Frevert, 1990, Applied Stochastic Techniques, personal computer version
5.2, users manual, Bureau of Reclamation, U.S. Dep. of Interior, Denver, Colorado.
Loucks, D. P., J. R. Stedinger, and D. A. Haith, 1981, Water Resources Systems Planning and
Analysis, Prentice-Hall, Englewood Cliffs, N.J..
Lawrance, A. J., 1982, The innovation distribution of a gamma distributed autoregressive
process, Scandinavian J. Statistics, 9(4), 234-236.
Lawrance, A. J. and P. A. W. Lewis, 1981, A New Autoregressive Time Series Model in
Exponential Variables [NEAR(1)], Adv. Appl. Prob., 13(4), pp. 826-845.
Matalas, N. C., 1966, Time Series Analysis, Water Resour. Res., 3(4), pp. 817-829.
Mejia, J. M., and J. Rousselle, 1976, Disaggregation Models in Hydrology Revisited, Water
Resources Research, vol. 12, no. 2, pp.185-186.
O’Connell, P. E., 1977, ARIMA Models in Synthetic Hydrology, Mathematical Models for
Surface Water Hydrology, in T. Ciriani, V. Maione, and J. Wallis, eds., Wiley & Sons,
N. Y., 51-68.
Salas, J. D., 1993, Analysis and Modeling of Hydrologic Time Series, Handbook of Hydrology,
Chap. 19, pp.19.1-19.72, edited by D. R. Maidment, McGraw-Hill, Inc., New York.
Salas, J. D., D. C. Boes, and R. A. Smith, 1982, Estimation of ARMA Models with Seasonal
Parameters, Water Resources Res., vol. 18, no. 4, pp. 1006-1010.
Salas, J. D., et al, 1999, Statistical Computer Techniques for Water Resources and
Environmental Engineering, forthcoming book.
Salas, J. D., J. W. Delleur, V. Yevjevich, and W. L. Lane, 1980, Applied Modeling of Hydrologic
Time Series, WWP, Littleton, Colorado.
Stedinger, J. R., D. P. Lettenmaier and R. M. Vogel, 1985, Multisite ARMA(1,1) and
Disaggregation Models for Annual Stream flow Generation, Water Resour. Res., 21(4),
pp. 497-509.
U. S. Army Corps of Engineers, 1971, HEC-4 Monthly Streamflow Simulation, Hydrologic
Engineering Center, Davis, Calif..
Valencia, D., and J. C. Schaake, Jr., 1973, Disaggregation Processes in Stochastic Hydrology,
Water Resources Research, vol. 9, no. 3, pp.580-585.

87
APPENDIX A

This appendix contains a sample of a monthly input data file used in this manual that
corresponds to 12 stations of monthly flows for the Yakima basin. The data file name is
YAKIMA.DAT. Printed below for illustration is data for only two stations. Note that except the
first block entitled “station” containing the stations’ names, all other items must be included in
the data file.

station

1 KEECHELUS RESERVOIR
2 KACHESS RESERVOIR
3 YAKIMA RIVER AT EASTON DIVERSION DAM
4 CLE ELUM RESERVOIR
5 YAKIMA RIVER AT CLE ELUM
6 YAKIMA RIVER AT UMTANUM
7 BUMPING RESERVOIR
8 NACHES RIVER AT NACHES-SELAH DIVERSION
9 TIETON RESERVOIR
10 TIETON RIVER AT TIETON DIVERSION
11 NACHES R BELOW TIETON R NEAR NACHES
12 YAKIMA R AT SUNNYSIDE DIV(PARKER)

tot_num_stats 12
Years 48
Seasonal 12

Station 1
Station_id KEECHELUS_RESERVOIR
Duration 1926 1973
10.2 35 14.9 12.4 24.7 30.3 19.1 4.2 3.2 2.9 5.3 20.9
14.8 19.3 8.9 6.8 7.9 19.1 44.5 57.7 16.1 4.8 11.5 23
48.5 19 29.1 6.2 21.5 21 55.8 20.4 6 4.4 2 17.3
8.5 6.5 4.2 3.2 11.3 15.3 50.1 38.4 11.6 2 2.4 2.7
2.8 7.3 7.1 23.9 18.2 36.9 26.9 17 5.8 3.2 2.8 8.5
10.2 5.9 12.9 14 21.9 26.1 43.9 20.2 6.6 2.8 2.6 9.5
19.6 7.6 18.1 22.2 32.2 32.4 50.8 53 18.7 7.9 4.3 11.7
61.4 27.9 25.8 7.8 8.4 18.1 37.3 65 37 8.9 11.3 36.8
31.9 80 37 17.8 47.7 48.2 22.4 7.1 2.7 1.3 2.8 22.7
32.4 20.2 39.4 14.5 14.5 15 44 41.7 13.9 4.3 4.2 4.7
5.9 8.1 12.6 6.8 15.1 34.9 74.2 44.5 7.5 3.6 3.4 3.1
2.9 22.3 5.9 7.1 12 22.4 45.4 56.4 14 3.5 2.9 6.3
40.8 23.6 14.3 5.4 9.7 34.7 46.6 26.5 5.8 2.7 1.8 4
16.4 26.6 26.7 8.9 14.1 29.2 43.6 25.2 9.6 3.3 3.2 6.8
14 28.6 7.6 15.3 22.8 33.4 35.6 12.5 4.9 3.9 2 5.5
11.6 17.9 6.7 5.9 18.2 23.1 19.8 10 4.9 4 11 18.6
19.3 20.1 5.8 5.7 9.6 30.9 31.3 28.5 7.9 2.5 3 3.6
23.2 20.5 12.9 7.4 13.8 39.1 46.5 50.6 23 4.8 3.6 4.9
9.3 24.8 6.3 9.3 12 21.5 37.1 18.4 3.4 5.1 8.9 6
9.5 15.1 32.1 18.8 9.3 17 58.4 27.4 8.3 4.2 9.2 14.5
20.9 16.2 13.9 6.6 12.1 26.6 67 59.6 23.9 6.8 4.7 15.5
11.8 38.4 23.1 17.9 25.4 42 45.3 24.9 7.6 1.3 4.7 29.4
34.3 16.3 9.9 9.3 9.3 17.5 64.3 72.2 17 6.6 6.8 11.8
16 14.6 9.1 9.1 12.4 31.1 71.9 48.1 22.9 6.9 5.1 19.7
37.6 22.4 13.9 10.8 21.6 19.2 46.1 78.6 38.4 9 4.5 22.4
33.4 34.7 16.9 33.4 10.2 30.3 57.6 37.7 8.8 3.3 2.9 20.1

88
16.3 10.9 6.4 9.6 8.8 30.4 51.9 28 12.9 3.4 3.6 3.3
2.1 4.2 49.3 24.9 10.9 24.3 44.1 36.4 20.9 4.6 2.9 7.5
20.2 38.8 16.6 10.9 10.5 22 55.4 58 40.1 12.6 7.5 10.6
23.1 10.3 10.2 15.5 8.4 11.9 37.3 67.9 39.4 9.8 4.8 29.5
36.9 25.5 9.7 6.6 8.7 31.9 77.7 63 33.8 6.4 5 20
18.9 56.3 8.5 7.4 10.9 29.5 55.5 21.8 6.2 3.8 2.6 4.5
9.9 19.6 13.6 19.2 12.3 29.3 55 18.2 4.3 4.3 4.3 19.7
50 39.9 30.4 10.9 17.2 39.9 43.5 44.7 17.9 3.4 27.2 34
56.1 32.6 7.4 12.3 15.5 30 46.2 36.4 8.4 4.3 4.6 13.3
27.9 10.2 24.7 32 19.8 32.7 56.6 47.1 9.1 2.7 4.4 13.2
15.2 21.4 28.5 15.5 8.4 39 29.3 25.5 11.9 4.2 4.1 13.1
33 24.7 18.2 28.9 14.9 20.3 28.3 14.5 4.9 2.5 2 6.5
21.1 9.8 16 7.4 11 19.8 46.9 71.6 45 12.4 9 13.5
15.8 26.9 28.2 22.4 13 38.2 43.1 34.2 10.8 3.7 2.8 7.8
14.8 11.9 11.1 5.8 11.9 35.2 48.8 29 12.3 2 2 9.6
15.2 33.9 28.8 20.2 11.4 10.8 44.9 46.2 10.8 1.4 1.4 26.7
21.2 42.2 32.3 37 23.4 17.1 31.9 23.9 6.2 4.5 13.8 17.6
26 17 20.3 5.4 9.8 28.1 67.9 40.1 8.2 4.8 7 10.7
10.3 8.1 13.2 9.1 16.4 23.2 48.4 42.7 8.8 3.1 5.2 7.6
19.3 9.7 28.1 27.8 10.1 15.1 69 55.1 39.5 7.2 4.8 12.7
24.5 16.9 22 37.9 47.4 23.8 78.3 64.8 31.4 9.1 12.7 6.7
13 35 19.2 6.2 10.8 20.2 31.5 18.3 5.8 3.2 4.4 8.6

Station 2
Station_id KACHESS_RESERVOIR
Duration 1926 1973
6.6 27.8 13.5 11.7 23.3 29.2 18.3 8.3 1.1 0.4 3.5 14.4
12.3 19.2 8.5 7.9 7.1 22.4 45.8 52 14.2 3.6 8.9 18.4
39.2 25.3 28.5 5.8 20.9 22.1 54.3 21.5 4.4 0.5 0.2 10.9
6.3 5.7 4.3 3.8 11.9 16 44.7 32.6 7.9 0.3 0.2 1
1.4 6.5 3.7 19.7 19.1 34.5 24 15 5 0.3 1.1 4.5
8 4.8 8.6 12.5 21.8 24.3 40.5 17.8 5.2 1.5 1.2 5.8
12.7 7.5 11.8 25.6 34 34.4 48.3 41.1 14.2 2.6 1.5 6.5
48.1 27.3 24.5 8 7.4 20 38.9 57.9 28.4 6.9 8.5 27.8
27.6 75.9 35.2 18.8 46.9 47.6 24.4 8.2 2.1 0.8 1.4 15.6
28.8 19.1 39 13.9 15.7 16.8 45.6 35.8 12.1 1.7 2.5 2.2
3.5 5.6 10.3 6.8 15.9 38.1 72.2 40.1 7.3 3.3 2.7 1.5
1.5 14.2 5.1 7.4 11.3 21 44.9 49.1 12.4 4.1 1 2.3
29.2 21.4 13.6 5.9 10.3 33 46.3 26.8 2.6 0.4 0.6 2.2
11.1 19.5 22.8 9.2 15 30.1 39.9 20.5 7.8 1.5 1.7 4.1
9.3 21.2 6.5 12 20.2 30.3 34.4 12.2 1.8 0.9 0.4 4.3
7.3 13.3 6.4 4.6 17.2 23.2 17.4 8.2 1.6 2 7 11.9
15.7 18.5 5.8 5.9 9.1 27.9 27.3 20.3 5.4 1.6 1.6 2.1
16 18.1 12.5 8.4 13.5 41 44.7 43.3 16.2 1.2 0.5 2.6
5.2 17.7 5.1 7.7 11.1 20.9 29.6 15 2 0.7 3 3.8
6.6 11.4 27.4 16.6 10.5 16.8 47.8 22.3 3 0.3 4.8 8
16.1 13.6 14.4 6.6 11.4 28.8 67.9 44.9 16.9 2.1 0.9 10.7
10.1 34.6 21.8 16.6 24.1 33.7 38.8 18.8 5 1.4 3.1 21.8
28.3 14.4 10 10.1 9.2 18.5 60 62 14.7 3.5 2.4 6.8
13.9 14.3 6.7 11.4 11.4 32.5 73.6 42.6 18.4 4 2 12.9
31.3 23 13.6 9.7 19.8 18.5 48 67.5 29.9 5.9 2.2 15.7
28.4 31.9 17.7 31.8 10.7 31.8 54.5 31.9 6.1 0.4 2.7 14
14.7 10.4 5.9 8 7.9 29.3 44.8 24.6 8.8 1.7 1.9 1.1
1.2 4 38.1 24.6 11.8 23.5 41.6 33.2 18.4 4 2.8 5.2
12.7 32.6 17.2 11.7 9.8 22.9 55.6 50.7 34.1 8.9 4.2 6.4
17.9 9.6 7.4 14.1 8.4 12.3 37.5 60.4 29.6 5.6 3.1 20.5
36.8 23.5 10.6 7.3 9.3 35.5 78 51.2 24.3 3.5 3 13.2
13.4 49.3 9.1 7.3 10 29.1 53.3 19.1 3 0.7 0.7 3.4
6.4 16.2 11.1 18.2 11.4 26.6 54.4 18.2 3.6 0.3 3.9 12.8
41.6 37 27.4 10.7 16.8 35.5 40.2 39.9 14 3.8 18.7 28.8
52.3 29.8 8.3 10.8 15.4 30.2 37.3 31.4 6.4 2.4 2.1 8.5
22.2 8.4 20.8 29.1 20.2 32.2 50.4 42.3 8.6 1.7 2.3 9.1
11.1 17.3 27.4 16.7 8.2 37.2 28.7 27 9.6 2.8 2.5 8.9
28 21.5 14.9 24.7 14.7 17.1 26.5 13 2.5 0.6 1.6 3.5
13 9.3 16.3 7.5 11.7 21.1 44.4 62.1 33.5 9.4 5.3 8.5
11 23.2 22.8 21.7 15.3 35.6 38.2 30.2 7.6 3.2 2.1 4.1
9.8 8.2 9.4 5.1 9.9 31.3 43.3 24.3 8.2 0.4 1 6.4

89
9.9 28.1 25.3 18.4 11.3 9.3 39.4 40.9 8.8 0.5 1.2 20.7
18.1 35.7 31.6 32.7 21.8 14.2 27.5 20.6 4.3 2.9 6.7 12
21.5 16.4 20.3 5.5 9.5 28.4 62 37.2 6.4 1.4 3.6 5
6.1 6.3 10.7 7.8 13.3 21.6 42.8 39.7 7.7 0.3 2.9 4.1
13.2 9.2 25 26.2 10.7 15.6 66.1 48.9 32.2 6.1 3.3 6.8
15 11.8 20.3 30.9 46.9 21.2 72.3 59.8 29.1 6.9 5.8 4.6
9.1 28.5 17.8 5.2 9.8 16.3 26.9 15.6 3.8 1 2.3 4.9

Remarks:
1. Data values are in free format but they must be separated by at least one space.
2. The item titles including “ tot_num_stats”, “Years”, “Seasonal”, “Station”, “Station_id”, and
“Duration” depend on the case at hand.
3. The station names following the item title “Station_id” must be one word. If the name has
more than one word, the words must be connected by underline “_” such as
“KEECHELUS_RESERVOIR” as shown on page 87.
4. The “Station_id” term is optional. Note the if a data file does not include the “Station_id”
term, the results in tables and graphs will not show the station’s identification.
5. The top portion of the sample data file shown on page 87 with a block of station names, can
be avoided. It does not affect the data reading result by SAMS.

90
APPENDIX B
This appendix contains a sample of an annual input data file used by SAMS
corresponding to 12 stations of annual flows for the Yakima basin. Printed below for illustration
are data for only two stations. Note that the data can also be arranged as a single column for
each station.

tot_num_stats 12

Years 48

Annual

Station 1
Station_id KEECHELUS_RESERVOIR
Duration 1926 1973
183.1 234.4 251.2 156.2 160.4 176.6 278.5 345.7
321.6 248.8 219.7 201.1 215.9 213.6 186.1 151.7
168.2 250.3 162.1 223.8 273.8 271.8 275.3 266.9
324.5 289.3 185.5 232.1 303.2 268.1 325.2 225.9
209.7 359.0 267.1 280.4 216.1 198.7 283.5 246.9
194.4 251.7 271.1 245.3 196.1 298.4 375.5 176.2

Station 2
Station_id KACHESS_RESERVOIR
Duration 1926 1973
158.1 220.3 233.6 134.7 134.8 152.0 240.2 303.7
304.5 233.2 207.3 174.3 192.3 183.2 153.5 120.1
141.2 218.0 121.8 175.5 234.3 229.8 239.9 243.7
285.1 261.9 159.1 208.4 266.8 226.4 296.2 198.4
183.1 314.4 234.9 247.3 197.4 168.6 242.1 215.0
157.3 213.8 228.1 217.2 163.3 263.3 324.6 141.2

91
APPENDIX C

The logarithmic transformation coefficients for both annual and monthly flows for each
site of the example data file YAKIMA.DAT are given below. Refer to Eq. (4.1) for detail.

Transformation coefficients for annual flows -

Site Coef. a
1 49
2 210
3 -205
4 1000
5 2000
6 2500
7 450
8 2500
9 40
10 100
11 2406
12 2500

Transformation coefficients for monthly flows

Month
Site 1 2 3 4 5 6 7 8 9 10 11 12
1 8 3.5 1.7 -1.7 -7.3 40 120 80 -1.4 0 -1.1 2.5
2 4.5 2 3 -2 -6.4 65 50 40 0.5 0.19 0.3 1.2
3 13 10 10 -7 -21 210 0 25 -3 8 -1.8 12
4 6 8 5 -4 -14.5 40 100 200 0 -4 2 2.7
5 15 16 10 -11 -47 187 366 300 -2.5 12 -3.8 8
6 15 14 12 -11 -62.5 105 380 290 15 100 27.5 5
7 2.27 1.15 2.7 -0.6 -2.83 14.6 18 90 0 -0.88 -0.15 -1.25
8 4.5 0.7 2.5 1.8 -11.1 112 133 340 9.4 55 86 -1.3
9 4.7 -1.4 -3.3 -2.7 -9.2 70 -10 90 -6 -6.5 2 -4.4
10 10.5 3.4 -4 -3.7 -10.5 44 -7 160 -7.5 -5.6 5.9 -3.5
11 1.7 -4.7 -4 -4.5 -14.6 138 101 350 18 296 98 1.2
12 -2 0 7 -30 -99 175 410 510 7 -20 -15 -35

92

You might also like