Professional Documents
Culture Documents
Data Analysis / Time Series Analysis 621
Data Analysis / Time Series Analysis 621
Y (n)
5
n
20 40 60 80 100
Figure 1 An example of a realization of a white-noise time series with 100 entries. The mean of this time series is 2.00 and the entries are
drawn from a normal distribution with unit variance. In this plot as in subsequent ones the entries are joined by a continuous line.
In time series we actually have a random function or and these are common in nature. (Strictly speaking this
string of random variables, with individual realiza- only denes a second moment stationary time series,
tions being individual graphs of the function. The since nonstationary properties of the probability
property that is so evident is that averaging along the distribution may still be present. If the random
time series is equivalent to xing the time and variables are normally distributed, these mean and
averaging across realizations to nd the mean. This covariance stationary properties sufce to determine
property of ensemble averaging is a very powerful one, the strong forms of stationarity.) Perhaps some exam-
which enables many of the proofs and analyses of time ples of nonstationary time series will help clarify the
series analysis. An example of a single realization of a concept. The diurnal or seasonal data mentioned
white-noise process is shown in Figure 1. above are examples of nonstationary time series since
An example of a white-noise time series is the their means depend on local time of day or time of year.
heights of students standing in a cafeteria queue. On In addition, their variances will also have such a phase
the other hand, consider the heights of succeeding rst dependence; even their serial correlation structure
sons in an ancestral sequence. There is a genetically may have a phase dependence for example, the serial
determined correlation between the height of a father correlation between entries may by greater in winter
and that of son. Such a time series exhibits a positive than in summer. At rst glance the sequence of heights
serial correlation which diminishes to zero after a few of rst sons across generations may seem like a
generations a phenomenon known as regression to stationary time series, but there is known to be a
the mean. secular trend of increasing heights over generations,
probably because of better nutrition.
Despite our ability to enumerate many time series
that are nonstationary, the model of a stationary time
Stationary Time Series
series is very valuable in the geosciences. For example,
Consider next a time series which is not necessarily annual averages of temperature at a location are likely
white-noise. That is to say, an entry may not neces- to form a stationary time series at least to a good
sarily be statistically independent of its predecessor approximation. The statistics of such a time series
there may be serial correlation. Nevertheless, the time (mean, variance, serial correlation properties) make a
series may not have any preferred origin. That is, as we good summary of the sequence and for many purposes
look along the time series, statistically speaking, each may form an adequate substitute in practical applica-
time is equivalent to each other time. In such a time tions. For example, an insurance company may want
series the mean is independent of time and so is the to know the likelihood of the temperature (or ood
variance. The covariance between an entry and that a water level) exceeding a given threshold. The serial
certain number of steps, say n, earlier depends only on correlation structure is particularly important in
the temporal separation or lag, n. A time series having drought, where sequences of dry years can be the
the above properties is called a stationary time series most important indicator of consequences.
DATA ANALYSIS / Time Series Analysis 623
Autoregressive Processes A
1
The most common type of time series encountered in
the geosciences is the rst-order autoregressive process 0.8
(known as the AR1 process). In this process each new
entry can be written mathematically as the sum of two 0.6
terms, the rst proportional to the previous entry, the
second an additive white-noise term. Higher-order 0.4
autoregressive processes (ARn) model the next entry
as a sum of n 1 terms, the rst n of which are 0.2
proportional to the previous n entries along with the
additive white-noise term. We concentrate here on the Lag
2 4 6 8 10
AR1 process because of its central importance. The
parameters which describe the time series are its mean, Figure 3 The autocorrelation function corresponding to the AR1
its variance, and its so-called lag-one serial correla- process depicted in Figure 2. The lag is treated as a continuous
tion. It is the job of the analyst to take the given data variable for this plot for clarity of the display. The autocorrelation
time for this process is about 4.0.
series and determine or t the parameters to the data.
That is, one wants to know the mean, variance, and
lag-one serial correlation in the data. If the lag-one
applications. One begins the analysis by taking the
serial correlation turns out to vanish, then we infer
nite-length segment of data in the sequence and
that the series can be modeled by a white-noise time
estimating the Fourier coefcients for representing the
series (AR0). If the lag-one serial correlation is r then
data as a Fourier series on the segment. In this process
the lag-two is r2 , and so on. In the limit of very small
one is representing the data in terms of the Fourier
time steps in the series this tends to an exponential
coefcients instead of the temporal entries. The two
falloff of serial correlation. The value of n for which
are equivalent ways of expressing the content of the
the serial correlation falls to 1=e 0:3678 . . . is
data. Each Fourier coefcient is a component or
known as the autocorrelation time. The autocorrela-
amplitude of a certain sinusoidal waveform in the data
tion time is a measure of the memory of the system. It is
stream. From the point of view of time series modeling,
often said that the system forgets its past values after
the Fourier coefcients are random variables, since
a few autocorrelation times (Figures 2 and 3).
from one realization of the process on the same
segment to another the coefcients will differ. How-
ever, they will have certain statistical properties
Fourier Analysis of Time Series common across the ensemble of realizations. If the
If there are physical reasons to think that a time series segment is sufciently long and the series is stationary,
of data is stationary, then Fourier analysis of the data it can be proven that the Fourier coefcients corre-
can lead to a number of powerful techniques useful in sponding to different frequencies are uncorrelated.
This permits us to perform an analysis of variance over
the different frequency bands to examine how vari-
Y (n) ance is distributed over frequencies. It is routine to plot
6 a graph of the variance or sometimes known as power
as a function of frequency. This is known as spectral
5 analysis.
The most common example is the white-noise time
4
series. The white-noise spectrum is at; that is, every
3 frequency band has allotted the same variance. Hence,
one way of determining whether a certain time series is
2 white-noise is to perform the Fourier analysis and plot
the spectrum (variance or power versus frequency). If
1
the spectrum is at, we can infer that the time series is
n white-noise. Of course, if the time series segment is
20 40 60 80 100 short, there will be problems in estimating the spec-
Figure 2 An example of a realization of an autoregressive
trum of the underlying process because of sampling
process of order one. In this example the present entry is 0.75 times error. Analysts have devised many useful techniques
the previous entry with an added normally distributed variable of for statistical testing of the white-noise hypothesis.
variance 0.25. The mean of the time series is 3.00. The term white-noise spectrum derives from optics,
624 DATA ANALYSIS / Time Series Analysis