Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

Time Series Analysis

an introduction

A time series is defined as a collection of observations made sequentially in time. This means that there must be equal intervals of time in between observations.

Types of Time Series Data


Continuous vs. Discrete Continuous - observations made continuously in time Examples: 1. Seawater level as measured by an automated sensor. 2. Carbon dioxide output from an engine. Discrete - observations made only at certain times. Examples: 1. Animal species composition measured every month. 2. Bacteria culture size measured every six hours. Stationary vs. on-stationary Stationary - Data that fluctuate around a constant value Non-stationary - A series having parameters of the cycle (i.e., length, amplitude or phase) change over time

Deterministic vs. Stochastic Deterministic time series - This data can be predicted exactly. Stochastic time series - Data are only partly determined by past values and future values have to be described with a probability distribution. This is the case for most, if not all, natural time series. So many factors are involved in a natural system that we can not possibly correctly apply all of them.

Autocorrelation
A series of data may have observations that are not independent of one another. Example: A population density on day 8 depends on what that population density was at on day 7. And likewise, that in turn is dependent on day 6 and so forth. The order of these data has to be taken into account so that we can assess the autocorrelation involved.. To find out if autocorrelation exists: Autocorrelation Coefficients measure correlations between observations a certain distance apart. Based on the ordinary correlation coefficient r, we can see if successive observations are correlated. An autocorrelation coefficient at lag k can be found by:

This is the covariance (xt xt+k)divided by the variance (xt). An rk value of ( 2/ ) denotes a significant difference from zero and signifies an autocorrelation. Also note that as k gets large, rk becomes smaller.

Correlograms
The autocorrelation coefficient rk can then be plotted against the lag (k) to develop a correlogram. This will give us a visual look at a range of correlation coefficients at relevant time lags so that significant values may be seen. The correlogram in Fig.2 shows a short-term correlation being significant at low k and small correlation at longer lags. Remember that an rk value of ( 2/ ) denotes a significant difference (a = 0.05) from zero and signifies an autocorrelation. Some procedures may call for a higher a value since this constitues expectation that one out of every twenty obsservations in a truly random data series will be significant.

Figure 2. A time series showing short-term autocorrelation together with its correlogram. Fig. 3 shows an alternating (negative correlation) time series. The coefficient rk alternates as does the raw data (r1 is negative and r2 is positive ..) This series of rk is negative.

Figure 3. An alternating time series with its correlogram.


i

Box-Jenkins Models (Forecasting)


Box and Jenkins developed the AutoRegressive Integrative Moving Average (ARIMA) model which combined the AutoRegresive (AR) and Moving Average (MA) models developed earlier with a differencing factor that removes in trend in the data. This time series data can be expressed as: Y1, Y2, Y3,, Yt-1, Yt With random shocks (a) at each corresponding time: a1, a2, a3,,at-1, at In order to model a time series, we must state some assumptions about these 'shocks'. They have: 1. a mean of zero 2. a constant variance 3. no covariance between shocks 4. a normal distribution (although there are procedures for dealing with this) An ARIMA (p,d,q) model is composed of three elements: p: Autoregression d: Integration or Differencing q: Moving Average A simple ARIMA (0,0,0) model without any of the three processes above is written as: Yt = at The autoregression process [ARIMA (p,0,0)] refers to how important previous values are to the current one over time. A data value at t1 may affect the data value of the series at t2 and t3. But the data value at t1 will decrease on an exponential basis as time passes so that the effect will decrease to near zero. It should be pointed out that is constrained between -1 and 1 and as it becomes larger, the effects at all subsequent lags increase. Yt = 1 Yt-1 + at The integration process [ARIMA (0,d,0)] is differenced to remove the trend and drift of the data (i.e. makes non-stationary data stationary). The first observation is subtracted from the second and the second from the third and . So the final form without AR or MA processes is the ARIMA (0,1,0) model:

Yt = Yt-1 + at The order of the process rarely exceeds one (d < 2 in most situations). The moving average process [ARIMA (0,0,q)] is used for serial correlated data. The process is composed of the current random shock and portions of the q previous shocks. An ARIMA (0,0,1) model is described as: Yt = at - 1at-1 As with the integration process, the MA process rarely exceeds the first order.

Reference: http://userwww.sfsu.edu/~efc/classes/biol710/ meseries/TimeSeriesAnalysis.html

You might also like