Download as pdf or txt
Download as pdf or txt
You are on page 1of 14

ARIMA Model building blocks

z Autoregressive (AR) models


z Moving-average (MA) models
z Mixed ARMA models
z Non stationary models (ARIMA models)
z The mean parameter
Dr.T.A.S.Vijayaraghavan z The trend parameter

1
The Box-Jenkins model building Partial-autocorrelations (PACs)
process
z Model identification z Partial-autocorrelations are another set of statistical
– Autocorrelations measures are used to identify time series models
– Partial-autocorrelations
z Model estimation
z PAC is Similar to AC, except that when calculating
z Model validation
it, the ACs with all the elements within the lag are
– Certain diagnostics are used to check the validity of the
model partialled out (Box & Jenkins, 1976)
z Model forecasting

2
Partial-autocorrelations (cont.) Model identification
z PACs can be calculated from the values of the
ACs where each PAC is obtained from a different z The sample ACs and PACs are computed for the
set of linear equations that describe a pure series and compared to theoretical autocorrelation
autoregressive model of an order that is equal to and partial-autocorrelation functions for candidate
the value of the lag of the partial-autocorrelation models investigated
computed

Stationarity and
z PAC at lag k is denoted by φkk Theoretical ACs and
invertibility
– The double notation kk is to emphasize that φkk is the PACs
conditions
autoregressive parameter φk of the autoregressive model
of order k

3
Stationarity requirements for an Stationarity requirements for an
AR(1) model (cont.) AR(1) model (cont.)
z For an autoregressive model of order p, the
z For a stationary AR(1) model, the theoretical
theoretical autocorrelation function satisfies the
following difference equation autocorrelation function decays exponentially to
ρk = φ1 ρk – 1 + φ2 ρk – 2 + … + φp ρk – p zero,

which for p = 1 and with ρ0 = 1 has the solution z However, the theoretical partial-autocorrelation
ρk = φ1
k
k > 0 i.e., Exponenial decay function has a cut off after the 1st lag

4
Invertibility requirements for a
Theoretical PACs
MA(1) model (cont.)
z The partial-autocorrelations of a time series
z For a stationary MA(1) model, the theoretical produce patterns that are exactly the reverse of
autocorrelation function has a cut off after the 1st autocorrelation patterns with respect to AR and
lag MA parameters

z However, the theoretical partial-autocorrelation z That is partial-autocorrelation patterns for AR


function decays exponentially to zero
models look like autocorrelation patterns for MA
models, and vice versa

5
Permissible regions for the AR and
Higher order models
MA parameters
z For an AR model of order p > 1:
– The autocorrelation function consists of a mixture of
damped exponentials and damped sine waves
– The partial-autocorrelation function has a cut off after
the p lag
z For a MA models of order q > 1:
– The autocorrelation function has a cut off after the q lag
– The partial-autocorrelation function consists of a
mixture of damped exponentials and damped sine
waves

6
Theoretical ACs and PACs for an AR(2) Process
AR(1)

7
Theoretical ACs and PACs for a
MA(2) Process
MA(1)

8
Theoretical ACs and PACs for an ARMA(1,1)
ARMA(1,1)

9
ARMA(1,1) Model estimation
z There are three objectives in estimating a specific
Box-Jenkins model for a given series:
1- Determining optimum values for the selected AR
and/or MA parameters so that the sum of squared
residuals is minimized. These parameters are called
“the most likelihood parameters” or “the least squares
parameters”.
2- Obtain residuals at that are not correlated to one
another.
3- Use as few parameters as necessary to obtain an
adequate model; i.e., make sure the model is
parsimonious (not overspecified)

10
Model verification
Diagnostic checks
Residual mean or Mean error (ME)
Parameter diagnostics Overfitting Residual diagnostics
z The residual mean is simply the average of all the
computed residuals
Cofidence limits Residual mean
Correlations Residual mean percent error z If the residual mean is significantly nonzero, then
Correlogram
Q-statistic
either the fitted values are consistently higher or
Cumulative periodogram
Normality
lower than the original series values
Error variance
z To check whether a residual mean is significantly
Closeness of fit statistics
(with 95% confidence) nonzero, its magnitude can
Average absolute error be compared with: 2 * S
Residual standard error
Average absolute percent error a
Index of determination
n

11
Correlogram of the residuals Q statistic of the residuals
z Is used in order to judge whether the
autocorrelations of the residual series, as a whole,
are significantly nonzero
z By comparing the Q-statistic with a critical test
value (the chi-square value), we can determine
(with a certain degree of confidence) if the
residual autocorrelations, being tested as a whole,
are significant
z Using the Ljung-Box formula:
z Solid lines represent the 95% confidence limits of
two standard deviations m
rk2
Q = n ( n + 2) ∑
k =1 ( n − k )

12
Normality of the residuals
Error variance
100

80

60

40

20

-2 0

-4 0

-6 0

-8 0

-1 0 0
0 20 40 60 80 100 120 140

P r e d ic t e d e f f lu e n t T S S ( m g /L )

13
Index of determination (R2)
Seasonal AC and PAC patterns
R2 = 1 −
∑ (a ) t
2

∑ (Z t − µ) 2 z The autocorrelation patterns associated with


purely seasonal models are analogous to those for
z The index of determination is a ratio that nonseasonal models, with the only difference that
compares the amount of “variation” present in the the nonzero autocorrelations that form the pattern
original series values with the amount of that occur at lags that are multiples of the number of
variation that has been accounted for by the fit periods per season

z A high index of determination does not


necessarily mean a good model

14

You might also like