Arima 4 2011-12

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 16

ARIMA Models

Session 4

Model identification
Introduction

Model identification is not a single process. Box


and Jenkins recommend the sequence,

IdentificationEstimation  Diagnostic
checking  Identification…

 In what follows we assume a stationary


series.

Main identification tool  Autocorrelations.

Preliminary estimation of autocorrelations


n
 ( X t  X )( X t k  X ) N
rk  ˆ k  t k 1 n
; X   Xt / N ,
 (Xt  X) 2
t 1
t 1

n =length of the stationary series.


 Estimators are consistent irrespective of the
underlying model.

2
Asymptotic distribution of autocorrelation
estimators

Theorem
rk ~ N (  k , Ckk ) (n =length of stationary series);
n 

1
Ckk   i (  k2i i  k i k  4  k i i k  2  k2 i2 ) .
n
Denoting Ck ,k  j  Cov ( rk , rk  j ) ,
1 
Ck ,k  j   i  (  k i  k  j i i  k  j i k  2  k i i k  j
n
 2  k  j i i k  2  k  k  j i2 ).

Special Case
White Noise:  k  0 for k  0
1
 Ckk  ; Ck ,k  j  0 .
n
 This result is used extensively for testing the
goodness of fit of the model fitted to the series,
by applying it to the autocorrelations of the
estimated model residuals.
3
Variance of Autocorrelations in
MA(q) processes
1
For k  q , Ckk  [1  2 iq1 i2 ]
n
 This result is used for testing that k  0 from
some lag (q+1). The variance formula applies
also when  k  0 since for large k, k  0 .
Graphical test
Standard computer software plot the values of rk
Z  Cˆ kk for k  1, 2,..., where Z  is the upper
1 1
2 2

1  ( / 2) percentile of the normal distribution.


The computation of the variance Cˆ kk assumes that
 k is the first autocorrelation that equals zero,
1
i.e., Ckk  [1  2 ik11 ˆi ] . Thus, the variance and
ˆ 2

n
hence the lengths 2  Z  Cˆ kk of the confidence
1
2

intervals increase with k until they stabilize.


 A value rk outside  Z1 Cˆ kk indicates
H
:

2
k  0 .
0

rejection of the hypothesis


4
Testing significance of autocorrelations (cont.)

Even if  k  0 from some lag and, for example,


  0.05  1  ( / 2) =0.975, about 1 out of 20
estimated correlations would be outside the
 Z 0.975 Cˆ kk boundaries and hence the null
hypothesis for this correlation will be rejected.
In such cases it is important to check for which j
the hypothesis has been rejected. (Normally we
wouldn’t worry if, for example, r17 turned out to
be significant.)
Variances of Partial Correlations
For an AR(p) process,
1
ˆkk ~ N (kk , ) for k>p and the estimators are
n n
approximately independent.

 This variance approximation is also used for


ARMA(p,q) models because after lag (p-q) the
partial correlations decay to zero and hence we
expect them to be close to zero after lag p.
5
Model diagnostics

After that the series has been transformed to a


stationary series and a preliminary model has
been identified and estimated, we need to test the
goodness of fit of the model. The conclusion of
this stage may either be to accept the model or to
try an alternative model(s). Below are a few
diagnostic procedures in common use.

1- Test the significance of the estimators of


the model parameters.
The unknown parameters include the constant
term (see Exercise 1 for the variance of the
sample mean). Nonsignificant coefficients can be
dropped from the model, but only one at a time.

6
Model diagnostics (cont.)

2- Test the significance of the estimators of the


autocorrelations of the model residuals.

This step can be implemented either by testing


the significance of each estimator separately, (the
estimator rk  ˆ k is nonsignificant if it is within

the  Z  Cˆ kk range), or by testing a group of


1
2

estimators jointly, using the Portmanteau test


(see below).
 The problem with the individual tests is that
the overall probability of rejection of at least
one null hypothesis, even if they are all
correct, is way beyond the assigned  -level
used for each of the individual tests.

7
Portmanteau test for autocorrelations

H0 : 1  ...  K  0
(first K Autocorrelations =0),
1 rk2
Q  (n  2) k 1
K
~  2 ( K  P*)
(n  k ) (1/ n) H 0


n
eˆ eˆ
t  k 1 t t  k
rk  ;

n 2

t 1 t

n =number of observations after differencing.


P* = Total number of estimated parameters.

 Rejection of H 0 suggests that at least one of


the autocorrelations is different from zero.

 The use of this test controls the overall


rejection probability but it does not indicate
directly which correlation is significant. The
power of the test is generally low. Thus, the
two tests complement each other.

8
Model diagnostics (cont.)

Example: Suppose that we fit the model,


 p ( B) P ( B s )Wt  q ( B)Q ( B s )bt , and when

testing the significance of the residual

autocorrelations find that rk* is significant for


some k *, but all the other estimated
autocorrelations are close to zero and non-
significant. This suggests that bt   t   t k*
 we modify the initial model as,
 p ( B ) P ( B s )Wt   q ( B )Q ( B s )(1   B k * ) t ,

and fit (and test) the modified model.

9
Model diagnostics (cont.)

3- Assess prediction performance

This is in many ways the most important test if


the model is fitted for prediction purposes. As a
simple criterion one may compute the statistics,
1

N
RMSE(b)  (Y  Yˆt|t b ) 2
, b  1, 2...,
N k t  k 1 t

for some constant k, or


1 Yt  Yˆt|t b

N
RelP(b)  | |, b  1, 2...
N k t  k 1
Yt

 The prediction performance is assessed with


respect to the original series.

 For b=1, RMSE(b) and RelP(b) refer to the one


step ahead prediction errors (innovations).

10
Assessing prediction performance (cont.)

The statistics RMSE(b) and RelP(b) can be


computed within the “fitting region”, i.e., use
the same data for model fitting
(identification, estimation) and for assessing
the prediction performance, or outside the
“fitting region”, i.e., use the first N-f
observations for model fitting and assess the
prediction performance using the last f
observations.

 The model residuals may not behave like


independent errors and yet the predictions
could be satisfactory.

11
Model diagnostics (cont.)

4- Model Extensions

Even when we are satisfied with the model, it is


always advisable to extend the model by adding
extra AR or MA terms (regular, seasonal) and
check if they improve the existing model. This
can be done by testing the significance of the
added coefficients and by comparing the
prediction performance of the two models.

 It is important not to add regular AR and


MA terms at the same time, and likewise for
seasonal terms. Otherwise, the extended
model could be “Multi-collinear” with very
unstable (large variance) estimators.

12
Model extension (cont.)

Example: suppose that the true model is


white noise, Wt   t . Adding simultaneously a
regular MA term and a regular AR term yields
the model,
(1   B )Wt  (1   B) t [ARMA(1,1)],
which is the correct model for any pair    .
Fitting the latter model is formidable. This can
be seen also by noting that for ARMA(1,1),
1 (1   )  (1   2
)(1   ) , (1   2
)(1   2
)
COV (ˆ,ˆ) 
n (φ - θ )2  (1   )(1   ) , (1   )(1   )
2 2 2
.
For    the variances tend to infinity.
 The risk of having to deal with multi-collinear
models is why we don’t start with very extensive
models and test for non-significant coefficients.

13
Model diagnostics (cont.)

5- Choice between Alternative Plausible


Models
At the end of the testing process we may end up
with two or more plausible models that seem to
fit the data “equally well”. In such cases it is
advisable to extend the existing models in such a
way that the extended model includes both of
them as special cases.
Example: Suppose that we hesitate between the
following two models fitted to the series,
Wt  X t  X t 1 .
(1  0.11B  0.34 B 2  0.16 B 3 )Wt   t [AR(3)],

(1  0.45 B  0.31B 2 )Wt  (1  0.36 B ) t [ARMA(2,1)].

A model that includes the two models as special


cases is ARMA(3,1).

14
Choice between alternative models (cont.)

(1  0.11B  0.34 B 2  0.16 B3 )Wt   t [AR(3)]


(1  0.45 B  0.31B 2 )Wt  (1  0.36 B ) t [ARMA(2,1)].

Fitting the ARMA(3,1) model gives,


(1  0.42 B  0.29 B 2  0.04 B3 )Wt  (1  0.32 B) t

S.E.  (0.31) (0.09) (0.14) (0.31)

The large Standard Errors suggest that there is


a redundant term. Note that the estimated
coefficients are close to the estimates obtained
for the ARMA(2,1).
The two original models are actually very
similar. To see this, write the ARMA model as,
(1  0.36 B) 1 (1  0.45B  0.31B 2 )Wt   t ,

(1  .36B  .362 B 2  .363 B3  ...)(1  0.45B  0.31B2 )Wt   t ,

(1  .09 B  .34 B 2  .12 B 3  .04 B 4  .02 B 5 ...)Wt   t .

15
Choice between models, AIC and SBC

Akaike AIC Criterion,

AIC= -2log(Max. Likelihood)+2P*


P* = # of estimated parameters

Schwartz SBC Criterion,

SBC= -2log(Max. Likelihood)+log(n*)P*


n* = # of estimated residuals

 Many other important diagnostic statistics


are discussed and illustrated in the literature.

16

You might also like