Arima Model

BABS 502
Lecture 8
ARIMA Forecasting II
March 16 and 21, 2011
Content
The Box-Jenkins Modeling Process
Seasonal ARIMA Models
Concluding comments on ARIMA models
Martin L. Puterman 2008
The Box Jenkins Approach to

forecasting with ARIMA models
Identification
Fitting
Diagnostics
Refitting if necessary
Forecasting
Identification
What does the data look like?
What patterns exist?
Is the data stationary?
Tools
Plots of data
PACF
ACF
Model Fitting
Trial model is proposed
e.g. ARIMA(0,1,2)
Model parameters are estimated using

statistical software
Output includes
Parameter estimates
Test statistics
Goodness of fit measures
Residuals
Diagnostics
Diagnostics
Determines whether model fits data adequately.
The goal is to extract all information and ensure that
residuals are white noise
Key measures
ACF of Residuals
PACF of Residuals
Ljung-Box-Pierce Q Statistic (Portmanteau Test)
Tests whether a set of residual autocorrelations is
significantly different than zero.
See next slide for details
If model deemed adequate, proceed with

forecasting, otherwise try a new model.
Comments on Model Adequacy Testing

(NCSS Documentation)
The Portmanteau Test (sometimes called the Box-Pierce-Ljung
statistic) is used to determine if there is any pattern left in the
residuals that may be modeled. This is accomplished by testing the
significance of the autocorrelations up to a certain lag. In a private
communication with Dr. Greta Ljung, we have learned that this test
should only be used for lags between 13 and 24. The test is
computed as :
2
rj
Q(k ) N ( N 2)
j 1
N j
where rj is the jth residual autocorrelation.

Under H0: All residual autocorrelations equal zero; Q(k) is
distributed as a Chi-square with (K-p-q-P-Q) degrees of freedom
where p,q,P and Q are the model orders.
Forecasting with ARIMA models

ARIMA forecasting is done automatically in any
statistical program.
You should try to figure out how this works in
terms of the equation for the model.
It helps to write out model equation
This is complicated with seasonal models, we will discuss
this below.
In AR portion of models use past values in forecasts

In MA portion of models use past residuals in
forecasts.
Prediction intervals are usually very wide; out of

sample forecast errors might be more reliable.
Google Share Price Forecasting

Series
Monthly price (Jan 23, 2006 March 3, 2008)
Model
Regular(1,1,0) Seasonal (No seasonal parameters)
Observations
111
Root Mean Square
20.12511
Model Estimation Section
Parameter
Parameter
Standard
Name
Estimate
Error
T-Value
AR(1)
0.2383635
.009
2.5978
Forecast of price
Row
Date
112
2094
113
2095
114
2096
115
2097
116
2098
Forecast
423.5
421.3
420.8
420.7
420.7
Lower
360.7
340.4
324.8
311.6
299.9
Prob
Level
0.009382
Upper 95% Limit

486.3
502.3
516.8
529.8
541.4
Fitted Model
Xt+1-Xt = .238 (Xt Xt-1) or Xt+1 = Xt + .238 (Xt Xt-1)
One Step Ahead Forecast = 432.70 + .238 * (432.7 471.2) = 423.5
Seasonal ARIMA Models

The basic concept is to add extra terms to model that
take into account a persistent seasonal pattern
For example, a AR model for monthly data may contain

information from lag 12, lag 24, etc.
i.e. Yt = A1Yt-12 +A2 Yt-24 + et
This is referred to as an ARIMA(0,0,0)x(2,0,0)12 model
General form is ARIMA(p,d,q)x(ps,ds,qs)s

This combines both non-seasonal and seasonal terms
This provides a broader class of models.

The challenge is to select a model from a larger class.
10
Wages Data
Autocorrelations of Wages (0,0,12,0,0)
Partial Autocorrelations of Wages (0,0,12,0,0)
Partial Autocorrelations
1.0
0.5
0.0
-0.5
-1.0
0.0
10.3
20.5
30.8
0.5
0.0
-0.5
-1.0
0.0
41.0
Time
10.3
20.5
30.8
41.0
Time
Plot of Wages
6.0
Observe data is non-stationary
5.7
Wages
Autocorrelations
1.0
5.4
5.1
4.8
0.9
18.9
36.9
54.9
72.9
Time
11
Differenced Wages Data

Partial Autocorrelations of Wages (1,0,12,0,0)

1.0
Partial Autocorrelations
Autocorrelations
1.0
0.5
0.0
-0.5
-1.0
0.0
0.5
0.0
-0.5
-1.0
0.0
10.3
20.5
30.8
10.3
41.0
Time

Lag
Correlation
Lag
Lag
Correlation
1
-0.055496 11
0.062967
2
-0.004269 12
0.506937
3
0.298826
13
-0.056564
4
0.108858
14
0.041622
5
0.073639
15
0.287086
6
0.121665
16
0.001067
7
0.048026
17
0.088581
8
0.069370
18
0.092850
9
0.218783
19
0.010672
10
0.044824
20
0.115261
Significant if |Correlation|> 0.237356
20.5
30.8
41.0
Time
Correlation
Lag
Correlation
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
0.013874
0.152615
0.077908
0.013874
0.118463
0.328709
-0.086446
0.028815
0.205977
-0.066169
0.200640
0.016009
0.008538
0.494130
-0.024546
0.058698
0.200640
0.003202
0.036286
0.091782
12
Model Fitting ARIMA(0,1,3)x(0,0,1)12

Parameter
Name
Estimate
MA(1)
0.1390065
MA(2)
1.547035E-02
MA(3)
-0.2083403
SMA(1) -0.5427189
Standard
Error
0.1200761
0.1202638
0.1170662
0.1019158
Prob
Level
0.247006
0.897645
0.075128
0.000000
T-Value
1.1577
0.1286
-1.7797
-5.3252
Forecasts of Wages
Autocorrelations of Residuals
6.4
6.0
0.5
Wages
Autocorrelations
1.0
0.0
-0.5
-1.0
0.0
5.6
5.2
12.3
24.5
36.8
49.0
4.8
1982.9
Lag
1984.9
1986.8
1988.8
1990.8
Time
Ljung-Box 24
20
34.96
0.020343
Inadequate Model
13

Parameter
Name Estimate
MA(1) 0.2134133
MA(2) 7.882232E-02
MA(3) -0.3358605
SMA(1) -0.4282575
SMA(2) -0.8555523
Standard
Error
0.1150536
0.1161695
0.1119487
0.1181367
6.011709E-02
Prob
Level
0.06361
0.49744
0.00269
0.00028
0.00000
T-Value
1.8549
0.6785
-3.000
-3.625
-14.23
Wages Chart
6.2
1.0
Wages
Autocorrelations
5.9
0.5
0.0
5.2
-0.5
-1.0
0.0
5.5
12.3
24.5
36.8
49.0
4.8
1982.9
Lag
1984.9
1986.8
1988.8
1990.8
Time
14

Model Estimation
Parameter
Name
SAR(1)
Section
Parameter
Estimate
-0.5495576
Standard
Error
8.447082E-02
T-Value
-6.5059
Prob
Level
0.000000
Wages Chart
6.2
1.0
Wages
Autocorrelations
5.9
0.5
0.0
5.5
5.2
-0.5
-1.0
0.0
12.3
24.5
36.8
49.0
4.8
1982.9
Lag
1984.9
1986.8
1988.8
1990.8
Time
15
Model Comparison
Model
RMSE
LjungBox (24)
Residual
ACF
34.96
Autocorrelations
(0,1,3)x(0,0,1)12 .0316
1.0
0.5
0.0
-0.5
-1.0
0.0
12.3
24.5
36.8
49.0
Lag
11.08
Autocorrelations
(0,1,3)x(0,0,2)12 .0245
1.0
0.5
0.0
-0.5
-1.0
0.0
12.3
24.5
36.8
49.0
Lag
(0,1,0)x(1,1,0)12 .0239
15.84
Autocorrelations
1.0
0.5
0.0
-0.5
-1.0
0.0
12.3
24.5
36.8
49.0
Lag
But we are concerned about forecasting and should compare models

out of sample (usually simpler models are better).
Also forecasts from the last model looks most reasonable.
16
Interpreting Seasonal Models

What does a ARIMA(1,0,0)x(1,0,0)12 model mean in terms of the
data xt?
We use the backshift operator Bxt = xt-1, the identity operator Ixt = xt
and the difference operator Dxt = (I B)xt = xt xt-1 to understand
this.
An AR(1) model is written as
(I a1B) xt = et
which becomes xt a1xt-1 = et
which becomes xt = a1xt-1 + et
Note B2xt = B(Bxt) = Bxt-1 = xt-2.

An AR(2) model is written as
(I a1B a2B2) xt = et
An MA(1) model is written as

xt = (I b1B) et = et b1et-1
17
What does a ARIMA(1,0,0)x(1,0,0)12 model mean in terms of the data xt?

It is written as:
(I a12B12)(I a1B) xt = et
Note the order of the terms on the left doesnt matter. Above can be rewritten
as
(I a1B)(I a12B12) xt = et
or
(I a1B - a12B12 + a1a12 B13) xt = et
or
xt a1 xt-1 a12 xt-12 + a1a12xt-13 = et
finally
xt = a1 xt-1 + a12 xt-12 - a1a12xt-13 + et
This is analogous to regressing xt on xt-1, xt-12, xt-13. And forecasts will be
based on past or predicted values for these quantities.
18
Concluding Comments
The ARIMA models are not designed for models with multiplicative seasonality. In
such cases;
Use log transforms.

De-seasonalize and use ARIMA on de-seasonalized data.
Models with persistent trends can be de-trended and ARIMA applied to the detrended series.
Several automatic fitting programs do a good job fitting ARIMA models (Not NCSS)
Parsimony is desirable use models with as few as terms as possible
AIC and BIC criterion penalize number of terms in the model

Theoretical result any high order MA model can be written as a low order AR model and
vice versa; e.g. an MA(6) can be closely represented by an AR(1) or AR(2) model
Key point Above approach to model selection is based on in sample fitting
Need to compare all models on the basis of out-of sample forecasts on holdout data.
Simpler ARIMA models seem to work better out of sample even though they may not give
the best fit.
Recall from early slides that fitting is different than forecasting.
ARIMA models forecasts can be pooled with those from one or more other models.
19

Arima Model

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Arima Model

Uploaded by

Copyright:

Available Formats

BABS 502

Martin L. Puterman 2008

The Box Jenkins Approach to

Martin L. Puterman 2010

Martin L. Puterman 2008

Model parameters are estimated using

If model deemed adequate, proceed with

Comments on Model Adequacy Testing

where rj is the jth residual autocorrelation.

Martin L. Puterman 2008

Forecasting with ARIMA models

In AR portion of models use past values in forecasts

Prediction intervals are usually very wide; out of

Google Share Price Forecasting

Upper 95% Limit

One Step Ahead Forecast = 432.70 + .238 * (432.7 471.2) = 423.5

Martin L. Puterman 2008

Seasonal ARIMA Models

For example, a AR model for monthly data may contain

General form is ARIMA(p,d,q)x(ps,ds,qs)s

This provides a broader class of models.

Martin L. Puterman 2008

Partial Autocorrelations of Wages (0,0,12,0,0)

Observe data is non-stationary

Martin L. Puterman 2008

Differenced Wages Data

Autocorrelations of Wages (1,0,12,0,0)

Autocorrelations of Wages (1,0,12,0,0)

Martin L. Puterman 2008

Model Fitting ARIMA(0,1,3)x(0,0,1)12

Martin L. Puterman 2008

Model Fitting ARIMA(0,1,3)x(0,0,2)12

Martin L. Puterman 2008

Model Fitting ARIMA(0,1,0)x(1,1,0)12

Martin L. Puterman 2008

But we are concerned about forecasting and should compare models

Interpreting Seasonal Models

Note B2xt = B(Bxt) = Bxt-1 = xt-2.

An MA(1) model is written as

What does a ARIMA(1,0,0)x(1,0,0)12 model mean in terms of the data xt?

Use log transforms.

Parsimony is desirable use models with as few as terms as possible

AIC and BIC criterion penalize number of terms in the model

Key point Above approach to model selection is based on in sample fitting

You might also like