Download as pdf or txt
Download as pdf or txt
You are on page 1of 25

ARMA models

• Basic elements of AR and MA models can be combined to produce a great variety


of models
• ARMA(1,1) or ARIMA(1,0,1) model

• Using the backshift notation

• Yt depends on one previous Yt−1 value and one previous error term et−1.
• Series is assumed stationary in the mean and variance
• An ARMA model with higher-order terms

1
Examples of ARIMA(1,0,1) models

φ1 = 0.3, θ1 = −0.7, and c = 7 φ1 = −0.8, θ1 = 0.8, and c = 18

2
ARIMA models

• If non-stationarity is added to a mixed ARMA model, ARIMA(p, d, q) model is


obtained
• ARIMA(1,1,1) is given by

• ARIMA(p, d, q) model yields a variety of patterns in the ACF and PACF, so that it is
unwise to state rules for identifying general ARIMA models
• Simpler AR(p) and MA(q) models do provide some identifying features that can help
zero in on a particular ARIMA model identification
• In practice, it is hardly necessary to deal with values p, d, or q that are other than 0,
1, or 2 3
Seasonality and ARIMA models
• Data separated by a whole season may exhibit AR, MA, mixed ARMA, or mixed
ARIMA properties, just as consecutive data points may exhibit in case of non
seasonal ARMA models

• where s = number of periods per season


• ARIMA(1, 1, 1)(1, 1, 1)4 model

4
Seasonal ARIMA model
• ARIMA(1, 1, 1)(1, 1, 1)4 model

• Coefficients φ1, Φ1, θ1, and Θ1 are estimated from the data, and the equation is used for
forecasting
• A seasonal MA model eg., ARIMA(0,0,0)(0,0,1)12 will show a spike at lag 12 in the ACF
but no other significant spikes;
• PACF will show exponential decay in the seasonal lags; that is, at lags 12, 24, 36, …. 5
An example dataset for ARIMA model identification

• Dataset of Monthly French industry sales of printing and writing paper (in thousands
of francs) between 1963–1972

6
7
An example dataset for ARIMA model
identification
• Plot shows a very clear seasonal pattern in the data plot and a general
increasing trend
• Autocorrelations are almost all positive, and the dominant seasonal
pattern shows clearly in the large values or r12, r24, and r36
• So, take the first difference to address the linear trend, and the 12th
difference to address the periodicity

8
9
Identification of ARIMA model

• The model is identified to be an ARIMA(p, 1, q)(P, 1, Q)12 where values for p, q, P,


and Q are yet to be determined.
• From the PACF, the exponential decay of the first few lags suggests a non-seasonal
MA(1) model. This suggests setting q = 1 and p = 0
• In the ACF, the value r1 is significant - reinforcing the non-seasonal MA(1) model -
and r12 is significant - suggesting a seasonal MA(1) model
• The tentative identification

• “Airline model” because it was


applied to international airline
data by Box and Jenkins (1970)
• One of the most commonly used
seasonal ARIMA models 10
A example dataset that needs
transformation before
modeling
• Monthly shipments of pollution equipment from
January 1986 through October 1996

11
• Fluctuations increase as one moves from left to right on the graph – non-
stationarity in the variance
• For achieving stationarity in variance, a logarithmic or power transformation of
the data may be done
12
• After logarithmic transformation, the series has achieved stationarity in variance
• For achieving stationarity in mean, differencing may be done
• As there is no strong seasonality, and so we take a first difference rather than a seasonal difference

13
• Series has achieved stationarity in variance and also mean

14
Tentative ARIMA model

• Significant spikes at lags 1 and 2 in the PACF indicating an AR(2) might be a


feasible non-seasonal component
• Single spike at lag 12 in the PACF indicates a seasonal AR(1) component
• A tentative model

15
Steps to identify a Box-Jenkins ARIMA model
• Make the series stationary
• Differencing (non-seasonal and/or seasonal) usually takes care of non-stationarity
in the mean
• Logarithmic or power transformations often take care of non-stationary in the
variance
• Consider non-seasonal aspects
• An examination of the ACF and PACF of the stationary series obtained in the
previous step can reveal whether a MA or AR model is feasible
• Consider seasonal aspects
• Examination of the ACF and PACF at the seasonal lags can help identify AR and
MA models for the seasonal aspects of the data
• Not as easy to identify as in the case of the non-seasonal aspects
16
Estimating the ARIMA model parameters

• Suppose the class of model identified is ARIMA (0, 1, 1). This is a family of models
depending on one MA coefficient θ1

• Method of least squares


• Sum of squared errors is minimized
• However, for models involving an MA component (i.e., where q > 0), there is no
simple formula that can be applied to obtain the estimates as there is in regression,
so an iterative method must be used.
• A preliminary estimate is chosen and it is refined through iteration via a computer
program
• Method of maximum likelihood
• Determines the values of the parameters which maximize the likelihood L
(probability) of obtaining the sample data 17
Is the selected ARIMA model the best?

• If some of the estimated parameters are insignificant (p-values larger than 0.05),
then, a revised model with the insignificant terms omitted may be considered

• Mixed models are generally harder to identify than pure AR or MA models. So, we
may begin with either a pure AR or a pure MA model and extend the selected model
to a mixed ARMA/ARIMA model

• From the identified/ shortlisted models, determine which one is preferred

What is the criteria?

18
Criteria for selecting the preferred ARIMA model
• Can we choose the model which gives the smallest sum of squared errors or the largest
value for the likelihood?
• Does not always work—often the MSE can be made smaller and the likelihood made
larger simply by increasing the number of terms in the model
• This is analogous to the problem of selecting a regression model by maximizing the R2
value – R2 value can be increased by adding another explanatory variable
• For ARIMA models, the likelihood may be penalized for each additional term in the model.
If the extra term does not improve the likelihood more than the penalty amount, it is not
worth adding.
• Akaike’s Information Criterion or AIC (Akaike, 1974)

• where m = p + q + P + Q and L = likelihood 19


Criteria for selecting the preferred ARIMA
model
• An approximation to the AIC

• n is the number of observations and   is the variance of the residuals


• AIC is only useful in comparison to the AIC value for another model fitted to the
same data set
• A difference in AIC values of 2 or less is generally not regarded as substantial and
in such cases, the simpler model is preferred
• Apart from AIC that are also other similar criteria such as Schwarz BIC (Bayesian
Information Criterion) and FPE (Final prediction error)
20
Sampling distribution of autocorrelations
and critical values
• Theoretically, all autocorrelation coefficients for a
series of random numbers must be zero.
• But because we have finite samples, each of the
sample autocorrelations will not be exactly zero
• It has been shown by Anderson (1942), Bartlett (1946),
Quenouille (1949), and others, that the autocorrelation
coefficients of white noise data have a sampling
distribution that can be approximated by a normal
curve with mean zero and standard error 1/ √ n where How do we get these
n is the number of standard error observations in the critical bounds
series
• Hence, 95% of all sample autocorrelation coefficients
are expected to be within ±1.96/ √ n (for white noise)
21
Sampling distribution of autocorrelations
• n = 36,
• Limits for the autocorrelations are at ±1.96/ √ 36
= ±0.327
• So, r7 = 0.275 is also not significant

What if the number of observations was 360?

22
Portmanteau tests
• Rather than study the rk values one at a time, an alternative approach is to
consider a whole set of rk values, (say, k = 1 to 15/ 20) all at one time, and
see whether the set is significantly different from a zero set
• Box-Pierce test

• where h is the maximum lag being considered and n is the number of


observations in the series. Usually h ≈ 20 is selected
• If each rk is close to zero, Q will be relatively small whereas if some rk values
are large (either positive or negative), the Q statistic will be relatively large
23
Box-Pierce test
• n =36, h =10

In the row corresponding to 10 df, we see that the probability of


obtaining a chi-square value as large or larger than 5.62 is more than 0.1 24
So the set of rk values is not significantly different from a null set
Alternative portmanteau test
• Ljung-Box test

25

You might also like