The Autocorrelation Function and AR(1), AR(2) Models

Al Nosedal
University of Toronto

January 29, 2019

Autocorrelation, or serial correlation, occurs in data when the error

terms of a regression forecasting model are correlated. When
autocorrelation occurs in a regression analysis, several possible problems
might arise. First, the estimates of the regression coefficients no longer
have the minimum variance property and may be inefficient. Second, the
variance of the error terms may be greatly underestimated by the mean
square error value. Third, the true standard deviation of the estimated
regression coefficient may be seriously underestimated. Fourth, the
confidence intervals and tests using t and F distributions are no longer
strictly applicable.

Motivation (cont.)

First-order autocorrelation results from correlation between the error terms

of adjacent time periods (as opposed to two or more previous periods). If
first-order autocorrelation is present, the error for one time period et , is a
function of the error of the previous time period, et−1 , as follows:

et = ρet−1 + wt
The first-order autocorrelation coefficient, ρ, measures the correlation
between the error terms; wt is a Normally distributed independent error
term. If the value of ρ is 0, et = wt , which means there is no
autocorrelation and et is just a random, independent error term.

Durbin-Watson Test

One way to test to determine whether autocorrelation is present in a

time-series regression analysis is by using the Durbin-Watson test for
(et − et−1 )2
D = t=2Pn 2
t=1 et
where n = the number of observations.

Durbin-Watson Test (cont.)

The range of values of D is 0 ≤ D ≤ 4 where small values of D (D < 2)

indicate a positive first-order autocorrelation and large values of D
(D > 2) imply a negative first-order autocorrelation. Positive first-order
autocorrelation is a common occurrence in business and economic time
series. The null hypothesis for this test is that there is no autocorrelation.
A one-tailed test is used:

H0 : ρ = 0 vs Ha : ρ > 0
In the Durbin-Watson test, D is the observed value of the Durbin-Watson
statistic using the residuals from the regression analysis. Our Tables are
designed to test for positive first-order autocorrelation by providing values
of dL and dU for a variety of values of n and k and for α = 0.01 and
α = 0.05.

Durbin-Watson Test (cont.)

The decision is made in the following way. If D < dL , we conclude that

there is enough evidence to show that positive first-order autocorrelation
exists. If D > dU , we conclude that there is not enough evidence to show
that positive first-order autocorrelation exists. And if dL ≤ D ≤ dU , the
test is inconclusive. The recommended course of action when the test is
inconclusive is to continue testing with more data until a conclusive
decision can be made.

Durbin-Watson Test (cont.)

To test for negative first-order autocorrelation, we change the critical

values. If D > 4 − dL , we conclude that negative first-order
autocorrelation exists. If D < 4 − dU , we conclude that there is not
enough evidence to show that negative first-order autocorrelation exists. If
4 − du ≤ D ≤ 4 − dL , the test is inconclusive.

Consider Table 1, which contains crude oil production and natural gas
withdrawal data for the U.S. over a 25-year time period published by the
U.S. Energy Information Administration in an Annual Energy Review. A
regression line can be fit through these data to determine whether the
amount of natural gas withdrawals can be predicted by the amount of
crude oil production. The resulting errors of prediction can be tested by
the Durbin-Watson statistic for the presence of significant positive
autocorrelation by using α = 0.05.

Table 1

Year Crude Oil Natural Gas

Production (1000s) Withdrawals (1000s)
1 8.597 17.573
2 8.572 17.337
3 8.649 15.809
4 8.688 14.153
5 8.879 15.513
6 8.971 14.535
7 8.680 14.154
8 8.349 14.807
9 8.140 15.467
10 7.613 15.709
11 7.355 16.054
12 7.417 16.018
13 7.171 16.165

Table 1

Year Crude Oil Natural Gas

Production (1000s) Withdrawals (1000s)
14 6.847 16.691
15 6.662 17.351
16 6.560 17.282
17 6.465 17.737
18 6.452 17.844
19 6.252 17.729
20 5.881 17.590
21 5.822 17.726
22 5.801 18.129
23 5.746 17.795
24 5.681 17.819
25 5.430 17.739

R Code (entering data)

# explanatory variable;
# oil= crude oil production;

oil=c( 8.597,8.572,8.649,8.688,

R Code (entering data)

# response variable;
# gas= natural gas withdrawals;

17.737,17.844,17.729, 17.590,
17.726,18.129, 17.795, 17.819,

R Code (fitting linear model)



R Code (fitting linear model)

## [1] "coefficients" "residuals" "effects" "rank"

## [5] "fitted.values" "assign" "qr"
## [9] "xlevels" "call" "terms" "model"

R Code (finding residuals)


# round will round residual using 4 decimal places;

R Code (finding residuals)

## 1 2 3 4 5 6 7
## 1 2 3 4 5 6 7 8
## 2.1492 1.8920 0.4295 -1.1933 0.3291 -0.5706 -1.1991
## 10 11 12 13 14 15 16 17 18
## -0.5518 -0.4263 -0.4096 -0.4718 -0.2215 0.2811 0.1254
## 19 20 21 22 23 24 25
## 0.3104 -0.1443 -0.0584 0.3267 -0.0541 -0.0854 -0.3789

R Code (time series plot of residuals)


R Code (time series plot of residuals)



5 10 15 20 25


R Code (installing library)


R Code (Durbin-Watson test)


R Code (installing library)

## Durbin-Watson test
## data: lin.mod
## DW = 0.68731, p-value = 2.256e-05
## alternative hypothesis: true autocorrelation is greater than 0

Using table

Because we used a simple linear regression, the value of k = 1. The sample

size, n, is 25, and α = 0.05. The critical values in our Table A − 2 are:

dL = 1.288 and dU = 1.454

Because the computed D statistic, 0.6873, is less than the value of
dL = 1.288, the null hypothesis is rejected. We have evidence that a
positive autocorrelation is present in this example.

The sample autocovariance function is defined as

γ̂(h) = (xt+h − x̄)(xt − x̄),

with γ̂(−h) = γ̂(h) for h = 0, 1, ..., n − 1.

The sample autocorrelation function is defined as

ρ̂(h) = .

Toy Example

To understand autocorrelation it is first necessary to understand what it

means to lag a time series.
Y (t) = Yt =(3,4,5,6,7,8,9,10,11,12)

Y lagged 1 = Y (t − 1) = Yt−1 =(*,3,4,5,6,7,8,9,10,11)

Toy Example (cont.)

Find the sample autocorrelation at lag 1 for the following time series:
Y (t) = (3, 4, 5, 6, 7, 8, 9, 10, 11, 12).
ρ̂(1) = 0.7

R Code


### function to find autocorrelation when lag=1;<-function(x){


## [,1]
## [1,] 0.7
An easier way of finding autocorrelations

Using R, we can easily find the autocorrelation at any lag k.




Series y


0.0 0.2 0.4 0.6 0.8 1.0


## [1] 0.7

It is natural to plot the autocovariances and/or the autocorrelations versus

lag. Further, it will be important to develop some notion of what would be
expected from such a plot if the errors are white noise (meaning no special
time series techniques are required) in contrast to the situation strong
serial correlation is present.

Graph of autocorrelations

A graph of the lags against the corresponding autocorrelations is called a

correlogram. The following lines of code can be used to make a
correlogram in R.

### autocorrelation function for

###a random series

acf(y,lag=8,main="Random Series N(0,1)");

Random Series N(0,1)



0 2 4 6 8


The AR(1) Model Autocorrelation and Autocovariance

For the AR(1) model, xj = a1 xj−1 + wj , xj = a12 xj−2 + a1 wj−1 + wj , and in

general xj = a1k xj−k + kt=1 a1t−1 wj−t+1 .
2 = a γ(0).
Furthermore γ(1) = E (xj xj−1 ) = E ([a1 xj−1 + wk ]xj−1 ) = a1 σAR 1

The AR(1) Model Autocorrelation and Autocovariance

More generally,
γ(k) = E (xj xj−k ) = E ([a1k xj−k + kt=1 a1t−1 wj−t+1 ]xj−k ) = a1k γ(0).
So, in general ρ(k) = a1 .

The AR(1) Model Autocorrelation and Autocovariance

For AR(1), the relationship of variance in the series to variance in white

noise is
2 = E (x x ) = E ([a x
σAR 2 2 2
j j 1 j−1 + wj ][a1 xj−1 + wj ]) = a1 σAR + σw , so
2 =
σAR .

The Stationary AR(1) is an MA model of infinite order

Here we introduce the fundamental duality between AR and MA models.

We can keep going backward in time using the first-order autoregressive
xt = a1 xt−1 + wt
xt = a1 (a1 xt−2 + wt−1 ) + wt
or xt = a12 xt−2 + (a1 wt−1 + wt )
= a12 (a1 xt−3 + wt−2 ) + (a1 wt−1 + wt )
= a13 xt−3 + a12 wt−2 + a1 wt−1 + wt

The Stationary AR(1) is an MA model of infinite order

Continuing back to minus infinity we would get

xt = wt + a1 wt−1 + a12 wt−2 + a13 xt−3 + ...

which makes sense if it is the case that a1k → 0 as k → ∞ rapidly enough
for the series to converge to a finite limit. This is our stationary condition.

The Stationary AR(1) is an MA model of infinite order

The expression for xt is then

xt = a1i wt−i

The AR(1) model can thus be written as an MA(∞) model.

The AR(1) process is not always stationary

The variance of the AR(1) process is given by

σAR = Var (xt ) = Var a1i wt−i

σAR = Var (xt ) = σw2 (1 + a12 + (a12 )2 + (a12 )3 + (a12 )4 + ...)
If a1 = 1 or if a1 is larger, this variance will increase without bound.

The Yule-Walker equations

The development of these useful equations is not difficult. Write the

general AR(p) model as

xt = a1 xt−1 + a2 xt−2 + ... + ap xt−p + wt

where we assume that xt is a zero-mean process (or that the mean has
been subtracted) and that wt is a white-noise process and that
E (wt xt−k ) = 0 for k > 0. Once again, compute γ(k):

γ(k) = E (xt xt−k )

The Yule-Walker equations

γ(k) = E (xt xt−k ) = E [(a1 xt−1 + a2 xt−2 + ... + ap xt−p + wt )xt−k ]

γ(k) = E (xt xt−k ) =
a1 E [xt−1 xt−k ] + a2 E [xt−2 xt−k ] + ... + ap E [xt−p xt−k ] + E [wt xt−k ]

The Yule-Walker equations

From the definition of the autocovariance γ(k) of a stationary process, it

is a function only of the lag between observations. Thus,
γ(k) = E (xt xt−k ) = E (xt+s xt+s−k ).
We can use this fact to simplify γ(k) = E (xt xt−k ) =
a1 E [xt−1 xt−k ] + a2 E [xt−2 xt−k ] + ... + ap E [xt−p xt−k ] + E [wt xt−k ] to obtain

γ(k) = a1 γ(k − 1) + a2 γ(k − 2) + ... + ap γ(k − p) + E (wt xt−k )

The Yule-Walker equations

For k > 0 we know that the last term is zero.

γ(k) = a1 γ(k − 1) + a2 γ(k − 2) + ... + ap γ(k − p).

2 and recall the
If we divide by the variance of the series γ(0) = σAR
definition of the autocorrelation ρ(k) = γ(0) , we obtain the Yule-Walker

ρ(k) = a1 ρ(k − 1) + a2 ρ(k − 2) + ... + ap ρ(k − p), k > 0.

The Yule-Walker equations

These are extremely important equations.

For k = 0, we obtain
γ(0) = σAR 2 = a γ(−1) + a γ(−2) + ... + a γ(−p) + E (w x ).
1 2 p t t
We can find E (wt xt ) as follows:
E (xt wt ) = E [a1 xt−1 wt + a2 xt−2 wt + ... + ap xt−p wt + wt2 ] = E (wt2 ) = σw2 .
(because E (xt−k wt ) = 0 for k > 0.)

The Yule-Walker equations

Recalling that γ(−s) = γ(s), we have that

γ(0) = σAR = a1 γ(1) + a2 γ(2) + ... + ap γ(p) + σw2 ,
and dividing by γ(0) on both sides, we obtain

1 = a1 ρ(1) + a2 ρ(2) + ... + ap ρ(p) + 2

σw2 = σAR
(1 − a1 ρ(1) − a2 ρ(2) − ... − ap ρ(p).

Why are Yule-Walker equations so important?

We can use equations

ρ(k) = a1 ρ(k − 1) + a2 ρ(k − 2) + ... + ap ρ(k − p), k > 0.


σw2 = σAR (1 − a1 ρ(1) − a2 ρ(2) − ... − ap ρ(p).
to estimate, say, a1 , a2 , σw2 from the autocorrelations if we knew that the
order of the model is p = 2.

AR(2) Process

This process is defined by

xt = a1 xt−1 + a2 xt−2 + wt .
It can be shown that

ρ(1)[1 − ρ(2)]
a1 =
1 − ρ(1)2

ρ(2) − ρ(1)2
a2 =
1 − ρ(1)2

Stationarity conditions for an AR(2) process

We recently discovered that the equation for the variance of an AR(p)

process was

2 σw2
σAR = .
1 − ρ(1)a1 − ρ(2)a2 − ... − ρ(p)ap
For an AR(2) process this reduces to

2 σw2
σAR = .
1 − ρ(1)a1 − ρ(2)a2

Stationarity conditions for an AR(2) process

Substituting our expressions for a1 and a2 can be shown to yield (after a

little bit of algebra)

2 (1 − a2 )σw2
σAR =
(1 + a2 )(1 − a1 − a2 )(1 + a1 − a2 )

Stationarity conditions for an AR(2) process

Using the facts that autocorrelations must be less than 1, and each factor
in the denominator and numerator must be positive, we can derive the
−1 < a2 < 1
a2 + a1 < 1
a2 − a1 < 1
The inequalities define the stationarity region for AR(2).

Scatterplot to Illustrate the AR(1) Model

The model AR(1) is about the correlation between an error and the
previous error. First consider white noise. In this case there should be no
correlation between the errors and the shifted errors.
On the other hand, consider AR(1) errors with a1 = 0.7.

R Code


# simulating white noise;


# simulating AR(1);
error = filter(noise,filter=(0.7),method="recursive",init=0);

ylab="shifted errors");
title("AR(1) errors, a=0.7" );

AR(1) errors, a=0.7

5 10

shifted errors

● ● ●●●●● ● ● ● ● ● ●
● ●●●
●●● ●● ●● ●●●●●●●● ●
● ● ●●
●●● ●

●●●●● ●
●●●● ●
●● ●


●● ●●
● ●
●● ● ●
● ●● ●
●●●● ●



● ●

● ● ●
●● ●
● ●






●● ●
●● ●
●●● ●


● ●









●●● ●●
●●●● ●●●

●● ●

● ●




●●●●● ●●
−5 0

● ●● ●








●●● ●
● ●●●

● ●










●● ● ●●●
● ●●●●







● ●●
●● ●●●●● ●
● ●




● ●
●●● ●
●●● ●●●
●● ●








●● ● ● ●●
● ● ●● ●●

−5 0 5 10


Autocorrelation Function


Autocorrelation Function

Series error


0 5 10 15 20 25 30


Based on the formulas already derived and the choice used in the code
(a1 = 0.7):
ρ(0) = 1 by definition
ρ(1) = a1 = 0.7
ρ(2) = a1 = 0.49
ρ(3) = a1 = 0.343
ρ(4) = a1 = 0.2401

R Code


## Autocorrelations of series 'error', by lag
## 0 1 2 3 4
## 1.000 0.690 0.461 0.319 0.216

Examples of Stable and Unstable AR(1) Models

Consider the following three cases of AR(1) data:

1. a1 = −0.9
2. a1 = 0.5
3. a1 = 1.01

Case 1

a1<- -0.9;
error <- filter(noise,filter=(a1),method="recursive",

plot.ts(error,main="a = -0.9, n =100");


Case 1. Time Series

a = −0.9, n =100
0 5


0 20 40 60 80 100


Case 1. Correlogram

Series error


0 5 10 15 20


Case 2

a1<- 0.5;
error <- filter(noise,filter=(a1),method="recursive",

plot.ts(error,main="a = 0.5, n =100");


Case 2. Time Series

a = 0.5, n =100


0 20 40 60 80 100


Case 2. Correlogram

Series error


0 5 10 15 20


Case 3

a1<- 1.1;
error <- filter(noise,filter=(a1),method="recursive",

plot.ts(error,main="a = 1.1, n =100");


Case 3. Time Series

a = 1.1, n =100

0 20 40 60 80 100


Case 3. Correlogram

Series error


0 5 10 15 20


The AR(2) Model Autocorrelation and Autocovariance

The model is:

xt = a1 xt−1 + a2 xt−2 + wt ,
and the autocovariances can be characterized with recursive relationships.

γ(1) and ρ(1)

γ(1) = E (xj xj−1 ) = E ([a1 xj−1 + a2 xj−2 + wj ]xj−1 ) = a1 γ(0) + a2 γ(1), so

it is easy to see that ρ(1) = 1−a 2

ρ(2) and ρ(3)

Doing something similar,

ρ(2) = + a2
(1 − a2 )

a13 + a1 a2
ρ(3) = + a1 a2 .
(1 − a2 )

σAR = γ(0)

It can be shown that, for the AR(2) model,

σAR = .
1 − a1 ρ(1) − a2 ρ(2)

Simulating Data for AR(2) Models

a1<- 0.8;
a2<- -0.7;
error <- filter(noise,filter=c(a1,a2),method="recursive");


Series error


0 5 10 15 20


Based on the formulas already derived and the choices used in the code
(a1 = 0.8, a2 = 0.7):
ρ(0) = 1, ρ(1) = 1−a 2
= 0.8
1.7 ≈ 0.4706,
ρ(2) = a1 ρ(1) + a2 ρ(0) ≈ 0.8(0.4706) − 0.7(1) ≈ −0.3235,
ρ(3) = a1 ρ(2) + a2 ρ(1) ≈ 0.8(−0.3235) − 0.7(0.4706) ≈ −0.5882, etc.

Stable and Unstable AR(2) Models

Recall the AR(2) process is stationary when:

i) a1 + a2 < 1,
ii) a2 − a1 < 1, and
iii) |a2 | < 1.

Examples of Stable and Unstable AR(2) Models

Stable: a1 = 1.60 and a2 = −0.63.

Stable: a1 = 0.30 and a2 = 0.40.
Stable: a1 = −0.30 and a2 = 0.40.
Unstable violates (i): a1 = 0.500 and a2 = 0.505.
Unstable violates (ii): a1 = −0.505 and a2 = 0.500.
Unstable violates (iii): a1 = 0 and a2 = −1.05.

Unstable case a1 = 0.500 and a2 = 0.505.

a1<- 0.500;
a2<- 0.505;
error <- filter(noise,filter=c(a1,a2),method="recursive");

plot.ts(error,main="a1 = 0.5 and a2=-0.505, n =100");

Unstable case a1 = 0.500 and a2 = 0.505.

a1 = 0.5 and a2=0.505, n =100

−4 0


0 20 40 60 80 100


Unstable case a1 = −0.505 and a2 = 0.500.

a1<- - 0.505;
a2<- 0.500;
error <- filter(noise,filter=c(a1,a2),method="recursive");

plot.ts(error,main="a1 = - 0.505 and a2=0.500, n =100");

Unstable case a1 = −0.505 and a2 = 0.500.

a1 = − 0.505 and a2=0.500, n =100



0 20 40 60 80 100


Unstable case a1 = 0 and a2 = −1.05.

a1<- 0;
a2<- -1.05;
error <- filter(noise,filter=c(a1,a2),method="recursive");

plot.ts(error,main="a1 = 0 and a2= -1.05, n =100");

Unstable case a1 = 0 and a2 = −1.05.

a1 = 0 and a2= −1.05, n =100



0 20 40 60 80 100


