HSTS423 - Unit 3 Autocorrelation and Generalised Least Squares Estimation (GLS)

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 12

UNIT 3

Autocorrelation and Generalised Least Squares


estimation(GLS)

Objectives

At the end of this unit students are expected to be able to

1. define autocorrelation or serial correlation,

2. describe at least five causes of autocorrelation,

3. describe common types of autocorrelation in particular,


first order autocorrelation,

4. describe at least three consequences of auto-correlation,

5. estimate model paramters in the presence of autocorrelation


i.e. conduct appropriate estimation procedures for the cases
where the assumption of independence or auto-correlation
is violated,

6. to generate accurate forecasts with an autocorrelated model,


Unit 1
Auto-correlation and Generalized
Least Squares Estimation

2
1.1. INTRODUCTION TO AUTOCORRELATION 3

1.1 Introduction to Autocorrelation


In this section we describe a statistical phenomenon called auto-correlation. Recall
that the main goal or objective of econometric modelling is to obtain accurate and
efficient estimates of relationships among variable of a economic system on the basis
of which the main aims of Econometrics namely, prediction, planning and control can
be effected. As indicated earlier, the degree of success or failure in achieving these
goals depends to a large extent, on the degree of success achieved at the specification,
estimation and diagnostic stages of the model building process. At the estimation and
inferential stages, efficiency of parameter estimates and validity of inference resulting
therefrom, depend largely on whether or not the fundamental assumptions of the GLM
are satisfied. In the econometric regression model Yt = X0t β + ut one of the basic
assumptions is that the disturbance or error terms {ut } are uncorrelated i.e.
Ω = E(uu0 ) = σ 2 I
where u = (u1 , u2 , . . . , un )0 . The assumption is sometimes referred to as the assumption
of spherical disturbances and as indicated earlier, the assumption involves the double
assumption that the error terms are uncorrelated and have a constant variance. We
will examine the assumption of constant variance in the next Unit. In this Unit we
discuss the assumption of no auto-correlation. This assumption ensures, inter alia, i.e.
among other things, that the least squares estimator β̂ = (X0 X)−1 XY is an efficient
estimator of β. The efficience of β̂ follows from the Gauss-Markov theorem which states
that β̂ is a Best Linear Unbiased Estimator (BLUE) of β in the sense that it has the
least or smallest variance among all unbiased linear estimators of β. Further, with the
additional assumption of normality the assumption imples that the conventional t-test
and F-test used to make inference about the model, are valid statistical tests. If the
error terms in the regression model are correlated we say that there is auto-correlation.

Definition 1.1 Let {ut } be a stationary time series. Then the series is said to be
auto-correlated if γ(k) = cov (ut , ut+k ) 6= 0 for some k ≥ 1

Auto-correlation refers to non-zero correlation between successive values of the same


variable or series. For this reason auto-correlation is sometimes referred to as serial
correlation.

Whenever there is serial correlation in the error terms all inference i.e. estimation,
hypothesis testing and forecasting must take into account the effects of auto-correlation
for the conclusions to be valid.

We examine below some common causes of auto-correlation and how they be avoided
or taken into account when making statistical inference.

1.2 Causes or Sources of Auto-correlation


In practice the assumption of independence or no auto-correlation can be violated i.e.
may not hold, for a number of reasons.
4UNIT 1. AUTO-CORRELATION AND GENERALIZED LEAST SQUARES ESTIMATION

1. OMISSION OF RELEVANT EXPLANATORY VARIABLES: Most eco-


nomic variables such as cyclic variables tend to be auto-correlated. If auto-
correlated explanatory variables are omitted or excluded from the regression
model then they are absorbed in the error term ut which will then, be also auto-
correlated. For example suppose that the model
Yt = β0 + β1 + X1t + β2 X2t + ut is wrongly specified as Yt = β0 + β1 + X1t + ut .
Then if {X2t } is serially correlated so will {ut }.

Auto-correlation due to mis-specification by exclusion of important explanatry


variables or by assuming a linear relation where in fact a non-linear relationship
exists is quite common. The soluton to the problem if detected is simply correting
the specification. Other treatments of the problem are also possible as we will
see later.
2. PROLONGED SHOCK EFFECTS: System shocks and serious natural or
socio-economic events such as strikes, accidents, can often be felt over fairly long
periods of time. This may be in both the explanatory and dependent variables.
Consequently the problem may or may not be easy to eliminate. Appropriate
specification or estimation and inference procedures that take into account such
correlation are recommended.

3. EXTRAPOLATION TENDENCIES: Traditions and other various forms of


behavioral patterns established in the past will often affect current economies.
For example when the dependent variable depends on its past history and this
dependency is not part of specification, then auto-correlation is bound to be
present. This is often an inherrent or deeply rooted problem that may not be easy
to eliminate. Appropriate estimation and inference procedures, in the presence
of such correlation may be the only solution.
4. DATA TREATMENT: Data manipulation such as data smoothing tends to
spread impacts over several periods. Sampling procedures such as systematic
sampling can also generate serial correlation.
5. ECONOMIC LINKAGES: Many of the points raised above refer to time-series
data. Cross-sectional or spatial data can also contain autocorrelation. Sub-sets of
an economic system may affect one another through their linkages and contacts.
if a shock hits one or more strategic parts of the system, then all the other units
to which it is linked will also be affected.

In order to appreciate the need for auto-correlation tests, on a routine basis, it is


necessary to understand consequences of auto-correlation, in particular properties of
the OLS estimator of β.

1.3 Consequences of Auto-correlation


A summary of the sampling properties of OLS estimators obtained in the presence of
auto-correlation is:
1.3. CONSEQUENCES OF AUTO-CORRELATION 5

1. β̂ = (X0 X)−1 X0 Y = β + (X0 X)−1 Xu,

2. E(β̂) = β, that is β̂ is still unbiased for β,

3. cov(β̂) = (X0 X)−1 (X0 ΩX)−1 (X0 X)−1 ,

4. β̂ ∼ Np [β, (X0 X)−1 (X0 ΩX)−1 (X0 X)−1 ]

Thus although the OLS estimator β̂ is still unbiased the for β, the conventional formula
for parameter covariance σ 2 (X0 X)−1 or its sample counterpart s2 (X0 X)−1 , no longer
measures the sampling variance of the OLS estimator β̂. Thus any application of it
is potentially misleading. It is also clear that the β̂ has now larger standard errors.
The minimum variance property of OLS no longer holds. We can now summarize the
undesirable consequences of auto-correlation as follows.

1. It results in infficient parameter estimation. More specifically,

1.1 Model coefficients will have inflated variances or standard errors.


1.2 Error variance is understimated by OLS estimation.

2. The estimated model has low forecasting power.

Although there are other consequenses of auto-correlation these are apparently, the
ones that in general have more serious impact on estimation, hypothesis testing and
forecatsing.

It is clear from the foregoing that in econometric modeling there should be routine
checks or tests for auto-correlation. Autocorrelation can of course, take various forms.
It is however, customary in empirical work and in the literature to concentrate on first
order autocorrelation as many empirical studies have shown that first auto-correlation
is apparently the most common form of auto-correlation found in many economic vari-
ables.

A series {ut } is said to have first order auto-correlation if it satisfies the first order
auto-regressive AR(1) model
ut = ρut−1 + t
The properties of an auto-regressive process were presented in Unit 1 and are discussed
in detail in the course on Time Series Analysis. Here it suffices to say that the auto-
correlation function(ACF) of an AR(1) decays either exponentially if 0 ≤ ρ < 1 or in
a dumped cosine wave, if −1 < ρ < 0. The Partial Auto-correlation fuction (PACF)
cuts off i.e. becomes zero after lag 1, that is, from lag 2 on.

It is important to note that in general serial correlation may not be of just simple
auto-regressive form. Consider therefore, some time series [TSM] models which are
frequently encountered in practice e.g. error terms which follows general ARMA(p,q)
model.
6UNIT 1. AUTO-CORRELATION AND GENERALIZED LEAST SQUARES ESTIMATION

1.4 Testing for Auto correlation: Durbin-Watson test


In practice checks or tests for auto-correlation in the error terms {ut } can be conducted
by

1. Examining the correlogram i.e. a plot of the ACF, of the residuals {ût }, against
lag. If there is no auto-correlation then the ACF coeficients should lie within the
95% confidence band
1.96
±√ .
n
where n is the sample size. Otherwise there is auto-correlation of some sort.
The advantage of this graphical test is that it applies not only to first order
auto-correlation, but to all forms of auto-correlation.

2. A more formal test for first order auto-correlation is provided by the Durbin-
Watson test which we discuss in some detail below.

The Durbin-Watson test for auto-correlation has the following steps.


The hypothesis to be tested is

H0 : ρ = 0 versus H1 : ρ 6= 0

The above hypotheses are tested indirectly by testing the hypothesis that that

H0 : µd = 2 versus H1 : µd 6= 2

where µd = E(d) and the test statistic d is given by

Pn
(ût − ût−1 )2
t=2P û0 Aû Y0 (I − H)A(I − H)Y
d = n = =
2
t=1 ût û0 û Y0 (I − H)Y

1 −1 0
 
0 0 ... 0
 −1 2 −1 0 .. 
Figure 1.1 Lower and Upper 0
distributions . of Durbin-Watson statistic
 0 −1 2 −1 0
 
where A =  .  and H = X(X0 X)−1 X0 .

 .. .
.. .
.. 
 
The decision
 rule is, if the value of−1d ≤ 2,
2 then−1using an appropriate significance level,
0 ... 0 −1 1
1. reject H0 in favour of H1 i.e. in favour of positive correlation if d < dL .
It is2.easy to show
accept H0Pifthat
dU <ford large
<4− ≈ 2(1
n,d d as − ρ̂) or equivalently
illustrated ρ̂ ≈ 1 − d/2 = (2 − d)/2,
in the diagram.
U
where ρ̂ = 1 − nt=2 ût ût−1 )2 / nt=1 û2t . The diagram below shows the distribution of
P
d. 3. If dL < d < dU the test is considered inconclusive.

The critical values dL and dU are given in the Durbin-Watson tables, found in the
appendix of most text books.

If the value of d > 2, compute d0 = 4 − d and compare this value with the tabulated
values of dL and dU as if one were testing for positive auto-correlation.
1.4. TESTING FOR AUTO CORRELATION: DURBIN-WATSON TEST 7

Note that for the inconclusive case, it is generally incorrect to say that {ut } is nearly
correlated or that there is little auto-correlation. A practical conservative procedure
is to simply reject H0 , as the consequences of assuming no auto-correlation are more
serious than assuming that it is there.

Example 1.1 Consider the following expenditure data presented in Unit 2.


income(x) expenditure(y)

2 2.0
3 2.5
4 2.6
5 2.9
6 3.0

(i) Using the fitted regression Y = 1.64 + 0.24X, compute the residual series {ût }.
(ii) Hence test for first order serial correlation. Use a 5% significance level.

Solution 1.1 ,

(i) The residuals series and its computation are displayed in the following table.

Xt Yt Ŷt ût ût−1

2 2.0 2.12 -0.12 ?


3 2.5 2.36 0.14 -0.12
4 2.6 2.60 0.00 0.14
5 2.9 2.84 0.06 0.00
6 3.0 3.08 -0.08 0.06

(ii) Suppose that {ut } satisfies ut = ρut−1 + t . Then the hypothesis to be tested is
H0 : ρ = 0
That is the error terms are serially uncorrelated.
The test statistic is
P5 2
t=2 (ût − ût−1 )
d = P5 2
t=1 ût
(0.14 − (−0.12))2 + . . . + (−0.08 − 0.06)2
=
0.0440
0.1104
=
0.0440
= 2.51.
Since d = 2.51 > 2.0 , we compute d0 = 4 − d = 1.49 From the Durbin-Watson
table with α = 5%, k=1 and n=5 we obtain dL = 0.6 and dU = 1.4. Thus there
is no sufficient evidence of negative auto-correlation. That is, we keep keep H0
and conclude that the error terms are serially uncorrelated.
8UNIT 1. AUTO-CORRELATION AND GENERALIZED LEAST SQUARES ESTIMATION

It is important to emphasize that the Durbin-Watson test deals with only one type
of auto-correlation, first order auto-regressive type of auto-correlation. Several auto-
correlation tests are often defined for specific specification cases as is the case in the
Durbin-Watson test. Such tests have some limitations. For example the Durbin-Watson
test assumes that the regressor is non-stochastic and thus is invalid if for instance lagged
values of the dependent appear among the regressors, which is quite common as we
will see later.

We now turn to the treatment of auto-correlation i.e. estimation in the presence of


auto-correlation.

1.5 Generalised Least Squares (GLS): Inference in the


presence of auto-correlation
If the reason for auto-correlation can be traced back to model specification, then correct
the specification. Using the generalised least squares as we describe below, will not be
the correct solution. If there is true auto-correlation present then use a suitable form
of (estimated) generalised least squares capable of bringing back non-zero covariances
to the ideal zero-form. If auto-correlation of a given form is found present, it may
be possible to transform the auto-correlated-affected model into a form, which doesn’t
violate the assumption of spherical disturbances. For example, in the case of first order
auto-correlation, assuming ρ to be known, there exists an orthogonal k × k matrix P
such that
Ω = cov(u, u) = P0 P
and P = Λ−1/2 E where Λ is a diagonal matrix of eigenvalues of Ω and E is the matrix
of (normalised) eigenvectors. Tranforming the data into

Y? = PY and X? = PX

yields an OLS estimate of parameters β of the model

Y ? = X ? β + vt .

which has the desirable property of being the best linear unbiased estimator of β where
β is as in the original model. In the case of first order auto-correlation, it can be shown
that p
1 − ρ2 0 . . . 0

.. 
 −ρ 1 0 .

0 −ρ 1
 
P=
 
.. .. .. 

 . . .
 −ρ 1 0 
0 ... 0 −ρ 1
Frequently, however, ρ is not known. There are several ways of estimating ρ. The sim-
plest is to use the estmate of ρ̂ = (2 − d)/2 obtained from the Durbin-Watson statistic.
Other methods incude the maximum likelihood estimation, iterative procedures etc.
The resulting estimator β̂ is called a Generalised Least Squares (GLS) or Estimated
1.6. PREDICTION I.E. FORECASTING USING A MODEL WITH AUTO-CORRELATED ERRORS

Generelaised EGLS if ρ is replaced by its sample estimate ρ̂. If the aim of the model
is for example to forecast, then more accurate forecasts are obtained using the GLS
estimated model than with model estimated using Ordinary Least squares.

1.6 Prediction i.e. forecasting using a model with auto-


correlated errors

Suppose that we detected auto-correlation and that we have transformed the data and
obtained estimates of the parameters of the model using GLS. Let Y? , X?1 , X?2 , . . . , X?k , be
the transformed variables or data and let
Xn+1 = (X1(n+1) , X2(n+1) , . . . , Xk(n+1) )0 be the values of the explanatory variables at
time t = n + 1. Then it can be shown that the most efficient prediction of Yn+1 is given
by

Ŷ (Xn+1 ) = X0n+1 β ? + ρ? v̂n

1.7 Summary of the Unit

In this Unit we have learnt that in practice there are several causes or sources of auto-
correlation. These include omission of relevant explanatory variables, inherent corre-
lation structure in the dependent practices, prolonged effects of external shocks, data
treatment etc. The serious consequences of this phenomenon are inefficient parameter
estimation and low forecasting power of the resulting model. Tests for auto-correlation
can be conducted by examining the correlogram or formally by conducting tests such as
the Durbin Watson test. Compensation for effects of auto- correlation can be achieved
by generalized least Squares, noting however that if auto-correlation is generated by
mis-specification then there can be no effective remedy other than appropriate re-
specification. Further many problem may arrive together, such as auto-correlation and
no-constant i.e heterogeneous variance. These may have to be treated together.

Activity 1.1 .

1. The following table shows data on imports and Gross Natinal Product (GNP).
10UNIT 1. AUTO-CORRELATION AND GENERALIZED LEAST SQUARES ESTIMATION

imports GNP

Zt Xt
3748 21777
4010 22418
3711 22308
4004 23319
4151 24180
4569 24893
4582 25310
4697 25799
4753 25886
5062 26868
5669 28134
5628 29091
5736 29450
5946 30705
6501 32372
6549 33152
6705 33764
7104 34411
7609 35429
8100 36200

Fit the imports model: imports = a + b*GNP + error, to the data and test for
first order auto-correlation.
2. Show that for large n d ≈ 2(1 − ρ̂) where ρ̂ = nt=2 ût ût−1 / nt=1 û2t−1
P P

3. Assuming that ρ = 0.5, compute the OLS estimate of the consumption function
of exercise 1 with the appropriately transformed data. Compare your results with
those obtained in exercise 1.
4. Prove the Gauss-Markov theorem, that is, if the assumptions of the GLM are
satisfied then the Ordinary Least Squares(OLS) estimator of β is efficient among
the class of linear unbiased estimators.
Pn
(û −û )2 y 0 (I−H)A(I−H)y
5. Let d = Pn t 2 t−1
t=2
= where
t=1 ût−1
y 0 (I−H)y

−1
 
1 0 0 0 ... 0
 −1 .. 
 2 −1 0 0 . 
 0 −1 2 −1 0
 
A= .  and H = X(X 0 X)−1 X 0

 .. 0 
 
 −1 2 −1 
0 ... 0 −1 1
be the Durbin-Watson test statistic. Show that the eigen-values of A are
λj = 2 [1 − cos (π(j − 1)/n) , j = 1, 2, . . . , n]
Hence prove that 0 ≤ d ≤ 4.
1.7. SUMMARY OF THE UNIT 11

6. The following table shows the annual consumption and the disposable income in
$million for a certain country.

Year C Yd
1957 11378 11617
1958 13012 13297
1959 15263 15790
1960 16873 18017
1961 17764 19314
1962 18857 20198
1963 20074 21512
1964 21439 23124
1965 22833 24724
1966 24205 26174
1967 25307 27219
1968 27020 28915

Application of OLS yields the following results

Ĉ = 8526 + 0.65Yd , R2 = .953

(a) Find the residuals and test for autocorrelation.

(b) Estimate the value of ρ and use your estimate of it to transform your original
data

Yt∗ = (Yt − ρ̂Yt−1 )Xt∗ = (Xt − ρ̂Xt−1 )

(c) Apply OLS to the transformed data and compare your results with the OLS
estimates obtained from the original sample observations.

7. The following data show the OLS residuals of the model Y = β0 + β1 X + u.


12UNIT 1. AUTO-CORRELATION AND GENERALIZED LEAST SQUARES ESTIMATION

Year (ût )
1950 1.0
1951 -1.5
1952 -0.7
1953 -1.3
1954 -4.6
1955 -0.3
1956 -3.1
1957 -5.5
1958 -4.7
1959 -1.3
1960 -4.6
1961 -4.3
1962 1.9
1963 1.9
1964 2.9
1965 -2.6
1966 -2.3
1967 0.9
1968 1.4
1969 3.7

Calculate d and estimate ρ. Do the results support the use of first difference to
estimate β0 and β1 ?

1. Christ, C.F (1966), Econometric Models and Methods, John Wiley, New York.

2. Jonston, J. (1991), Econometric methods, McGraw-Hill, London.

3. Koutsoyiannis A. (1991) Theory of Econmetrics: An Introductory Exposition of


Econometric methods, Macmillan, Hong-kong

4. Matindike G. (1997), Commerce vol 2, College Press, Harare

5. Stanlake (1980), Introductory Economics, Longman, Harare

6. Statistical Year Book(1987), Central Statistical Office (CSO), Harare

You might also like