ECON F342 AE CH 12

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 104

ECON F342 : Applied Econometrics

N V M RAO

1 / 23
Reading from the Text Book

For the today


Serial correlation and heteroskedasticity in time series
regressions.
Chapter 12 (pp. 376 – 404).

2 / 23
Reading From the Text Book

For the today


Serial correlation and heteroskedasticity in time series
regressions.
Chapter 12 (pp. 376 – 404).

2 / 23
Properties of OLS with Serially Correlated
Errors: Unbiasedness and Consistency

OLS is unbiased under the first 3 Gauss-Markov


assumptions for time series regression.

3 / 23
Properties of OLS with Serially Correlated
Errors: Unbiasedness and Consistency

OLS is unbiased under the first 3 Gauss-Markov


assumptions for time series regression.
But we do not assume anything about serial correlation
present often in the economic data.

3 / 23
Properties of OLS with Serially Correlated
Errors: Unbiasedness and Consistency

OLS is unbiased under the first 3 Gauss-Markov


assumptions for time series regression.
But we do not assume anything about serial correlation
present often in the economic data.
As long as explanatory variables are strictly exogenous,
β̂OLS are unbiased, regardless the degree of serial
correlation in the errors.

3 / 23
Properties of OLS with Serially Correlated
Errors: Unbiasedness and Consistency

OLS is unbiased under the first 3 Gauss-Markov


assumptions for time series regression.
But we do not assume anything about serial correlation
present often in the economic data.
As long as explanatory variables are strictly exogenous,
β̂OLS are unbiased, regardless the degree of serial
correlation in the errors.
Last lecture, we also relaxed the strict exogeneity to
E(ut |xt ) = 0 and by assuming weak dependence of the
data, we have shown that β̂OLS are still consistent
(although not necessarily unbiased).

3 / 23
Properties of OLS with Serially Correlated
Errors: Unbiasedness and Consistency

OLS is unbiased under the first 3 Gauss-Markov


assumptions for time series regression.
But we do not assume anything about serial correlation
present often in the economic data.
As long as explanatory variables are strictly exogenous,
β̂OLS are unbiased, regardless the degree of serial
correlation in the errors.
Last lecture, we also relaxed the strict exogeneity to
E(ut |xt ) = 0 and by assuming weak dependence of the
data, we have shown that β̂OLS are still consistent
(although not necessarily unbiased).
But what about assumption on serial correlation?

3 / 23
Properties of OLS with Serially Correlated
Errors: Efficiency and Inference

Gaus-Markov theorem requires both homoskedasticity and


serially uncorrelated errors.

4 / 23
Properties of OLS with Serially Correlated
Errors: Efficiency and Inference

Gaus-Markov theorem requires both homoskedasticity and


serially uncorrelated errors.
Thus, OLS is no longer BLUE in the presence of serial
correlation.

4 / 23
Properties of OLS with Serially Correlated
Errors: Efficiency and Inference

Gaus-Markov theorem requires both homoskedasticity and


serially uncorrelated errors.
Thus, OLS is no longer BLUE in the presence of serial
correlation.
...and standard errors and test statistics are not valid.

4 / 23
Properties of OLS with Serially Correlated
Errors: Efficiency and Inference

Gaus-Markov theorem requires both homoskedasticity and


serially uncorrelated errors.
Thus, OLS is no longer BLUE in the presence of serial
correlation.
...and standard errors and test statistics are not valid.
Let’s assume model with AR(1) errors:

yt = β0 + β1 xt + ut ,
ut = ρut−1 + t ,
for t = 1, 2, . . . , n, where |ρ| < 1 and t uncorrelated random
variables with zero mean and variance σ2 .

4 / 23
Properties of OLS with Serially Correlated
Errors: Efficiency and Inference cont.

The OLS estimator is then:


n
X
β̂1 = β1 + SSTx−1 xt ut ,
t=1
Pn 2
where SSTx = t=1 xt

5 / 23
Properties of OLS with Serially Correlated
Errors: Efficiency and Inference cont.
Variance of β̂1 conditional on X is:
n
!
X
−2
V ar(β̂1 ) = SSTx V ar xt ut
t=1
 
n
X n−1
XX n−t
= SSTx−2  x2t V ar(ut ) + 2 xt xt+j E(ut ut+j )
t=1 t=1 j=1
n−1
XX n−t
2 2
= σ /SSTx + 2(σ /SSTx2 ) ρj xt xt+j
| {z }
t=1 j=1
| {z }

6 / 23
Properties of OLS with Serially Correlated
Errors: Efficiency and Inference cont.
Variance of β̂1 conditional on X is:
n
!
X
−2
V ar(β̂1 ) = SSTx V ar xt ut
t=1
 
n
X n−1
XX n−t
= SSTx−2  x2t V ar(ut ) + 2 xt xt+j E(ut ut+j )
t=1 t=1 j=1
n−1
XX n−t
2 2
= σ /SSTx + 2(σ /SSTx2 ) ρj xt xt+j
| {z }
t=1 j=1
variance of β̂1 | {z }
bias

6 / 23
Properties of OLS with Serially Correlated
Errors: Efficiency and Inference cont.
Variance of β̂1 conditional on X is:
n
!
X
−2
V ar(β̂1 ) = SSTx V ar xt ut
t=1
 
n
X n−1
XX n−t
= SSTx−2  x2t V ar(ut ) + 2 xt xt+j E(ut ut+j )
t=1 t=1 j=1
n−1
XX n−t
2 2
= σ /SSTx + 2(σ /SSTx2 ) ρj xt xt+j
| {z }
t=1 j=1
variance of β̂1 | {z }
bias
where σ 2 = V ar(ut ) and we used the fact from last lecture
E(ut ut+j ) = Cov(ut , ut+j ) = ρj σ 2

6 / 23
Properties of OLS with Serially Correlated
Errors: Efficiency and Inference cont.
Variance of β̂1 conditional on X is:
n
!
X
−2
V ar(β̂1 ) = SSTx V ar xt ut
t=1
 
n
X n−1
XX n−t
= SSTx−2  x2t V ar(ut ) + 2 xt xt+j E(ut ut+j )
t=1 t=1 j=1
n−1
XX n−t
2 2
= σ /SSTx + 2(σ /SSTx2 ) ρj xt xt+j
| {z }
t=1 j=1
variance of β̂1 | {z }
bias
where σ 2 = V ar(ut ) and we used the fact from last lecture
E(ut ut+j ) = Cov(ut , ut+j ) = ρj σ 2
If we ignore the serial correlation and estimate the variance
in the usual way, variance estimator will be biased (as
ρ 6= 0) 6 / 23
Properties of OLS with Serially Correlated
Errors: Efficiency and Inference cont.

Consequences:
In most economic applications, ρ > 0 and usual OLS
variance underestimates the true variance of the OLS

7 / 23
Properties of OLS with Serially Correlated
Errors: Efficiency and Inference cont.

Consequences:
In most economic applications, ρ > 0 and usual OLS
variance underestimates the true variance of the OLS
We tend to think that OLS slope estimator is more precise
than it actually is.

7 / 23
Properties of OLS with Serially Correlated
Errors: Efficiency and Inference cont.

Consequences:
In most economic applications, ρ > 0 and usual OLS
variance underestimates the true variance of the OLS
We tend to think that OLS slope estimator is more precise
than it actually is.
Main consequence is that standard errors are invalid ⇒ t
statistics for testing single hypotheses are invalid ⇒
statistical inference is invalid.

7 / 23
Testing for AR(1) Serial Correlation

We need to be able to test for serial correlation in the error


terms in the multiple linear regression model:

yt = β0 + β1 xt1 +, . . . + βk xtk + ut ,

with ut = ρut−1 + t , t = 1, 2, . . . n.

8 / 23
Testing for AR(1) Serial Correlation

We need to be able to test for serial correlation in the error


terms in the multiple linear regression model:

yt = β0 + β1 xt1 +, . . . + βk xtk + ut ,

with ut = ρut−1 + t , t = 1, 2, . . . n.
The null hypothesis is that there is no serial
correlation.

8 / 23
Testing for AR(1) Serial Correlation

We need to be able to test for serial correlation in the error


terms in the multiple linear regression model:

yt = β0 + β1 xt1 +, . . . + βk xtk + ut ,

with ut = ρut−1 + t , t = 1, 2, . . . n.
The null hypothesis is that there is no serial
correlation.
H0 : ρ = 0

8 / 23
Testing for AR(1) Serial Correlation

We need to be able to test for serial correlation in the error


terms in the multiple linear regression model:

yt = β0 + β1 xt1 +, . . . + βk xtk + ut ,

with ut = ρut−1 + t , t = 1, 2, . . . n.
The null hypothesis is that there is no serial
correlation.
H0 : ρ = 0
With strictly exogenous regressors, the test is very
straightforward - simply regress the OLS residuals ût on
lagged residuals ût−1

8 / 23
Testing for AR(1) Serial Correlation

We need to be able to test for serial correlation in the error


terms in the multiple linear regression model:

yt = β0 + β1 xt1 +, . . . + βk xtk + ut ,

with ut = ρut−1 + t , t = 1, 2, . . . n.
The null hypothesis is that there is no serial
correlation.
H0 : ρ = 0
With strictly exogenous regressors, the test is very
straightforward - simply regress the OLS residuals ût on
lagged residuals ût−1
t statistics of ρ̂ coefficient can be used to test H0 : ρ = 0
against HA : ρ 6= 0 (or sometimes even HA : ρ > 0)

8 / 23
Testing for AR(1) Serial Correlation cont.

An alternative is the Durbin-Watson (DW) statistic:


Pn
(ût − ût−1 )2
DW = t=2Pn 2 .
t=1 ût

9 / 23
Testing for AR(1) Serial Correlation cont.

An alternative is the Durbin-Watson (DW) statistic:


Pn
(ût − ût−1 )2
DW = t=2Pn 2 .
t=1 ût

DW ≈ 2(1 − ρ̂).

9 / 23
Testing for AR(1) Serial Correlation cont.

An alternative is the Durbin-Watson (DW) statistic:


Pn
(ût − ût−1 )2
DW = t=2Pn 2 .
t=1 ût

DW ≈ 2(1 − ρ̂).
ρ̂ ≈ 0 ⇒ DW ≈ 2.

9 / 23
Testing for AR(1) Serial Correlation cont.

An alternative is the Durbin-Watson (DW) statistic:


Pn
(ût − ût−1 )2
DW = t=2Pn 2 .
t=1 ût

DW ≈ 2(1 − ρ̂).
ρ̂ ≈ 0 ⇒ DW ≈ 2.
ρ̂ > 0 ⇒ DW < 2.

9 / 23
Testing for AR(1) Serial Correlation cont.

An alternative is the Durbin-Watson (DW) statistic:


Pn
(ût − ût−1 )2
DW = t=2Pn 2 .
t=1 ût

DW ≈ 2(1 − ρ̂).
ρ̂ ≈ 0 ⇒ DW ≈ 2.
ρ̂ > 0 ⇒ DW < 2.
The DW is little problematic, we have 2 sets of critical
values, dL (lower) and dU (upper):

9 / 23
Testing for AR(1) Serial Correlation cont.

An alternative is the Durbin-Watson (DW) statistic:


Pn
(ût − ût−1 )2
DW = t=2Pn 2 .
t=1 ût

DW ≈ 2(1 − ρ̂).
ρ̂ ≈ 0 ⇒ DW ≈ 2.
ρ̂ > 0 ⇒ DW < 2.
The DW is little problematic, we have 2 sets of critical
values, dL (lower) and dU (upper):
DW < dL ⇒ reject the H0 : ρ = 0 in favor of HA : ρ > 0.

9 / 23
Testing for AR(1) Serial Correlation cont.

An alternative is the Durbin-Watson (DW) statistic:


Pn
(ût − ût−1 )2
DW = t=2Pn 2 .
t=1 ût

DW ≈ 2(1 − ρ̂).
ρ̂ ≈ 0 ⇒ DW ≈ 2.
ρ̂ > 0 ⇒ DW < 2.
The DW is little problematic, we have 2 sets of critical
values, dL (lower) and dU (upper):
DW < dL ⇒ reject the H0 : ρ = 0 in favor of HA : ρ > 0.
DW > dU ⇒ fail to reject the H0 : ρ = 0.

9 / 23
Testing for AR(1) Serial Correlation cont.

An alternative is the Durbin-Watson (DW) statistic:


Pn
(ût − ût−1 )2
DW = t=2Pn 2 .
t=1 ût

DW ≈ 2(1 − ρ̂).
ρ̂ ≈ 0 ⇒ DW ≈ 2.
ρ̂ > 0 ⇒ DW < 2.
The DW is little problematic, we have 2 sets of critical
values, dL (lower) and dU (upper):
DW < dL ⇒ reject the H0 : ρ = 0 in favor of HA : ρ > 0.
DW > dU ⇒ fail to reject the H0 : ρ = 0.
dL ≤ DW ≤ dU the test is inconclusive.

9 / 23
Testing for AR(1) Serial Correlation cont.

In case we do not have strictly exogenous regressors (one or


more xtj is correlated with ut−1 ), t test nor DW test does
not work.

10 / 23
Testing for AR(1) Serial Correlation cont.

In case we do not have strictly exogenous regressors (one or


more xtj is correlated with ut−1 ), t test nor DW test does
not work.
In this case, we can regress ût on xt1 , xt2 , . . . xtk , ût−1 for all
t = 2, . . . , n.

10 / 23
Testing for AR(1) Serial Correlation cont.

In case we do not have strictly exogenous regressors (one or


more xtj is correlated with ut−1 ), t test nor DW test does
not work.
In this case, we can regress ût on xt1 , xt2 , . . . xtk , ût−1 for all
t = 2, . . . , n.
t statistics of ρ̂ coefficient of ût−1 can be used to test the
null of no serial correlation.

10 / 23
Testing for AR(1) Serial Correlation cont.

In case we do not have strictly exogenous regressors (one or


more xtj is correlated with ut−1 ), t test nor DW test does
not work.
In this case, we can regress ût on xt1 , xt2 , . . . xtk , ût−1 for all
t = 2, . . . , n.
t statistics of ρ̂ coefficient of ût−1 can be used to test the
null of no serial correlation.
The inclusion of xt1 , xt2 , . . . xtk explicitly allows each xtj to
be correlated with ut−1 ⇒ no need for strict exogeneity.

10 / 23
Testing for Higher Order Serial Correlation

We can easily extend the test for second order (AR(2))


serial correlation.

11 / 23
Testing for Higher Order Serial Correlation

We can easily extend the test for second order (AR(2))


serial correlation.
In the model yt = ρ1 ut−1 + ρ2 ut−2 + t , we test the

H0 : ρ1 = 0, ρ2 = 0

11 / 23
Testing for Higher Order Serial Correlation

We can easily extend the test for second order (AR(2))


serial correlation.
In the model yt = ρ1 ut−1 + ρ2 ut−2 + t , we test the

H0 : ρ1 = 0, ρ2 = 0

We regress ût on xt1 , xt2 , . . . xtk , ût−1 , ût−2 for all


t = 3, . . . , n

11 / 23
Testing for Higher Order Serial Correlation

We can easily extend the test for second order (AR(2))


serial correlation.
In the model yt = ρ1 ut−1 + ρ2 ut−2 + t , we test the

H0 : ρ1 = 0, ρ2 = 0

We regress ût on xt1 , xt2 , . . . xtk , ût−1 , ût−2 for all


t = 3, . . . , n
...and obtain F test for joint significance of ût−1 and ût−2 .
If they are jointly significant, we reject the null ⇒ errors
are serially correlated of order two.

11 / 23
Testing for Higher Order Serial Correlation cont.

We can include q lags to test high order serial correlation.

12 / 23
Testing for Higher Order Serial Correlation cont.

We can include q lags to test high order serial correlation.


Regress ût on xt1 , xt2 , . . . xtk , ût−1 , ût−2 , . . . , ût−q for all
t = (q + 1), . . . , n.

12 / 23
Testing for Higher Order Serial Correlation cont.

We can include q lags to test high order serial correlation.


Regress ût on xt1 , xt2 , . . . xtk , ût−1 , ût−2 , . . . , ût−q for all
t = (q + 1), . . . , n.
Use F test to test joint significance of ût−1 , ût−2 , . . . , ût−q

12 / 23
Testing for Higher Order Serial Correlation cont.

We can include q lags to test high order serial correlation.


Regress ût on xt1 , xt2 , . . . xtk , ût−1 , ût−2 , . . . , ût−q for all
t = (q + 1), . . . , n.
Use F test to test joint significance of ût−1 , ût−2 , . . . , ût−q
Or use LM version of test – Breusch-Godfrey test:

LM = (n − q)Rû2 ,

12 / 23
Testing for Higher Order Serial Correlation cont.

We can include q lags to test high order serial correlation.


Regress ût on xt1 , xt2 , . . . xtk , ût−1 , ût−2 , . . . , ût−q for all
t = (q + 1), . . . , n.
Use F test to test joint significance of ût−1 , ût−2 , . . . , ût−q
Or use LM version of test – Breusch-Godfrey test:

LM = (n − q)Rû2 ,

where Rû2 is usual R2 from the regression above.

12 / 23
Testing for Higher Order Serial Correlation cont.

We can include q lags to test high order serial correlation.


Regress ût on xt1 , xt2 , . . . xtk , ût−1 , ût−2 , . . . , ût−q for all
t = (q + 1), . . . , n.
Use F test to test joint significance of ût−1 , ût−2 , . . . , ût−q
Or use LM version of test – Breusch-Godfrey test:

LM = (n − q)Rû2 ,

where Rû2 is usual R2 from the regression above.


a
Under the null hypothesis, LM ∼ χ2q .

12 / 23
Correcting for Serial Correlation

When correlation is detected, we need to treat it.

13 / 23
Correcting for Serial Correlation

When correlation is detected, we need to treat it.


We know that OLS may be inefficient.

13 / 23
Correcting for Serial Correlation

When correlation is detected, we need to treat it.


We know that OLS may be inefficient.
So how do we obtain BLUE estimator in the AR(1) setting?

13 / 23
Correcting for Serial Correlation

When correlation is detected, we need to treat it.


We know that OLS may be inefficient.
So how do we obtain BLUE estimator in the AR(1) setting?
We assume all 4 Gauss-Markov Assumptions, but we relax
Assumption 5 and assume errors to follow AR(1):
ut = ρut−1 + t , t = 1, 2, . . .

13 / 23
Correcting for Serial Correlation

When correlation is detected, we need to treat it.


We know that OLS may be inefficient.
So how do we obtain BLUE estimator in the AR(1) setting?
We assume all 4 Gauss-Markov Assumptions, but we relax
Assumption 5 and assume errors to follow AR(1):
ut = ρut−1 + t , t = 1, 2, . . .
V ar(ut ) = σ2 /(1 − ρ2 )

13 / 23
Correcting for Serial Correlation

When correlation is detected, we need to treat it.


We know that OLS may be inefficient.
So how do we obtain BLUE estimator in the AR(1) setting?
We assume all 4 Gauss-Markov Assumptions, but we relax
Assumption 5 and assume errors to follow AR(1):
ut = ρut−1 + t , t = 1, 2, . . .
V ar(ut ) = σ2 /(1 − ρ2 )
We need to transform the regression equation so we have
no serial correlation in the errors.

13 / 23
Correcting for Serial Correlation cont.
Consider following regression:

yt = β0 + β1 xt + ut ,
ut = ρut−1 + t

14 / 23
Correcting for Serial Correlation cont.
Consider following regression:

yt = β0 + β1 xt + ut ,
ut = ρut−1 + t

For t ≥ 2, we can write:

yt−1 = β0 + β1 xt−1 + ut−1 ,


yt = β0 + β1 xt + ut

14 / 23
Correcting for Serial Correlation cont.
Consider following regression:

yt = β0 + β1 xt + ut ,
ut = ρut−1 + t

For t ≥ 2, we can write:

yt−1 = β0 + β1 xt−1 + ut−1 ,


yt = β0 + β1 xt + ut

By multiplying first equation by ρ and subtracting it from


second equation, we get:

ỹt = (1 − ρ)β0 + β1 x̃t + t

where ỹt = yt − ρyt−1 and x̃t = xt − ρxt−1

14 / 23
Correcting for Serial Correlation cont.
Consider following regression:

yt = β0 + β1 xt + ut ,
ut = ρut−1 + t

For t ≥ 2, we can write:

yt−1 = β0 + β1 xt−1 + ut−1 ,


yt = β0 + β1 xt + ut

By multiplying first equation by ρ and subtracting it from


second equation, we get:

ỹt = (1 − ρ)β0 + β1 x̃t + t

where ỹt = yt − ρyt−1 and x̃t = xt − ρxt−1


This is called quasi-differencing.

14 / 23
Correcting for Serial Correlation cont.
Consider following regression:

yt = β0 + β1 xt + ut ,
ut = ρut−1 + t

For t ≥ 2, we can write:

yt−1 = β0 + β1 xt−1 + ut−1 ,


yt = β0 + β1 xt + ut

By multiplying first equation by ρ and subtracting it from


second equation, we get:

ỹt = (1 − ρ)β0 + β1 x̃t + t

where ỹt = yt − ρyt−1 and x̃t = xt − ρxt−1


This is called quasi-differencing. BUT we never know
the value of ρ
14 / 23
Feasible GLS Estimation with AR(1) Errors
The problem with this GLS estimator is that we never
know the value of ρ.

15 / 23
Feasible GLS Estimation with AR(1) Errors
The problem with this GLS estimator is that we never
know the value of ρ.
But we already know how to obtain the ρ estimator:

15 / 23
Feasible GLS Estimation with AR(1) Errors
The problem with this GLS estimator is that we never
know the value of ρ.
But we already know how to obtain the ρ estimator:
Simply regress the OLS residuals on their lagged values
and get ρ̂.

15 / 23
Feasible GLS Estimation with AR(1) Errors
The problem with this GLS estimator is that we never
know the value of ρ.
But we already know how to obtain the ρ estimator:
Simply regress the OLS residuals on their lagged values
and get ρ̂.
Feasible GLS (FGLS) Estimation with AR(1) Errors

15 / 23
Feasible GLS Estimation with AR(1) Errors
The problem with this GLS estimator is that we never
know the value of ρ.
But we already know how to obtain the ρ estimator:
Simply regress the OLS residuals on their lagged values
and get ρ̂.
Feasible GLS (FGLS) Estimation with AR(1) Errors
Run the OLS regression of yt on xt1 , . . . , xtk and obtain residuals
ût , t = 1, 2, . . . , n.

15 / 23
Feasible GLS Estimation with AR(1) Errors
The problem with this GLS estimator is that we never
know the value of ρ.
But we already know how to obtain the ρ estimator:
Simply regress the OLS residuals on their lagged values
and get ρ̂.
Feasible GLS (FGLS) Estimation with AR(1) Errors
Run the OLS regression of yt on xt1 , . . . , xtk and obtain residuals
ût , t = 1, 2, . . . , n.
Run the regression of ût on ût−1 to obtain estimate ρ̂.

15 / 23
Feasible GLS Estimation with AR(1) Errors
The problem with this GLS estimator is that we never
know the value of ρ.
But we already know how to obtain the ρ estimator:
Simply regress the OLS residuals on their lagged values
and get ρ̂.
Feasible GLS (FGLS) Estimation with AR(1) Errors
Run the OLS regression of yt on xt1 , . . . , xtk and obtain residuals
ût , t = 1, 2, . . . , n.
Run the regression of ût on ût−1 to obtain estimate ρ̂.
Run OLS equation:

ỹt = β0 x̃t0 + β1 x̃t1 + . . . + βk x̃t + errort ,

where x̃t0 = (1 − ρ̂), x̃t1 = xt − ρxt−1 for t ≥ 2, and


x̃t0 = (1 − ρ2 )1/2 , x̃t1 = (1 − ρ2 )1/2 x1 for t = 1

15 / 23
Feasible GLS Estimation with AR(1) Errors
GLS is BLUE under Assumptions 1 – 5 and we can use t
and F tests from the transformed equation for the
inference.

16 / 23
Feasible GLS Estimation with AR(1) Errors
GLS is BLUE under Assumptions 1 – 5 and we can use t
and F tests from the transformed equation for the
inference.
These tests are asymptotically valid if Ass.1 – 5 hold in
transformed model (along with stationary and weak
dependence in the original variables)

16 / 23
Feasible GLS Estimation with AR(1) Errors
GLS is BLUE under Assumptions 1 – 5 and we can use t
and F tests from the transformed equation for the
inference.
These tests are asymptotically valid if Ass.1 – 5 hold in
transformed model (along with stationary and weak
dependence in the original variables)
Distributions conditional on X are exact (with minimum
variance) if Ass 6. holds fro t .

16 / 23
Feasible GLS Estimation with AR(1) Errors
GLS is BLUE under Assumptions 1 – 5 and we can use t
and F tests from the transformed equation for the
inference.
These tests are asymptotically valid if Ass.1 – 5 hold in
transformed model (along with stationary and weak
dependence in the original variables)
Distributions conditional on X are exact (with minimum
variance) if Ass 6. holds fro t .
FGLS estimator is called Prais-Winsten estimator

16 / 23
Feasible GLS Estimation with AR(1) Errors
GLS is BLUE under Assumptions 1 – 5 and we can use t
and F tests from the transformed equation for the
inference.
These tests are asymptotically valid if Ass.1 – 5 hold in
transformed model (along with stationary and weak
dependence in the original variables)
Distributions conditional on X are exact (with minimum
variance) if Ass 6. holds fro t .
FGLS estimator is called Prais-Winsten estimator
If we just omit first equation (t = 1), it is called
Cochrane-Orcutt estimator.

16 / 23
Feasible GLS Estimation with AR(1) Errors
GLS is BLUE under Assumptions 1 – 5 and we can use t
and F tests from the transformed equation for the
inference.
These tests are asymptotically valid if Ass.1 – 5 hold in
transformed model (along with stationary and weak
dependence in the original variables)
Distributions conditional on X are exact (with minimum
variance) if Ass 6. holds fro t .
FGLS estimator is called Prais-Winsten estimator
If we just omit first equation (t = 1), it is called
Cochrane-Orcutt estimator.
FGLS estimators are not unbiased, but are consistent.

16 / 23
Feasible GLS Estimation with AR(1) Errors
GLS is BLUE under Assumptions 1 – 5 and we can use t
and F tests from the transformed equation for the
inference.
These tests are asymptotically valid if Ass.1 – 5 hold in
transformed model (along with stationary and weak
dependence in the original variables)
Distributions conditional on X are exact (with minimum
variance) if Ass 6. holds fro t .
FGLS estimator is called Prais-Winsten estimator
If we just omit first equation (t = 1), it is called
Cochrane-Orcutt estimator.
FGLS estimators are not unbiased, but are consistent.
Asymptotically, both procedures are the same and FGLS is
more efficient than OLS.

16 / 23
Feasible GLS Estimation with AR(1) Errors
GLS is BLUE under Assumptions 1 – 5 and we can use t
and F tests from the transformed equation for the
inference.
These tests are asymptotically valid if Ass.1 – 5 hold in
transformed model (along with stationary and weak
dependence in the original variables)
Distributions conditional on X are exact (with minimum
variance) if Ass 6. holds fro t .
FGLS estimator is called Prais-Winsten estimator
If we just omit first equation (t = 1), it is called
Cochrane-Orcutt estimator.
FGLS estimators are not unbiased, but are consistent.
Asymptotically, both procedures are the same and FGLS is
more efficient than OLS.
This method can be extended for higher order serial
correlation, AR(q) in the error term.
16 / 23
Serial Correlation-Robust Standard Errors

Problem: If the regressors are not strictly exogenous,


FGLS is no longer consistent.

17 / 23
Serial Correlation-Robust Standard Errors

Problem: If the regressors are not strictly exogenous,


FGLS is no longer consistent.
If strict exogeneity does not hold, it’s possible to calculate
serial correlation (and heteroskedasticity) robust standard
errors of OLS estimate. We know that OLS will be
inefficient.

17 / 23
Serial Correlation-Robust Standard Errors

Problem: If the regressors are not strictly exogenous,


FGLS is no longer consistent.
If strict exogeneity does not hold, it’s possible to calculate
serial correlation (and heteroskedasticity) robust standard
errors of OLS estimate. We know that OLS will be
inefficient.
The idea is to scale OLS standard errors to take into
account serial correlation.

17 / 23
Serial Correlation-Robust Standard Errors cont.
Estimate the model with OLS to obtain residuals ût , σ̂ and
the usual standard errors “se(β̂1 )”, which are incorrect.

18 / 23
Serial Correlation-Robust Standard Errors cont.
Estimate the model with OLS to obtain residuals ût , σ̂ and
the usual standard errors “se(β̂1 )”, which are incorrect.
Run the auxiliary regression of xt1 on xt2 , xt3 , . . . , xtk (with
constant) and get residuals r̂t .

18 / 23
Serial Correlation-Robust Standard Errors cont.
Estimate the model with OLS to obtain residuals ût , σ̂ and
the usual standard errors “se(β̂1 )”, which are incorrect.
Run the auxiliary regression of xt1 on xt2 , xt3 , . . . , xtk (with
constant) and get residuals r̂t .
For a chosen integer g > 0 (typically integer part of n1/4 ):
n g n
!
X X X
2
ν̂ = ât + 2 [1 − h/(g + 1)] ât ât−h ,
t=1 h=1 t=h+1

where ât = r̂t ût , t = 1, 2, . . . , n.

18 / 23
Serial Correlation-Robust Standard Errors cont.
Estimate the model with OLS to obtain residuals ût , σ̂ and
the usual standard errors “se(β̂1 )”, which are incorrect.
Run the auxiliary regression of xt1 on xt2 , xt3 , . . . , xtk (with
constant) and get residuals r̂t .
For a chosen integer g > 0 (typically integer part of n1/4 ):
n g n
!
X X X
2
ν̂ = ât + 2 [1 − h/(g + 1)] ât ât−h ,
t=1 h=1 t=h+1

where ât = r̂t ût , t = 1, 2, . . . , n.

Serial Correlation-Robust Standard Error



se(β̂1 ) = [“se(β̂1 )”/σ̂]2 ν̂

18 / 23
Serial Correlation-Robust Standard Errors cont.
Estimate the model with OLS to obtain residuals ût , σ̂ and
the usual standard errors “se(β̂1 )”, which are incorrect.
Run the auxiliary regression of xt1 on xt2 , xt3 , . . . , xtk (with
constant) and get residuals r̂t .
For a chosen integer g > 0 (typically integer part of n1/4 ):
n g n
!
X X X
2
ν̂ = ât + 2 [1 − h/(g + 1)] ât ât−h ,
t=1 h=1 t=h+1

where ât = r̂t ût , t = 1, 2, . . . , n.

Serial Correlation-Robust Standard Error



se(β̂1 ) = [“se(β̂1 )”/σ̂]2 ν̂

Similarly for β̂j .

18 / 23
Serial Correlation-Robust Standard Errors cont.
Estimate the model with OLS to obtain residuals ût , σ̂ and
the usual standard errors “se(β̂1 )”, which are incorrect.
Run the auxiliary regression of xt1 on xt2 , xt3 , . . . , xtk (with
constant) and get residuals r̂t .
For a chosen integer g > 0 (typically integer part of n1/4 ):
n g n
!
X X X
2
ν̂ = ât + 2 [1 − h/(g + 1)] ât ât−h ,
t=1 h=1 t=h+1

where ât = r̂t ût , t = 1, 2, . . . , n.

Serial Correlation-Robust Standard Error



se(β̂1 ) = [“se(β̂1 )”/σ̂]2 ν̂

Similarly for β̂j .


SC robust standard errors can poorly behave in small samples
in presence of large serial correlation.
18 / 23
Heteroskedasticity in Time Series Regressions
OLS estimators are unbiased (with Ass. 1-3) and
consistent (Ass. 1A-3A).

19 / 23
Heteroskedasticity in Time Series Regressions
OLS estimators are unbiased (with Ass. 1-3) and
consistent (Ass. 1A-3A).
OLS inference is invalid, if Ass.4 (homoskedasticity) fail.

19 / 23
Heteroskedasticity in Time Series Regressions
OLS estimators are unbiased (with Ass. 1-3) and
consistent (Ass. 1A-3A).
OLS inference is invalid, if Ass.4 (homoskedasticity) fail.
Heteroskedasticity-robust statistics can be easily derived in
the same manner as in cross-sectional data (if Ass.
1A,2A,3A and 5A hold).

19 / 23
Heteroskedasticity in Time Series Regressions
OLS estimators are unbiased (with Ass. 1-3) and
consistent (Ass. 1A-3A).
OLS inference is invalid, if Ass.4 (homoskedasticity) fail.
Heteroskedasticity-robust statistics can be easily derived in
the same manner as in cross-sectional data (if Ass.
1A,2A,3A and 5A hold).
However, in small samples we know that these robust
standard errors may be large. ⇒ we want to test for
heteroskedasticity.

19 / 23
Heteroskedasticity in Time Series Regressions
OLS estimators are unbiased (with Ass. 1-3) and
consistent (Ass. 1A-3A).
OLS inference is invalid, if Ass.4 (homoskedasticity) fail.
Heteroskedasticity-robust statistics can be easily derived in
the same manner as in cross-sectional data (if Ass.
1A,2A,3A and 5A hold).
However, in small samples we know that these robust
standard errors may be large. ⇒ we want to test for
heteroskedasticity.
We can use the same tests as for the cross-sectional case,
but we need to have no serial correlation in the errors.

19 / 23
Heteroskedasticity in Time Series Regressions
OLS estimators are unbiased (with Ass. 1-3) and
consistent (Ass. 1A-3A).
OLS inference is invalid, if Ass.4 (homoskedasticity) fail.
Heteroskedasticity-robust statistics can be easily derived in
the same manner as in cross-sectional data (if Ass.
1A,2A,3A and 5A hold).
However, in small samples we know that these robust
standard errors may be large. ⇒ we want to test for
heteroskedasticity.
We can use the same tests as for the cross-sectional case,
but we need to have no serial correlation in the errors.
Also for the Breusch-Pagan test where we specify
u2t = δ0 + δ1 xt1 + . . . + δk xtk + νt and test
H0 : δ1 = δ2 = . . . = δk = 0, we need ν to be
homoskedastic and serially uncorrelated.

19 / 23
Heteroskedasticity in Time Series Regressions
OLS estimators are unbiased (with Ass. 1-3) and
consistent (Ass. 1A-3A).
OLS inference is invalid, if Ass.4 (homoskedasticity) fail.
Heteroskedasticity-robust statistics can be easily derived in
the same manner as in cross-sectional data (if Ass.
1A,2A,3A and 5A hold).
However, in small samples we know that these robust
standard errors may be large. ⇒ we want to test for
heteroskedasticity.
We can use the same tests as for the cross-sectional case,
but we need to have no serial correlation in the errors.
Also for the Breusch-Pagan test where we specify
u2t = δ0 + δ1 xt1 + . . . + δk xtk + νt and test
H0 : δ1 = δ2 = . . . = δk = 0, we need ν to be
homoskedastic and serially uncorrelated.
If we find heteroskedasticity, we can use heteroskedasticity
robust statistics.
19 / 23
Autoregressive Conditional Heteroskedasticity
Many times, we find dynamic form of the
heteroskedasticity in economic data.

20 / 23
Autoregressive Conditional Heteroskedasticity
Many times, we find dynamic form of the
heteroskedasticity in economic data.
We can have E[u2t |X] = V ar(ut |X) = V ar(ut ) = σ 2 , but
still:
E[u2t |X, ut−1 , ut−2 , . . .] = E[u2t |X, ut−1 ] = α0 + α1 u2t−1 .

20 / 23
Autoregressive Conditional Heteroskedasticity
Many times, we find dynamic form of the
heteroskedasticity in economic data.
We can have E[u2t |X] = V ar(ut |X) = V ar(ut ) = σ 2 , but
still:
E[u2t |X, ut−1 , ut−2 , . . .] = E[u2t |X, ut−1 ] = α0 + α1 u2t−1 .
Thus u2t = α0 + α1 u2t−1 + νt , where
E[ν|X, ut−1 , ut−1 , . . .] = 0.

20 / 23
Autoregressive Conditional Heteroskedasticity
Many times, we find dynamic form of the
heteroskedasticity in economic data.
We can have E[u2t |X] = V ar(ut |X) = V ar(ut ) = σ 2 , but
still:
E[u2t |X, ut−1 , ut−2 , . . .] = E[u2t |X, ut−1 ] = α0 + α1 u2t−1 .
Thus u2t = α0 + α1 u2t−1 + νt , where
E[ν|X, ut−1 , ut−1 , . . .] = 0.
Engle (1982) suggested looking at the conditional variance
of ut given past errors - autoregressive conditional
heteroskedasticity (ARCH) model.

20 / 23
Autoregressive Conditional Heteroskedasticity
Many times, we find dynamic form of the
heteroskedasticity in economic data.
We can have E[u2t |X] = V ar(ut |X) = V ar(ut ) = σ 2 , but
still:
E[u2t |X, ut−1 , ut−2 , . . .] = E[u2t |X, ut−1 ] = α0 + α1 u2t−1 .
Thus u2t = α0 + α1 u2t−1 + νt , where
E[ν|X, ut−1 , ut−1 , . . .] = 0.
Engle (1982) suggested looking at the conditional variance
of ut given past errors - autoregressive conditional
heteroskedasticity (ARCH) model.
So even when the errors are not correlated (Ass. 5 holds),
its squares can be correlated.

20 / 23
Autoregressive Conditional Heteroskedasticity
Many times, we find dynamic form of the
heteroskedasticity in economic data.
We can have E[u2t |X] = V ar(ut |X) = V ar(ut ) = σ 2 , but
still:
E[u2t |X, ut−1 , ut−2 , . . .] = E[u2t |X, ut−1 ] = α0 + α1 u2t−1 .
Thus u2t = α0 + α1 u2t−1 + νt , where
E[ν|X, ut−1 , ut−1 , . . .] = 0.
Engle (1982) suggested looking at the conditional variance
of ut given past errors - autoregressive conditional
heteroskedasticity (ARCH) model.
So even when the errors are not correlated (Ass. 5 holds),
its squares can be correlated.
OLS is still BLUE with ARCH errors and inference is valid
if Ass. 6 (normality) holds.

20 / 23
Autoregressive Conditional Heteroskedasticity
Many times, we find dynamic form of the
heteroskedasticity in economic data.
We can have E[u2t |X] = V ar(ut |X) = V ar(ut ) = σ 2 , but
still:
E[u2t |X, ut−1 , ut−2 , . . .] = E[u2t |X, ut−1 ] = α0 + α1 u2t−1 .
Thus u2t = α0 + α1 u2t−1 + νt , where
E[ν|X, ut−1 , ut−1 , . . .] = 0.
Engle (1982) suggested looking at the conditional variance
of ut given past errors - autoregressive conditional
heteroskedasticity (ARCH) model.
So even when the errors are not correlated (Ass. 5 holds),
its squares can be correlated.
OLS is still BLUE with ARCH errors and inference is valid
if Ass. 6 (normality) holds.
Even if Normality does not hold, we know that
asymptotically OLS inference is valid under Ass 1A – 5A
and we can have ARCH effects.
20 / 23
Autoregressive Conditional Heteroskedasticity
cont.

So why do we need to care about ARCH errors?

21 / 23
Autoregressive Conditional Heteroskedasticity
cont.

So why do we need to care about ARCH errors?


Because we can obtain asymptotically more efficient
estimators than OLS.

21 / 23
Autoregressive Conditional Heteroskedasticity
cont.

So why do we need to care about ARCH errors?


Because we can obtain asymptotically more efficient
estimators than OLS.
Details will be provided at Mgr. courses, not Bc. level.

21 / 23
Autoregressive Conditional Heteroskedasticity
cont.

So why do we need to care about ARCH errors?


Because we can obtain asymptotically more efficient
estimators than OLS.

ARCH model have become important for empirical finance


as it captures time-varying volatility in the stock markets.

21 / 23
Autoregressive Conditional Heteroskedasticity
cont.

So why do we need to care about ARCH errors?


Because we can obtain asymptotically more efficient
estimators than OLS.
ARCH model have become important for empirical finance
as it captures time-varying volatility in the stock markets.
Rob Engle received a Nobel Prize in 2003 for it.

21 / 23
Autoregressive Conditional Heteroskedasticity
cont.

So why do we need to care about ARCH errors?


Because we can obtain asymptotically more efficient
estimators than OLS.
ARCH model have become important for empirical finance
as it captures time-varying volatility in the stock markets.
Rob Engle received a Nobel Prize in 2003 for it. Example of
stock market returns on the next slide.

21 / 23
Autoregressive Conditional Heteroskedasticity
cont.
Prices of DJI stock market index H2000-2011L

14 000

12 000

10 000

8000

6000
2000 2002 2004 2006 2008 2010 2012
Returns of DJI stock market index 2000-2011

0.10

0.05

0.00

-0.05

-0.10
2000 2002 2004 2006 2008 2010 2012

22 / 23
Thank you

23 / 23

You might also like