Econ321 2017 Tutorial 2 Lab

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 9

ECON321: Advanced Econometrics

1st Semester 2017


Computer lab
Week 3

In the computer lab in Week 2, we learned how to read in a prepared dataset in


STATA, compute descriptive statistics, estimate a regression model and perform
various statistical tests. This week we return to this original dataset and continue this
discussion. We’ll end with a non-computer-based question.

 We need to get back into STATA. Once it’s up and running, change the default
directory to the one that contains the dataset that you’ll read into STATA. On
the command line type:

cd j:\data

This will change the directory to r: where our data are stored.

 Read in this dataset by typing:

insheet using labour.csv, clear

 These are again time series data for the New Zealand economy over this 92-
quarter period (23 years). Tell STATA that these are times series data by
typing:

generate qtr=obs+103
tsset qtr, quarterly

 Begin with the simplest two-variable regression model we estimated last time:

regress lfpr ur

You should again get this output:


2

Source SS df MS Number of obs = 92


F( 1, 90) = 299.26
Model 173.761767 1 173.761767 Prob > F = 0.0000
Residual 52.2568688 90 .580631875 R-squared = 0.7688
Adj R-squared = 0.7662
Total 226.018636 91 2.48372127 Root MSE = .76199

lfpr Coef. Std. Err. t P>|t| [95% Conf. Interval]

ur -.6635147 .0383552 -17.30 0.000 -.7397139 -.5873155


_cons 69.86231 .2468887 282.97 0.000 69.37182 70.3528

 Since this is a time series regression, we might want to test for serial
correlation in the disturbances. Unlike other software packages, STATA does
not automatically produce the Durbin-Watson d Statistic. You can ask for it
by typing:

estat dwatson

You should get the following:

Durbin-Watson d-statistic( 2, 92) = .2696924

Recall that the DW statistic is used to test the null hypothesis of first-order
(AR(1)):
autocorrelation
ut = ρut −1 + ε t

where ρ is the coefficient of autocovariance. The DW statistic is:

d≈2 (1−ρ )

d= 2 or ρ=0 indicates absence of autocorrelation.


d <2 or ρ >0 indicates + autocorrelation (min value d <2 if ρ=1 ).
d >2 or ρ=0 indicates – autocorrelation (max value d= 4 if ρ=−1 ).

In this example, we have severe positive autocorrelation (i.e., we can probably


comfortably reject the null hypothesis about an absence AR(1)).

NOTE: Unlike t and F-tests, there is no single critical value for this test
procedure. This is because the test statistic itself is a function of the
explanatory variables in the regression. Instead, we get upper and lower
thresholds for this test procedure (refer to a conventional DW table). If the
d statistic is sufficiently far from 2, then we reject the null in favour of + or
– negative autocorrelation. If close to 2, we don’t reject the null. In
between these thresholds, we say that the test is inconclusive.
3

An alternative test procedure has been suggested (Durbin (1970),


Econometrica, 38(3):422-429) which results in a chi-square test and produces
a P-Value for this procedure. Type:

estat durbinalt

Durbin's alternative test for autocorrelation

lags(p) chi2 df Prob > chi2

1 250.634 1 0.0000

H0: no serial correlation

The default is a test against AR(1). This says that we can reject H0 (see this
listed at the bottom of this table) at better than a 0.01% level.

This procedure has another advantage: it allows us to test for higher orders of
autocorrelation. For example, we could test against AR(2)
( u t = ρ1 u¿ ut −2 + ε t ) by typing:
estat durbinalt, lags(2)

Durbin's alternative test for autocorrelation

lags(p) chi2 df Prob > chi2

2 255.493 2 0.0000

H0: no serial correlation

Or against AR(4) (
u t =ρ1 ut −1 + ρ2 u t −2 + ρ3 ut −3 + ρ4 u t −4 + ε t )
:

estat durbinalt, lags(4)

Durbin's alternative test for autocorrelation

lags(p) chi2 df Prob > chi2

4 266.509 4 0.0000

H0: no serial correlation

All of these forms are possible, but let’s go with the simplest AR(1) form.

 Let’s return to the original regression model:

lfpr t =β 1 + β 2 ur t +ut

Bur now recognise that the disturbances likely follow an AR(1) process:
4

ut = ρut −1 + ε t

The usual OLS standard errors are incorrect, because they assume no
autocorrelation. It’s easy to motivate what we should do in this situation.

This is known as Infeasible Generalised Least Squares (GLS). Suppose we


know the value of the parameter ρ .

Multiply the lagged value of this regression model by ρ .

ρ lfpr t−1= ρβ1 + ρβ 2 ur t−1 + ρut−1

Subtract this from the original regression model:

lfpr t −ρ lfpr t −1 =β1 −ρβ 1 + β 2 ur t −ρβ 2 ur t −1 +ut − ρut −1

=β 1 ( 1− ρ ) + β 2 ( ur t −ρ ur t −1 ) + ε t

¿ ¿
lfpr t =δ 1 + β 2 ur t + ε t

This ‘quasi-differencing’ of the variables with ρ would allow us to produce


unbiased and efficient estimators for the coefficients and their standard errors
using OLS. The disturbance term in the transformed model is serially
uncorrelated.

This is infeasible GLS because is ρ unknown. Thus, the composite variables


in the transformed model cannot be computed.

Feasible GLS uses exactly the same steps as above, but replaces the known ρ
^ . This can come from the original OLS estimation of the
with at estimator ρ
regression model, and using the residuals to estimate this coefficient of
autocovariance:

e t = ρ^ e t −1 +ε t

With as estimate of ρ we can transform the data and get estimates for the β
coefficients.

lfpr t − ρ^ lfpr t −1 =β1 ( 1− ρ^ ) + β 2 ( ur t − ^ρ ur t −1 ) +ε t

But this improvement in our estimates for the β coefficients would also
improve our estimates for the disturbances and therefore ρ . This will produce
an iterative process known as the Cochrane-Orcutt or Prais-Winston
5

Iterative Procedure). This process continues until the estimates of ρ change


by less than a predetermined value (i.e., they converge on particular values).
6

Here’s the command for the GLS procedure:

prais lfpr ur, vce(robust)


Iteration 0: rho = 0.0000
Iteration 1: rho = 0.9143
Iteration 2: rho = 0.9834
Iteration 3: rho = 0.9888
Iteration 4: rho = 0.9892
Iteration 5: rho = 0.9893
Iteration 6: rho = 0.9893
Iteration 7: rho = 0.9893

Prais-Winsten AR(1) regression -- iterated estimates

Linear regression Number of obs = 92


F( 2, 90) = 1804.20
Prob > F = 0.0000
R-squared = 0.8829
Root MSE = .34667

Semi-robust
lfpr Coef. Std. Err. t P>|t| [95% Conf. Interval]

ur -.1268939 .0913175 -1.39 0.168 -.308312 .0545242


_cons 68.03752 1.147238 59.31 0.000 65.75833 70.31671

rho .9892771

Durbin-Watson statistic (original) 0.269692


Durbin-Watson statistic (transformed)2.267114

This quasi-differencing seems to have eliminated the positive autocorrelation


(compare the DW d statistics from OLS with GLS. The latter is close to 2).

The final estimate of ρ is just slightly less than 0.99.

The estimate for the slope coefficient has declined substantially in magnitude.
With OLS we got an estimate of -0.664 (and significantly different from zero
at better than a 0.1% level). With GLS we get an estimate of -0.127 (and only
significantly different from zero at a 16.8% level). This raises some concerns
about the negative impact of the unemployment rate on participation.

 We could also consider a Distributed Lag version of this model. The labour
force participation rate may depend on both the current and lagged
unemployment rates.

These lags are easy to create with time series data. Type the following:

generate ur1=l.ur
generate ur2=l2.ur
7

This will create lags for the unemployment rate for one and two quarters,
respectively.

It’s always easy to check that you’ve done the right thing by listing these
variables:
. list ur ur1 ur2

ur ur1 ur2

1. 4.2 . .
2. 4.1 4.2 .
3. 4.1 4.1 4.2
4. 4.1 4.1 4.1
5. 4 4.1 4.1

6. 4.1 4 4.1
7. 4.1 4.1 4
8. 4.3 4.1 4.1
9. 4.8 4.3 4.1
10. 5.3 4.8 4.3

11. 6.2 5.3 4.8


12. 6.2 6.2 5.3
13. 7 6.2 6.2
14. 7.3 7 6.2
15. 7.2 7.3 7

16. 7.1 7.2 7.3


17. 7 7.1 7.2
18. 7.5 7 7.1
19. 7.9 7.5 7
20. 8.7 7.9 7.5

Note that the lagged values are missing for first quarter (both ur1 and ur2)
and the second quarter (just ur2).

Now estimate this regression model:

lfpr t =β 1 + β 2 ur t + β 3 ur t −1 + β 4 ur t −2 +ut

Now estimate this regression using the Cochrane-Orcutt or Prais-Winston


Iterative Procedure for AR(1). Type:

prais lfpr ur ur1 ur2, vce(robust)

These regression results show suggest that the effect of unemployment lagged
one quarter is particularly important for participation. On average, a one
percentage-point increase in the unemployment rate reduces the labour force
participation rate by about 0.2 percentage points. The P-value is about 2.5%.
8

Iteration 0: rho = 0.0000


Iteration 1: rho = 0.9224
Iteration 2: rho = 0.9680
Iteration 3: rho = 0.9786
Iteration 4: rho = 0.9804
Iteration 5: rho = 0.9807
Iteration 6: rho = 0.9807
Iteration 7: rho = 0.9807
Iteration 8: rho = 0.9807
Iteration 9: rho = 0.9807

Prais-Winsten AR(1) regression -- iterated estimates

Linear regression Number of obs = 90


F( 4, 86) = 1503.66
Prob > F = 0.0000
R-squared = 0.9307
Root MSE = .34484

Semi-robust
lfpr Coef. Std. Err. t P>|t| [95% Conf. Interval]

ur -.0110442 .1139873 -0.10 0.923 -.2376434 .2155551


ur1 -.1978072 .0865464 -2.29 0.025 -.3698557 -.0257587
ur2 -.096394 .0975713 -0.99 0.326 -.2903594 .0975713
_cons 68.53783 1.023672 66.95 0.000 66.50284 70.57282

rho .980731

Durbin-Watson statistic (original) 0.241805


Durbin-Watson statistic (transformed)2.295892

Finally, the long-term cumulative effect of the unemployment on participation


is approximately -0.30525 (adding up the estimated slope coefficients). We
can reject the null that this long-term effect is equal to zero at better than a 1%
level.

test ur+ur1+ur2=0

( 1) ur + ur1 + ur2 = 0

F( 1, 86) = 9.86
Prob > F = 0.0023
9

You might also like