Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

Solutions to Problem Set 5

Econometrics (30413)

Spring 2021

The dataset used for this problem set contains data on car accidents in US states, observed
over the period 1982–1988. The variables of interest are:

ˆ f atal = mortality rate in car crashes;

ˆ beertax = beer’s tax rate;

ˆ unrate = unemployment rate;

ˆ perincK = personal income in thousands of dollars;

ˆ state = US state;

ˆ year = year.

The mortality rate is the outcome (dependent) variable of interest, and we seek to fit the
following linear panel model
y = xi β + αi + εit

Question 1

a) What is αi ?

b) In the following tables Fixed effects, Random effects and Between estimation results are
reported. Are the estimates from the three models different?

1
2
c) Shortly explain the difference between the three models;

d) Which model would you choose?

Solution

a) αi is the individual-specific and time-invariant component of the error term: it could be


assumed to be fixed, or distributed according to a Normal distribution. It catches individual
and unobserved heterogeneity.

b) The estimates from the different models are different since they catch different data variabil-
ity.

c) The fixed effects estimator takes into account only individual differences across time. The
Between estimator takes into account the variation across individuals (the estimated model is
based on individual means). The Random effects model considers both sources of variation,
across time and individuals: the estimator is a linear combination of the FE and between
estimators.

3
d) An econometric valuation could bone on the basis of the Hausman test, that would allow to
discuss whether αi is a source of endogeneity or not. Since a Hausman test is not performed,
it is not easy to choose among the three models, since they answer to different questions.

In general, from a data fitting perspective, if we are interested on individual differences,


the between estimator should be chosen; if the focus is on the variation across time, the FE
model should instead be chosen. On the contrary, if we are interested in explaining the overall
variation, the RE is the best choice. However, looking at the R2 ’s in the three outputs, this is
quite high in both the FE and RE cases, suggesting that in the data we have both individual
and time variation: the RE model could be the right choice, since it accounts for both of
them.

Question 2

a) How can you test if a Fixed Effects model is more appropriate than a Random Effects model?
Describe the test procedure: the null and alternative hypotheses, the test statistic, and its
distribution.

b) The output of the test is reported in the following table. Comment on it.

Solution

a) We can perform a Hausman test. In general, this test compares two estimators, and it uses
their properties under the hypotheses. In this case,

H0 : E(Xi αi ) = 0

H1 : E(Xi αi ) 6= 0

Under the null, both RE and FE are consistent, and RE is efficient. Under the alternative,
only FE is consistent.

4
The test statistic is

H0
H = (β̂F E − β̂RE )0 [V̂ (β̂F E ) − V̂ (β̂RE )]−1 (β̂F E − β̂RE ) ≈ χ2K

and it is approximately distributed as a χ2 with K degrees of freedom, where in our case


K = 3 (number of regressors).

b) Given that the p-value associated to the test is approximately zero, we reject the null. With
this, we can conclude that αi ’s are correlated with at least one of the regressors, RE suffers
from endogeneity, and hence FE is a better choice than RE.

You might also like