Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 6

November 24, 2017

ECON 301: ECONOMETRICS I


Assignment 7 Answer Key
End-of-Chapter 8 Questions
2. We divide everything by income to get a homoscedastic error term:

beer 1 price educ female u


=β 0 + β 1+ β 2 + β3 + β4 +
inc inc inc inc inc inc

Observe that Var(u/inc|X)=(1/inc2)*Var(u|x)=σ2.

4. (i) These coefficients have the anticipated signs. The higher the weighted average of the
overall GPA of a student, measured by higher crsgpa, is, the more likely he is to succeed this
term. The better the student has been in the last semester – as measured by cumgpa – the better
the student does (on average) in the current semester. Finally, tothrs is a measure of experience,
and its coefficient indicates an increasing return to experience.

The t statistic for crsgpa is very large, over five using the usual standard error (which is the
largest of the two). Using the robust standard error for cumgpa, its t statistic is about 2.61, which
is also significant at the 5% level. The t statistic for tothrs is only about 1.17 using either
standard error, so it is not significant at the 5% level.

(ii) This is easiest to see without other explanatory variables in the model. If crsgpa were the
only explanatory variable, H0: βcrsgpa = 1 means that without any information about the student,
the best predictor of term GPA is the average GPA in the students’ courses; this holds essentially
by definition. (The intercept would be zero in this case.) With additional explanatory variables it
is not necessarily true that βcrsgpa = 1because crsgpa could be correlated with characteristics of the
student. For example, perhaps the courses students take are influenced by ability – as measured
by test scores – and past college performance. It is still interesting though to test this hypothesis.

The t statistic using the usual standard error is t = (.900 – 1)/.175 = 0.57; using the
heteroskedasticity- robust standard error gives t = 0.6. In either case we fail to reject the null at
any reasonable significance level, certainly including 5%.

(iii) The in-season effect is given by the coefficient on season, which implies that, other things
equal, an athlete’s GPA is about .16 points lower when his/her sport is competing. The t statistic
using the usual standard error is about –1.60, while that using the robust standard error is about
–1.96. Against a two-sided alternative, the t statistic using the robust standard error is just
significant at the 5% level (the standard normal critical value is 1.96), while using the usual
standard error, the t statistic is not quite significant at the 10% level (cv = 1.65). After all, the
standard error used makes a difference in this case. This example is somewhat unusual, as the
robust standard error is more often the larger of the two.
6. (i) The proposed test is a hybrid of the BP and White tests. There are k + 1 regressors, each
original explanatory variable and the squared fitted values. So, the number of restrictions tested
is k + 1, and this is the numerator df. The denominator df is n - (k + 2) = n - k - 2.

(ii) For the BP test, this is easy: the hybrid test has an extra regressor, so the R-squared will be no
less for the hybrid test than for the BP test. For the special case of the White test, the argument is
a bit more subtle. In regression (8.20), the fitted values are a linear function of the regressors
(where, of course, the coefficients in the linear function are the OLS estimates). So, we are
putting a restriction on how the original explanatory variables appear in the regression. This
means that the R-squared from (8.20) will be no greater than the R-squared from the hybrid
regression.
2 2
(iii) No. The F statistic for joint significance of the regressors depends on Ru^ /( 1−Ru^ ), and it is
2 2

2
true that this ratio increases as Ru^ increases. But, the F statistic also depends on the df, and the df
2

are different among all three tests: the BP test, the special case of the White test, and the hybrid
test. So we do not know which test will deliver the smallest p-value.

C1. (i) The assumption that the variance of u given all explanatory variables depends only on
gender is Var(u | totwrk,educ,age, yngkid,male) = Var(u |male) = δ0 + δ1male. Then the variance
for women is simply δ0 and that for men is δ0 + δ1male; the difference in variances is δ1.

(ii)
. reg sleep totwrk educ age agesq yngkid male

Source | SS df MS Number of obs = 706


-------------+---------------------------------- F(6, 699) = 16.30
Model | 17092058.6 6 2848676.43 Prob > F = 0.0000
Residual | 122147777 699 174746.462 R-squared = 0.1228
-------------+---------------------------------- Adj R-squared = 0.1152
Total | 139239836 705 197503.313 Root MSE = 418.03

------------------------------------------------------------------------------
sleep | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
totwrk | -.1634235 .0181634 -9.00 0.000 -.1990848 -.1277622
educ | -11.71327 5.871952 -1.99 0.046 -23.24205 -.1844947
age | -8.697402 11.32909 -0.77 0.443 -30.94053 13.54572
agesq | .1284415 .1346696 0.95 0.341 -.1359638 .3928469
yngkid | -.0228006 50.27641 -0.00 1.000 -98.73367 98.68807
male | 87.75455 34.66794 2.53 0.012 19.68877 155.8203
_cons | 3840.852 239.4139 16.04 0.000 3370.795 4310.909
------------------------------------------------------------------------------

. predict resid, residuals

. gen residsq=resid*resid
. reg residsq male

Source | SS df MS Number of obs = 706


-------------+---------------------------------- F(1, 704) = 1.12
Model | 1.4430e+11 1 1.4430e+11 Prob > F = 0.2909
Residual | 9.0942e+13 704 1.2918e+11 R-squared = 0.0016
-------------+---------------------------------- Adj R-squared = 0.0002
Total | 9.1086e+13 705 1.2920e+11 Root MSE = 3.6e+0

------------------------------------------------------------------------------
residsq | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
male | -28849.63 27296.51 -1.06 0.291 -82441.94 24742.69
_cons | 189359.2 20546.36 9.22 0.000 149019.8 229698.7
------------------------------------------------------------------------------
Because the coefficient on male is negative, the estimated variance is higher for women.

(iii) No. The t statistic on male is only about –1.06, which is not significant at even the 20% level
against a two-sided alternative.

C5. (i)
. reg sprdcvr

Source | SS df MS Number of obs = 553


-------------+---------------------------------- F(0, 552) = 0.00
Model | 0 0 . Prob > F = .
Residual | 138.119349 552 .250216212 R-squared = 0.0000
-------------+---------------------------------- Adj R-squared = 0.0000
Total | 138.119349 552 .250216212 Root MSE = .50022

------------------------------------------------------------------------------
sprdcvr | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
_cons | .5153707 .0212714 24.23 0.000 .473588 .5571534
------------------------------------------------------------------------------

The asymptotic t statistic for H0: μ = .5 is (.515 - .5)/.021 = .71, which is not significant at the
10% level, or even the 20% level.

(ii)
. tab neutral

neutral | Freq. Percent Cum.


------------+-----------------------------------
0 | 518 93.67 93.67
1 | 35 6.33 100.00
------------+-----------------------------------
Total | 553 100.00
(iii)
. reg sprdcvr favhome neutral fav25 und25

Source | SS df MS Number of obs = 553


-------------+---------------------------------- F(4, 548) = 0.47
Model | .469633366 4 .117408341 Prob > F = 0.7597
Residual | 137.649716 548 .251185612 R-squared = 0.0034
-------------+---------------------------------- Adj R-squared = -0.0039
Total | 138.119349 552 .250216212 Root MSE = .50118

------------------------------------------------------------------------------
sprdcvr | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
favhome | .0345911 .0497192 0.70 0.487 -.0630724 .1322546
neutral | .117618 .0946631 1.24 0.215 -.068329 .3035651
fav25 | -.0234674 .0501925 -0.47 0.640 -.1220606 .0751258
und25 | .0178728 .0918841 0.19 0.846 -.1626154 .198361
_cons | .4895665 .0447585 10.94 0.000 .4016472 .5774858
------------------------------------------------------------------------------

. reg sprdcvr favhome neutral fav25 und25, robust

Linear regression Number of obs = 553


F(4, 548) = 0.48
Prob > F = 0.7489
R-squared = 0.0034
Root MSE = .50118

------------------------------------------------------------------------------
| Robust
sprdcvr | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
favhome | .0345911 .0498231 0.69 0.488 -.0632766 .1324588
neutral | .117618 .0931229 1.26 0.207 -.0653036 .3005396
fav25 | -.0234674 .0503824 -0.47 0.642 -.1224336 .0754988
und25 | .0178728 .0900634 0.20 0.843 -.1590389 .1947846
_cons | .4895665 .0448511 10.92 0.000 .4014655 .5776675
------------------------------------------------------------------------------
The variable neutral has by far the largest effect – if the game is played on a neutral court, the
probability that the spread is covered is estimated to be about .12 higher – and, except for the
intercept, its t statistic is the only t statistic greater than one in absolute value (about 1.24).

(iv) Under the null hypothesis, the response probability does not depend on any explanatory
variables, which means neither the mean nor the variance depends on the explanatory variables –
remember the mean and variance of the linear probability model.

(v)
. test favhome neutral fav25 und25

( 1) favhome = 0
( 2) neutral = 0
( 3) fav25 = 0
( 4) und25 = 0

F( 4, 548) = 0.48
Prob > F = 0.7489

Very large p-value, so the evidence is in favor of homoscedasticity.


(vi) Based on these variables, it is not possible to predict whether the spread will be covered. The
explanatory power is very low, and the explanatory variables are jointly very insignificant. The
coefficient on neutral may indicate something is going on with games played on a neutral court,
but we would not want to bet money on it unless it could be confirmed with a separate, larger
sample.

C10. (i)
. reg e401k inc incsq age agesq male

Source | SS df MS Number of obs = 9,275


-------------+---------------------------------- F(5, 9269) = 192.96
Model | 208.430869 5 41.6861738 Prob > F = 0.0000
Residual | 2002.39458 9,269 .216031349 R-squared = 0.0943
-------------+---------------------------------- Adj R-squared = 0.0938
Total | 2210.82544 9,274 .238389632 Root MSE = .46479

------------------------------------------------------------------------------
e401k | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
inc | .0124464 .0005929 20.99 0.000 .0112843 .0136086
incsq | -.0000616 4.73e-06 -13.03 0.000 -.0000709 -.0000524
age | .0265061 .0039225 6.76 0.000 .0188173 .034195
agesq | -.0003053 .000045 -6.78 0.000 -.0003935 -.000217
male | -.0035328 .012084 -0.29 0.770 -.0272202 .0201545
_cons | -.5062895 .0810961 -6.24 0.000 -.6652556 -.3473233
------------------------------------------------------------------------------

. reg e401k inc incsq age agesq male, robust

Linear regression Number of obs = 9,275


F(5, 9269) = 209.32
Prob > F = 0.0000
R-squared = 0.0943
Root MSE = .46479

------------------------------------------------------------------------------
| Robust
e401k | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
inc | .0124464 .0006003 20.73 0.000 .0112697 .0136232
incsq | -.0000616 5.00e-06 -12.32 0.000 -.0000715 -.0000518
age | .0265061 .0038235 6.93 0.000 .0190113 .034001
agesq | -.0003053 .0000438 -6.98 0.000 -.000391 -.0002195
male | -.0035328 .0120525 -0.29 0.769 -.0271583 .0200927
_cons | -.5062895 .0785541 -6.45 0.000 -.6602728 -.3523061
------------------------------------------------------------------------------
There are no important differences; if anything, the robust standard errors are smaller.

(ii) This is a general claim. Since Var(y|x) = p(x)[1- p(x)], we can write E(u2 | x) = p(x) – p2(x).
Written in error form, u2 = p(x) – p2(x) + v. In other words, we can write this as a regression
model u2 = δ0 + δ1p(x) + δ2p2(x) + v, with the restrictions δ0 = 0, δ1 = 1, and δ2 = -1. Remember
that, for the LPM, the fitted values are estimates of p(xi) = β0 + β1x1 + …. + + βkxk. When we run
the regression u^i on ^
y i and ^ 2
y i , the intercept estimates should be close to zero, the coefficient on
^
y i should be close to one, and the coefficient on ^ 2
y i should be close to –1. Moreover, they should
converge to these values as the sample size goes to infinity due to the Law of Large Numbers.
(iii)
. predict fitted
(option xb assumed; fitted values)

. predict u, residuals

. gen fittedsq=fitted*fitted

. gen usq=u*u

. reg usq fitted fittedsq

Source | SS df MS Number of obs = 9,275


-------------+---------------------------------- F(2, 9272) = 310.32
Model | 14.7106003 2 7.35530013 Prob > F = 0.0000
Residual | 219.765799 9,272 .023702092 R-squared = 0.0627
-------------+---------------------------------- Adj R-squared = 0.0625
Total | 234.476399 9,274 .0252832 Root MSE = .15395

------------------------------------------------------------------------------
usq | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
fitted | 1.009682 .057717 17.49 0.000 .8965443 1.12282
fittedsq | -.9702863 .069728 -13.92 0.000 -1.106968 -.8336041
_cons | -.0090334 .0109145 -0.83 0.408 -.0304283 .0123615
------------------------------------------------------------------------------

. test fitted fittedsq

( 1) fitted = 0
( 2) fittedsq = 0

F( 2, 9272) = 310.32
Prob > F = 0.0000

With an F equal to 310, the test produces extremely significant results. The parameter estimates are
quite close to what we expect to find from the theory in part (ii).

You might also like