Download as pdf or txt
Download as pdf or txt
You are on page 1of 14

Semester One Final Deferred Examinations, 2017 ECON7310 Elements of Econometrics

Part A:
Answer ALL Questions on the Multiple Choice Answer Sheet.
Each Question is worth 3 marks (30 Marks Total):

1. Suppose we have estimated the regression model,


yi = β1 + β2 xi2 + · · · + βK xiK + ei
Let ŷi be the fitted value of yi for each i. Now, we estimate the artificial model,
yi = β1 + β2 xi2 + · · · + βK xiK + γ1 ŷi + γ2 ŷi2 + vi
to test H0 : γ1 = γ2 = 0 against H1 : H0 is wrong. Choose the correct statement.
(a) H1 can be equivalently rewritten as H1 : γ1 6= 0 and γ2 6= 0.
(b) An F -test cannot be appropriate for testing H0 .
(c) This test is called the Augmented Dicky-Fuller test.
(d) Rejection of H0 suggests that there can be omitted variables. ⇐ Answer
(e) None of the above is correct.
2. Suppose we are interested in predicting the price of house using the interior area of
house. Consider the regression model,
P ricei = β1 + β2 SQF Ti + γ1 P OOLi + γ2 (P OOLi × SQF Ti ) + ei .
where P ricei is the market price for house i in $1000, SQF Ti is the interior area of
house i in square feet, and P OOLi is 1 if house i has a pool and 0, otherwise. Choose
the correct statement about the regression model for houses with a pool.
(a) The intercept is β1 and the slope is β2 .
(b) The intercept is γ1 and the slope is γ2 .
(c) The intercept is (β1 + γ1 ) and the slope is β2 .
(d) The intercept is (β1 + γ1 ) and the slope is (β2 + γ2 ). ⇐ Answer
(e) None of the above is correct.
3. Suppose we are interested in predicting the price of house using the interior area of
house. Consider the regression model,
P ricei = β1 + β2 SQF Ti + γ1 P OOLi + γ2 (P OOLi × SQF Ti ) + ei .
where P ricei is the market price for house i in $1000, SQF Ti is the interior area of
house i in square feet, and P OOLi is 1 if house i has a pool and 0, otherwise. We
want to test whether the regression model for houses with a pool is equivalent to the
regression model for houses without a pool. Choose the correct statement.
(a) An appropriate null hypothesis is H0 : γ1 = 0.
(b) An appropriate null hypothesis is H0 : γ2 = 0.
(c) An appropriate null hypothesis is H0 : γ1 = γ2 = 0. ⇐ Answer
(d) The test is called as RESET.
(e) None of the above is correct.

Examination continues on the next page.


Page 3 of 16
Semester One Final Deferred Examinations, 2017 ECON7310 Elements of Econometrics

4. Consider the regression model,

yi = β1 + β2 xi2 + · · · + βK xiK + ei

where errors may be heteroskedastic. Choose the most incorrect statement.

(a) The OLS estimators are consistent and unbiased.


(b) We should report the OLS estimates with the robust standard errors.
(c) The Gauss-Markov theorem may not apply.
(d) The GLS cannot be used because we do not know the error variances in practice.
(e) We should take care of heteroskedasticity only if homoskedasticity is rejected. ⇐
Answer

5. Consider the regression model,

yt = β1 + β2 xt2 + · · · + βK xtK + et ,
et = ρet−1 + vt with − 1 < ρ < 1

where vt ’s are independent random error terms with mean zero and variance σv2 .
Choose the wrong statement.

(a) OLS is unbiased, consistent, and efficient ⇐ Answer


(b) We may use the OLS estimates with the HAC standard errors
(c) The errors follow an AR(1) process if ρ 6= 0.
(d) V ar(et ) = σv2 /(1 − ρ2 ) and Cov(et , et−k ) = ρk σv2 /(1 − ρ2 ) for k > 0.
(e) The Breusch-Godfrey test considers H0 : ρ = 0.

6. For each observation i = 1, · · · , n, the regressor xi,2 indicates “female,” and xi,3
indicates lipstick users. From the data, it is confirmed that xi,2 and xi,3 are similar, but
not the same. That is, in the sample, there are some women not using a lipstick, and
there are some men using a lipstick. Now, suppose that an econometrician regresses
yi on all the regressors including xi,2 and xi,3 and finds that the standard errors of
the OLS estimates are large. Choose the wrong statement.

(a) This is called the near multicollinearity.


(b) One should find a way to reduce the standard errors to have significant estimates.
⇐ Answer
(c) The large standard errors arise naturally due to the difficulty to disentangle the
effect of xi,2 on yi from the effect of xi,3 on yi .
(d) The OLS is unbiased and consistent.
(e) None of the above is wrong.

Examination continues on the next page.


Page 4 of 16
Semester One Final Deferred Examinations, 2017 ECON7310 Elements of Econometrics

7. In order to estimate the wage equation, an econometrician regresses the log of wage
on individual’s observed characteristics including years of schooling. In this problem,
however, the error term contains unobserved characteristics such as motivation, and it
is likely that the error is correlated with the years of schooling, i.e., a highly motivated
person tends to study more and also make more money. Choose the wrong statement.

(a) The years of schooling is endogenous.


(b) The OLS estimator is still consistent and asymptotically normal. ⇐ Answer
(c) The OLS estimator is not unbiased.
(d) The IV estimator can be used to estimate the coefficients.
(e) None of the above is wrong.

8. An econometrician regressed the hamburger sales on the hamburger price as well


as the expenditure on advertising and its square. She obtained the following fitted
regression equation,

\ i = 109.719 − 7.640P ricei + 12.151Adverti − 2.768Advert2i


Sales

Economic theory says the firm should increase advertising expenditure to the point
where an extra $1 of expenditure results in an extra $1 of sales (i.e., marginal cost =
marginal revenue). Choose the wrong statement.

(a) The term Advert2 captures some nonlinearity.


(b) This is an example of the (estimated) multivariate linear regression model.
(c) The estimated marginal revenue is 12.151 + 2(−2.768)Advert.
(d) The estimated optimal level of advertising is approximately 2.014.
(e) A t test can be used to test the hypothesis that advertising does not affect sales.
⇐ Answer

9. The time series yt satisfies

E[yt ] = µ < ∞
V ar(yt ) = σ 2 < ∞
Cov(yt , yt−s ) = γs , (covariance depends on s, not t)

Then, the series yt is said to be:

(a) cross sectional data.


(b) stationary. ⇐ Answer
(c) reliable.
(d) unreliable.
(e) None of the above.

Examination continues on the next page.


Page 5 of 16
Semester One Final Deferred Examinations, 2017 ECON7310 Elements of Econometrics

10. Consider the autoregressive model,

yt = θ0 + θ1 yt−1 + θ2 yt−2 + · · · + θp yt−p + vt

where vt are independent random error terms with zero means and variances σv2 .
Choose the wrong statement.

(a) This model can be denoted as AR(p).


(b) We often choose the value of p by hypothesis tests, residual analysis, information
criteria, and parsimony.
(c) OLS is not appropriate to estimate the autoregressive model given here. ⇐
Answer
(d) The forecast on yT +1 is given as

ŷT +1 = θ̂0 + θ̂1 yT + θ̂2 yT −1 + · · · + θ̂p yT +1−p

where θ̂0 , θ̂1 , . . . , θ̂p are estimates of the coefficients.


(e) The forecast on yT +2 is given as

ŷT +2 = θ̂0 + θ̂1 ŷT +1 + θ̂2 yT + · · · + θ̂p yT +2−p

where θ̂0 , θ̂1 , . . . , θ̂p are estimates of the coefficients.

Examination continues on the next page.


Page 6 of 16
Semester One Final Deferred Examinations, 2017 ECON7310 Elements of Econometrics

Part B:
Short answer questions – answer all questions in the Answer Booklet.
Marks are as indicated. Total value: 70 marks

1. Data on the weekly sales of a major brand of canned tuna by a supermarket chain in
a large mid-western US city during a mid-1990’s calendar year are contained in the
file tuna.dta. The file contains 52 observations on the following variables

sal1 = unit sales of brand no.1 of canned tuna,


apr1 = price per can of brand no.1 of canned tuna,
apr2,apr3 = price per can of brands nos. 2 and 3 of canned tuna,
disp = 1 if there is a store display for brand 1 but no newspaper ad, 0 otherwise
dispad = 1 if there is a store display and a newspaper ad for brand 1, 0 otherwise

We estimated the log-linear model:

log(sal1) = β1 + β2 apr1 + β3 apr2 + β4 apr3 + β5 disp + β6 dispad + e

and obtained the following Stata outcome;

(a) In order to see if this model has omitted variables, we ran the following command,

estat ovtest

and obtain the p value of 0.057. State precisely the null hypothesis associated
with this test (5 marks) and decide whether to reject the null hypothesis using
your own level of the test (5 marks).
(Answer)
H0 : the model has no omitted variables (5 marks)
Since p-value is larger than any conventionally used level of the test such as 1%
or 5%, we do not reject the null hypothesis. (5 marks)
If they say we accept the null hypothesis, take 1 mark off.
If they used 10% level of significance or even larger, they should reject the null
hypothesis. Mark accordingly.


Examination continues on the next page.


Page 7 of 16
Semester One Final Deferred Examinations, 2017 ECON7310 Elements of Econometrics

(b) Discuss and interpret the estimates of β3 . Especially, show, step-by-step, that
the marginal effect is given as
∂sal1
= β3 sal2
∂apr2

(5 marks). Then, explain the effect of a one-unit increase in the price of Brand
2 on sales using the Stata outcome. (5 marks).
(Answer)
we can re-write the regression equation as

sal1 = exp (β1 + β2 apr1 + β3 apr2 + β4 apr3 + β5 disp + β6 dispad + e) .

So,
∂sal1
= exp (β1 + β2 apr1 + β3 apr2 + β4 apr3 + β5 disp + β6 dispad + e) β3 = β3 sal1.
∂apr2

Therefore,  
∂sal1
∂apr2
β3 =
sal1
Hence, a unit change in apr2 would lead to approximately, β3 × 100% change in
sal1. The estimates says that a one-unit increase in the price of Brand 2 will
lead to a 115% increase in sales.

(c) What is the estimated percentage increase in sales from a display and no adver-
tisement? (5 marks)
(Answer)

sal1|disp=1 = exp (β1 + β2 apr1 + β3 apr2 + β4 apr3 + β5 + β6 dispad + e)


sal1|disp=0 = exp (β1 + β2 apr1 + β3 apr2 + β4 apr3 + β6 dispad + e)
sal1|disp=1 − sal1|disp=0
= exp(β5 ) − 1
sal1|disp=0

Hence, the estimated percentage increase in sales from a display and no adver-
tisement is
h i
exp(β̂5 ) − 1 ×100% = [exp(0.4237) − 1]×100% = [1.5276 − 1]×100% = 52.76%

Examination continues on the next page.


Page 8 of 16
Semester One Final Deferred Examinations, 2017 ECON7310 Elements of Econometrics

2. Mroz (1987, Econometrica, 55:765-800) investigated the labour force participation of


married women. Consider the model

hours = β1 +β2 log(wage)+β3 educ+β4 age+β5 kidsl6+β6 kids618+β7 nwiffinc+e

where the errors are assumed to be independent N (0, σ 2 ) random variables. The
variable nwiffinc = faminc − wage × hours. The variable lfp is a binary variable
that takes the value 1 if a woman worked and 0 if she did not. The variable exper is
the years of experience.

(a) Explain why log(wage) can be endogenous. (5 marks)


(Answer)
For this problem it is important for students to explain why log(wage) can be
correlated with e. There could be many possible stories in principle. But, as
long as students correctly understand under what circumstance the variable is
endogenous (nonzero correlation with e) and give a reasonable story, they should
be given a full mark. If either of them lacks, take off 2.
For example, motivation is not included in the equation, but it affects hours
and is included in e. Moreover, if your motivation is high, you get higher wage.
So, it can be argued that log(wage) is correlated with e and it is endogenous.

(b) Explain why exper and exper2 can be valid instruments to correct the endo-
geneity problem in (a). (5 marks)
(Answer)
Similarly, it is important for students to show they understand the conditions
for IV. Apply the same marking rule as (a). Years of experience may not be
correlated with motivation but it is correlated with log(wage). Hence, it can be
an IV.

(c) After estimating the regression equations by the 2SLS with exper and exper2 as
instrument variables, we performed the Stata command and obtained the results
as shown below. Interpret First-state regression summary statistics: es-
pecially, what can you say about the instrumental variables? (5 marks)
(Answer)
We consider the null hypothesis that the instruments are uncorrelated with the
endogenous variable. The usual rule of thumb is to reject the null hypothesis if
F > 10. We do not reject H0 and we conclude that we do not have any strong
instruments. This means the IV estimator is relatively inefficient.
It is most important to know the meaning of the F -stat which measures the
strength of IV or similarly the underlying null hypothesis. If students show
they understand this, they should get at least 3 marks. But, if they use F -stat
incorrectly, (for example, they could mistakenly say small F implies strong IV),
take off 2. But, not following the rule of thumb is fine. (For example, they could
say since F is very close to 10 (or p-value is close to zero, IV is strong enough).


Examination continues on the next page.


Page 9 of 16
Semester One Final Deferred Examinations, 2017 ECON7310 Elements of Econometrics

(d) The following computer log and output are related to a hypothesis testing to
see whether log(wage) is endogenous using exper and exper2 as instrumental
variables. Precisely state the null hypothesis (5 marks), explain the test proce-
dure step-by-step and make a conclusion about the endogeneity using the Stata
outputs (5 marks).
(Answer)
The null hypothesis is that log(wage) is uncorrelated with the error terms, i.e.,
log(wage) is not endogenous (5 marks).
Testing the null hypothesis is equivalent to testing whether the coefficient of
vhat is zero. This is a standard t-test. The p-value is < 0.05 so we reject H0
and conclude that the explanatory variables are correlated with the error term
(i.e., we have an endogeneity problem). (5 marks)


Examination continues on the next page.


Page 10 of 16
Semester One Final Deferred Examinations, 2017 ECON7310 Elements of Econometrics

Examination continues on the next page.


Page 11 of 16
Semester One Final Deferred Examinations, 2017 ECON7310 Elements of Econometrics

3. The file mexican.dta contains data collected in 2001 from the transactions of 754
Mexican sex workers. There is information on four transactions per worker. The data
are a subset of that used by Gentler, Shah and Bertozzi, ‘Risky Business: The Market
for Unprotected Sex,’ Journal of Political Economics, (2005), 113, 518–550. There are
four transactions per worker. The labels id and trans are used to describe a particular
woman and a particular transaction. There are three categories of variables,

• Sex worker characteristics: age, attractive (indicator variable), school (if Year
12 or higher).

• Client characteristics: regular, rich, alcohol (=1 if consumed before). All


indicator variables.

• Transaction characteristics: lnprice (log of the price), nocondom, bar (=1 ini-
tiated at a bar), street (=1 initiated in the street)

We estimated a model with lnprice as the dependent variable using client and trans-
action characteristics as independent variables and obtained the following Stata log.

(a) Test the null hypothesis that all individual has the same intercept at the level of
5 % and interpret your test result (5 marks).
(Answer)
The associated F -statistic and p-value appear at the bottom of the output. Since
the p-value is very small, we reject H0 . Hence, there are fixed effects (for some

Examination continues on the next page.


Page 12 of 16
Semester One Final Deferred Examinations, 2017 ECON7310 Elements of Econometrics

individuals).

(b) Explain why sex workers characteristics should be omitted (5 marks).
(Answer)
Sex worker characteristics are omitted because they are time-invariant over the
time in which the 4 transactions took place. Their effect cannot be separated
from the individual effects given by the coefficients of the fixed-effects dummy
variables.

(c) nocondom can be argued to be an indicator of a risk premium (a higher price
might be paid for not wearing a condom). What is the estimate of the risk
premium? (5 marks)
(Answer)
The estimated risk premium for not using a condom is approximately 17% (log-
linear model). The exact estimate is 100(exp(0.170282) − 1)% = 18.6%

(d) Explain briefly about the main message from the Stata log below (5 marks).

(Answer)
Since the p-values are smaller than 0.05, we reject the null hypothesis that the RE
estimates and the FE estimates are equal for each of nocondom, rich, regular,
and alcohol.


Examination continues on the next page.


Page 13 of 16
Semester One Final Deferred Examinations, 2017 ECON7310 Elements of Econometrics

 End of Examination 

Page 14 of 16
Semester One Final Deferred Examinations, 2017 ECON7310 Elements of Econometrics

Some Formulas:
Simple Regression A:
P P P
N i xi yi − i xi i yi
b1 = ȳ − b2 x̄ and b2 = P 2
N i xi − ( i xi )2
P
2
P 2
σ x σ2 −x̄σ 2
V ar(b1 ) = P i i 2 , V ar(b2 ) = P 2
, and Cov(b ,
1 2b ) = P 2
N i (xi − x̄) i (xi − x̄) i (xi − x̄)
ŷi = b1 + b2 xi
1 X 1 X 2
σ̂2 = (yi − ŷi )2 = êi
N −2 N −2
i i
σ̂ 2 i x2i
P q
\
V ar(b1 ) = and se(b1 ) = V\ ar(b1 )
N i (xi − x̄)2
P

σ̂ 2
q
V\ar(b2 ) = P 2
and se(b2 ) = V\
ar(b2 )
i (xi − x̄)
bk − c
bk ± tc se(bk ) and t = for k = 1, 2 and Pr(|N (0, 1)| > 1.96) = 0.05
se(bk )

Simple Regression B:

yˆ0 = b1 + b2 x0 and f = y0 − yˆ0


(x0 − x̄)2
 
2 1
V ar(f ) = σ 1 + + P 2
N i (xi − x̄)
(x0 − x̄)2
  q
1
V\ar(f ) = σ̂ 2 1 + +P 2
and se(f ) = V\
ar(f ) and ŷ0 ± tc se(f )
N i (xi − x̄)
X X X
(yi − ȳi )2 = (ŷi − ȳi )2 + ê2i
i i i

 
∂y x
elasticity of y to x =
∂x y
(K − 3)2
 
N 2
∼ χ2 (2) and Pr χ2 (2) > 5.99 = 0.05

JB = S +
6 4

Multiple Regression A:

ŷi = b1 + b2 xi2 + · · · bK xiK and êi = yi − ŷi


σ2
V ar(b2 ) = 2 )
P 2
when K = 3
(1 − r23 i (xi2 − x̄2 )
1 X 1 X
σ̂2 = (yi − ŷi )2 = ê2i
N −K N −K
i i
σ̂ 2
q
V\
ar(b2 ) = 2 ) and se(b2 ) = V\
ar(b2 ) when K = 3
− x̄2 )2
P
(1 − r23 i (xi2
bk − βk
bk ± tc se(bk ) and t = for k = 1, . . . , K
se(bk )
(SSER − SSEU )/J SSR SSE SSE/(N − K)
F = and R2 = =1− and R̄2 = 1 −
SSEU /(N − K) SST SST SST /(N − 1)

Page 15 of 16
Semester One Final Deferred Examinations, 2017 ECON7310 Elements of Econometrics

Multiple Regression B:
 
SSE 2K
AIC = log +
N N
 
SSE K log(K)
SIC = log +
N N

Autocorrelation:
Cov(et , et−k )
Corr(et , et−k ) =
V ar(et )
PT
\
Cov(et , et−k ) t=k+1 êt × êt−k
rk = = PT 2
V\ar(et ) t=k+1 êt−k
PT
(êt − êt−1 )2
d = t=2PT ≈ 2(1 − r1 )
2
t=2 êt

Endogeneity:

In the case of the simple regression model y = β1 + β2 x + e and one IV zi , the variance of
the IV estimator is
σ2
V ar(β̂2 ) = 2
.
(xi − x̄)2
P
rzx

Page 16 of 16

You might also like