Mock Exam Solution Empirical Methods For Finance

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Empirical Methods for Finance

Mock Exam - Solution Sketch

Prof. Virginia Gianinazzi

Questions
1. Companies release their earnings-per-share (EPS) on scheduled announce-
ment dates, typically every quarter. Preceding earnings announcements,
financial analysts publish their forecasts of companies’ earnings, based on
their expectations about companies’ growth and profitability.
You estimate the following model:

f e = β0 + β1 f inancials + ε

where fe is analysts’ forecast errors and financials is a dummy variable


equal to 1 if the company is a financial company. The forecast error
is computed as the difference between announced earnings (actual) and
the consensus forecast (consensus), scaled by the announced earnings, in
absolute value: f e = | actual−consensus
actual |.
Figure 1 reports the regression output. Interpret the coefficient on f inancials
and the R2 of the regression.
Solution. The coefficient on f inancials corresponds to the difference in
average forecast error between financial firms and non-financial firms. In
our sample, the average forecast error for financial firms is 1.7 percentage
points lower than for non-financial firms. The R2 represents the fraction
of variation in forecast error explained by the model. The current model
explains less than 0.1% of the variation in the dependent variable.
2. Comment on the statistical significance of βb1 . Specifically, how do we
interpret a p-value=0.177?
Solution. The difference in average forecast error between financial and
non-financial firms is not statistically significant, i.e. it is indistinguishable
from zero, at any standard confidence level. The p-value represents the
lowest significance level at which we can still reject the null hypothesis
of a zero effect of f inancials against a two-sided alternative. Under the

1
Figure 1: Forecast error for financial versus non-financial firms

null-hypotheses of no effect (β1 = 0), the probability of observing a value


of βb1 as extreme or more extreme than −0.017 is 17.7%.
3. How do we get to the 95% confidence interval for βb1 reported in the table?
Write down the formula. At the end of the document you find the table
for the t-distribution.
Solution.

ci95 = βb1 ± 1.96 × se(βb1 ) = −0.017 ± 1.96 × 0.012 = [−0.0411; 0.0076]

4. How do you test the hypothesis that the forecast error is twice as large
for financial firms than for nonfinancial firms?
Solution. This corresponds to testing the following hypothesis about the
model parameters (linear combination of parameters): H0 : β1 = β0 , or
equivalently H0 : β1 − β0 = 0. The corresponding t-test :

βb1 − βb0
t=
se(βb1 − βb0 )

In Stata, use the command test beta1=beta0


5. Now, you additionally control for coverage, defined as the number of ana-
lysts providing a forecast, and market capitalization. How does the inter-
pretation of the coefficient on f inancials change?

f e = β0 + β1 f inancials + β2 coverage + β3 marketcap + η (1)

2
Solution.
The coefficient on f inancials now is to be interpreted as the predicted dif-
ference in forecast error between financials and non-financial firms holding
analyst coverage and market capitalization constant. In other words, for
the same level of analyst coverage and firm size, the model predicts a
1.1 percentage points lower forecast error. This difference is however not
statistically significant.
6. Currently, market cap is in thousand dollars. How does β3 change if you
re-run the regression using marketcap in billions of dollars? And how does
the t-statistic change?
Solution. The coefficient gets multiplied by 1,000,000. Data scaling has
no effect on the t-statistic.

f e = β0 + β1 f in + β2 cov + β3 × 1, 000, 000mktcap/1, 000, 000 + η

f e = β0 + β1 f in + β2 cov + β̃3 mktcap in bn + η

7. We re-run the regression this time using the natural logarithm of mar-
ket capitalization (logmktcap) instead of market capitalization in levels.
Interpret the coefficient on logmktcap reported in the table below.
Solution.
The model predicts that a 1% increase in market capitalization leads to a
decrease in forecast error by 0.0355/100, i.e. 3.55 basis points.

Figure 2: Regression with log market capitalization

3
8. Under which assumptions can we interpret the coefficient on coverage as
measuring the causal effect of analysts’ coverage on forecast accuracy in
the model of Figure 2?
Solution.
Under the assumptions MLR.1-MLR.4, the OLS estimator identifies the
causal effect of coverage on forecast error. The key assumption for causal-
ity is the zero conditional mean assumption E[u|x] = 0. OLS is biased and
inconsistent if the model is omitting relevant variables that are correlated
with our regressors. In particular, here we are assuming that how many
analysts are covering a given firm is unrelated to company characteristics
that affect the accuracy of the forecast.

9. What threats to the identification of the causal effect do you see in this
case?
Solution.
There may not be a causal relationship if coverage is higher for compa-
nies for which there is currently more interest. If this was the case, the
observed smaller forecast errors may be explained by better reporting or
more comprehensive information disclosure by companies that are under
greater scrutiny by market participants and have more visibility in the
financial press.
We could also not make a causal claim if analysts self-selected into those
industries and companies for which it is easier to come up with an accurate
forecast.
10. You compute the Shapiro-Wilk W test for the normality of the regression
residuals. The figure below reports the results. What does it tell you about
the normality assumption (MLR.6)? How does this affect the validity of
your t-stats from the results at the previous point?

Solution.
The Shapiro-Wilk test provides strong evidence against the null hypothesis
of normally distributed errors. The normality assumption is necessary to
derive the distribution of the t-statistics. Without normality of the errors,
t-statistics are no longer exactly t-distributed. Luckily, even if the errors
are not normal, t-statistics are approximately normally distributed in large
enough samples.

4
11. Consider the following population model that satisfies MLR.1 - MLR.4:

yi = β0 + β1 Di + β2 xi + ε

where Di is a dummy variable = 1 if i is a dummy variable. What would


be the consequence of estimating the model without Di ?
Solution.
If we omit Di , it ends up in the error term. This produces a biased
estimate of β̂2 if x is correlated with D. We can derive the bias to be δ̃β1 ,
where δ̃ = Cov(D,x)
V ar(x) .

12. To address the endogeneity of education in a model of wage on education,


experience and tenure, you instrument education using a dummy variable
for whether there is a university in the city where the person grew up.
Discuss the validity of this instrument.
Solution.
The requirements for instrument validity are instrument exogeneity and
instrument relevance: the proximity to a university must be uncorrelated
with wage and it has to be correlated with the level of education.
Instrument relevance can be tested (strong first stage). It is possible that
people that leave close to universities may be more likely to go. Instrument
exogeneity is not testable. It would be violated if family characteristics
that correlate with both education and wage affect the decision to live in
a city with a university. It is indeed quite possible that highly educated
parents that value education of their children may be making this choice.
At the same time, children of highly educated parents are expected to earn
higher salaries, ceteris paribus.

True or False
PN
1. i=1 ûi (ŷi − ȳ) = 0 always.
True or False
2. Consider the following population model that satisfies MLR.1 - MLR.4:

yi = β0 + β1 Di + β2 xi + ε

where Di is a dummy variable = 1 if i is a woman. Omitting D from the


regression causes a bias in the intercept (βb0 ) only.
True or False

3. One must report the heteroskedasticity adjusted standard errors every


time that the assumption of normal errors is rejected in the data.
True or False

5
4. Correlation between the regressors and the residuals (b
u) causes the OLS
estimators to be biased.
True or False

You might also like