Professional Documents
Culture Documents
Mock Exam1
Mock Exam1
Mock Exam1
1.1
Describe the features of the scatterplot shown in figure 1. In particular, mention the
sign of the correlation and the linearity/nonlinearity of the relationship of the two
quantities.
3.5
3.0
Jobop
2.5
2.0
5 6 7 8 9 10
Uemp
Figure 1: Scatter plot of the job openings rate (Jobop) plotted against the unemploy-
ment rate (Uemp)
```
##
## C a l l :
## lm ( f o r m u l a = Jobop ~ Uemp, data = Q1data )
##
## R e s i d u a l s :
## Min 1Q Median 3Q Max
## −0.45309 −0.09979 0 . 0 2 4 0 7 0 . 1 1 6 7 1 0 . 2 5 9 3 5
##
## C o e f f i c i e n t s :
## Estimate Std . E r r o r t v a l u e Pr ( >| t | )
## ( I n t e r c e p t ) 4 . 5 0 2 5 7 0.06825 65.97 <2e −16 ∗∗∗
## Uemp −0.28680 0 . 0 1 1 0 9 −25.85 <2e −16 ∗∗∗
## −−−
## S i g n i f . c o d e s : 0 ' ∗ ∗ ∗ ' 0 . 0 0 1 ' ∗ ∗ ' 0 . 0 1 ' ∗ ' 0 . 0 5 ' . ' 0 . 1 ' ' 1
##
## R e s i d u a l s t a n d a r d e r r o r : 0 . 1 5 6 1 on 58 d e g r e e s o f freedom
## M u l t i p l e R−s q u a r e d : 0 . 9 2 0 2 , Adjusted R−s q u a r e d : 0 . 9 1 8 8
## F− s t a t i s t i c : 6 6 8 . 4 on 1 and 58 DF, p−v a l u e : < 2 . 2 e −16
```
1.2
First, we consider the following linear model:
Y = β0 + β1 X + u, E(u|X) = 0.
Let’s call it model 1.1. The response variable Y corresponds to the job openings rate
Jobop and the explanatory variable X to the unemployment rate Uemp. The results
from this regression can be found in listing 1.
From the output, report the OLS estimates of β0 and β1 .
1.3
a) Comment on the adequacy of the model specification using figures 2 and 3 by
referring to relevant assumptions. The solid black line in figure 2 represents the fitted
regression line and the solid black horizontal line in figure 3 represents x-axis.
b) Based on your answer in a), what are the limitations of this model in terms of
interpreting the OLS estimate of β1 ?
3.5
3.0
Jobop
2.5
2.0
5 6 7 8 9 10
Uemp
- **Homoscedasticity:** Residuals should be spread relatively evenly across all levels of the predicted
values. A cone or fan shape indicates heteroscedasticity, violating the assumption.
- **Linearity:** The residuals should be randomly scattered around zero, with no clear patterns. If you
observe a pattern (e.g., a curve or systematic deviation), it suggests a violation of linearity.
0.2
0.0
residuals
−0.2
−0.4
5 6 7 8 9 10
Uemp
Figure 3: Residuals of Model 1.1 plotted against the unemployment rate (Uemp)
1.4
In fact, the relationship between job openings and unemployment has been studied in
economics and is known as the Beveridge curve. Assuming that the efficiency of the
labor market stays constant, the economic model for the Beveridge curve is written as
Y = β˜0 X β1 ,
with Y representing the job openings rate Jobop and X the unemployment rate Uemp.
Write down the corresponding econometric model in the format
log Y = . b0
. . +b1 * log(xi) + u
```
##
## C a l l :
## lm ( f o r m u l a = l o g ( Jobop ) ~ l o g (Uemp) , data = Q1data )
##
## R e s i d u a l s :
## Min 1Q Median 3Q Max
## −0.120437 −0.027405 0 . 0 0 1 4 8 2 0 . 0 2 9 8 3 5 0 . 0 7 0 1 8 6
##
## C o e f f i c i e n t s :
## Estimate Std . E r r o r t v a l u e Pr ( >| t | )
## ( I n t e r c e p t ) 2 . 3 7 8 6 3 0.03612 65.86 <2e −16 ∗∗∗
## l o g (Uemp) −0.78819 0 . 0 2 0 6 1 −38.24 <2e −16 ∗∗∗
## −−−
## S i g n i f . c o d e s : 0 ' ∗ ∗ ∗ ' 0 . 0 0 1 ' ∗ ∗ ' 0 . 0 1 ' ∗ ' 0 . 0 5 ' . ' 0 . 1 ' ' 1
##
## R e s i d u a l s t a n d a r d e r r o r : 0 . 0 4 2 9 8 on 58 d e g r e e s o f freedom
## M u l t i p l e R−s q u a r e d : 0 . 9 6 1 9 , Adjusted R−s q u a r e d : 0 . 9 6 1 2
## F− s t a t i s t i c : 1462 on 1 and 58 DF, p−v a l u e : < 2 . 2 e −16
```
1.5
From here on, we refer to the econometric model of the Beveridge curve described in
problem 1.4 as Model 1.2. The results from this regression can be found in listing 2.
From the output, report the OLS-estimate of β1 .
1.6
a) Compare the model fit between Model 1.1 and Model 1.2. For Model 1.2, refer
to figures 4 and 5. When conducting the comparison, explicitly refer to relevant
assumption(s).
b) Does the homoskedasticity assumption V(u|X) = σ 2 seem to hold for Model 1.2?
Justify your answer by referring to features of figure 5.
3.5
3.0
Jobop
2.5
2.0
5 6 7 8 9 10
Uemp
0.05
0.00
residuals
−0.05
−0.10
5 6 7 8 9 10
Uemp
Figure 5: Residuals of Model 1.2 plotted against the unemployment rate (Uemp)
1.7
Interpret the OLS estimate of β1 for Model 1.2 in the current context.
1.8
For both models, predict the job openings rate when the unemployment rate is 15
percentage points.
1.9
In figure 6, more recent data from during the Covid period (Apr 2020 to Aug 2021)
is plotted (represented as crosses) in addition to the original data shown in figure 1.
Provide econometric reasoning as to why Model 1.2 may not generalize to these
observations. Specifically, focus on the underlying assumptions about the population
level on the one hand and sample level on the other hand.
8
7
6
Jobop
5
4
3
2
1
4 6 8 10 12 14
Uemp
Figure 6: Scatter plot of job openings rate (Jobop) plotted against the unemployment
rate (Uemp) including recent data (shown as +)
```
##
## C a l l :
## lm ( f o r m u l a = PRICE ~ BATH, data = house )
##
## R e s i d u a l s :
## Min 1Q Median 3Q Max
## −728368 −190684 −41785 143562 2019273
##
## C o e f f i c i e n t s :
## Estimate Std . E r r o r t v a l u e Pr ( >| t | )
## ( I n t e r c e p t ) 130046 31876 4 . 0 8 4 . 6 2 e −05 ∗∗∗
## BATH 252578 14970 1 6 . 8 7 < 2e −16 ∗∗∗
## −−−
## S i g n i f . c o d e s : 0 ' ∗ ∗ ∗ ' 0 . 0 0 1 ' ∗ ∗ ' 0 . 0 1 ' ∗ ' 0 . 0 5 ' . ' 0 . 1 ' ' 1
##
## R e s i d u a l s t a n d a r d e r r o r : 269600 on 3080 d e g r e e s o f freedom
## M u l t i p l e R−s q u a r e d : 0 . 0 8 4 6 1 , Adjusted R−s q u a r e d :
0.08431
## F− s t a t i s t i c : 2 8 4 . 7 on 1 and 3080 DF, p−v a l u e : < 2 . 2 e −16
```
2.1
In listing 3 you can see a fit for the model
Y = β0 + β1 Xbath + u, E(u|Xbath ) = 0.
This model will be referred to as model 2.1.
Report the OLS estimate for β1 and provide an interpretation for it in the current
context.
2.2
In listing 4, you may find the a fit for the model
b) Why are the OLS estimates for β1 different in models 2.1 and 2.2?
```
##
## C a l l :
## lm ( f o r m u l a = PRICE ~ BATH + SIZE , data = house )
##
## R e s i d u a l s :
## Min 1Q Median 3Q Max
## −1081683 −143453 −7280 134330 1503722
##
## C o e f f i c i e n t s :
## Estimate Std . E r r o r t v a l u e Pr ( >| t | )
## ( I n t e r c e p t ) −111842 27136 −4.122 3 . 8 6 e −05 ∗∗∗
## BATH 158924 12629 1 2 . 5 8 4 < 2e −16 ∗∗∗
## SIZE 4072 108 3 7 . 7 1 6 < 2e −16 ∗∗∗
## −−−
## S i g n i f . c o d e s : 0 ' ∗ ∗ ∗ ' 0 . 0 0 1 ' ∗ ∗ ' 0 . 0 1 ' ∗ ' 0 . 0 5 ' . ' 0 . 1 ' ' 1
##
## R e s i d u a l s t a n d a r d e r r o r : 223000 on 3079 d e g r e e s o f freedom
## M u l t i p l e R−s q u a r e d : 0 . 3 7 3 9 , Adjusted R−s q u a r e d : 0 . 3 7 3 5
## F− s t a t i s t i c : 9 1 9 . 3 on 2 and 3079 DF, p−v a l u e : < 2 . 2 e −16
```
2.3
Another model is fitted in listing 5. Let’s call it model 2.3. Provide the full model
equation for this model 2.3. That is, write down a formula of the form
Y = ...
2.4
In model 2.3, what is the dimension of the design matrix? (Provide the number of
rows and columns.)
```
##
## C a l l :
## lm ( f o r m u l a = PRICE ~ BATH + SIZE + YEAR + BEDROOM + GARAGE, data = hous
##
## R e s i d u a l s :
## Min 1Q Median 3Q Max
## −925619 −125886 −9468 115620 1493701
##
## C o e f f i c i e n t s :
## Estimate Std . E r r o r t v a l u e Pr ( >| t | )
## ( I n t e r c e p t ) −5769475.2 3 5 2 5 2 5 . 8 −16.37 <2e −16 ∗∗∗
## BATH 110851.1 11655.7 9.51 <2e −16 ∗∗∗
## SIZE 4461.2 101.9 43.79 <2e −16 ∗∗∗
## YEAR 2702.9 183.6 14.72 <2e −16 ∗∗∗
## BEDROOM 76732.9 6797.0 11.29 <2e −16 ∗∗∗
## GARAGE 167951.7 12857.9 13.06 <2e −16 ∗∗∗
## −−−
## S i g n i f . c o d e s : 0 ' ∗ ∗ ∗ ' 0 . 0 0 1 ' ∗ ∗ ' 0 . 0 1 ' ∗ ' 0 . 0 5 ' . ' 0 . 1 ' ' 1
##
## R e s i d u a l s t a n d a r d e r r o r : 202500 on 3076 d e g r e e s o f freedom
## M u l t i p l e R−s q u a r e d : 0 . 4 8 4 4 , Adjusted R−s q u a r e d : 0 . 4 8 3 5
## F− s t a t i s t i c : 5 7 7 . 9 on 5 and 3076 DF, p−v a l u e : < 2 . 2 e −16
```
2.5
a) Report the coefficient of determination (R2 ) of model 2.3 and provide an
interpretation of it in the current context.
b) Compare it with the coefficients of determination of models 2.2 and 2.1 and
provide an interpretation in the current context.
2.6
In model 2.3, what is the (estimated) expected price for a house with 2 bedrooms, 2
bathroom and 1 garage, which was built in 1920, if the size of the house is 150 square
meters?
2.7
a) In model 2.3, report the standard error of the OLS estimator of the coefficient
associated to SIZE.
b) What is the implicit assumption made such that the standard error is a valid
estimate?
2.8
a) In model 2.3, test if the predictor variable BATH has a significant effect on the
the price of a house, controlling for the effect of the remaining explanatory
variables. (Provide details: Null-hypothesis, significance level, test statistic,
p-value, interpretation).
b) What are the implicit assumptions to make the testing procedure valid?
2.9
Suppose a real estate agent wants to improve model 2.3 by including a luxury score
for a house. The luxury score of a house is the sum of all bathrooms and bedrooms.
Explain what effect the inclusion of the luxury score into model 2.3 would have on
the OLS estimator.
3 True or false?
State if you deem the statement true or false. For true statements, provide a brief
explanation as to why they are true. For false statements, provide a correct statement
or an explanation as to why they are false.
3.1
Consider the simple linear regression model
Y = β0 + β1 log2 (X) + u,
where E(u|X) = 0 and log2 is the logarithm with base 2 (e.g. log2 (2) = 1, log2 (4) = 2,
log2 (8) = 3). Suppose further that β0 = 1, β1 = 3 and E(log2 (X)) = 2.
a) The unconditional expectation of Y is 7.
3.2
Consider the simple linear regression model
Y = β0 + β1 X + u,
where E(u|X) = 0, V(u|X) = σ 2 > 0. Suppose you have a random sample (yi , xi ),
i = 1, . . . , N , from this population. Let β̂0 and β̂1 be the OLS estimators of β0 and
β1 .
a) To check the assumption that the model is correctly specified (E(u|X) = 0),
one needs to check if the mean of all residuals is 0.
b) Increasing the sample size from N = 150 to N2 = 600 will decrease the standard
deviation of β̂1 by 50%.
3.3
Consider the multiple linear regression model
Y = β0 + β1 X1 + β2 X2 + u,
where E(u|X1 , X2 ) = 0, V(u|X1 , X2 ) = σ 2 > 0. Let (yi , x1,i , x2,i ), i = 1, . . . , N , be a
random sample.
Let β̂ be the OLS estimator for β = (β0 , β1 , β2 )0 .
a) It holds that E(β̂0 ) = E(β̂1 ) = E(β̂2 ).
c) The stronger X1 and X2 are correlated, the less precisely we can estimate β1
and β2 .