Professional Documents
Culture Documents
PS2 Solution
PS2 Solution
PS2 Solution
Question 1
Preliminary Analysis
Solution (1) Looking at the standard deviations of GDP per capita’s log
(table 1), Chile registered the greatest variation, whereas Paraguay registered
the lowest one.
Solution (2) No, they did not. The highest variation in the CO2 emissions’
log has been registered by Honduras, while Mexico has registered the lowest one
(table 2).
Solution (3) From table 2, on average Venezuela is the country that expe-
rienced the highest level of CO2 emissions per capita, on the other side there
1
Table 2: CO2 level of emission per capita’s log statistics
Country Mean SD Min Max
Argentina 1.328376 .1020411 1.18661 1.566575
Bolivia .0234759 .3002862 -.4931038 .4491379
Brazil .4917929 .1555767 .2356698 .7655925
Chile 1.046245 .3199351 .5727751 1.451309
Colombia .455186 .0949469 .2584696 .6320238
Costa Rica .209913 .2648509 -.271057 .6527979
Honduras -.3625969 .3342241 -.8597653 .201258
Mexico 1.328472 .0487545 1.23513 1.456606
Nicaragua -.4283491 .1965379 -1.00708 -.186882
Panama .5053506 .2397069 .0354755 .9628125
Paraguay -.4852938 .2338397 -.8812061 -.1304859
Peru .1497628 .184563 -.1051707 .6768439
Uruguay .4541554 .2196969 .0477475 .9114031
Venezuela, RB 1.830554 .0986794 1.632516 2.031914
Total .4582117 .7102654 -1.00708 2.031914
is Nicaragua that has seen on average the lowest level of CO2 emissions per
capita. Interestingly, the same countries have also registered the peak and bot-
tom value of CO2 pollution per capita’s log. Generally, the summary statistics
do not confirm the Kuznets curve as the country with the lowest level of pollu-
tion can hardly be considered the most developed one.
Solution (4) Looking at the plot (figure 1), it suggests that a quadratic term
could help the model. However, according to the Kuznets theory the quadratic
line is expected to be downward facing whereas the figure suggests the opposite.
Moreover, it seems to be a positive relation between those two sizes. Thus,
the more a country’s economy is big, the more it pollutes.
Regression Analysis
Solution (1) The model shows an R2 equal to 0.682, which is not particularly
high. The estimated coefficients are around -6.71 for the constant and 0.9 for
the GDP per capita’s log. Both of them are significant, even at a 1% level of
significance, with p-values approximately equal to 0. Hence, there is a positive
correlation between the GDP per capita’s log and the CO2 emission’s log. β1 can
be interpreted as the elasticity of CO2 emission to GDP per capita. Therefore,
when the GDP per capita increases by 1%, we can expect an increases of β1 %
in the level of emission.
Solution (2) Adding the controls given by the problem, the elasticity of CO2
emission to GDP decreases from 0.89 to 0.13, though it still remains significant,
2
Figure 1: Scatterplot of GDP per capita’s log against CO2 per capita emissions’
log
Solution (3) If we add a polynomial term of order 2 of GDP per capita’s log
to the previous model, STATA gives us the following results. The coefficient
for the linear term of GDP’s log is positive and equal to 1.15. Whereas, the
coefficient of the quadratic term of GDP’s log (equal to -0.067) is negative. Thus,
the marginal effect of GDP (elasticity of CO2 to GDP per capita) became:
∂CO2i
= 1.146218 − 2 · 0.066824 · GDP i (1)
∂GDP i
Notice that CO2i and GDP i are logs.
3
That seems to confirm the Kuznets curve hypothesis, because those coeffi-
cients generate a downward facing parabola.
However, neither of those terms are significant at a 5% level of significance
(with p-value’s equal to 0.053 for the linear one and equal to 0.089 for the
quadratic one). This implies that we cannot infer with high certainty whether
the quadratic relationship between CO2 emission and GDP is represented by a
downward facing parabola or a upward facing on. Moreover, we cannot infer
whether there is or not a quadratic relationship between those two sizes.
The test for the hypothesis of insignificance for the quadratic term gives an F
statistic with p-value equal to 0.0887 (this is exactly the same test that STATA
does automatically when it runs the regressions). Hence, if we use a 5% level of
significance we should not include the quadratic term in the regression.
Equation 1 represents the marginal effect of GDP. Therefore, in order to test
whether the marginal effect is 0, we build a test as follows:
(
H0 : β1 + 2β2 GDP i = 0
H1 : β1 + 2β2 GDP i ̸= 0
In order to compute it, we use the mean of the GDP’s log, which is equal to
7.98. The test gives a t statistic with p-value equal to 0.092. Hence, using a 5%
level of significance, we can infer that the marginal effect of GDP is 0 in this
model.
The turning point of a general parabola (y = ax2 + bx + c) can be computed
as follows:
b
x=−
2a
In our case, b is β̂1 and a is β̂2 , thus the (estimated) turning point (GDP
\ 0 ) will
be:
\ 0 = − β̂1 ≈ 8.58
GDP
2β̂2
4
effect is not 0 (at a 5% level of significance, yet we would not say so at a 1%
level). In the case c = 7, we get a t statistic with a p-value of 0, thus we can say
that, when the GDP’s log is equal to 7, its marginal effect is not 0 (even at a
1% level of significance). In the case c = 9, we get a t statistic with a p-value of
0.634, thus we cannot say that, when the GDP’s log is equal to 9, its marginal
effect is not 0.
Question 2
Regression Analysis
Solution (1) Since the regressions run are between standardised measures,
then the interpretation is as follows: when each regressor (the original non-
standardised measures) varies of a value equal to its standard deviation (∂Xj =
5
σXj ), the provision of public goods (the original non-standardised measure)
varies of a value equal to its standard deviation times the estimated coefficient
(∂Y = βj · σY ).
Solution (2) The interpretation of the estimated coefficients for the regressors
already used in the previous regressions does not change.
For the legal origin dummies’ coefficient, we can say that being, for instance,
a French legal origin country will lead to a public goods’ provision that is less
than the one of a country with British legal origin (the base group) by 1.673875
times the standard deviation of public goods’ provision. We can give a similar
interpretation to the regional dummies’ ones.
For the absolute latitude, we can say that a unit change in the absolute
latitude will lead to a chenge equal to 0.0419 times the standard deviation of
public goods’ provision in this last measure.
Lastly, using or not the robust errors gives the same estimates. Yet, what
change are the standard errors, that become bigger.
Solution (3) In order to test the joint significance of the standardise three
measures, we test the following hypothesis:
(
H0 : Rγ = d
H1 : Rγ ̸= d
Where:
γ0
′
0 1 0 0 0 γ1 0
0′
R = 0 0 1 0 γ2
γ= d = 0
0 0 0 1 0′ γ3 0
γ4
Where 0 is a column vector of 0’s. The F statistic computed has a p-value of
0.096, thus, even at a 1% level of significance, we can reject the hypothesis of
jointly insignificance.
Solution (4) The Brusch-Pagan test gives a χ2 statistic with p-value equal
to 0.033. Thus, at a 5% level of significance, we should reject the hypothesis of
homoskedasticity. Yet, the White’s test gives a χ2 statistic with p-value equal
to 0.317, which should lead us to not reject the homoskedasticity hypothesis.
This contradiction can be explained by two factors: the first is the number of
regressors and the second is the number of observations. Due to the number
of regressors, the White’s test χ2 statistic has 44 degrees of freedom, since we
are estimating a model with that much regressors. Usually, this test does not
perform well in those cases. This is even more true if we look at the number of
observation (48). Thus the model built for the test would have several problems.
Moreover, having 48 observations creates problem for all types of tests, because
it is too small to lead us to think that the CLT holds.
6
Figure 2: Scatterplot of residuals against fitted values
Figure 2 shows the plot of residuals against fitted values. Since the number
of observation is low, we cannot say much. Though, it seems that there is not
homoskedasticity by looking at the concentration of points when the fitted value
is between 0 and 4.
β̂j − βj∗ d
p −
→ N (0, 1)
σ̂ 2 [(X ′ X)−1 ]jj
p
Though, this is true because σ̂ 2 In − → σ 2 In , namely the homoskedastic variance
estimator is a consistent estimator for the covariance matrix of ϵ. Yet, if there
is heteroskedasticity, this is not true anymore. Therefore, that statistic does not
tend to a standard Gaussian distribution. Hence, we cannot use it for testing.
With the robust standard errors (assuming that the CLT holds), as n → ∞,
we would have:
β̂j − βj∗ d
q −
→ N (0, 1)
[V̂ HC (β̂)]jj
7
would rule out the risk. Moreover, as it is consistent even in the homoskedastic
case, we would not run the risk of making error in the simpler case.
Solution (6) Table 3 contains all the regressions’ results from the previous
points.