Professional Documents
Culture Documents
Solution Basic Econometrics
Solution Basic Econometrics
Solution Basic Econometrics
𝛽̂2 = ∑ 𝑘𝑖 (𝛽1 + 𝛽2 𝑋𝑖 + 𝑢𝑖 )
𝛽̂2 = 𝛽2 + ∑ 𝑘𝑖 𝑢𝑖
(b) False. From the normal equations after minimizing the residual sum of square in the 2-
variable regression equation, the solution for 𝛽̂2 can be written as –
∑𝑥 𝑌 𝑥
𝛽̂2 = ∑ 𝑥𝑖 2 𝑖 = ∑ 𝑘𝑖 𝑌𝑖 ; where, 𝑘𝑖 = ∑ 𝑥𝑖 2 .
𝑖 𝑖
𝛽̂2 = ∑ 𝑘𝑖 (𝛽1 + 𝛽2 𝑋𝑖 + 𝑢𝑖 )
𝛽̂2 = 𝛽2 + ∑ 𝑘𝑖 𝑢𝑖
𝐸(𝛽̂2 ) = 𝛽2
To find out variance we have
⇒ 𝑉𝑎𝑟(𝛽̂2 ) = 𝐸(∑ 𝑘𝑖 𝑢𝑖 )2
⇒ 𝑉𝑎𝑟(𝛽̂2 ) = 𝐸[𝑘12 𝑢12 + 𝑘22 𝑢22 + ⋯ + 𝑘𝑛2 𝑢𝑛2 + 2𝑘1 𝑘2 𝑢1 𝑢2 + ⋯ 2𝑘𝑛−1 𝑘𝑛 𝑢𝑛−1 𝑢𝑛 ]
𝑉𝑎𝑟(𝛽̂2∗ ) = 𝜎 2 ∑ 𝑘𝑖2 then, 𝑉𝑎𝑟(𝛽̂2∗ ) < 𝑉𝑎𝑟(𝛽̂2 ). Hence serially correlated error term make
the slope estimator inefficient.
(c) False. In the double log model of, log (Y) = log(𝛽1 ) + 𝛽2 log(𝑋𝑖 ) + log( 𝑢𝑖 ), the slope
coefficient is known as the elasticity of the dependent variable (Y) with respect to the
independent variable (X) and not the growth rate. Suppose, Y is the demand for internet and X
is the income of the consumer then 𝛽2 is the income elasticity of demand for internet.
𝑋 𝑑𝑦
The elasticity 𝛽2 can also be written as, 𝛽2 = 𝑌 𝑑𝑥 , where 𝛽2 is be defined as the percentage
change in demand for internet (Y) due to percentage change in the in the income (X).
(d) False. The durbin Watson test for serial correlation has been designed for first order serial
correlation only. This is based on two assumptions- (i) 𝑌 = 𝛽1 + 𝛽2 𝑋𝑖 + 𝑢𝑖 , where, 𝑢𝑡 =
𝜌𝑢𝑡−1 + 𝜀𝑡 , −1 < 𝜌 < 1, or, |𝜌| < 1, 𝜌 is called the coefficient of the error term lagged one
period. It is also called the first order autocorrelation coefficient. (ii) 𝐸(𝜀𝑡 ) = 0, E(𝜀𝑡2 ) = 𝜎𝜀2 <
∞ and 𝐸(𝜀𝑡 , 𝜀𝑡−𝑠 ) = 0; for all 𝑠 ≠ 0. The durbin Watson test take following steps – (i) Estimate
the OLS model and compute the residual as, 𝑢̂𝑡 = 𝑌𝑡 − 𝛽̂1 + 𝛽̂2 𝑋𝑡2 + ⋯ + 𝛽̂𝑘 𝑋𝑡𝑘 . (ii)
∑𝑡=𝑛
𝑡=2 (𝑢 ̂𝑡−1 )2
̂𝑡 −𝑢
Compute durbin Watson, 𝐷𝑊(𝑑) = ∑𝑡=𝑛 ̂𝑡2
≅ 2(1 − 𝜌). (iii) If, 𝑑 < 2, the null and
𝑡=1 𝑢
alternative hypothesis is, 𝐻0 : 𝜌 = 0 and 𝐻1 : 𝜌 > 0 and if 𝑑 > 2, 𝐻0 : 𝜌 = 0 and 𝐻1 : 𝜌 < 0.
Look up the 𝑑𝐿 and 𝑑𝑈 from Durbin Watson table for k’=number of independent variable
(excluding constant), reject the null if 𝑑 ≤ 𝑑𝐿 and do not reject the null if 𝑑 ≥ 𝑑𝑈 . If 𝑑𝐿 < 𝑑 <
𝑑𝑈 ⇒ 𝑡𝑒𝑠𝑡 𝑖𝑠 𝑖𝑛𝑐𝑜𝑛𝑐𝑙𝑢𝑠𝑖𝑣𝑒. Similarly, for 𝑑 > 2, if 𝑑 ≤ 4 − 𝑑𝑈 , do not reject the null and if
𝑑 ≥ 4 − 𝑑𝐿 , reject the null. If, 4-𝑑𝑈 < 𝑑 < 4 − 𝑑𝐿 ⇒ 𝑡𝑒𝑠𝑡 𝑖𝑠 𝑖𝑛𝑐𝑜𝑛𝑐𝑙𝑢𝑠𝑖𝑣𝑒.
The major problem of the DW(d) is that the result of the first order serial correlation can be
inconclusive and in this case and we need to conduct Lagrange Multiplier or other higher order
test of serial correlation.
(e) True. Otherwise, there will be a dummy variable trap. Assume a model with qualitative and
quantitative variables –
𝑌𝑡 = 𝛼1 + 𝛼2 𝐷 + 𝛽𝑋 + 𝑢 (a)
Where, Y=wage Earned, D=1, if male, 0, otherwise, X=Experience. The estimated relationship
for two groups’ are-
(f) False. Show and analyse from the relationship between 𝑅 2 and 𝑅̅ 2 . For example in case of
N=26, k=6, 𝑅 2 = 0.1, we get 𝑅̅ 2 = −0.125.
(g) False. Using the deviation form of the equation-
∑ 𝑦̂𝑖 𝑢̂𝑖 = 𝛽̂2 ∑ 𝑥𝑖 𝑢̂𝑖
(b) The durbin Watson test for serial correlation has been designed for first order serial
correlation only. This is based on two assumptions- (i) 𝑌 = 𝛽1 + 𝛽2 𝑋𝑖 + 𝑢𝑖 , where, 𝑢𝑡 =
𝜌𝑢𝑡−1 + 𝜀𝑡 , −1 < 𝜌 < 1, or, |𝜌| < 1. 𝜌 is called the coefficient of the error term lagged one
period. It is also called the first order autocorrelation coefficient. (ii) 𝐸(𝜀𝑡 ) = 0, E(𝜀𝑡2 ) = 𝜎𝜀2 <
∞ and 𝐸(𝜀𝑡 , 𝜀𝑡−𝑠 ) = 0; for all 𝑠 ≠ 0. The durbin Watson test take following steps – (i) Estimate
the OLS model and compute the residual as, 𝑢̂𝑡 = 𝑌𝑡 − 𝛽̂1 + 𝛽̂2 𝑋𝑡2 + ⋯ 𝛽̂𝑘 𝑋𝑡𝑘 . (ii) Compute
∑𝑡=𝑛
𝑡=2 (𝑢 ̂𝑡−1 )2
̂𝑡 −𝑢
durbin Watson, 𝐷𝑊(𝑑) = ∑𝑡=𝑛 ̂𝑡2
≅ 2(1 − 𝜌). (iii) (a) If, 𝑑 < 2, the null and alternative
𝑡=1 𝑢
hypothesis is, 𝐻0 : 𝜌 = 0 and 𝐻1 : 𝜌 > 0, and (b) if 𝑑 > 2 𝐻0 : 𝜌 = 0 and 𝐻1 : 𝜌 < 0.
Look up the 𝑑𝐿 and 𝑑𝑈 , from Durbin Watson table for k’=number of independent variable
(excluding constant), reject the null if 𝑑 ≤ 𝑑𝐿 , and do not reject the null if 𝑑 ≥ 𝑑𝑈 . If 𝑑𝐿 <
𝑑 < 𝑑𝑈 ⇒ 𝑡𝑒𝑠𝑡 𝑖𝑠 𝑖𝑛𝑐𝑜𝑛𝑐𝑙𝑢𝑠𝑖𝑣𝑒. Similarly, for (iii) (b) if 𝑑 ≤ 4 − 𝑑𝑈 , do not reject the null
and if 𝑑 ≥ 4 − 𝑑𝐿 , reject the null. If, 4-𝑑𝑈 < 𝑑 < 4 − 𝑑𝐿 ⇒ 𝑡𝑒𝑠𝑡 𝑖𝑠 𝑖𝑛𝑐𝑜𝑛𝑐𝑙𝑢𝑠𝑖𝑣𝑒.
The major problem of the DW(d) is that the result of the first order serial correlation can be
inconclusive in this case and we need to do Lagrange Multiplier or other higher order test of
serial correlation.
For the given value of N=57, k=2, DW(d)=0.802. The hypothesis will be, 𝐻0 : 𝜌 = 0 and
𝐻1 : 𝜌 > 0 because 𝑑 < 2. From the dubin Watson table we find that, 𝑑𝐿 = 1.49, and 𝑑𝑈 =
1.64. Notice that 𝑑 < 𝑑𝐿 , hence reject the null of no serial correction and conclude that there
is first order positive serial correlation in the error term and the value of serial correlation is
𝑑
𝜌 = 1 − 2 = 0.599. [Note that, in many colleges the durbin Watson tables could not been
provided so, if the students have found the value of 𝜌, we should give marks for this.]
3. See Chapter 11, pp. 387-389, show through graphs and all. Consider a regression equation
– 𝑌𝑡 = 𝛽1 + 𝛽2 𝑋𝑡2 + ⋯ + 𝛽𝑘 𝑋𝑡𝑘 + 𝑢𝑡 , where, 𝑢𝑡 is random variable and Var(𝑢𝑡 /𝑋𝑡 ) = 𝜎𝑡2 for
𝑡 = 1,2,3, … , 𝑛. Thus for each observation has a different error variance suggests that the
variance of the error term is heteroskedastic in nature. This problem is prevalent more in the
cross section and panel data however, it can also be a problem in a time series. Methods of the
detection of heteroskedasticity are as follows – (i) Graphical (Student must have explained this)
(ii) Lagrange Multiplier Test (Breusch-Pagan Test, Glesjer Test, Harvey Godfrey Test),
Spearman’s Rank Correlation Test, Goldfeld Quandt Test, White Test (Student must have
explained at least one).
Consequences of ignoring Heteroskedasticity –
(i) Effects on the properties of Estimators – the properties of the unbiasedness and consistency
are not violated by ignoring heteroskedasticity and using OLS to estimate 𝛽1 and 𝛽2. Since,
Var(𝑢𝑡 /𝑋𝑡 ) = 𝜎 2 (constant variance) is used to show the efficient property of the variance of
𝛽̂1 and 𝛽̂2, it is not possible to show that the Gauss Markov theorem holds. It means OLS
estimators are inefficient. That is, it is possible to find out an alternative unbiased linear
estimator that has lower variance than the OLS estimator.
(ii) Effects on Tests of Hypothesis – The estimated variances and covariance of the regression
coefficients will be biased and inconsistent and hence tests of hypothesis (that is, t- and F-tests)
are invalid.
(iii) Since, OLS estimators are still unbiased the forecasting based on these estimates are still
unbiased, but because the estimators are inefficient, forecasts will also be inefficient.
3. (b) (i) Growth rate of India’s population is 2.4% (Instantaneous). The compound growth rate
is Antilog (0.024)-1=2.43%
̂ 𝑡 = 4.77 + 0.015 ∗ 𝑡
(ii) Before 1978: 𝐸(𝐿𝑜𝑔(𝑃𝑜𝑝)𝑡 /𝐷𝑡 = 0, 𝑡); 𝐿𝑜𝑔(𝑃𝑜𝑝)
4. (a) (i) The benchmark category is unmarried and Non-South resident. The mean hourly wage
of the benchmark is $8.81.
(ii) The mean hourly wage of those who are married is around $1.10 higher and an actual hourly
wage = $9.91($8.81+$1.10).
(iii) For those who live in the South, the average hourly age is lower by about $1.67 and the
actual wage is = $7.14($8.81-$1.67).
∑ 𝑥𝑖 𝑦 𝑖 16800
(b) (i) 𝛽̂2 = ∑ 𝑥𝑖2
= 33000 = 0.509
Further, 𝛽̂1 = 𝑌̅ − 𝛽̂2 𝑋̅ = 111 − 0.509 ∗ 170 = 24.47. The estimated linear regression is
𝑌̂ = 24.47 + 0.509𝑋.
2 𝐶𝑜𝑣(𝑋,𝑌) 16800
(ii) For 2- variable regression equation 𝑟𝑋,𝑌 = 𝑅 2 . So, 𝑟𝑋,𝑌 = = =
𝜎𝑋 𝜎𝑌 √33000√17099
∑𝑥 2
2
0.70725. And, 𝑟𝑋,𝑌 = 𝑅 2 = 0.5002[One can also calculate 𝑅 2 by, 𝑅 2 = 𝛽̂ 2 ∑ 𝑦𝑖2 ].
𝑖
Fitted Value with the original equation, 𝑌̂𝑡 = 𝛽̂1 + 𝛽̂2 𝑋𝑡 and the Fitted Value with the new
1
scaling would be 𝑌̂𝑡∗ = 𝛽̂1∗ + 𝛽̂2∗ 𝑋𝑡∗ ⇒ 𝑌̂𝑡 = 𝛽̂1 + 10 𝛽̂2 𝑋𝑡 , That is, only the slope coefficient
changes and not the intercept. Further, with original equation the residual is, 𝑢̂𝑡 = 𝑌𝑡 − 𝑌̂𝑡 and
with a new scaling 𝑢̂𝑡∗ = 𝑢̂𝑡 = 𝑌𝑡 − 𝑌̂𝑡 . The residual will not change.
If 10 is added with variable X, the new equation changes as, 𝑌̂𝑡 = 𝛽̂1 + 𝛽̂2 (10 + 𝑋𝑡 )
⇒ 𝑌̂𝑡 = 𝛽̂1∗ + 𝛽̂2 𝑋𝑡 , where, 𝛽̂1∗ = 𝛽̂1 + 10𝛽̂2 . In this case, intercept is the only term which will
change and not the slope coefficient. There is no change in the residual and fitted values when
10 is added with the variable X.
5. (a) (i) For a given 𝑅̅ 2 = 0.277, find the value of 𝑅 2 first and then calculate F-statistics for
𝑅 2 ⁄(𝑘−1)
overall significance using 𝐹𝑐 = (1−𝑅2 )⁄(𝑛−𝑘).
(𝑛−𝑘)
𝑅 2 = 1 − (𝑛−1) (1 − 𝑅̅ 2 ) = 0.3222,
(iii) This might be because of the concept of backward bending labour supply curve with
respect to age or due to multicollinearity with experience (EXPER), or both.
(iv) Even if there is low t-statistics the expected sign is correct and the result is conceptual
correct. Removal of AGE from the regression equation can create a problem of omitting the
relevant variable and can cause omitted variable bias in remaining estimators.
(b) By definition,
= −𝑋̅ 𝐸(𝛽̂2 − 𝛽2 )2 ,
= −𝑋̅ 𝑉𝑎𝑟(𝛽̂2 )
̂
𝛼 26.034
6. (a) 𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝐸𝑟𝑟𝑜𝑟(𝛼̂) = 𝑡−𝑟𝑎𝑡𝑖𝑜 = = 1.7408, and
14.955
̂
𝛽 0.137
𝑡 − 𝑟𝑎𝑡𝑖𝑜 = ̂) = 0.028 = 4.8928
𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝐸𝑟𝑟𝑜𝑟 (𝛽
𝐶(𝑉,𝑃) 126.84967
𝑟𝑉,𝑃 = , and also 𝐶(𝑉, 𝑃) = 𝛽̂ 𝑆𝑃2 = 126.84967, 𝑟𝑉,𝑃 = = 0.73744
𝜎𝑉 𝜎𝑃 √31.954√925.91
2
In 2-variable regression equation, we have 𝑅 2 = 𝑟𝑉,𝑃 = 0.5438. Next, we have N=22, k=2
̂2
∑𝑢 𝐸𝑆𝑆 305.96
𝜎̂ 2 = 𝑛−𝑘𝑡 = 𝑛−𝑘 = = 15.298, where ESS is error sum of square as provided in the
20
question, however since full form of ESS is not provided in the question paper and student
might have found out 𝜎̂ 2 taking ESS as explained sum of squares. In that case 𝜎̂ 2 = 12.834
𝐸𝑆𝑆 305.96
[That is, 𝑇𝑆𝑆 = , where ESS is explained sum of square, and TSS = =
𝑅2 0.5438
𝑅𝑆𝑆
562.6333, 𝑇𝑆𝑆 = 1 − 𝑅 2 , 𝑅𝑆𝑆 = (1 − 𝑅 2 )𝑇𝑆𝑆 = 0.4568 ∗ 562.6333 = 256.6733, and 𝜎̂ = 2
𝑅𝑆𝑆 256.6733
= = 12.834]. Both the answers are correct though only the former will be the
𝑛−𝑘 20
consistent with the information provided.
𝑉̅ = 𝛼̂ + 𝛽̂ 𝑃̅
𝑉̅ = 26.034 + 0.137 ∗ 54.478
𝑉̅ = 33.4975
𝑌𝑡 = 𝑌̂𝑡 + 𝑢̂𝑡
Writing equation in the deviation form, and squaring further
𝑦𝑡 = 𝑦̂𝑡 + 𝑢̂𝑡
∑ 𝑦𝑡2 = ∑ 𝑦̂𝑡2 + ∑ 𝑢̂𝑡2 [since, 2 ∑ 𝑦̂𝑡 𝑢̂𝑡 = 0]
̂2 ⁄(𝑛−𝑘)
∑𝑢 𝑅𝑆𝑆⁄(𝑛−𝑘) (𝑛−1)
and 𝑅̅ 2 = 1 − ∑ 𝑦𝑡2⁄(𝑛−1) = 1 − 𝑇𝑆𝑆⁄(𝑛−1) = 1 − (𝑛−𝑘) (1 − 𝑅 2 )
𝑡
𝑅̅ 2 is the adjusted 𝑅 2 implies adjusted for the d.f. associated with the sums of squares entering
𝑅𝑆𝑆
in 𝑅 2 = 1 − 𝑇𝑆𝑆 . That is, ∑ 𝑢̂𝑡2 has (𝑛 − 𝑘) d.f. in a model involving k parameters including
the intercept term and ∑ 𝑦𝑡2 has (𝑛 − 1) d.f. In the formula for 𝑅̅ 2 – (i) for 𝑘 > 1, 𝑅̅ 2 < 𝑅 2 ,
implies that as the number of x variable increases, the 𝑅̅ 2 increases less than the un-adjusted
𝑅 2 , and (ii) The 𝑅̅ 2 can be negative, though 𝑅 2 is non-negative necessarily.
𝑅̅ 2 is a better measure of goodness of fit because it allows for the trade-off between increased
𝑛−1
𝑅 2 and decreased d.f. Note that, is never less than 1 and 𝑅̅ 2 will never be higher than 𝑅 2 .
𝑛−𝑘
𝛽̂2 ∑ 𝑥𝑡2
2
+ 𝛽̂3 ∑ 𝑥𝑡2 𝑥𝑡3 = ∑ 𝑦𝑡 𝑥𝑡2 (a)
𝛽̂2 ∑ 𝑥𝑡2 (2𝑥𝑡2 ) + 𝛽̂3 ∑ 𝑥𝑡2 (2𝑥𝑡2 ) = ∑ 𝑦𝑡 (2𝑥𝑡2 ). This further gives,
2𝛽̂2 ∑ 𝑥𝑡2
2
+ 2𝛽̂3 ∑ 𝑥𝑡2 𝑥𝑡3 = 2 ∑ 𝑦𝑡 𝑥𝑡2 (c)
Eq. (c) is linear transformation of Eq. (a) and we find that the Eq. (a) and Eq. (b) are not
independent to give estimates of 𝛽̂2 and 𝛽̂3. In matrix form, we will get singular matrix where,
determinant of (𝑋’𝑋), ∆= 0 and hence we cannot invert and get the solution of the matrix. This
is the problem of exact multicollinearity.
* Means sum of squares, + Mean sum of squares, which is obtained by dividing SS by their
d.f.
𝑀𝑆𝑆 𝑜𝑓 𝐸𝑆𝑆 ̂22 ∑ 𝑥𝑡2
𝛽 ̂22 ∑ 𝑥𝑡2
𝛽
From the table consider the following variable- 𝐹 = 𝑀𝑆𝑆 𝑜𝑓 𝑅𝑆𝑆 = ̂2
∑𝑢
= ̂2
. Assuming
𝑡 𝜎
𝑛−2
𝑢𝑖 is normally distributed and if the null hypothesis (𝐻0 ) is that 𝛽2 = 0, then it can be shown
that the F-variable follows the F-distribution with 1 d.f. in the numerator and n-2 d.f. in the
denominator.
It can be shown that,
If 𝛽2 = 0 implies Eq. (a) and Eq. (b) equations provide identical estimates of true 𝜎 2 . In this
case X does not have linear influence on Y and all the variation in Y is explained by the random
error term 𝑢𝑖 . If 𝛽2 ≠ 0, the Eq. (a) and Eq. (b) will be different and part of the variation in Y
will be explained by X. Therefore, F-test provide a null hypothesis 𝐻0 : 𝛽2 = 0. If 𝐹𝑐 >
𝐹 ∗ (tabulated value), reject the null and the probability of committing type-I error is very low.