Professional Documents
Culture Documents
Chapter One Part 2
Chapter One Part 2
1
4/10/2021
Causes of Multicollinearity
Consequences of Multicollinearity
1. The data collection method employed, for example, sampling over a 1. The OLS estimators have large variances and covariances, making
limited range of the values taken by the regressors in the population. precise estimation difficult.
2. Constraints on the model or in the population being sampled. For 2. Because of consequence 1, the t ratio of one or more
example, in the regression of electricity consumption on income (X2)
coefficients tends to be statistically insignificant.
and house size (X3) (High X2 always mean high X3).
3. Model specification, for example, adding polynomial terms to a
3. Although the t ratio of one or more coefficients is statistically
regression model, especially when the range of the X variable is small. insignificant, R2 can be very high.
4. An overdetermined model. This happens when the model has more 4. The OLS estimators and their standard errors can be sensitive to small
explanatory variables than the number of observations. changes in the data.
5. An additional reason for multicollinearity, especially in time series 5. Under Perfect Multicollinearity, the OLS estimators simply do
data, may be that the regressors included in the model share a common not exist
trend, that is, they all increase or decrease over time.
1. High but few significant t ratios. If R2 is high, say, in Drop one of the collinear variables
excess of 0.8, the F test in most cases will reject the hypothesis
that the partial slope coefficients are simultaneously equal to Transform the highly correlated variables
zero, but the individual t tests will show that none or very few Collect more data.
of the partial slope coefficients are statistically different from
zero. Combining cross-sectional and time series data (pooling the
2. High pair-wise correlations among regressors. data)
Another suggested rule of thumb is that if the pair-wise Do nothing.
correlation coefficient between two regressors is high, say, in
excess of 0.8, then multicollinearity is a serious problem.
3. The Variance Inflation Factor (VIF=1/(1- R2j ). VIF
values that exceed 10 are generally viewed as evidence of
the existence of problematic multicollinearity which happens
when R2j >0.9
2
4/10/2021
Heteroscedasticity
Consumption
Yi = β1 + β2Xi + Ui Yi
.
. .
. .. . .. . . . .
Homoscedasticity:
Var(Ui) = σ 2 .
. . .. . . .
. .
. . .. . .
Or E(Ui2) = σ 2 . . .
.. . . . .
. .
.
Heteroscedasticity:
Var(Ui) = σi 2
Or E(Ui2) = σi 2 Income
Xi
3
4/10/2021
4
4/10/2021
1. White’s Test
Remedy
Step 1: Estimate the model by OLS and obtain the residuals
Step 2: Run the following auxiliary regression: Check the data
uˆt2 = a1 + a2 X 2t + a3 X 3t + a4 X 22t + a5 X 32t + a6 X 2t X 3t + vt Respecify the model;
look for other missing variables
Step 3: Compute LM=nR2, where n and R2 are from the auxiliary other appropriate functional forms
regression. Robust Regression: robust estimation should be
Step 4: If LM-stat>χ2p-1 critical reject the null and conclude that considered when there is a strong suspicion of
there is significant evidence of heteroscedasticity
hetroscedasticity.
The White test on the other hand is more generic.
It is able to detect more general form of heteroscedasticity
than the Breusch-Pagan test.