Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

4/10/2021

1.6 Violation of Classical


1.6.1Multicollinearity
Assumptions
1. Nature of Multicollinearity
1. Multicollinearity 2. Causes of Multicollinearity
2. Hetroscedasticity 3. Consequences of Multicollinearity
4. Detecting Multicollinearity
5. Remedial measures for multicollinearity

Nature of Multicollinearity Nature of Multicollinearity


 Assumption number 10 of the CLRM requires that there are no As a numerical example, consider the following hypothetical data:
exact linear relationships among the sample values of the X2 X3 X*3
explanatory variables (the Xs). 10 50 52
 So, when the explanatory variables are very highly correlated 15 75 75
with each other (correlation coefficients either very close to 1 or 18 90 97
24 120 129
to -1) then the problem of multicollinearity occurs.
30 150 152
 In case, when one variable can be expressed as an exact linear
function of one or more or even all of the other variables, it is It is apparent that X3i = 5X2i . Therefore, there is perfect collinearity
called perfect multicollinearity (e.g. X2=2X1 ) between X2 and X3 since the coefficient of correlation r23 is unity.
The variable X*3 was created from X3 by simply adding to it the
 Imperfect multicollinearity (or near multicollinearity) exists following numbers, which were taken from a table of random numbers:
when the explanatory variables in an equation are correlated, but 2, 0, 7, 9, 2. Now there is no longer perfect collinearity between X2 and
this correlation is less than perfect. X*3. (X3i = 5X2i + vi )
However, the two variables are highly correlated because calculations
will show that the coefficient of correlation between them is 0.9959.

1
4/10/2021

Causes of Multicollinearity
Consequences of Multicollinearity
1. The data collection method employed, for example, sampling over a 1. The OLS estimators have large variances and covariances, making
limited range of the values taken by the regressors in the population. precise estimation difficult.
2. Constraints on the model or in the population being sampled. For 2. Because of consequence 1, the t ratio of one or more
example, in the regression of electricity consumption on income (X2)
coefficients tends to be statistically insignificant.
and house size (X3) (High X2 always mean high X3).
3. Model specification, for example, adding polynomial terms to a
3. Although the t ratio of one or more coefficients is statistically
regression model, especially when the range of the X variable is small. insignificant, R2 can be very high.
4. An overdetermined model. This happens when the model has more 4. The OLS estimators and their standard errors can be sensitive to small
explanatory variables than the number of observations. changes in the data.
5. An additional reason for multicollinearity, especially in time series 5. Under Perfect Multicollinearity, the OLS estimators simply do
data, may be that the regressors included in the model share a common not exist
trend, that is, they all increase or decrease over time.

DETECTION OF MULTICOLLINEARITY REMEDIAL MEASURES

1. High but few significant t ratios. If R2 is high, say, in  Drop one of the collinear variables
excess of 0.8, the F test in most cases will reject the hypothesis
that the partial slope coefficients are simultaneously equal to  Transform the highly correlated variables
zero, but the individual t tests will show that none or very few  Collect more data.
of the partial slope coefficients are statistically different from
zero.  Combining cross-sectional and time series data (pooling the
2. High pair-wise correlations among regressors. data)
Another suggested rule of thumb is that if the pair-wise  Do nothing.
correlation coefficient between two regressors is high, say, in
excess of 0.8, then multicollinearity is a serious problem.
3. The Variance Inflation Factor (VIF=1/(1- R2j ). VIF
values that exceed 10 are generally viewed as evidence of
the existence of problematic multicollinearity which happens
when R2j >0.9

2
4/10/2021

Heteroscedasticity

1.6.2 Heteroscedasticity:  Homoscedasticity is the assumption of equal


variance Var(ui) = σ2, for all i, which means “equal
Nature, cause, Effect, scatter” (of the error terms ui around their mean 0).
 Ordinary least squares assumes that all
Detection and Remedy observations are equally reliable.
 Heteroscedasticity is a systematic pattern
in the errors where the variances of the errors
are not constant.

Regression Model Homoscedastic pattern of errors

Consumption
Yi = β1 + β2Xi + Ui Yi
.
. .
. .. . .. . . . .
Homoscedasticity:
Var(Ui) = σ 2 .
. . .. . . .
. .
. . .. . .
Or E(Ui2) = σ 2 . . .
.. . . . .
. .
.
Heteroscedasticity:
Var(Ui) = σi 2
Or E(Ui2) = σi 2 Income
Xi

3
4/10/2021

Heteroscedastic pattern of errors Causes of Heteroscedasticity


Consumption 1. Learning Effects as people learn, their errors of
.
Yi . behavior become smaller over time. In this case, σ2i
. . is expected to decrease. Example as the number of
. . . . hours of typing practice increases, the average
. . number of typing errors as well as their variances
. .
. . . . .. . decreases.
. . . . . . 2. As incomes grow, people have more discretionary
. .
. . . .. .. . . income and hence more scope for choice about the
. . . . . . . . disposition of their income. Hence, σ2i is likely to
. . increase with income. Similarly, companies with
larger profits are generally expected to show greater
variability in their dividend policies than companies
Income Xi
with lower profits.

Causes of Heteroscedasticity Causes of Heteroscedasticity


3. As data collecting techniques improve, σ2i is likely to 6. Another source of heteroscedasticity is skewness in
decrease. Thus, banks that have sophisticated data processing the distribution of one or more regressors included in
equipment are likely to commit fewer errors in the monthly the model. Examples are economic variables such as
or quarterly statements of their customers than banks without
income, wealth, and education. It is well known that
such facilities.
4. Heteroscedasticity can also arise as a result of the presence the distribution of income and wealth in most societies
of outliers, (either very small or very large) in relation to the is uneven, with the bulk of the income and wealth
observations in the sample. being owned by a few at the top.
5. Incorrect functional form: Another source of 7. Other sources of heteroscedasticity: As David Hendry
heteroscedasticity arises from violating Assumption 9 of notes, heteroscedasticity can also arise because of:
CLRM, namely, that the regression model is correctly  incorrect data transformation (e.g., ratio or first
specified, very often what looks like heteroscedasticity may difference transformations)
be due to the fact that some important variables are omitted
 incorrect functional form (e.g., linear versus log–
from the model. But if the omitted variables are included in
linear models).
the model, that impression may disappear.

4
4/10/2021

Consequences of Heteroscedasticity 1. The Breusch-Pagan LM Test


The Breusch-Pagan test is designed to detect only
1. Ordinary least squares estimators not efficient.
2. Usual formulas give incorrect standard errors for least
linear forms of heteroscedasticity
squares. Step 1: Estimate the model by OLS and obtain the residuals
Step 2: Run the following auxiliary regression:
3. Confidence intervals and hypothesis tests based on usual
uˆi2 = a1 + a2 X 1i + a3 X 2i + vt
standard errors are wrong .
Step 3: Compute LM=nR2, where n and R2 are from the
auxiliary regression.
Step 4: If LM-stat>χ2p-1 critical reject the null and conclude
that there is significant evidence of heteroscedasticity

1. White’s Test
Remedy
Step 1: Estimate the model by OLS and obtain the residuals
Step 2: Run the following auxiliary regression:  Check the data
uˆt2 = a1 + a2 X 2t + a3 X 3t + a4 X 22t + a5 X 32t + a6 X 2t X 3t + vt Respecify the model;
look for other missing variables
Step 3: Compute LM=nR2, where n and R2 are from the auxiliary other appropriate functional forms
regression. Robust Regression: robust estimation should be
Step 4: If LM-stat>χ2p-1 critical reject the null and conclude that considered when there is a strong suspicion of
there is significant evidence of heteroscedasticity
hetroscedasticity.
 The White test on the other hand is more generic.
 It is able to detect more general form of heteroscedasticity
than the Breusch-Pagan test.

You might also like