Professional Documents
Culture Documents
Chapter 6
Chapter 6
MULTICOLLINEARITY
1
2
6.1 Introduction
A basic assumption is multiple linear regression model is that the rank of the matrix of observations on
explanatory variables is same as the number of explanatory variables. In other words, such matrix is of full
column rank. This in turn implies that all the explanatory variables are independent, i.e., there is no linear
relationship among the explanatory variables. It is termed that the explanatory variables are orthogonal.
In many situations in practice, the explanatory variables may not remain independent due to various reasons.
The situation where the explanatory variables are highly intercorrelated is referred to as multicollinearity. The
problem of multicollinearity exists when two or more regressor variables are strongly correlated or linearly
dependent.
3
Example 6.1:
Can we use the following data set to fit the model
Here (
(b) Multicollinearity tends LSE that are too far from the true parameter .
(c) Model coefficient with negative value sign when positive sign is expected.
(d) High significance in a global F-test but in which none of the regressors are significance in partial F-
test.
1 8 6
4 2 8
9 -8 1
11 -10 0
3 6 5
8 -6 3
5 0 2
10 -12 -4
2 4 10
7 -2 -3
6 -4 5
8
indicates multicollinearity problem.
None of the pairwise correlations are suspiciously large and we have no indication of near linear
dependence.
(b). Multicolinearity can also be detected from the eigenvalues of the correlation matrix .
11
For a regressor model , there will be eigen values .
If there are one or more linear dependences in the data, then one or more eigenvalues will be small.
Define the the condition number of as
As a general rule
If indicates no serious problem with multicolinerity.
If indicates indicate moderate to strong multicolinearity.
If indicates a severe problem with multicolinerity.
The condition indicates of the matrix are
12
Which indicate severe problem with multicollinearity.
13 The condition indices are
Only one condition index exceeds 1000, so we conclude that there is only one near linear
dependance in the data.
Eigen system analysis can also be used to identify the nature of linear dependencies in data.
14
Where and
where
is are eigen vectors associated with eigen values
If the eigen value is close to zero, the elements of the associated eigenvectors describe the nature of
linear dependence.
15
Can be viewed as the factor by which the is increased due to linear dependance among the regressors.
17