Professional Documents
Culture Documents
Possible Indirect Measures For Alleviating Multicollinerarity
Possible Indirect Measures For Alleviating Multicollinerarity
Possible Indirect Measures For Alleviating Multicollinerarity
σ u2 1 σ 2
1
σ=
2
× = u
×
βˆ2
∑ ( 2i 2 )
X − X
2
1 − r 2
X2 ,X3 n MSD ( X 2 ) 1 − r 2
X2 ,X3
In this sequence, we look at four possible indirect methods for alleviating a problem of
multicollinearity.
1
POSSIBLE INDIRECT MEASURES FOR ALLEVIATING MULTICOLLINEARITY
σ u2 1 σ 2
1
σ=
2
× = u
×
βˆ2
∑ ( 2i 2 )
X − X
2
1 − r 2
X2 ,X3 n MSD ( X 2 ) 1 − r 2
X2 ,X3
First, if the correlated variables are similar conceptually, it may be reasonable to combine
them into some overall index.
2
POSSIBLE INDIRECT MEASURES FOR ALLEVIATING MULTICOLLINEARITY
. reg S ASVABC SM SF
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 3, 496) = 81.06
Model | 1235.0519 3 411.683966 Prob > F = 0.0000
Residual | 2518.9701 496 5.07856875 R-squared = 0.3290
-----------+------------------------------ Adj R-squared = 0.3249
Total | 3754.022 499 7.52309018 Root MSE = 2.2536
----------------------------------------------------------------------------
S | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
ASVABC | 1.242527 .123587 10.05 0.000 .999708 1.485345
SM | .091353 .0459299 1.99 0.047 .0011119 .1815941
SF | .2028911 .0425117 4.77 0.000 .1193658 .2864163
_cons | 10.59674 .6142778 17.25 0.000 9.389834 11.80365
----------------------------------------------------------------------------
That is precisely what has been done with the three cognitive ASVAB variables. ASVABC
has been calculated as a weighted average of scores on subtests: ASVABAR (arithmetic
reasoning), ASVABWK (word knowledge), and ASVABPC (paragraph comprehension).
3
POSSIBLE INDIRECT MEASURES FOR ALLEVIATING MULTICOLLINEARITY
. reg S ASVABC SM SF
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 3, 496) = 81.06
Model | 1235.0519 3 411.683966 Prob > F = 0.0000
Residual | 2518.9701 496 5.07856875 R-squared = 0.3290
-----------+------------------------------ Adj R-squared = 0.3249
Total | 3754.022 499 7.52309018 Root MSE = 2.2536
----------------------------------------------------------------------------
S | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
ASVABC | 1.242527 .123587 10.05 0.000 .999708 1.485345
SM | .091353 .0459299 1.99 0.047 .0011119 .1815941
SF | .2028911 .0425117 4.77 0.000 .1193658 .2864163
_cons | 10.59674 .6142778 17.25 0.000 9.389834 11.80365
----------------------------------------------------------------------------
The three components are highly correlated and by combining them as a weighted average,
rather than using them individually, one avoids a potential problem of multicollinearity.
4
POSSIBLE INDIRECT MEASURES FOR ALLEVIATING MULTICOLLINEARITY
σ u2 1 σ 2
1
σ=
2
× = u
×
βˆ2
∑ ( 2i 2 )
X − X
2
1 − r 2
X2 ,X3 n MSD ( X 2 ) 1 − r 2
X2 ,X3
Dropping some of the correlated variables, if they have insignificant coefficients, may
alleviate multicollinearity.
5
POSSIBLE INDIRECT MEASURES FOR ALLEVIATING MULTICOLLINEARITY
σ u2 1 σ 2
1
σ=
2
× = u
×
βˆ2
∑ ( 2i 2 )
X − X
2
1 − r 2
X2 ,X3 n MSD ( X 2 ) 1 − r 2
X2 ,X3
σ u2 1 σ 2
1
σ=
2
× = u
×
βˆ2
∑ ( 2i 2 )
X − X
2
1 − r 2
X2 ,X3 n MSD ( X 2 ) 1 − r 2
X2 ,X3
If that is the case, their omission may cause omitted variable bias, to be discussed in
Chapter 6.
7
POSSIBLE INDIRECT MEASURES FOR ALLEVIATING MULTICOLLINEARITY
σ u2 1 σ 2
1
σ=
2
× = u
×
βˆ2
∑ ( 2i 2 )
X − X
2
1 − r 2
X2 ,X3 n MSD ( X 2 ) 1 − r 2
X2 ,X3
Y = β1 + β 2 X + β 3 P + u
8
POSSIBLE INDIRECT MEASURES FOR ALLEVIATING MULTICOLLINEARITY
σ u2 1 σ 2
1
σ=
2
× = u
×
βˆ2
∑ ( 2i 2 )
X − X
2
1 − r 2
X2 ,X3 n MSD ( X 2 ) 1 − r 2
X2 ,X3
Y = β1 + β 2 X + β 3 P + u
For example, suppose that Y in the equation above is the demand for a category of
consumer expenditure, X is aggregate disposable personal income, and P is a price index
for the category.
9
POSSIBLE INDIRECT MEASURES FOR ALLEVIATING MULTICOLLINEARITY
σ u2 1 σ 2
1
σ=
2
× = u
×
βˆ2
∑ ( 2i 2 )
X − X
2
1 − r 2
X2 ,X3 n MSD ( X 2 ) 1 − r 2
X2 ,X3
Y = β1 + β 2 X + β 3 P + u
To fit a model of this type you would use time series data. If X and P are highly correlated,
which is often the case with time series variables, the problem of multicollinearity might be
eliminated in the following way.
10
POSSIBLE INDIRECT MEASURES FOR ALLEVIATING MULTICOLLINEARITY
σ u2 1 σ 2
1
σ=
2
× = u
×
βˆ2
∑ ( 2i 2 )
X − X
2
1 − r 2
X2 ,X3 n MSD ( X 2 ) 1 − r 2
X2 ,X3
Obtain data on income and expenditure on the category from a household survey and
regress Y' on X'. (The ' marks are to indicate that the data are household data, not
aggregate data.)
11
POSSIBLE INDIRECT MEASURES FOR ALLEVIATING MULTICOLLINEARITY
σ u2 1 σ 2
1
σ=
2
× = u
×
βˆ2
∑ ( 2i 2 )
X − X
2
1 − r 2
X2 ,X3 n MSD ( X 2 ) 1 − r 2
X2 ,X3
This is a simple regression because there will be relatively little variation in the price paid
by the households.
12
POSSIBLE INDIRECT MEASURES FOR ALLEVIATING MULTICOLLINEARITY
σ u2 1 σ 2
1
σ=
2
× = u
×
βˆ2
∑ ( 2i 2 )
X − X
2
1 − r 2
X2 ,X3 n MSD ( X 2 ) 1 − r 2
X2 ,X3
Y − βˆ2' X =
Z= β1 + β 2 P + u
Now substitute β̂ 2' for β2 in the time series model. Subtract β̂ 2' X from both sides, and regress
Z = Y – β̂ 2' X on price. This is a simple regression, so multicollinearity has been eliminated.
13
POSSIBLE INDIRECT MEASURES FOR ALLEVIATING MULTICOLLINEARITY
σ u2 1 σ 2
1
σ=
2
× = u
×
βˆ2
∑ ( 2i 2 )
X − X
2
1 − r 2
X2 ,X3 n MSD ( X 2 ) 1 − r 2
X2 ,X3
Y − βˆ2' X =
Z= β1 + β 2 P + u
There are some problems with this technique. First, the β2 coefficients may be conceptually
different in time series and cross-section contexts.
14
POSSIBLE INDIRECT MEASURES FOR ALLEVIATING MULTICOLLINEARITY
σ u2 1 σ 2
1
σ=
2
× = u
×
βˆ2
∑ ( 2i 2 )
X − X
2
1 − r 2
X2 ,X3 n MSD ( X 2 ) 1 − r 2
X2 ,X3
Y − βˆ2' X =
Z= β1 + β 2 P + u
Second, since we subtract the estimated income component β̂ 2' X, not the true income
component β 2X, from Y when constructing Z, we have introduced an element of
measurement error in the dependent variable.
15
POSSIBLE INDIRECT MEASURES FOR ALLEVIATING MULTICOLLINEARITY
σ u2 1 σ 2
1
σ=
2
× = u
×
βˆ2
∑ ( 2i 2 )
X − X
2
1 − r 2
X2 ,X3 n MSD ( X 2 ) 1 − r 2
X2 ,X3
The last, but by no means least, indirect method for alleviating multicollinearity is the use of
a theoretical restriction, which is defined as a hypothetical relationship among the
parameters of a regression model.
16
POSSIBLE INDIRECT MEASURES FOR ALLEVIATING MULTICOLLINEARITY
σ u2 1 σ 2
1
σ=
2
× = u
×
βˆ2
∑ ( 2i 2 )
X − X
2
1 − r 2
X2 ,X3 n MSD ( X 2 ) 1 − r 2
X2 ,X3
S = β 1 + β 2 ASVABC + β 3 SM + β 4 SF + u
. reg S ASVABC SM SF
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 3, 496) = 81.06
Model | 1235.0519 3 411.683966 Prob > F = 0.0000
Residual | 2518.9701 496 5.07856875 R-squared = 0.3290
-----------+------------------------------ Adj R-squared = 0.3249
Total | 3754.022 499 7.52309018 Root MSE = 2.2536
----------------------------------------------------------------------------
S | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
ASVABC | 1.242527 .123587 10.05 0.000 .999708 1.485345
SM | .091353 .0459299 1.99 0.047 .0011119 .1815941
SF | .2028911 .0425117 4.77 0.000 .1193658 .2864163
_cons | 10.59674 .6142778 17.25 0.000 9.389834 11.80365
----------------------------------------------------------------------------
S increases by 0.09 years for every extra year of schooling of the mother and 0.20 years for
every extra year of schooling of the father.
18
POSSIBLE INDIRECT MEASURES FOR ALLEVIATING MULTICOLLINEARITY
. reg S ASVABC SM SF
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 3, 496) = 81.06
Model | 1235.0519 3 411.683966 Prob > F = 0.0000
Residual | 2518.9701 496 5.07856875 R-squared = 0.3290
-----------+------------------------------ Adj R-squared = 0.3249
Total | 3754.022 499 7.52309018 Root MSE = 2.2536
----------------------------------------------------------------------------
S | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
ASVABC | 1.242527 .123587 10.05 0.000 .999708 1.485345
SM | .091353 .0459299 1.99 0.047 .0011119 .1815941
SF | .2028911 .0425117 4.77 0.000 .1193658 .2864163
_cons | 10.59674 .6142778 17.25 0.000 9.389834 11.80365
----------------------------------------------------------------------------
Mother's education is generally held to be at least, if not more, important than father's
education for educational attainment, so this outcome is unexpected.
19
POSSIBLE INDIRECT MEASURES FOR ALLEVIATING MULTICOLLINEARITY
. reg S ASVABC SM SF
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 3, 496) = 81.06
Model | 1235.0519 3 411.683966 Prob > F = 0.0000
Residual | 2518.9701 496 5.07856875 R-squared = 0.3290
-----------+------------------------------ Adj R-squared = 0.3249
Total | 3754.022 499 7.52309018 Root MSE = 2.2536
----------------------------------------------------------------------------
S | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
ASVABC | 1.242527 .123587 10.05 0.000 .999708 1.485345
SM | .091353 .0459299 1.99 0.047 .0011119 .1815941
SF | .2028911 .0425117 4.77 0.000 .1193658 .2864163
_cons | 10.59674 .6142778 17.25 0.000 9.389834 11.80365
----------------------------------------------------------------------------
. cor SM SF
(obs=500)
| SM SF
--------+------------------
SM | 1.0000
SF | 0.5312 1.0000
However assortive mating leads to correlation between SM and SF and the regression may
be suffering from multicollinearity. This could lead to erratic estimates of the coefficients.
20
POSSIBLE INDIRECT MEASURES FOR ALLEVIATING MULTICOLLINEARITY
σ u2 1 σ 2
1
σ=
2
× = u
×
βˆ2
∑ ( 2i 2 )
X − X
2
1 − r 2
X2 ,X3 n MSD ( X 2 ) 1 − r 2
X2 ,X3
S = β 1 + β 2 ASVABC + β 3 SM + β 4 SF + u
β3 = β4
Suppose that we hypothesize that mother's and father's education are equally important.
We can then impose the restriction β3 = β4.
21
POSSIBLE INDIRECT MEASURES FOR ALLEVIATING MULTICOLLINEARITY
σ u2 1 σ 2
1
σ=
2
× = u
×
βˆ2
∑ ( 2i 2 )
X − X
2
1 − r 2
X2 ,X3 n MSD ( X 2 ) 1 − r 2
X2 ,X3
S = β 1 + β 2 ASVABC + β 3 SM + β 4 SF + u
β3 = β4
S =β 1 + β 2 ASVABC + β 3 ( SM + SF ) + u
β 1 + β 2 ASVABC + β 3 SP + u
=
22
POSSIBLE INDIRECT MEASURES FOR ALLEVIATING MULTICOLLINEARITY
σ u2 1 σ 2
1
σ=
2
× = u
×
βˆ2
∑ ( 2i 2 )
X − X
2
1 − r 2
X2 ,X3 n MSD ( X 2 ) 1 − r 2
X2 ,X3
S = β 1 + β 2 ASVABC + β 3 SM + β 4 SF + u
β3 = β4
S =β 1 + β 2 ASVABC + β 3 ( SM + SF ) + u
β 1 + β 2 ASVABC + β 3 SP + u
=
Defining SP to be the sum of SM and SF, the equation may be rewritten as shown. The
problem caused by the correlation between SM and SF has been eliminated.
23
POSSIBLE INDIRECT MEASURES FOR ALLEVIATING MULTICOLLINEARITY
. g SP=SM+SF
. reg S ASVABC SP
----------------------------------------------------------------------------
Source | SS df MS Number of obs = 500
-----------+------------------------------ F( 2, 497) = 120.22
Model | 1223.98508 2 611.992542 Prob > F = 0.0000
Residual | 2530.03692 497 5.09061754 R-squared = 0.3260
-----------+------------------------------ Adj R-squared = 0.3233
Total | 3754.022 499 7.52309018 Root MSE = 2.2562
----------------------------------------------------------------------------
S | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
ASVABC | 1.243199 .1237327 10.05 0.000 1.000095 1.486303
SP | .1500751 .0229866 6.53 0.000 .1049123 .1952379
_cons | 10.50285 .6117 17.17 0.000 9.301009 11.70468
----------------------------------------------------------------------------
24
POSSIBLE INDIRECT MEASURES FOR ALLEVIATING MULTICOLLINEARITY
. g SP=SM+SF
. reg S ASVABC SP
----------------------------------------------------------------------------
S | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
ASVABC | 1.243199 .1237327 10.05 0.000 1.000095 1.486303
SP | .1500751 .0229866 6.53 0.000 .1049123 .1952379
_cons | 10.50285 .6117 17.17 0.000 9.301009 11.70468
----------------------------------------------------------------------------
. reg S ASVABC SM SF
----------------------------------------------------------------------------
S | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
ASVABC | 1.242527 .123587 10.05 0.000 .999708 1.485345
SM | .091353 .0459299 1.99 0.047 .0011119 .1815941
SF | .2028911 .0425117 4.77 0.000 .1193658 .2864163
_cons | 10.59674 .6142778 17.25 0.000 9.389834 11.80365
----------------------------------------------------------------------------
25
POSSIBLE INDIRECT MEASURES FOR ALLEVIATING MULTICOLLINEARITY
. g SP=SM+SF
. reg S ASVABC SP
----------------------------------------------------------------------------
S | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
ASVABC | 1.243199 .1237327 10.05 0.000 1.000095 1.486303
SP | .1500751 .0229866 6.53 0.000 .1049123 .1952379
_cons | 10.50285 .6117 17.17 0.000 9.301009 11.70468
----------------------------------------------------------------------------
. reg S ASVABC SM SF
----------------------------------------------------------------------------
S | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
ASVABC | 1.242527 .123587 10.05 0.000 .999708 1.485345
SM | .091353 .0459299 1.99 0.047 .0011119 .1815941
SF | .2028911 .0425117 4.77 0.000 .1193658 .2864163
_cons | 10.59674 .6142778 17.25 0.000 9.389834 11.80365
----------------------------------------------------------------------------
The standard error of SP is much smaller than those of SM and SF. The use of the
restriction has led to a large gain in efficiency and the problem of multicollinearity has been
eliminated.
26
POSSIBLE INDIRECT MEASURES FOR ALLEVIATING MULTICOLLINEARITY
. g SP=SM+SF
. reg S ASVABC SP
----------------------------------------------------------------------------
S | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
ASVABC | 1.243199 .1237327 10.05 0.000 1.000095 1.486303
SP | .1500751 .0229866 6.53 0.000 .1049123 .1952379
_cons | 10.50285 .6117 17.17 0.000 9.301009 11.70468
----------------------------------------------------------------------------
. reg S ASVABC SM SF
----------------------------------------------------------------------------
S | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-----------+----------------------------------------------------------------
ASVABC | 1.242527 .123587 10.05 0.000 .999708 1.485345
SM | .091353 .0459299 1.99 0.047 .0011119 .1815941
SF | .2028911 .0425117 4.77 0.000 .1193658 .2864163
_cons | 10.59674 .6142778 17.25 0.000 9.389834 11.80365
----------------------------------------------------------------------------
The t statistic is very high. Thus it would appear that imposing the restriction has improved
the regression results. However, it is possible that the restriction may not be valid. We
should test it. Testing theoretical restrictions is one of the topics in Chapter 6.
27
Copyright Christopher Dougherty 2016.
Individuals studying econometrics on their own who feel that they might benefit
from participation in a formal course should consider the London School of
Economics summer school course
EC212 Introduction to Econometrics
http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx
or the University of London International Programmes distance learning course
EC2020 Elements of Econometrics
www.londoninternational.ac.uk/lse.
2016.04.30