Professional Documents
Culture Documents
Topic 9 Heteroscedasticity
Topic 9 Heteroscedasticity
HETEROSCEDASTICITY( 异方差 )
DEFINITION OF HETEROSCEDASTICITY
EXAMPLES OF HET MODELS
TESTS FOR HETEROSCEDASTICITY
REMEDIAL METHODS : WEIGHTED AND
LOGARITHMIC REGRESSIONS
1
HETEROSCEDASTICITY
+ 2X
Y = 1
1
X1 X2 X3 X4 X5 X
This sequence introduces the topic of heteroscedasticity, which relates to the distribution
of the disturbance term in a regression model.
2
HETEROSCEDASTICITY
+ 2X
Y = 1
1
X1 X2 X3 X4 X5 X
We will discuss it in the context of the regression model Y = 1 + 2X + u. To keep the
diagram uncluttered, we will suppose that we have a sample of only five observations, the X
values of which are as shown.
3
HETEROSCEDASTICITY
+ 2X
Y = 1
1
X1 X2 X3 X4 X5 X
If there were no disturbance term in the model, the observations would lie on the line as
shown.
4
HETEROSCEDASTICITY
+ 2X
Y = 1
1
X1 X2 X3 X4 X5 X
Now we take account of the effect of the disturbance term. It will displace each observation
in the vertical dimension, since it modifies the value of Y without affecting X.
5
HETEROSCEDASTICITY
+ 2X
Y = 1
1
X1 X2 X3 X4 X5 X
6
HETEROSCEDASTICITY
+ 2X
Y = 1
1
X1 X2 X3 X4 X5 X
One is that the expected value of u in each observation is 0. That is, E(u)=0. The second is
that the distribution in each observation is normal. We are not concerned with either of
these and we will assume them to be true.
7
HETEROSCEDASTICITY
+ 2X
Y = 1
1
X1 X2 X3 X4 X5 X
The third is that the distribution is the same for each observation. Var(ut)=σ2. In the present
case, that means that the normal distributions shown all have the same variance. In other
words, the variance of ut does not depend on X (σ2 ≠ f(Xi)).
8
HETEROSCEDASTICITY
+ 2X
Y = 1
1
X1 X2 X3 X4 X5 X
9
HETEROSCEDASTICITY
+ 2X
Y = 1
1
X1 X2 X3 X4 X5 X
Each observation is then potentially (before the sample is drawn) an equally reliable guide
to the location of the line Y = 1 + 2X.
10
HETEROSCEDASTICITY
+ 2X
Y = 1
1
X1 X2 X3 X4 X5 X
Once the sample has been drawn, some observations will lie closer to the line than others,
but we have no way of anticipating in advance which ones these will be.
11
HETEROSCEDASTICITY
+ 2X
Y = 1
1
X1 X2 X3 X4 X5 X
Now consider the situation illustrated by the diagram above. The distribution of u
associated with each observation still has expected value 0 and is normal. However the
variance is no longer constant.
12
HETEROSCEDASTICITY
+ 2X
Y = 1
1
X1 X2 X3 X4 X5 X
Obviously, observations where u has low variance, like that for X1, will tend to be better
guides to the underlying relationship than those like that for X5, where it has a relatively
high variance.
13
HETEROSCEDASTICITY
+ 2X
Y = 1
1
X1 X2 X3 X4 X5 X
When the distribution is not the same for each observation, the disturbance term is said to
be subject to heteroscedasticity ( 异方差 ).
14
Examples of het models
Example 1. Consumption-expenditure model: ( 单调递
增性 )
CONt = β1+β2INCt+ut
It is likely that V(ut)>V(us) if INCt>INCs
15
Example 3. Average hourly earnings vs years of
education
(Data source: Current Population Survey)
16
CONSEQUENCES OF HETEROSCEDASTICITY
+ 2X
Y = 1
1
X1 X2 X3 X4 X5 X
OLS estimator is still linear and unbiased. There are two major consequences of
heteroscedasticity. One is that the standard errors of the regression coefficients are
estimated wrongly and the t tests (and F test) are invalid.
17
Example:
For simple regression model Yt 1 2 X t ut
If there is HET, where var(ut)=σt2 , then the usual formula for the variance
(and hence the standard error) of the OLS estimator:
var(ut ) 2
var(b2 )
n var( X ) n var( X )
t2
var(b2 )
n var( X )
18
HETEROSCEDASTICITY
+ 2X
Y = 1
1
X1 X2 X3 X4 X5 X
The other is that OLS is an inefficient estimation technique. An alternative technique which
gives relatively high weight to the relatively low-variance observations should tend to yield
more accurate estimates.
19
Still in the example of a simple regression model, if there is no HET,
then the variance of b2 is,
1
var(b2 ) kt2 2
where kt2
( X t X )2
If there is HET in the disturbance terms ut, suppose the estimator of the
parameter β2 is b2*
var(ut ) t2 t 2 (t 0, t 1,2,..., n)
then
var(b )
*
k
2 2
2
k 2
2
k 2
k
t t
2
k
2 t t t t t 2
t
var(b2 )
k t t
2
k t
2
Then if
k t t
2
1, var(b2* ) var(b2 )
k t
2
Therefore, when there is HET, the OLS estimator is not efficient anymore.
20
Detection of Heteroscedasticity
确定是否存在异方差
1800000
1600000
1400000
1200000
Manufacturing
1000000
800000
600000
400000
200000
0
0 1000000 2000000 3000000 4000000 5000000 6000000 7000000 8000000
GDP
In the scatter diagram manufacturing output is plotted against GDP, both measured in U.S.
$ millions, for 30 countries for 1997. (Data are from the UNIDO Yearbook. The sample is
restricted to countries with GDP at least $10 billion and GDP per capita at least $2000.)
22
DETECTION OF HETEROSCEDASTICITY
1800000
1600000
USA
1400000
Japan
1200000
Manufacturing
1000000
800000
600000
400000
200000
0
0 1000000 2000000 3000000 4000000 5000000 6000000 7000000 8000000
GDP
The scatter diagram is dominated by the observations for Japan and the USA and it is
difficult to detect any kind of pattern.
23
DETECTION OF HETEROSCEDASTICITY
300000
250000
200000
Manufacturing
150000
100000
50000
0
0 200000 400000 600000 800000 1000000 1200000 1400000
GDP
However if those two countries are dropped and the scatter diagram rescaled, a clear
picture of heteroscedasticity emerges.
24
DETECTION OF HETEROSCEDASTICITY
300000
250000
200000
Manufacturing
South Korea
150000
100000
50000
Mexico
0
0 200000 400000 600000 800000 1000000 1200000 1400000
GDP
The reason for the heteroscedasticity is that variations in the size of the manufacturing
sector around the trend relationship increase with the size of GDP.
25
DETECTION OF HETEROSCEDASTICITY
300000
250000
200000
Manufacturing
South Korea
150000
100000
50000
Mexico
0
0 200000 400000 600000 800000 1000000 1200000 1400000
GDP
South Korea and Mexico are both countries with relatively large GDP. The manufacturing
sector is relatively important in South Korea, so its observation is far above the trend line.
The opposite was the case for Mexico, at least in 1997.
26
DETECTION OF HETEROSCEDASTICITY
300000
250000
200000
Manufacturing
150000
100000
Singapore
50000
Greece
0
0 200000 400000 600000 800000 1000000 1200000 1400000
GDP
Singapore and Greece are another pair of countries with relatively large and small
manufacturing sectors. However, because the GDP of both countries is small, their
variations from the trend relationship are also small.
27
GOLDFELD-QUANDT TEST FOR HETEROSCEDASTICITY
+ 2X
Y = 1
1
X1 X2 X3 X4 X5 X
The disturbance term in a regression model is said to be homoscedastic if it has the same
potential distribution in all observations. If this condition is not satisfied, it is said to be
heteroscedastic, and clearly the possible types of heteroscedasticity are endless.
28
GOLDFELD-QUANDT TEST FOR HETEROSCEDASTICITY
+ 2X
Y = 1
1
X1 X2 X3 X4 X5 X
However, in one particularly common type the standard deviation of the distribution is
proportional to the size of one of the explanatory variables.
29
GOLDFELD-QUANDT TEST FOR HETEROSCEDASTICITY
+ 2X
Y = 1
1
X1 X2 X3 X4 X5 X
This type of heteroscedasticity is illustrated in the diagram above. The standard deviation
of the distribution is proportional to X.
30
GOLDFELD-QUANDT TEST FOR HETEROSCEDASTICITY
300000
250000
200000
Manufacturing
150000
100000
50000
0
0 200000 400000 600000 800000 1000000 1200000 1400000
GDP
The Goldfeld-Quandt test is a test for this type of heteroscedasticity. The sample is divided
into three ranges containing the 3/8 of the observations with the smallest values of the X
variable, the 3/8 of the observations with the largest values, and 1/4 in the middle.
31
GOLDFELD-QUANDT TEST FOR HETEROSCEDASTICITY
300000
250000
200000
Manufacturing
150000
100000
50000
0
0 200000 400000 600000 800000 1000000 1200000 1400000
GDP
In the present case with 28 observations, the lower, middle, and upper ranges have 11, 6,
and 11 observations, respectively
32
GOLDFELD-QUANDT TEST FOR HETEROSCEDASTICITY
300000
250000
200000
Manufacturing
150000
100000
50000
0
0 200000 400000 600000 800000 1000000 1200000 1400000
GDP
You then fit regression lines to the lower and upper ranges of the observations, as shown.
33
GOLDFELD-QUANDT TEST FOR HETEROSCEDASTICITY
300000
250000
200000
Manufacturing
150000
100000
50000
0
0 200000 400000 600000 800000 1000000 1200000 1400000
GDP
The regression line for the lower range has been buried under the observations. Here it is,
in red.
34
GOLDFELD-QUANDT TEST FOR HETEROSCEDASTICITY
300000
250000
RSS1 = 157,000,000
200000
Manufacturing
150000
100000
RSS2 = 13,518,000,000
50000
0
0 200000 400000 600000 800000 1000000 1200000 1400000
GDP
You then compare the residual sum of squares for the two regressions. We will denote
them RSS1 and RSS2 for the lower and upper ranges, respectively.
35
GOLDFELD-QUANDT TEST FOR HETEROSCEDASTICITY
300000
250000
RSS1 = 157,000,000
200000
Manufacturing
150000
100000
RSS2 = 13,518,000,000
50000
0
0 200000 400000 600000 800000 1000000 1200000 1400000
GDP
36
GOLDFELD-QUANDT TEST FOR HETEROSCEDASTICITY
300000
250000
RSS1 = 157,000,000
200000
Manufacturing
150000
100000
RSS2 = 13,518,000,000
50000
0
0 200000 400000 600000 800000 1000000 1200000 1400000
GDP
37
GOLDFELD-QUANDT TEST FOR HETEROSCEDASTICITY
300000
250000
RSS1 = 157,000,000
200000 RSS 2 / n2 13,518,000,000 / 9
Manufacturing
F ( n2 , n1 ) 86.1
RSS1 / n1 157,000,000 / 9
150000
F (9,9)crit , 0.1% 10.1
100000
RSS2 = 13,518,000,000
50000
0
0 200000 400000 600000 800000 1000000 1200000 1400000
GDP
If it is greater, the question is whether it is significantly greater. The test statistic is the F
statistic shown above. n1 and n2 are the numbers of degrees of freedom in the lower and
upper regressions. (Normally n1 and n2 will be the same.)
38
GOLDFELD-QUANDT TEST FOR HETEROSCEDASTICITY
300000
RSS1 / n1 157,000,000 / 9
150000
F (9,9)crit , 0.1% 10.1
100000
RSS2 = 13,518,000,000
50000
0
0 200000 400000 600000 800000 1000000 1200000 1400000
GDP
In the present case we reject the null hypothesis of homoscedasticity at the 0.1% level. We
therefore need to find an alternative to straightforward OLS regression.
39
GOLDFELD-QUANDT TEST FOR HETEROSCEDASTICITY
300000
250000
RSS1 = 157,000,000
200000
Manufacturing
150000
100000
RSS2 = 13,518,000,000
50000
0
0 200000 400000 600000 800000 1000000 1200000 1400000
GDP
Incidentally, why was the sample split into three ranges? Why not split it into two halves,
and compare RSS for the regressions using the two halves?
40
GOLDFELD-QUANDT TEST FOR HETEROSCEDASTICITY
300000
250000
RSS1 = 157,000,000
200000
Manufacturing
150000
100000
RSS2 = 13,518,000,000
50000
0
0 200000 400000 600000 800000 1000000 1200000 1400000
GDP
The reason is that, by omitting the central range, you increase the contrast between the
variances of the residuals, and you have a better chance of rejecting the null hypothesis of
homoscedasticity.
41
GOLDFELD-QUANDT TEST FOR HETEROSCEDASTICITY
300000
250000
RSS1 = 157,000,000
200000
Manufacturing
150000
100000
RSS2 = 13,518,000,000
50000
0
0 200000 400000 600000 800000 1000000 1200000 1400000
GDP
However, the larger the omitted central section, the smaller will be the number of degrees
of freedom in the subsample regressions, and this will make it more difficult to reject the
null hypothesis.
42
GOLDFELD-QUANDT TEST FOR HETEROSCEDASTICITY
300000
250000
RSS1 = 157,000,000
200000
Manufacturing
150000
100000
RSS2 = 13,518,000,000
50000
0
0 200000 400000 600000 800000 1000000 1200000 1400000
GDP
Thus there is a trade-off between making the omitted range too large and too small. On the
basis of experimentation, Goldfeld and Quandt recommend omitting about a quarter of the
observations.
43
Park TEST and Gleiser test FOR HETEROSCEDASTICITY
ei f ( X ji ) i
ei 1 2 X i i
ei 1 2 X i i
1
ei 1 2 i
Xi
White test FOR HETEROSCEDASTICITY
最常用
Unlike the G-Q test, which requires reordering the
observations with respect to the X variable that
supposedly caused HET, White test is used to detect
general HET and easy to implement.
eg.
Yi 1 2 X 2 3 X 3 i
ei 2 1 2 X 2 3 X 3 4 X 22 5 X 32 6 X 2 X 3 i
Under the null that there is no HET,
n R 2 ~ 2 df
If the observed chi-square value exceeds the critical chi-square
value at the chosen level of sign ificance, then there is HET.
REMEDIAL METHODS: WEIGHTED AND LOGARITHMIC REGRESSIONS
Y 1 2 X u
population variance of ui i2
This sequence presents two methods for dealing with the problem of heteroscedasticity.
We will start with the general case, where the variance of the distribution of the disturbance
term in observation i is i2.
47
REMEDIAL METHODS: WEIGHTED AND LOGARITHMIC REGRESSIONS
Y 1 2 X u
population variance of ui i2
Yi 1 X i ui
1 2
i i i i
48
REMEDIAL METHODS: WEIGHTED AND LOGARITHMIC REGRESSIONS
Y 1 2 X u
population variance of ui i2
Yi 1 X i ui
1 2
i i i i
ui 1
population variance of 2 population variance of ui
i i
i2
2 1
i
The population variance of the disturbance term in the revised model is now equal to 1 in all
observations, and so the disturbance term is homoscedastic.
49
REMEDIAL METHODS: WEIGHTED AND LOGARITHMIC REGRESSIONS
Y 1 2 X u
population variance of ui i2
Yi 1 X i ui
1 2
i i i i
ui 1
population variance of 2 population variance of ui
i i
i2
2 1
i
Yi 1 Xi ui
Y ' 1 H 2 X ' u' Y ' , H , X ' , u'
i i i i
In the revised model, we regress Y' on X' and H, as defined. Note that there is no intercept
in the revised model. 1 becomes the slope coefficient of the artificial variable 1/ i.
50
REMEDIAL METHODS: WEIGHTED AND LOGARITHMIC REGRESSIONS
Y 1 2 X u
population variance of ui i2
Yi 1 X i ui
1 2
i i i i
ui 1
population variance of 2 population variance of ui
i i
i2
2 1
i
Yi 1 Xi ui
Y ' 1 H 2 X ' u' Y ' , H , X ' , u'
i i i i
The revised model is described as a weighted regression model( 加权回归模型 ) because we are
weighting observation i by a factor 1/i. Note that we are automatically giving the highest
weights to the most reliable observations (those with the lowest values of i).
51
REMEDIAL METHODS: WEIGHTED AND LOGARITHMIC REGRESSIONS
Y 1 2 X u
population variance of ui i2
i Z i
Of course in practice we do not know the value of i in each observation. However it may
be reasonable to suppose that it is proportional to some measurable variable, Zi.
52
REMEDIAL METHODS: WEIGHTED AND LOGARITHMIC REGRESSIONS
Y 1 2 X u
population variance of ui i2
i Z i
Yi 1 X i ui
1 2
Zi Zi Zi Zi
If this is the case, we can make the model homoscedastic by dividing through by Zi.
53
REMEDIAL METHODS: WEIGHTED AND LOGARITHMIC REGRESSIONS
Y 1 2 X u
population variance of ui i2
i Z i
Yi 1 X i ui
1 2
Zi Zi Zi Zi
ui 1 2 i2
population variance of 2 i 2 2 2
Zi Zi i /
Yi 1 Xi ui
Y ' 1 H 2 X ' u' Y ' , H , X ' , u'
Zi Zi Zi Zi
The disturbance term in the revised model has constant variance 2. We do not need to
know the value of 2. The crucial point is that, by assumption, it is constant.
54
REMEDIAL METHODS: WEIGHTED AND LOGARITHMIC REGRESSIONS
300000
250000
200000
Manufacturing
150000
100000
50000
0
0 200000 400000 600000 800000 1000000 1200000 1400000
GDP
We will illustrate this procedure with the UNIDO data on manufacturing output and GDP.
We will try scaling by population. A regression of manufacturing output per capita on GDP
per capita is less likely to be subject to heteroscedasticity.
55
REMEDIAL METHODS: WEIGHTED AND LOGARITHMIC REGRESSIONS
9000
8000
Manufacturing per capita
7000
6000
5000
4000
3000
2000
1000
0
0 5000 10000 15000 20000 25000 30000 35000 40000
GDP per capita
Here is the revised scatter diagram. Does it look homoscedastic? Actually, no. This is still
a classic pattern of heteroscedasticity.
56
REMEDIAL METHODS: WEIGHTED AND LOGARITHMIC REGRESSIONS
9000
8000
RSS1 = 5,378,000
Manufacturing per capita
7000
6000
5000
4000
3000
2000
57
11
REMEDIAL METHODS: WEIGHTED AND LOGARITHMIC REGRESSIONS
9000
8000
RSS1 = 5,378,000
Manufacturing per capita
7000
2000
However, the subsamples are small and high ratios can occur on a pure chance basis. The
null hypothesis of homoscedasticity is only just rejected at the 5% level.
58
REMEDIAL METHODS: WEIGHTED AND LOGARITHMIC REGRESSIONS
Y 1 2 X u
population variance of ui i2
i X i
Often the X variable itself is a suitable scaling variable. After all, the Goldfeld-Quandt test
assumes that the standard deviation of the disturbance term is proportional to it.
59
REMEDIAL METHODS: WEIGHTED AND LOGARITHMIC REGRESSIONS
Y 1 2 X u
population variance of ui i2
i X i
Yi 1 ui
1 2
Xi Xi Xi
Note that when we scale though by it, the 2 term becomes the intercept in the revised
model.
60
REMEDIAL METHODS: WEIGHTED AND LOGARITHMIC REGRESSIONS
Y 1 2 X u
population variance of ui i2
i X i
Yi 1 ui
1 2
Xi Xi Xi
ui 1 2 i2
population variance of 2 i 2 2 2
Xi Xi i /
Yi 1 ui
Y ' 1 H 2 u' Y ' , H , u'
Xi Xi Xi
It follows that when we interpret the regression results, the slope coefficient is an estimate
of 1 in the original model and the intercept is an estimate of 2.
61
REMEDIAL METHODS: WEIGHTED AND LOGARITHMIC REGRESSIONS
0.40
Manufacturing/GDP
0.30
0.20
0.10
0.00
0 10 20 30 40 50 60 70 80
1/GDP x 1,000,000
62
REMEDIAL METHODS: WEIGHTED AND LOGARITHMIC REGRESSIONS
0.40
RSS1 = 0.065
Manufacturing/GDP
0.30
0.20
0.10
RSS2 = 0.070
0.00
0 10 20 30 40 50 60 70 80
1/GDP x 1,000,000
No longer. The residual sums of squares for the two subsamples are almost identical,
indeed closer than one would usually expect on a pure chance basis under the null
hypothesis.
63
REMEDIAL METHODS: WEIGHTED AND LOGARITHMIC REGRESSIONS
0.40
RSS1 = 0.065
Manufacturing/GDP
0.10
RSS2 = 0.070
0.00
0 10 20 30 40 50 60 70 80
1/GDP x 1,000,000
64
REMEDIAL METHODS: WEIGHTED AND LOGARITHMIC REGRESSIONS
300000
250000
200000
Manufacturing
150000
100000
50000
0
0 200000 400000 600000 800000 1000000 1200000 1400000
GDP
We will now consider an alternative approach to the problem. It is possible that the
heteroscedasticity has been caused by an inappropriate mathematical specification.
Suppose, in particular, that the true relationship is in fact logarithmic.
65
REMEDIAL METHODS: WEIGHTED AND LOGARITHMIC REGRESSIONS
13
12
log Manufacturing
11
10
7
9 10 11 12 13 14 15
log GDP
66
REMEDIAL METHODS: WEIGHTED AND LOGARITHMIC REGRESSIONS
13
12 RSS1 = 2.140
log Manufacturing
11
10
8
RSS2 = 1.037
7
9 10 11 12 13 14 15
log GDP
We confirm this with the Goldfeld-Quandt test. In this case there is no point in calculating
the conventional test statistic. RSS2 is smaller than RSS1, so it cannot be significantly
greater than RSS1.
67
REMEDIAL METHODS: WEIGHTED AND LOGARITHMIC REGRESSIONS
13
12 RSS1 = 2.140
log Manufacturing
11 RSS1 / n1 2.140 / 9
F ( n2 , n1 ) 2.06
RSS 2 / n2 1.037 / 9
10
F (9,9)crit , 5% 3.18
9
8
RSS2 = 1.037
7
9 10 11 12 13 14 15
log GDP
68
REMEDIAL METHODS: WEIGHTED AND LOGARITHMIC REGRESSIONS
13
12
log Manufacturing
11
10
9 log Y 1 2 log X u
8 Y e 1 X 2 e u
7
9 10 11 12 13 14 15
log GDP
69
REMEDIAL METHODS: WEIGHTED AND LOGARITHMIC REGRESSIONS
MANˆ U 1 GDP
612 0.182 R 2 0.70
POP (1371)POP (0.016)POP
MANˆ U 1
0.189 533 R 2 0.02
GDP (0.019) (841)GDP
Here is a summary of the regressions using the four alternative specifications of the model.
70
REMEDIAL METHODS: WEIGHTED AND LOGARITHMIC REGRESSIONS
MANˆ U 1 GDP
612 0.182 R 2 0.70
POP (1371)POP (0.016)POP
MANˆ U 1
0.189 533 R 2 0.02
GDP (0.019) (841)GDP
71
REMEDIAL METHODS: WEIGHTED AND LOGARITHMIC REGRESSIONS
MANˆ U 1 GDP
612 0.182 R 2 0.70
POP (1371)POP (0.016)POP
MANˆ U 1
0.189 533 R 2 0.02
GDP (0.019) (841)GDP
However, this regression was subject to severe heteroscedasticity. Although the estimate
of the coefficient of GDP is unbiased, it is likely to be relatively inaccurate. Also, and this is
a separate effect of heteroscedasticity, the standard errors, t tests and F test are invalid.
72
REMEDIAL METHODS: WEIGHTED AND LOGARITHMIC REGRESSIONS
MANˆ U 1 GDP
612 0.182 R 2 0.70
POP (1371)POP (0.016)POP
MANˆ U 1
0.189 533 R 2 0.02
GDP (0.019) (841)GDP
In the second regression, the estimate of the slope coefficient was a little lower. However
for this regression also the null hypothesis of homoscedasticity was rejected, but only at
the 5% level.
73
REMEDIAL METHODS: WEIGHTED AND LOGARITHMIC REGRESSIONS
MANˆ U 1 GDP
612 0.182 R 2 0.70
POP (1371)POP (0.016)POP
MANˆ U 1
0.189 533 R 2 0.02
GDP (0.019) (841)GDP
In the third regression the model was scaled through by GDP. As a consequence, the
intercept became an estimator of the original slope coefficient, and vice versa.
74
REMEDIAL METHODS: WEIGHTED AND LOGARITHMIC REGRESSIONS
MANˆ U 1 GDP
612 0.182 R 2 0.70
POP (1371)POP (0.016)POP
MANˆ U 1
0.189 533 R 2 0.02
GDP (0.019) (841)GDP
75
REMEDIAL METHODS: WEIGHTED AND LOGARITHMIC REGRESSIONS
MANˆ U 1 GDP
612 0.182 R 2 0.70
POP (1371)POP (0.016)POP
MANˆ U 1
0.189 533 R 2 0.02
GDP (0.019) (841)GDP
76
REMEDIAL METHODS: WEIGHTED AND LOGARITHMIC REGRESSIONS
MANˆ U 1 GDP
612 0.182 R 2 0.70
POP (1371)POP (0.016)POP
MANˆ U 1
0.189 533 R 2 0.02
GDP (0.019) (841)GDP
MANˆ U 1 GDP
612 0.182 R 2 0.70
POP (1371)POP (0.016)POP
MANˆ U 1
0.189 533 R 2 0.02
GDP (0.019) (841)GDP
MANˆ U 1 GDP
612 0.182 R 2 0.70
POP (1371)POP (0.016)POP
MANˆ U 1
0.189 533 R 2 0.02
GDP (0.019) (841)GDP
MANˆ U 1
0.189 533 R 2 0.02
GDP (0.019) (841)GDP
80
REMEDIAL METHODS:White heteroskedasiticity-consistence covariance matrix
estimator( 怀特异方差一致协方差矩阵估计量 )
Y 0 1 X ui
If the errors contain heteroscedasticity, then
Var (ui ) i 2
White(1980) provided a way to estimate a valid variance of ˆ1
i
( X
i 1
X ) 2 2
uˆ i
n
[ ( X i X ) 2 ]2
i 1
Briefly, it can be shown that when the above equation is multiplied by the sample size n, it converges
in probability toE[( X i x ) 2 u 2i ] / ( 2 x ) 2 , which is the probability limit of n times the equation. The law of
numbers and the central limit theorem play key roles in establishing these convergences. A similar
formula works in the general multiple regression model.
81
REMEDIAL METHODS:White heteroskedasiticity-consistence covariance matrix
estimator( 怀特异方差一致协方差矩阵估计量 )
Y 0 1 X ui
If the errors contain heteroscedasticity, then
Var (ui ) i 2
White(1980) provided a way to estimate a valid variance of ˆ1
i
( X
i 1
X ) 2 2
uˆ i
n
[ ( X i X ) 2 ]2
i 1