Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

𝑛− 𝑐

ST 4011 – Econometrics (30L, 2C) Step 3: Fit separate OLS regressions to the first observations and the
2
Department of Statistics
𝑛− 𝑐
last observations and obtain the respective residual sums of squares
2
4. Goldfeld-Quandt Test
𝑅𝑆𝑆1 and 𝑅𝑆𝑆2. Here, 𝑅𝑆𝑆1 is the residual sum of squares of the regression
This is a popular method that can be applied to test heteroscedasticity if
model that is fitted to the smallest set of 𝑋𝑖 values and 𝑅𝑆𝑆2 is the
the heteroscedastic variance σ2𝑖 can be assumed positively related to one
corresponding value for the largest set of 𝑋𝑖 values.
of the explanatory variables in the regression model.
𝑛− 𝑐
Note that each of these RSS have ( ) – 𝑘 degrees of freedom (df),
For simplicity, consider the following simple linear regression model: 2

𝑌𝑖 = 𝛽0 + 𝛽1 𝑋𝑖 + 𝑢𝑖 where 𝑘 is the number of parameters to be estimated in the fitted model.

For example, suppose σ2𝑖 is positively related to 𝑋𝑖 as For the simple linear regression case, 𝑘 = 2.

𝜎𝑖2 = 𝜎 2 𝑋𝑖2 , Step 4: Compute the ratio


𝑅𝑆𝑆2 /𝑑𝑓
where 𝜎 2 is a constant. Here 𝜎𝑖2 is proportional to the square of the 𝜆=
𝑅𝑆𝑆1 /𝑑𝑓
𝑋𝑖 variable. It has been discussed in the literature that this assumption
If the assumption of homoscedasticity is valid and error terms 𝑢𝑖 ; 𝑖 =
seems to be quite useful generally in practice.
1, 2, 3, … , 𝑛 are normally distributed (usual regression assumption), 𝜆
If the above-mentioned positive relationship is appropriate, it would mean
follows a 𝐹 −distribution with both numerator and denominator degrees
that 𝜎𝑖2 will be increased as the value of 𝑋𝑖 increases. If it is the case, 𝑛− 𝑐
of freedoms are equal to ( 2
) – 𝑘 as mentioned above.
heteroscedasticity is most likely to be present in the model. To test this
Hypothesis test
explicitly, Goldfeld and Quandt suggest the following steps:
H0: Homoscedasticity H1: Heteroscedasticity
Step 1: Order or rank the observations according to the values of 𝑋𝑖 from
We reject H0, if the computed 𝜆 value is greater than the critical F value at
smallest to largest.
a given level of significance.
Step 2: Omit 𝑐 central observations, where 𝑐 is specified a priori, and
divide the remaining (𝑛 − 𝑐) observations into two groups of each having
(𝑛 − 𝑐)/2 observations.
Department of Statistics ST 4011 - Econometrics

Note: The performance of this test depends on how 𝑐 is chosen. For simple Therefore, G-Q test is applied to test the heteroscedasticity. There are 30
linear regression, Goldfeld and Quandt suggested based on the Monte observations. Therefore, 𝑐 can be taken equal to 4 and have equal number
Carlo experiments that if the sample size (𝑛) is about 30, 𝑐 can roughly be of observations for the two regressions.
taken equal to 8. Hence, following the same proportion 𝑐 = 16 if 𝑛 =
60. However, Judge et al. (1982) (see reference below) noted that, 𝑐 = 4
if 𝑛 = 30 and 𝑐 = 10 if 𝑛 = 60. The latter one is used nowadays since
it shows satisfactory results.
Reference: George G. Judge, R. Carter Hill, William E. Griffiths, Helmut
Lütkepohl, and Tsoung-Chao Lee, Introduction to the Theory and Practice
of Econometrics, John Wiley & Sons, New York, 1982, p. 422.
Note: if there are more than on explanatory variables, the ranking of
observations, the first step in the test, can be done according to any one
of them. However, if the suitable 𝑋 variables for the analysis are unknown,
then the selection of variables can be achieved by sketching each variable
with 𝑢̂𝑖2 . It will follow by applying the Goldfeld-Quandt Test for of any of
the selected 𝑋 variables. If only 𝑋2 is related with 𝑢̂𝑖2 , then perform the
test for 𝑋2 only.
Example: Consider a context of modelling the consumption expenditure
using the income for a cross section of 30 families. Assume that
Regression based on first 13 observations
consumption expenditure is linearly related to income and
heteroscedasticity problem can be there in the fitted linear model. Further
assume that the nature of heteroscedasticity is as given under the G-Q test.

2
Department of Statistics ST 4011 - Econometrics

Regression based on last 13 observations 𝑋 variable to be used for ordering the observations. These limitations can
be avoided by applying the Breusch–Pagan–Godfrey test.
To illustrate this test, consider the following 𝑘 −variable linear regression
model
𝑌𝑖 = 𝛽0 + 𝛽1 𝑋1𝑖 + 𝛽2 𝑋2𝑖 + ⋯ + 𝛽𝑘 𝑋𝑘𝑖 + 𝑢𝑖 ……………………….. (01)
F- test

Assume that the error variance σ2i is described as some function of the
non-stochastic variables Z’s; some or all of the X’s can serve as Z’s.
𝜎𝑖2 = 𝑓(𝛼0 + 𝛼1 𝑍1𝑖 + 𝛼2 𝑍2𝑖 + ⋯ + 𝛼𝑚−1 𝑍(𝑚−1)𝑖 )
Specifically, assume that
𝜎𝑖2 = 𝛼0 + 𝛼1 𝑍1𝑖 + 𝛼2 𝑍2𝑖 + ⋯ + 𝛼𝑚−1 𝑍(𝑚−1)𝑖 + 𝑣𝑖
where 𝑣𝑖 is the errror term of this regression. Hence, σ2i is a linear
function of the Z’s.
If 𝛼1 = 𝛼2 = ⋯ = 𝛼𝑚−1 = 0 , then 𝜎𝑖2 = 𝛼0 which is a constant.
Therefore, to test whether σ2i is homoscedastic, one has to test the
Note: Understand the procedure and the interpretation of results of the following hypothesis.
Goldfeld-Quandt test based on the given outputs. H0: 𝛼1 = 𝛼2 = ⋯ = 𝛼𝑚−1 = 0 (Homoscedasticity)
H1: at least one 𝛼𝑖 ≠ 0, where 𝑖 ≠ 0 (Heteroscedasticity)
5. Breusch–Pagan–Godfrey Test (B-P-G Test)
The success of the above test depends on the value of 𝑐 (the number of
central observations to be ignored) as well as identifying the correct

3
Department of Statistics ST 4011 - Econometrics

Step 1: Estimate model (01) using OLS and obtain the residuals; 1
If the computed value of (𝐸𝑆𝑆) exceeds the critical chi-square value at
2
𝑢̂1 , 𝑢̂2 , … , 𝑢̂𝑛. the chosen level of significance, the null hypothesis of homoscedasticity
2
Step 2: Compute the Maximum Likelihood (ML) estimator of 𝜎 is defined
can be rejected. Otherwise, we do not have enough evidence to reject the
𝑢̂𝑖2⁄
by 𝜎̃ 2 = ∑𝑛𝑖=1 𝑛. null hypothesis.
Step 3: Construct variables 𝑝𝑖 defined as
𝑢̂𝑖2⁄ Understand the procedure and the interpretation of results of the
𝑝𝑖 = ; 𝑖 = 1, 2, 3, … , 𝑛,
𝜎̃ 2
Breusch–Pagan–Godfrey test using the following output.
which is simply each squared residual divided by the average squared
residuals.
Step 4: Regress 𝑝𝑖 on the 𝑍’s as
𝑝𝑖 = 𝛼0 + 𝛼1 𝑍1𝑖 + 𝛼2 𝑍2𝑖 + ⋯ + 𝛼𝑚−1 𝑍(𝑚−1)𝑖 + 𝑣𝑖 …………….. (02)
Step 5: Obtain the ESS (explained sum of squares) from the model under
step 4 and define,
1
Θ = (𝐸𝑆𝑆)
2
If the assumption of homoscedasticity is valid, the error terms 𝑢𝑖 ; 𝑖 =
1, 2, 3, … , 𝑛 are normally distributed (usual regression assumption), and
the sample size 𝑛 is very large, Θ follows the chi-square distribution with
(𝑚 − 1) degrees of freedom which is equal to the number of non-
stochastic variables in the fitted model (02).

Here, `asy‘ means asymptotically distributed.

4
Department of Statistics ST 4011 - Econometrics

6. White’s General Heteroscedasticity Test


Note that the above Breusch–Pagan–Godfrey test is sensitive to normality
assumption. White has proposed a general test that does not depend on
the normality assumption. Furthermore, it is easy to implement. Consider
the following two explanatory variables model (it can be generalized to
any number of explanatory variables following the same argument).

𝑌𝑖 = 𝛽0 + 𝛽1 𝑋1𝑖 + 𝛽2 𝑋2𝑖 + 𝑢𝑖
Step1: Given the data, first run the above model assuming no
heteroscedasticity and obtain the residuals, 𝑢̂𝑖 ; for 𝑖 = 1,2,3, … , 𝑛.
Step 2: Run the following (auxiliary) regression model
𝑢̂𝑖2 = 𝛼0 + 𝛼1 𝑋1𝑖 + 𝛼2 𝑋2𝑖 + 𝛼3 𝑋1𝑖 2 + 𝛼4 𝑋2𝑖 2 + 𝛼5 𝑋1𝑖 𝑋2𝑖 + 𝑣𝑖
and obtain the 𝑅 2 of the model. Note that the square and the cross-
product terms of explanatory variables are considered in the model.
Step 3: Conduct the hypothesis test
H0: Homoscedasticity H1: Heteroscedasticity
Under H0,

Here, the degrees of freedom (df) of the chi-square distribution equals the
number of explanatory variables in the fitted model in step 2.

5
Department of Statistics ST 4011 - Econometrics

If the chi-square value obtained exceeds the critical chi-square value at the
chosen level of significance, it can be concluded that there is
heteroscedasticity in the fitted model.

Limitations
1. If the model has several explanatory variables, then introducing all
explanatory variables and their squares and cross product terms
can be time consuming.
2. If the test is significant it may be due to heteroscedasticity or
specification error or both. We will discuss about the specification
error later.
3. It is important note that both B-P-G and White tests are
appropriate for large 𝑛.
Refer the following output and understand the application and
interpretation of results of the White’s general heteroscedasticity test.

6
Department of Statistics ST 4011 - Econometrics

Remedial Measures 3. Under plausible assumptions about heteroscedasticity patterns


Now consider the remedial measures for the heteroscedasticity. a). If the error variance can be assumed proportional to 𝑿𝟐𝒊
(i.e. V (𝑢𝑖 ) ∝ 𝑋𝑖2 )
1. When values of 𝛔𝟐𝐢 are known: Weighted Least Squares method
As we have seen at the beginning of the lesson, if σ2i ; I =1, 2, 3, …, n are
known, the most straightforward method of correcting for
heteroscedasticity is to apply the weighted least squares method to obtain
estimators that are BLUE.

2. When values of 𝛔𝟐𝐢 are unknown: White’s Heteroscedasticity-


Then, divide the original model by 𝑋𝑖 .
Consistent Variances and Standard Errors
Exercise: Consider the model 𝑌𝑖 = 𝛽0 + 𝛽1 𝑋𝑖 + 𝑢𝑖 and assume that the
It has been shown that White’s heteroscedasticity corrected standard
error variance is proportional to 𝑋𝑖2 . Apply the above transformation and
errors or robust standard errors work better than corresponding OLS
show that the constant variance assumption is satisfied under the
estimates for large samples. They are asymptotically valid (i.e., for large
assumption that 𝐸(𝑢𝑖 ) = 0.
samples) for statistical inferences about the true parameter values under
heteroscedastic conditions. Nowadays most of the statistical packages
b). If the error variance can be assumed proportional to 𝑿𝒊
produce White’s heteroscedasticity corrected standard errors together
(i.e. V (𝑢𝑖 ) ∝ 𝑋𝑖 )
with the usual OLS standard errors.
Exercise: Divide the original model by √𝑋𝑖 and show that the error
variance becomes a constant similar to above.

7
Department of Statistics ST 4011 - Econometrics

c). If the error variance can be assumed proportional to squared value of


the mean of 𝒀𝒊 .
(i.e. V (𝑢𝑖 ) ∝ [𝐸(𝑌𝑖 )]2 )
=> Var(𝑢𝑖 ) = 𝜎 2 [𝐸(𝑌𝑖 )]2
Note: We hardly know the exact values of 𝐸(𝑌𝑖 ) at different levels of
𝑋 variable. However, if 𝑛 is large, this can be estimated using 𝑦̂𝑖 as it is a
consistent estimator for 𝐸(𝑌𝑖 ).
Exercise: Divide the original model by 𝑦̂𝑖 and show that the error variance
becomes a constant.

d). Using the log transformation


The log transformation can be used to compress the scales in which the
variables are measured. Therefore, a log transformation such as
ln(𝑌𝑖 ) = 𝛽̃0 + 𝛽̃1 ln (𝑋𝑖 ) + 𝑢̃𝑖 often reduces heteroscedasticity impact
when compared with the regression model 𝑌𝑖 = 𝛽0 + 𝛽1 𝑋𝑖 + 𝑢𝑖 .

You might also like