Professional Documents
Culture Documents
Chapter 5 - Violations of Regression Assumptions
Chapter 5 - Violations of Regression Assumptions
Fig B
Figure A shows, the conditional variance of Yi
(which is equal to that of ui), conditional upon
the given Xi, remains the same regardless of
the values taken by the variable X.
In contrast, consider Figure B, which shows
that the conditional variance of Yi increases as
X increases. Here, the variances of Yi are not
the same. Hence, there is heteroscedasticity.
Assumption 5of the LRM states that the
disturbances should have a constant (equal)
variance independent of t:
Var(ut)=σ2
Note: as X increases, the variance of the error
term increases (the “goodness of fit” gets worse)
Homoskedasticity: The error has constant variance
Heteroskedasticity: Spread of error depends on X.
Another form of Heteroskedasticity
Types of Heteroskedacticity
Unconditional Heteroskedasticity – occurs
when heteroskedasticity of the error variance
is not correlated with the independent
variables in the multiple regression.
Conditional Heteroskedasticity – the error
variance that is correlated (conditional on)
the values of the independent variables in the
regression.
Consequences of Heteroskedasticity
Heteroskedasticity can lead to mistakes in inference.
When errors are heteroskedastic the F-test for the
overall significance of the regression is unreliable.
Further more t-tests for the significance of individual
regression coefficients are unreliable because
hetereskedasticity introduces bias into estimators of
the regression coefficients.
If a regression shows significant heteroskedasticity,
the standard errors and test statistics computed by
regression programs will be incorrect unless they are
adjusted for heteroskedasticity.
In regressions with financial data, the most
likely result of heteroskedasticity is that the
estimated standard errors will be
underestimated and the t-statistics will be
inflated.
Hypothesis Testing:
H0: Error variances are equal or constant
H1: Error variances are not equal
Detecting Heteroskedasticity
There are two ways in general.
The first is the informal way which is done
through graphs and therefore we call it the
graphical method.
The second is through formal tests for
heteroskedasticity, like the following ones:
0.097941 0.133302
Blaisdell Company Example
Example
.09794
DW .735
.13330
Using Durbin Watson table of your text book,
for k = 1, and n=20, and using = .05 we find
U = 1.20, and L = 1.41
If wd < DL reject Ho, while if wd > DU do not
reject Ho.
Since dw = .735 < L = .95
We reject the null hypothesis, namely, that
the error terms are positively
autocorrelated.
Remedial Measures for Serial Correlation
Addition of one or more independent
variables to the regression model.
One major cause of autocorrelated error terms is the
omission from the model of one or more key
variables that have time-ordered effects on the
dependent variable.
Use transformed variables.
The regression model is specified in terms of
changes rather than levels.
Multicollinearity
Multicollinearity (also collinearity) is a phenomenon
in which two or more predictor variables in a
multiple regression model are highly correlated,
meaning that one can be linearly predicted from the
others with a substantial degree of accuracy.
Multicollinearity violates Classical Regression
Assumption which specifies that no explanatory
variable is a perfect linear function of any other
explanatory variables.
You cannot hold all the other independent variables
in the equation constant if every time one variable
changes, another changes in an identical manner.
There are certain reasons why Multicollinearity occurs:
It is caused by an inaccurate use of dummy
variables.
It is caused by the inclusion of a variable
which is computed from other variables in the
data set.
Generally occurs when the variables are highly
correlated to each other.
Problems of Multicollinearity
The computed t-scores will fall.
Estimates will become very sensitive to changes in
specification.
Multicollinearity will thus make confidence
intervals for the parameters very wide, and
significance tests might therefore give
inappropriate conclusions, and so make it
difficult to draw sharp inferences.
Multicollinearity Diagnostics
A formal method of detecting the presence of
multicollinearity that is widely used is by the means
of Variance Inflation Factor (VIF)
– It measures how much the variances of the estimated
regression coefficients are inflated as compared to when
the independent variables are not linearly related.
1
VIF j , j 1,2, k
1 R 2j