Professional Documents
Culture Documents
1997 Bookmatter UnderstandingRegressionAnalysi
1997 Bookmatter UnderstandingRegressionAnalysi
~, = ua + Xlb I + x2b 2
1
Mean(~,) = -~- (u':~)
= Na + ~ x u b I + ~x2ib 2
= N a + bl~Xli + b2~x2i
= N a + b l N ~ 1 + b2N~ 2
This quantity can be substituted into the equation for the mean of a
variable such that:
1 1
Mean(~)) = ~ - ( u ' ~ ) = ~ ( N a + b l N ~ 1 + b2Nx2)
= a + b l ~ 1 + b2~ 2
ss(y) = - - y) =
= b l X ] X l b 1 + blX~X2b 2 + b 2 x ~ x l b I + b2x~x2b 2
b,x;xlb, = b x;xl =
Var(~) = (b2y.x 2) + 1 2 2
~(b2~'.x2i ) +
1
~" 2(bib2 EXliX2i)
--
2i +
2 bl b 2 1 E Xli x2i
= b~ 2 + b2 2 + 2b~b2r12
= 2 + 2 + 2blb2 2
where
Yi = a + bxi
Obviously, the values of the predicted scores and, therefore, the values
of the errors of prediction are determined by the choice of a particu-
lar intercept and regression coefficient.
If we substitute this linear function for the predicted value of the
dependent variable, the problem becomes one of finding the regres-
sion coefficient, b, and the intercept, a, that minimize the following
function:
3S
o~---b = ~ 2 (Yi - a - b x i ) ( - x i )
~9S
o3"-a = ~ 2 ( Y i - a - bxi)(-1)
~(-yi+bxi +a) = 0
b ~ , x ? - X~xiYi + a ~ , x i = 0
aN - ~-.dYi + b ~ , x i = 0
~,x i = 0
~, xi Yi
b=
Xx?
a = =
N
THELEAST-SQUARESREGRESSIONCOEFFICIENT 19 7
Cov(x,y)
b =
Var(x)
a=y-b~
where
~=0
y = a+ b~
These results were obtained using the simplifying assumption that the
independent variable is expressed in mean deviation form. However,
it can be demonstrated that this assumption is not necessary to obtain
these equations. Indeed, the covariance between two variables and the
variance of the independent variable are the same whether or not the
independent variable is expressed in mean deviation form.
APPENDIX C
Cov(x,y)
b -
Var(x)
N C xy X xi Yi
b =
N s2
x Xx?
b= Ewiy
where
xi
w/=
2 2
Var(b) = w i2 Sy I + w 2 s~2 + ... + w i2 Syn
2 2 2 2
Sy I = Sy 2 = ... = Sy n = Sy
200 APPENDIXC
2 Xi 2 x,
Var(b) = Sy ~ = Sy (2)-Ex
--2
such that:
Sy2 Sy2
Var(b) - Zx 2 =
zx,
Consequently, it can be seen that the variance of the regression
coefficient is equal to the variance of the dependent variable divided
by the sum of squares of the independent variable. This equation can
be expressed in more familiar terms as follows:
Sy2
Var(b) =
Ns2x
The standard error of a statistic is equal to the square root of its vari-
ance. Therefore, the standard error of the regression coefficient is
given by:
S.D.(b) = i sy2
N sx
2
I 2
Sb = Sy
2
( N - 2) s x
This is, of course, the familiar equation for the standard error of the
regression coefficient.
It might be noted that the equation for the regression coefficient
as a linear function of the values of the dependent variable tells us a
great deal about how the value of the regression coefficient is affected
by the values of the dependent variable and the independent variable.
Once again, the simple regression coefficient is given by:
b = Yz + Y2 + ... + Yn
y = Xb+ e
e = y- Xb
3S
= 0 - 2X'y + 2 X ' X b
,gb
X'y- X'Xb = 0
b = (X'X)-lXty
It must be noted that the first quantity on the right, X ' X -1, is
simply the inverse of the matrix whose main diagonal elements are the
sums of squares of the independent variables and whose off-diagonal
elements are the sums of the cross-products of the independent vari-
ables. Similarly, the second quantity on the right, X'y, is the column
vector of the sum of the cross-products of the independent variables
and the dependent variable. Therefore, the vector of partial regression
coefficients, b, is equivalent to the inverse of the variance-covariance
matrix of the independent variables, Cxxl, postmultiplied by the vector
of covariances between the independent variables and the dependent
variable, Cyx, such that:
-1
b = Cxxcy x
204 APPENDIXD
b* = R x-1x r y x
This result gives us the equation for the standardized partial regression
coefficient.
Although the matrix algebra equation for the partial regression
coefficients, derived from the normal equations, is rather complicated,
it is analogous to the equation for the simple regression coefficient.
Indeed, if multiplication by the inverse of a matrix is the matrix alge-
bra analog to division in ordinary algebra, the equation for the partial
regression coefficients can be viewed as the ratio of a function of the
covariances between the dependent variable and the independent vari-
ables to a function of the variances and covariances of the indepen-
dent variables.
APPENDIX E
Statistical
Tables
Probability Level