Chapter5-Multiple Linear Regression

HASTS112/HSTS112 REGRESSION ANALYSIS AND ANOVA
CHAPTER 5: MULTIPLE LINEAR REGRESSION
Introduction
Up to now, we have been dealing with regression relationships in which two
variables, dependent and one independent variables were involved. Multiple
linear regression analysis is merely an extension of simple linear regression.
The difference is that more than one independent variable is involved in the
relationship.
The multiple linear regression model is;
yi = β0 + β1 X1i + β2 X2i + ... + βp−1 Xp−1,i + i
The term linear is used because the model is a linear function of the unknown
parameters, β 0 s.
Example Yield (Y) depend on many variables for example amount of rainfall
and amount of fertilizer. Our model will be of the form;
yi = β0 + β1 X1i + β2 X2i + i
that is a multiple linear regression model with two regressor/independent

variables (X1i being amount of rainfall and X2i being amount of fertilizer)
Estimation of Parameters
There are many ways of estimating the parameters in a regression model.
However in this course we shall focus attention on the Least Squares method.
There are two to apply the LS method
(i) Estimation by substitution.
(ii) Matrix approach.
1
However, with multiple linear regression, the matrix approach seem to be
more appropriate.
In matrix notation, our model is given by
Y = Xβ + , where
     
y1 1 x1,1 . . . xp−1,1 β0

 y2 


 1 x1,2 . . . xp−1,2 


 β1 

. . . . .
     
     
Y =  , X =  , β =   and

 . 


 . . . 


 . 

. . . . .
     
     
yn 1 x1,n . . . xp−1,n βp−1
 
1
 2 
 
 . 
 
= 
 . 

 . 
 
n
Consequently, the random vector has expectation,
E [Y] = Xβ
The fitted data is obtained from,
Ŷ = Xβ̂
There is a standard formula we use estimate the vector β, and this is
β̂ = (XT X)−1 XT Y
Hypothesis Testing on the Parameters

We have hypotheses of the form
(A) H0 βi = b versus H1 βi 6= b for i = 0, 1, ..., p − 1
(B) H0 βi ≥ b versus H1 βi < b for i = 0, 1, ..., p − 1
(C) H0 βi ≤ b versus H1 βi > b for i = 0, 1, ..., p − 1
2
The estimate of the variance of β̂i is given by
V ar(β̂i ) = s2 × [(i + 1), (i + 1)]th element of (XT X)−1

Pn
SSE (yi −ŷi )2
where s2 = n−p
= i=1
n−p
It there follows that for the above hypotheses,

(A) We reject H0 if
β̂i − b
|t| = q > t α2 (n−p)
V ar(β̂i )
(B) We reject H0 if
β̂i − b
t= q < tα (n − p)
V ar(β̂i )
(C) We reject H0 if
β̂i − b
t= q > tα (n − p)
V ar(β̂i )
Analysis of Variance Approach to Multiple Lin-

ear Regression
Analysis of Variance(ANOVA) is a highly useful and flexible mode of anal-
ysis for regression models. We will use ANOVA to compute s2 = SSE n−p
(an
2
estimate of σ ) and to check if there is a regression relationship.
Partitioning the Total Sum of Squares

SST = SSR + SSE
The matrix approach puts this as
SST = YT Y − nȳ 2
SSR = β̂ T XT Y − nȳ 2
3
and
SSE = YT Y − β̂ T XT Y
Once β has been estimated, the sum of squares can be easily computed.
Partitioning the degrees of freedom

SST has n − 1 degrees of freedom
SSR has p − 1 degrees of freedom
SSE has n − p degrees of freedom
Mean squares
The sum of squares divided by its degrees of freedom is called mean squares.
The two important mean squares are the Regression Mean Squares (MSR)
and Error Mean Squares (MSE) and these are given by
SSR
M SR =
p−1
SSE
M SE =
n−p
The Basic ANOVA table
Source of Variation SS d.f MS F

M SR
Regression SSR p-1 MSR F = M SE
Error SSE n-p MSE
Total SST n-1
To test for the significance of the regression, our hypotheses are of the
form
H0 : β1 = β2 = ... = βp−1 = 0
H1 : βi 6= 0 for at least one i at α significance level
4
Test Statistic: F-ratio
Rejection Criteria: Reject H0 if F > Fα (p − 1, n − p)
If we reject H0 , we go on to test the significance of each of the parame-

ters to find out which variable led to the rejection of the null hypothesis.
Exercise 5.1
Consider the following data set, where Y is the dependent variable and X1
and X2 are the regressors
Y 4.1 8.5 5.2 9.6 8.7

X1 2.5 3.7 2.6 5.5 4.0
X2 3.5 4.4 3.9 4.3 4.9
Suppose the data can be described by model Yi = β0 + β1 X1i + β2 X2i + i

where i ∼ N (0, σ 2 ) and Cov(ei , ej ) = 0 if i 6= j
(a) Express the above model in matrix form.
(b) Find the least squares estimator of β given that

 
17.2124 0.5764 −4.5529
[X0 X]−1 =  0.5764

0.2632 −0.3666 

−4.5529 −0.3666 1.4035
(c) Construct the ANOVA table and test for the significance of the regres-
sion line using α = 0.05.
(d) Test the hypothesis H0 : β0 = 0 versus H1 :β0 6= 0 at α = 0.05.
(e) Estimate y at X1 = 3 and X2 = 4.5.

Chapter5-Multiple Linear Regression

Uploaded by

Copyright:

Available Formats

You might also like

Chapter5-Multiple Linear Regression

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Chapter5-Multiple Linear Regression

Uploaded by

Copyright:

Available Formats

HASTS112/HSTS112 REGRESSION ANALYSIS AND ANOVA

CHAPTER 5: MULTIPLE LINEAR REGRESSION

yi = β0 + β1 X1i + β2 X2i + ... + βp−1 Xp−1,i + i

that is a multiple linear regression model with two regressor/independent

(i) Estimation by substitution.

(ii) Matrix approach.

In matrix notation, our model is given by

The fitted data is obtained from,

There is a standard formula we use estimate the vector β, and this is

Hypothesis Testing on the Parameters

(A) H0 βi = b versus H1 βi 6= b for i = 0, 1, ..., p − 1

(B) H0 βi ≥ b versus H1 βi < b for i = 0, 1, ..., p − 1

(C) H0 βi ≤ b versus H1 βi > b for i = 0, 1, ..., p − 1

V ar(β̂i ) = s2 × [(i + 1), (i + 1)]th element of (XT X)−1

It there follows that for the above hypotheses,

Analysis of Variance Approach to Multiple Lin-

Partitioning the Total Sum of Squares

Partitioning the degrees of freedom

The Basic ANOVA table

Source of Variation SS d.f MS F

Rejection Criteria: Reject H0 if F > Fα (p − 1, n − p)

If we reject H0 , we go on to test the significance of each of the parame-

Y 4.1 8.5 5.2 9.6 8.7

Suppose the data can be described by model Yi = β0 + β1 X1i + β2 X2i + i

(a) Express the above model in matrix form.

(b) Find the least squares estimator of β given that

(d) Test the hypothesis H0 : β0 = 0 versus H1 :β0 6= 0 at α = 0.05.

(e) Estimate y at X1 = 3 and X2 = 4.5.

You might also like

yi = β0 + β1 X1i + β2 X2i + ... + βp−1 Xp−1,i + i

Suppose the data can be described by model Yi = β0 + β1 X1i + β2 X2i + i