Linear Regression Analysis 6th Edition Montgomery, Peck & Vining 1

Page 1
Chapter 1
Introduction
Linear Regression Analysis 6th edition 1

Montgomery, Peck & Vining
Page 2
1.1 Regression and Model Building
• Regression analysis is a statistical technique for

investigating and modeling the relationship between
variables.
• Equation of a straight line (classical)
y = mx +b
we usually write this as
y = β0 +β1x
Page 3
• Not all observations will fall exactly on a straight line.

y = β0 + β1x + ε
where ε represents error

- it is a random variable that accounts for the failure of the model to
fit the data exactly.
- ε ~ N(0, σ2)

Page 4

Delivery time example

Page 5

• Simple Linear Regression Model
where y – dependent (response) variable

x – independent (regressor/predictor) variable
β0 - intercept
β1 - slope
ε - random error term
Page 6

Page 7
• The mean response at any value, x, of the regressor

variable is
• The variance of y at any given x is

Page 8

Page 9
Figure 1.3 Linear regression approximation of a complex relationship.

Page 10
Figure 1.4 Piecewise linear approximation of a complex relationship.

Page 11
Figure 1.5 The danger of extrapolation in regression.

Page 12
• Multiple Linear Regression Model

Page 13
1.2 Data Collection
• Your analysis/model is only as good as the data.

• Three different methods can be used for data collection:
1. A retrospective study based on historical data
2. An observational study
3. A designed experiment

Page 14
1.2 Data Collection

Page 15

Page 16

Page 17

Page 18

Page 19
Designed Experiment

Page 20
1.3 Uses of Regression

There are many uses of regression, including:
1. Data description
2. Parameter estimation
3. Prediction and estimation
4. Control
Regression analysis is perhaps the most widely used
statistical technique, and probably the most widely misused.

Page 21
1.3 Uses of Regression

Cause and Effect Relationships
• Caution: just because you can fit a linear model to a set of
data, does not mean you should.
• It is relatively easy to build “nonsense” relationships
between variables.
• Regression does not necessarily imply causality.

Page 22
Model building in regression

Page 23
Chapter 2
Simple Linear Regression
Linear Regression Analysis 6E Montgomery, Peck 1

& Vining
Page 24
2.1 Simple Linear Regression Model
• Single regressor, x; response, y Population

regression model
y = β0 + β1x + ε
• β0 – intercept: if x = 0 is in the range, then β0 is the mean of the
distribution of the response y, when x = 0; if x = 0 is not in the range,
then β0 has no practical interpretation
• β1 – slope: change in the mean of the distribution of the response
produced by a unit change in x
• ε - random error

& Vining
Page 25
2.1 Simple Linear Regression Model
• The response, y, is a random variable

• There is a probability distribution for y at each value of x
– Mean:
E(y | x ) = β0 + β1x
E(y|x) = E( β0 + β1 x + ε) = β0 + β1 x + E(ε) = β0 + β1 x
– Variance: Expectation is a linear operator.
( ) ( )
Var y | x = Var β0 + β1x + ε = σ 2
for all levels of x

& Vining
Page 26
2.2 Least-Squares Estimation of the Parameters

Derivation of the Least Squares estimators for β0 and β1
• β0 and β1 are unknown and must be estimated

Sample
yi =β 0 + β1 xi + ε i , i =1, 2,..., n regression
model
• Least squares estimation seeks to minimize the sum of squares of the

differences between the observed response, yi, and the straight line.
S ( β 0 , β1=
) ∑i
ε
i
2
= ∑ i 0 1i
( y
i
− β − β x ) 2
What is the LS estimator of μ in univariate statistics?
Answer: Xbar
Sum(Xi - μ)^2 is minimized for μ = Xbar.

& Vining
Page 27

• Let βˆ 0 , βˆ 1 represent the least squares estimators of β0 and β1,
respectively.
• These estimators must satisfy:
∂S
−2∑ ( yi − βˆ0 − βˆ1 xi ) =
= 0
∂β 0 βˆ0 , βˆ1 i
∂S
−2∑ ( yi − βˆ0 − βˆ1 xi )xi =
= 0
∂β1 βˆ0 , βˆ1 i

& Vining
Page 28
• Simplifying yields the least squares normal equations:

n n
nβˆ0 + βˆ1 ∑ xi =
∑ yi
=i 1 =i 1
n n n
βˆ ˆ
0 ∑ xi + β1 ∑ xi =
2
∑ yi xi
=i 1 =i 1 =i 1

& Vining
Page 29
• Solving the normal equations yields the ordinary least

squares estimators:
βˆ 0 = y − βˆ 1x
 ∑ y  ∑ x 
n n Under the assumption of normal error terms, these are the
 i  i
LS estimators and also the Maximum Likelihood
n
 i =1  i =1 
estimators.
∑ yi x i − Computation formulas - can be easily programmed in
βˆ 1 = i =1 n Excel, for example.

2
∑ x 
n See RegressionExampleShelfStocking.xlsx
 i
n
2  i =1 
∑ xi −
i =1 n
& Vining
Page 30
• The fitted simple linear regression model:

ŷ = βˆ 0 + βˆ 1x
• Sum of Squares Notation:
2
∑ x n
 i
n
2
Sxx = ∑ x i −  i =1  n
= ∑ (x i − x )
2
i =1 n i =1
∑ n
y   ∑
n
x 
 i  i
n
Sxy = ∑ y i x i −  i =1   i =1  n
= ∑ y i (x i − x )
i =1 n i =1
& Vining
Page 31
• Then
∑ n
  n

 y i  ∑ x i 
n
y x −  i =1  i =1 
∑ i i S
ˆβ = i =1 n =
xy
1 2
∑ n
 Sxx
 xi 
n
x 2
−  i =1 
∑ i
i =1 n

& Vining
Page 32
• Residuals:
e i = y i − ŷ i
observed - fit
data - fit
• Residuals will be used to determine the adequacy of the

model

& Vining
Page 33
Example 2.1 – The Rocket Propellant Data

& Vining
Page 34

& Vining
Page 35
Example 2.1- Rocket Propellant Data

& Vining
Page 36
Example 2.1- Rocket Propellant Data
• The least squares regression line is

ŷ = 2627.82 − 37.15x

& Vining
Page 37

& Vining
Page 38
• Just because we can fit a linear model doesn’t mean that we

should
– How well does this equation fit the data?
– Is the model likely to be useful as a predictor?
– Are any of the basic assumptions (such as constant variance and
uncorrelated errors) violated, if so, how serious is this?

& Vining
Page 39

Computer Output (Minitab)
We will learn how to populate the ANOVA table below.
When p is low,
H0 must go.

& Vining
Page 40
2.2.2 Properties of the Least-Squares Estimators

and the Fitted Regression Model
• The ordinary least-squares (OLS) estimator of the
slope is a linear combinations of the observations,
yi:
S n
βˆ1 = ∑ ci yi
= xy
S xx i =1
Useful in showing expected
value and variance properties
where
n n
ci = ( xi − x ) / S xx , ∑ ci = 0, ∑ ci2 = 1
i =1 i =1

& Vining
Page 41

• The least-squares estimators are unbiased estimators of their respective
parameter:
=E β1 β=
1 ( )
ˆ
E β0 β0 ˆ ( )
• The variances are
σ 2
 1 x 2

Var=ˆ
β1( )
S xx
ˆ
β0 σ  +
Var= 2
 n S
( )

xx 
• The OLS estimators are Best Linear Unbiased Estimators (BLUE)

Gauss-Markov Theorem
& Vining
Page 42

• Useful properties of the least-squares fit

n n
1. ∑ (y i − ŷ i ) = ∑ ei = 0
i =1 i =1
n n
2. ∑ y i = ∑ ŷ i
i =1 i =1
3. The least-squares regression line always passes through the

centroid (y, x ) of the data.
n n
4. ∑ x i ei = 0 5. ∑ ŷ i ei = 0
i =1 i =1

& Vining
Page 43
2.2.3 Estimation of σ2
• Residual (error) sum of squares

n 2 n
SS Re s = ∑ ( yi − yî ) = ∑e 2
i
=i 1 =i 1
n
= ∑y− ny − βˆ1S xy
2
i
i =1
 
SST ∑ ( yi − y )
2
=
= SST − βˆ1S xy

& Vining
Page 44
• Unbiased estimator of σ2
SS Re s
σˆ
= 2
= MS Re s =s2
n−2
s is known as the estimated standard error of the regression model
• The quantity n – 2 is the number of degrees of

freedom for the residual sum of squares.

& Vining
Page 45
2
• σ̂ depends on the residual sum of squares. Then:
– Any violation of the assumptions on the model errors could
damage the usefulness of this estimate
– A misspecification of the model can damage the usefulness of
this estimate
– This estimate is model dependent

& Vining
Page 46
2.3 Hypothesis Testing on the Slope and Intercept
• Three assumptions needed to apply procedures such as

hypothesis testing and confidence intervals. Model
errors, εi,
– are normally distributed
– are independently distributed
– have constant variance
i.e. εi ~ NID(0, σ2)

& Vining
Page 47
2.3.1 Use of t-tests

Slope
H0: β1 = β10 H1: β1 ≠ β10
MS Re s
• Standard error of the slope: se ( β1 ) =
ˆ
S xx
• Test statistic: βˆ1 − β10
t0 =
se βˆ1 ( )
• Reject H0 if |t0| > t α / 2,n − 2
• Can also use the P-value approach

& Vining
Page 48
2.3.1 Use of t-tests

Intercept
H0: β0 = β00 H1: β0 ≠ β00
 1 x2 
• Standard error of the intercept:=se ( βˆ0 ) MS Re s  + 
 n S xx 
βˆ0 − β 00
t0 =
• Test statistic:
se βˆ0 ( )
t α / 2,n − 2
• Reject H0 if |t0| >

& Vining
Page 49
2.3.2 Testing Significance of Regression
H0: β1 = 0 H1: β1 ≠ 0
• This tests the significance of regression; that is, is there a
linear relationship between the response and the regressor?
• Failing to reject β1 = 0, implies that there is no linear
relationship between y and x. Does X help to predict Y?
Under H : E(Y) = β 0 0
Y |
|
|
β0 ----------------------------
|
|
-----------------------------
X

& Vining
Page 50

& Vining
Page 51
Testing
significance
of regression
Always plot data!!

& Vining
Page 52
2.3.3 The Analysis of Variance Approach

• Partitioning of total variability
yi − y = ( yî − y ) + ( yi − yî ) Add and subtract y hat i
∑ ( yi − y ) = ∑ ( yî − y ) + ∑ ( yi − yî )
2 2 2
This cross product term is equal to 0

+ 2∑ ( yî − y )( yi − yî ) based on the Properties of the Least
Squares Estimators and the Fitted
Regression Model.
or
∑ ( yi − y ) = ∑ ( yî − y ) + ∑ ( yi − yî )
2 2 2
  

SST SS R SSRe s
• It can be shown thatSS R = βˆ1S xy

& Vining
Page 53

• Degrees of Freedom
∑ ( y − y ) = ∑ ( yˆ − y ) + ∑ ( y − yˆ )
2 2 2
i i i i
  
SST SS R SSRe s
n −1 = 1 + (n − 2)
• Mean Squares
SS R SS Re s
=MS R = MS Re s
1 n−2

& Vining
Page 54
2.3.3 The Analysis of Variance Approach Rely on Cochran's theorem

for underlying distribution
of MSR/MSRES
• ANOVA procedure for testing H0: β1 = 0
Source of Variation Sum of DF MS F0

Squares
Regression SSR 1 MSR MSR/MSRes
Residual SSRes n-2 MSRes
Total SST n-1
• A large value of F0 indicates that regression is significant;

specifically, reject if F0 > Fα ,1,n − 2 Compute the probability under the null
hypothesis that the random variable F

is greater than the computed F=F *
When p is low H0 must go.

& Vining
Page 55

& Vining
Page 56

Relationship between t0 and F0:
• For H0: β1 = 0, it can be shown that:
t02 = F0
So for testing significance of regression, the t-test and the
ANOVA procedure are equivalent (only true in simple linear
regression)

& Vining
Page 57
2.4 Interval Estimation in Simple Linear Regression
• 100(1-α)% Confidence interval for the Slope
( )
βˆ1 − tα / 2,n − 2 se βˆ1 ≤ β1 ≤ βˆ1 + tα / 2,n − 2 se βˆ1 ( )
• 100(1-α)% Confidence interval for the Intercept
( )
βˆ0 − tα / 2,n − 2 se βˆ0 ≤ β 0 ≤ βˆ0 + tα / 2,n − 2 se βˆ0 ( )

& Vining
Page 58
Also, see page 30, text

& Vining
Page 59
2.4 Interval Estimation in Simple Linear Regression
• 100(1-α)% Confidence interval for σ2

( n − 2) MS RES 2 ( n − 2) MS RES
2
≤σ ≤
χα / 2, n − 2 χ12− α / 2, n − 2

& Vining
Page 60
2.4.2 Interval Estimation of the Mean Response
• Let x0 be the level of the regressor variable at which we want to estimate the
mean response, i.e.
E ( y | x0 ) = µ y| x0
– Point estimator of E ( y | x0 ) once the model is fit:


E ( y | x0=) µˆ y|x= ˆ + βˆ x
β
0 0 1 0
– In order to construct a confidence interval on the mean response, we need

the variance of the point estimator.

& Vining
Page 61
• The variance of µˆ y| x0 is
( ) ( )
ˆ y| x0 = Var βˆ 0 + βˆ 1 x0 = Var[ y + βˆ 1 ( x0 − x )]
Var µ
= Var ( y ) + Var[βˆ 1 ( x0 − x )]
σ 2 σ 2 ( x0 − x ) 2  1 ( x − x ) 2
= + = σ2  + 0 
n S xx  n S xx 

& Vining
Page 62
• 100(1-α)% confidence interval for E(y|x0)
 1 ( x0 − x ) 2 
µˆ y| x0 − t α / 2,n − 2 MS Re s  +  ≤ E ( y | x0 )
 n S xx 
 1 ( x0 − x ) 2 
≤ µˆ y| x0 + t α / 2,n − 2 MS Re s  + 
 n S xx 
Notice that the length of the CI depends on the location of the point of interest

& Vining
Page 63
See pages 32-33, text

& Vining
Page 64
2.5 Prediction of New Observations
• Suppose we wish to construct a prediction interval on a future

observation, y0 corresponding to a particular level of x, say x0.
• The point estimate would be:
ŷ= ˆ + βˆ x
β
0 0 1 0
• The confidence interval on the mean response at this point is not

appropriate for this situation. Why?
& Vining
Page 65
• Let the random variable, Ψ be Ψ = y0 − ŷ0

• Ψ is normally distributed with
– E(Ψ) = 0 2
 1 ( x − x )
– Var(Ψ) = Var ( y 0 − ˆ
y 0 ) = σ 2
 1 + + 0

 n S xx 
(y0 is independent of ŷ0)

& Vining
Page 66

• 100(1 - α)% prediction interval on a future observation, y0,
at x0
 1 ( x0 − x ) 2 
yˆ 0 − t α / 2, n − 2 MS Re s 1 + +  ≤ y0
 n S xx 
 1 ( x0 − x ) 2 
≤ yˆ 0 + t α / 2, n − 2 MS Re s 1 + + 
 n S xx 

& Vining
Page 67

& Vining
Page 68
2.6 Coefficient of Determination

• R2 - coefficient of determination
2 SS R SS Re s
R = = 1−
SST SST
• Proportion of variation explained by the
regressor, x
• For the rocket propellant data

& Vining
Page 69
2.6 Coefficient of Determination
• R2 can be misleading!
– Simply adding more terms to the model will increase R2
– As the range of the regressor variable increases (decreases), R2
generally increases (decreases).
– R2 does not indicate the appropriateness of a linear model

& Vining
Page 70
Two Interesting Applications
• 2.7 A Service Industry Application of Regression

– Patient satisfaction study
• 2.8 Does Pitching Win Baseball Games?
– Wins versus team ERA

& Vining
Page 71
2.10 Considerations in the Use of Regression

• Extrapolating
• Extreme points will often influence the slope.
• Outliers can disturb the least-squares fit
• Linear relationship does not imply cause-effect relationship
(interesting mental defective example in the book pg 45)
• Sometimes, the value of the regressor variable is unknown and
itself be estimated or predicted. Let x=# of workers y=# of items shipped at an Amazon
location An increase in x may not cause an increase in y (due
to tighter quality control procedures).
A decrease in x may result in an increase in y due to more

efficient processes.

& Vining
Page 72

& Vining
Page 73
2.11 Regression Through the Origin

• The no-intercept model is
y = β1 x + ε
• This model would be appropriate for situations where the origin (0, 0)
has some meaning.
• A scatter diagram can aid in determining where an intercept- or no-
intercept model should be used.
• In addition, the practitioner could test both models. Examine t-tests,
residual mean square.

& Vining
Page 74
2.12 Estimation by Maximum Likelihood

• The method of least squares is not the only method of parameter
estimation that can be used for linear regression models.
• If the form of the response (error) distribution is known, then the
maximum likelihood (ML) estimation method can be used.
• In simple linear regression with normal errors, the MLEs for the
regression coefficients are identical to that obtained by the method of
least squares.
• Details of finding the MLEs are on page 52.
• The MLE of σ2 is biased. (Not a serious problem here though.)

& Vining
Page 75

& Vining
Page 76
Identical to OLS
Biased

& Vining
Page 77

& Vining
Page 78
2.13 Case Where the Regressor is Random
• When x and y have an unknown joint distribution, all of the “fixed

x’s” regression results hold if:

& Vining
Page 79

& Vining
Page 80

& Vining
Page 81

& Vining
Page 82
The sample correlation coefficient is related to the slope:
Also,

& Vining
Page 83

& Vining
Page 84

& Vining
Page 85

& Vining
Page 86

& Vining

Linear Regression Analysis 6th Edition Montgomery, Peck & Vining 1

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Linear Regression Analysis 6th Edition Montgomery, Peck & Vining 1

Uploaded by

Copyright:

Available Formats

Page 1

Linear Regression Analysis 6th edition 1

1.1 Regression and Model Building

• Regression analysis is a statistical technique for

1.1 Regression and Model Building

• Not all observations will fall exactly on a straight line.

where ε represents error

Linear Regression Analysis 6th edition 3

1.1 Regression and Model Building

Linear Regression Analysis 6th edition 4

1.1 Regression and Model Building

where y – dependent (response) variable

Linear Regression Analysis 6th edition 6

1.1 Regression and Model Building

• The mean response at any value, x, of the regressor

• The variance of y at any given x is

Linear Regression Analysis 6th edition 7

Linear Regression Analysis 6th edition 8

Figure 1.3 Linear regression approximation of a complex relationship.

Linear Regression Analysis 6th edition 9

Figure 1.4 Piecewise linear approximation of a complex relationship.

Linear Regression Analysis 6th edition 10

Figure 1.5 The danger of extrapolation in regression.

1.1 Regression and Model Building

• Multiple Linear Regression Model

Linear Regression Analysis 6th edition 12

1.2 Data Collection

• Your analysis/model is only as good as the data.

Linear Regression Analysis 6th edition 13

1.2 Data Collection

Linear Regression Analysis 6th edition 14

Linear Regression Analysis 6th edition 15

Linear Regression Analysis 6th edition 16

Linear Regression Analysis 6th edition 17

Linear Regression Analysis 6th edition 18

Linear Regression Analysis 6th edition 19

1.3 Uses of Regression

Linear Regression Analysis 6th edition 20

1.3 Uses of Regression

Linear Regression Analysis 6th edition 21

Model building in regression

Linear Regression Analysis 6th edition 22

Linear Regression Analysis 6E Montgomery, Peck 1

2.1 Simple Linear Regression Model

• Single regressor, x; response, y Population

Linear Regression Analysis 6E Montgomery, Peck 2

2.1 Simple Linear Regression Model

• The response, y, is a random variable

Linear Regression Analysis 6E Montgomery, Peck 3

2.2 Least-Squares Estimation of the Parameters

• β0 and β1 are unknown and must be estimated

• Least squares estimation seeks to minimize the sum of squares of the

Sum(Xi - μ)^2 is minimized for μ = Xbar.

Linear Regression Analysis 6E Montgomery, Peck 4

2.2 Least-Squares Estimation of the Parameters

Linear Regression Analysis 6E Montgomery, Peck 5

2.2 Least-Squares Estimation of the Parameters

• Simplifying yields the least squares normal equations:

Linear Regression Analysis 6E Montgomery, Peck 6

2.2 Least-Squares Estimation of the Parameters

• Solving the normal equations yields the ordinary least

∑ yi x i − Computation formulas - can be easily programmed in