Download as pdf or txt
Download as pdf or txt
You are on page 1of 86

Page 1

Chapter 1
Introduction

Linear Regression Analysis 6th edition 1


Montgomery, Peck & Vining
Page 2

1.1 Regression and Model Building

• Regression analysis is a statistical technique for


investigating and modeling the relationship between
variables.
• Equation of a straight line (classical)
y = mx +b
we usually write this as
y = β0 +β1x
Linear Regression Analysis 6th edition 2
Montgomery, Peck & Vining
Page 3

1.1 Regression and Model Building

• Not all observations will fall exactly on a straight line.


y = β0 + β1x + ε

where ε represents error


- it is a random variable that accounts for the failure of the model to
fit the data exactly.
- ε ~ N(0, σ2)

Linear Regression Analysis 6th edition 3


Montgomery, Peck & Vining
Page 4

1.1 Regression and Model Building


Delivery time example

Linear Regression Analysis 6th edition 4


Montgomery, Peck & Vining
Page 5

1.1 Regression and Model Building


• Simple Linear Regression Model

where y – dependent (response) variable


x – independent (regressor/predictor) variable
β0 - intercept
β1 - slope
ε - random error term
Linear Regression Analysis 6th edition 5
Montgomery, Peck & Vining
Page 6

Linear Regression Analysis 6th edition 6


Montgomery, Peck & Vining
Page 7

1.1 Regression and Model Building

• The mean response at any value, x, of the regressor


variable is

• The variance of y at any given x is

Linear Regression Analysis 6th edition 7


Montgomery, Peck & Vining
Page 8

Linear Regression Analysis 6th edition 8


Montgomery, Peck & Vining
Page 9

Figure 1.3 Linear regression approximation of a complex relationship.

Linear Regression Analysis 6th edition 9


Montgomery, Peck & Vining
Page 10

Figure 1.4 Piecewise linear approximation of a complex relationship.

Linear Regression Analysis 6th edition 10


Montgomery, Peck & Vining
Page 11

Figure 1.5 The danger of extrapolation in regression.


Linear Regression Analysis 6th edition 11
Montgomery, Peck & Vining
Page 12

1.1 Regression and Model Building

• Multiple Linear Regression Model

Linear Regression Analysis 6th edition 12


Montgomery, Peck & Vining
Page 13

1.2 Data Collection

• Your analysis/model is only as good as the data.


• Three different methods can be used for data collection:
1. A retrospective study based on historical data
2. An observational study
3. A designed experiment

Linear Regression Analysis 6th edition 13


Montgomery, Peck & Vining
Page 14

1.2 Data Collection

Linear Regression Analysis 6th edition 14


Montgomery, Peck & Vining
Page 15

Linear Regression Analysis 6th edition 15


Montgomery, Peck & Vining
Page 16

Linear Regression Analysis 6th edition 16


Montgomery, Peck & Vining
Page 17

Linear Regression Analysis 6th edition 17


Montgomery, Peck & Vining
Page 18

Linear Regression Analysis 6th edition 18


Montgomery, Peck & Vining
Page 19

Designed Experiment

Linear Regression Analysis 6th edition 19


Montgomery, Peck & Vining
Page 20

1.3 Uses of Regression


There are many uses of regression, including:
1. Data description
2. Parameter estimation
3. Prediction and estimation
4. Control
Regression analysis is perhaps the most widely used
statistical technique, and probably the most widely misused.

Linear Regression Analysis 6th edition 20


Montgomery, Peck & Vining
Page 21

1.3 Uses of Regression


Cause and Effect Relationships
• Caution: just because you can fit a linear model to a set of
data, does not mean you should.
• It is relatively easy to build “nonsense” relationships
between variables.
• Regression does not necessarily imply causality.

Linear Regression Analysis 6th edition 21


Montgomery, Peck & Vining
Page 22

Model building in regression

Linear Regression Analysis 6th edition 22


Montgomery, Peck & Vining
Page 23

Chapter 2
Simple Linear Regression

Linear Regression Analysis 6E Montgomery, Peck 1


& Vining
Page 24

2.1 Simple Linear Regression Model

• Single regressor, x; response, y Population


regression model
y = β0 + β1x + ε
• β0 – intercept: if x = 0 is in the range, then β0 is the mean of the
distribution of the response y, when x = 0; if x = 0 is not in the range,
then β0 has no practical interpretation
• β1 – slope: change in the mean of the distribution of the response
produced by a unit change in x
• ε - random error

Linear Regression Analysis 6E Montgomery, Peck 2


& Vining
Page 25

2.1 Simple Linear Regression Model

• The response, y, is a random variable


• There is a probability distribution for y at each value of x
– Mean:
E(y | x ) = β0 + β1x
E(y|x) = E( β0 + β1 x + ε) = β0 + β1 x + E(ε) = β0 + β1 x
– Variance: Expectation is a linear operator.

( ) ( )
Var y | x = Var β0 + β1x + ε = σ 2
for all levels of x

Linear Regression Analysis 6E Montgomery, Peck 3


& Vining
Page 26

2.2 Least-Squares Estimation of the Parameters


Derivation of the Least Squares estimators for β0 and β1

• β0 and β1 are unknown and must be estimated


Sample
yi =β 0 + β1 xi + ε i , i =1, 2,..., n regression
model

• Least squares estimation seeks to minimize the sum of squares of the


differences between the observed response, yi, and the straight line.

S ( β 0 , β1=
) ∑i
ε
i
2
= ∑ i 0 1i
( y
i
− β − β x ) 2
What is the LS estimator of μ in univariate statistics?
Answer: Xbar

Sum(Xi - μ)^2 is minimized for μ = Xbar.

Linear Regression Analysis 6E Montgomery, Peck 4


& Vining
Page 27

2.2 Least-Squares Estimation of the Parameters


• Let βˆ 0 , βˆ 1 represent the least squares estimators of β0 and β1,
respectively.
• These estimators must satisfy:
∂S
−2∑ ( yi − βˆ0 − βˆ1 xi ) =
= 0
∂β 0 βˆ0 , βˆ1 i

∂S
−2∑ ( yi − βˆ0 − βˆ1 xi )xi =
= 0
∂β1 βˆ0 , βˆ1 i

Linear Regression Analysis 6E Montgomery, Peck 5


& Vining
Page 28

2.2 Least-Squares Estimation of the Parameters

• Simplifying yields the least squares normal equations:


n n
nβˆ0 + βˆ1 ∑ xi =
∑ yi
=i 1 =i 1
n n n
βˆ ˆ
0 ∑ xi + β1 ∑ xi =
2
∑ yi xi
=i 1 =i 1 =i 1

Linear Regression Analysis 6E Montgomery, Peck 6


& Vining
Page 29

2.2 Least-Squares Estimation of the Parameters

• Solving the normal equations yields the ordinary least


squares estimators:
βˆ 0 = y − βˆ 1x
 ∑ y  ∑ x 
n n Under the assumption of normal error terms, these are the

 i  i
LS estimators and also the Maximum Likelihood
n
 i =1  i =1 
estimators.

∑ yi x i − Computation formulas - can be easily programmed in

βˆ 1 = i =1 n Excel, for example.


2
∑ x 
n See RegressionExampleShelfStocking.xlsx
 i
n
2  i =1 
∑ xi −
i =1 n
Linear Regression Analysis 6E Montgomery, Peck 7
& Vining
Page 30

2.2 Least-Squares Estimation of the Parameters

• The fitted simple linear regression model:


ŷ = βˆ 0 + βˆ 1x
• Sum of Squares Notation:
2
∑ x n
 i
n
2
Sxx = ∑ x i −  i =1  n
= ∑ (x i − x )
2

i =1 n i =1

∑ n
y   ∑
n
x 
 i  i
n
Sxy = ∑ y i x i −  i =1   i =1  n
= ∑ y i (x i − x )
i =1 n i =1
Linear Regression Analysis 6E Montgomery, Peck 8
& Vining
Page 31

2.2 Least-Squares Estimation of the Parameters

• Then
∑ n
  n

 y i  ∑ x i 
n
y x −  i =1  i =1 
∑ i i S
ˆβ = i =1 n =
xy
1 2
∑ n
 Sxx
 xi 
n
x 2
−  i =1 
∑ i
i =1 n

Linear Regression Analysis 6E Montgomery, Peck 9


& Vining
Page 32

2.2 Least-Squares Estimation of the Parameters

• Residuals:
e i = y i − ŷ i
observed - fit

data - fit

• Residuals will be used to determine the adequacy of the


model

Linear Regression Analysis 6E Montgomery, Peck 10


& Vining
Page 33

Example 2.1 – The Rocket Propellant Data

Linear Regression Analysis 6E Montgomery, Peck 11


& Vining
Page 34

Linear Regression Analysis 6E Montgomery, Peck 12


& Vining
Page 35

Example 2.1- Rocket Propellant Data

Linear Regression Analysis 6E Montgomery, Peck 13


& Vining
Page 36

Example 2.1- Rocket Propellant Data

• The least squares regression line is


ŷ = 2627.82 − 37.15x

Linear Regression Analysis 6E Montgomery, Peck 14


& Vining
Page 37

Linear Regression Analysis 6E Montgomery, Peck 15


& Vining
Page 38

2.2 Least-Squares Estimation of the Parameters

• Just because we can fit a linear model doesn’t mean that we


should
– How well does this equation fit the data?
– Is the model likely to be useful as a predictor?
– Are any of the basic assumptions (such as constant variance and
uncorrelated errors) violated, if so, how serious is this?

Linear Regression Analysis 6E Montgomery, Peck 16


& Vining
Page 39

2.2 Least-Squares Estimation of the Parameters


Computer Output (Minitab)

We will learn how to populate the ANOVA table below.

When p is low,
H0 must go.

Linear Regression Analysis 6E Montgomery, Peck 17


& Vining
Page 40

2.2.2 Properties of the Least-Squares Estimators


and the Fitted Regression Model
• The ordinary least-squares (OLS) estimator of the
slope is a linear combinations of the observations,
yi:
S n
βˆ1 = ∑ ci yi
= xy

S xx i =1
Useful in showing expected
value and variance properties
where
n n
ci = ( xi − x ) / S xx , ∑ ci = 0, ∑ ci2 = 1
i =1 i =1

Linear Regression Analysis 6E Montgomery, Peck 18


& Vining
Page 41

2.2.2 Properties of the Least-Squares Estimators


and the Fitted Regression Model
• The least-squares estimators are unbiased estimators of their respective
parameter:
=E β1 β=
1 ( )
ˆ
E β0 β0 ˆ ( )
• The variances are
σ 2
 1 x 2

Var=ˆ
β1( )
S xx
ˆ
β0 σ  +
Var= 2

 n S
( )

xx 

• The OLS estimators are Best Linear Unbiased Estimators (BLUE)


Gauss-Markov Theorem
Linear Regression Analysis 6E Montgomery, Peck 19
& Vining
Page 42

2.2.2 Properties of the Least-Squares Estimators


and the Fitted Regression Model

• Useful properties of the least-squares fit


n n
1. ∑ (y i − ŷ i ) = ∑ ei = 0
i =1 i =1
n n
2. ∑ y i = ∑ ŷ i
i =1 i =1

3. The least-squares regression line always passes through the


centroid (y, x ) of the data.
n n
4. ∑ x i ei = 0 5. ∑ ŷ i ei = 0
i =1 i =1

Linear Regression Analysis 6E Montgomery, Peck 20


& Vining
Page 43

2.2.3 Estimation of σ2

• Residual (error) sum of squares


n 2 n
SS Re s = ∑ ( yi − yˆi ) = ∑e 2
i
=i 1 =i 1
n
= ∑y− ny − βˆ1S xy
2
i
i =1
 
SST ∑ ( yi − y )
2
=

= SST − βˆ1S xy

Linear Regression Analysis 6E Montgomery, Peck 21


& Vining
Page 44

2.2.3 Estimation of σ2

• Unbiased estimator of σ2
SS Re s
σˆ
= 2
= MS Re s =s2

n−2
s is known as the estimated standard error of the regression model

• The quantity n – 2 is the number of degrees of


freedom for the residual sum of squares.

Linear Regression Analysis 6E Montgomery, Peck 22


& Vining
Page 45

2.2.3 Estimation of σ2
2
• σ̂ depends on the residual sum of squares. Then:
– Any violation of the assumptions on the model errors could
damage the usefulness of this estimate
– A misspecification of the model can damage the usefulness of
this estimate
– This estimate is model dependent

Linear Regression Analysis 6E Montgomery, Peck 23


& Vining
Page 46

2.3 Hypothesis Testing on the Slope and Intercept

• Three assumptions needed to apply procedures such as


hypothesis testing and confidence intervals. Model
errors, εi,
– are normally distributed
– are independently distributed
– have constant variance
i.e. εi ~ NID(0, σ2)

Linear Regression Analysis 6E Montgomery, Peck 24


& Vining
Page 47

2.3.1 Use of t-tests


Slope
H0: β1 = β10 H1: β1 ≠ β10
MS Re s
• Standard error of the slope: se ( β1 ) =
ˆ
S xx
• Test statistic: βˆ1 − β10
t0 =
se βˆ1 ( )
• Reject H0 if |t0| > t α / 2,n − 2
• Can also use the P-value approach

Linear Regression Analysis 6E Montgomery, Peck 25


& Vining
Page 48

2.3.1 Use of t-tests


Intercept
H0: β0 = β00 H1: β0 ≠ β00
 1 x2 
• Standard error of the intercept:=se ( βˆ0 ) MS Re s  + 
 n S xx 
βˆ0 − β 00
t0 =
• Test statistic:
se βˆ0 ( )
t α / 2,n − 2
• Reject H0 if |t0| >
• Can also use the P-value approach

Linear Regression Analysis 6E Montgomery, Peck 26


& Vining
Page 49

2.3.2 Testing Significance of Regression

H0: β1 = 0 H1: β1 ≠ 0
• This tests the significance of regression; that is, is there a
linear relationship between the response and the regressor?
• Failing to reject β1 = 0, implies that there is no linear
relationship between y and x. Does X help to predict Y?
Under H : E(Y) = β 0 0
Y |
|
|
β0 ----------------------------
|
|
-----------------------------
X

Linear Regression Analysis 6E Montgomery, Peck 27


& Vining
Page 50

Linear Regression Analysis 6E Montgomery, Peck 28


& Vining
Page 51

Testing
significance
of regression

Always plot data!!

Linear Regression Analysis 6E Montgomery, Peck 29


& Vining
Page 52

2.3.3 The Analysis of Variance Approach


• Partitioning of total variability
yi − y = ( yˆi − y ) + ( yi − yˆi ) Add and subtract y hat i

∑ ( yi − y ) = ∑ ( yˆi − y ) + ∑ ( yi − yˆi )
2 2 2

This cross product term is equal to 0


+ 2∑ ( yˆi − y )( yi − yˆi ) based on the Properties of the Least
Squares Estimators and the Fitted
Regression Model.
or

∑ ( yi − y ) = ∑ ( yˆi − y ) + ∑ ( yi − yˆi )
2 2 2

  


SST SS R SSRe s

• It can be shown thatSS R = βˆ1S xy

Linear Regression Analysis 6E Montgomery, Peck 30


& Vining
Page 53

2.3.3 The Analysis of Variance Approach


• Degrees of Freedom
∑ ( y − y ) = ∑ ( yˆ − y ) + ∑ ( y − yˆ )
2 2 2
i i i i
  
SST SS R SSRe s

n −1 = 1 + (n − 2)
• Mean Squares
SS R SS Re s
=MS R = MS Re s
1 n−2

Linear Regression Analysis 6E Montgomery, Peck 31


& Vining
Page 54

2.3.3 The Analysis of Variance Approach Rely on Cochran's theorem


for underlying distribution
of MSR/MSRES

• ANOVA procedure for testing H0: β1 = 0

Source of Variation Sum of DF MS F0


Squares
Regression SSR 1 MSR MSR/MSRes
Residual SSRes n-2 MSRes
Total SST n-1

• A large value of F0 indicates that regression is significant;


specifically, reject if F0 > Fα ,1,n − 2 Compute the probability under the null
hypothesis that the random variable F

• Can also use the P-value approach


is greater than the computed F=F *

When p is low H0 must go.

Linear Regression Analysis 6E Montgomery, Peck 32


& Vining
Page 55

Linear Regression Analysis 6E Montgomery, Peck 33


& Vining
Page 56

2.3.3 The Analysis of Variance Approach


Relationship between t0 and F0:
• For H0: β1 = 0, it can be shown that:

t02 = F0
So for testing significance of regression, the t-test and the
ANOVA procedure are equivalent (only true in simple linear
regression)

Linear Regression Analysis 6E Montgomery, Peck 34


& Vining
Page 57

2.4 Interval Estimation in Simple Linear Regression

• 100(1-α)% Confidence interval for the Slope

( )
βˆ1 − tα / 2,n − 2 se βˆ1 ≤ β1 ≤ βˆ1 + tα / 2,n − 2 se βˆ1 ( )
• 100(1-α)% Confidence interval for the Intercept

( )
βˆ0 − tα / 2,n − 2 se βˆ0 ≤ β 0 ≤ βˆ0 + tα / 2,n − 2 se βˆ0 ( )

Linear Regression Analysis 6E Montgomery, Peck 35


& Vining
Page 58

Also, see page 30, text

Linear Regression Analysis 6E Montgomery, Peck 36


& Vining
Page 59

2.4 Interval Estimation in Simple Linear Regression

• 100(1-α)% Confidence interval for σ2


( n − 2) MS RES 2 ( n − 2) MS RES
2
≤σ ≤
χα / 2, n − 2 χ12− α / 2, n − 2

Linear Regression Analysis 6E Montgomery, Peck 37


& Vining
Page 60

2.4.2 Interval Estimation of the Mean Response

• Let x0 be the level of the regressor variable at which we want to estimate the
mean response, i.e.
E ( y | x0 ) = µ y| x0

– Point estimator of E ( y | x0 ) once the model is fit:



E ( y | x0=) µˆ y|x= ˆ + βˆ x
β
0 0 1 0

– In order to construct a confidence interval on the mean response, we need


the variance of the point estimator.

Linear Regression Analysis 6E Montgomery, Peck 38


& Vining
Page 61

2.4.2 Interval Estimation of the Mean Response

• The variance of µˆ y| x0 is

( ) ( )
ˆ y| x0 = Var βˆ 0 + βˆ 1 x0 = Var[ y + βˆ 1 ( x0 − x )]
Var µ
= Var ( y ) + Var[βˆ 1 ( x0 − x )]
σ 2 σ 2 ( x0 − x ) 2  1 ( x − x ) 2
= + = σ2  + 0 
n S xx  n S xx 

Linear Regression Analysis 6E Montgomery, Peck 39


& Vining
Page 62

2.4.2 Interval Estimation of the Mean Response

• 100(1-α)% confidence interval for E(y|x0)

 1 ( x0 − x ) 2 
µˆ y| x0 − t α / 2,n − 2 MS Re s  +  ≤ E ( y | x0 )
 n S xx 

 1 ( x0 − x ) 2 
≤ µˆ y| x0 + t α / 2,n − 2 MS Re s  + 
 n S xx 

Notice that the length of the CI depends on the location of the point of interest

Linear Regression Analysis 6E Montgomery, Peck 40


& Vining
Page 63

See pages 32-33, text

Linear Regression Analysis 6E Montgomery, Peck 41


& Vining
Page 64

2.5 Prediction of New Observations

• Suppose we wish to construct a prediction interval on a future


observation, y0 corresponding to a particular level of x, say x0.
• The point estimate would be:

ŷ= ˆ + βˆ x
β
0 0 1 0

• The confidence interval on the mean response at this point is not


appropriate for this situation. Why?
Linear Regression Analysis 6E Montgomery, Peck 42
& Vining
Page 65

2.5 Prediction of New Observations

• Let the random variable, Ψ be Ψ = y0 − ŷ0


• Ψ is normally distributed with
– E(Ψ) = 0 2
 1 ( x − x )
– Var(Ψ) = Var ( y 0 − ˆ
y 0 ) = σ 2
 1 + + 0

 n S xx 

(y0 is independent of ŷ0)

Linear Regression Analysis 6E Montgomery, Peck 43


& Vining
Page 66

2.5 Prediction of New Observations


• 100(1 - α)% prediction interval on a future observation, y0,
at x0
 1 ( x0 − x ) 2 
yˆ 0 − t α / 2, n − 2 MS Re s 1 + +  ≤ y0
 n S xx 

 1 ( x0 − x ) 2 
≤ yˆ 0 + t α / 2, n − 2 MS Re s 1 + + 
 n S xx 

Linear Regression Analysis 6E Montgomery, Peck 44


& Vining
Page 67

Linear Regression Analysis 6E Montgomery, Peck 45


& Vining
Page 68

2.6 Coefficient of Determination


• R2 - coefficient of determination
2 SS R SS Re s
R = = 1−
SST SST
• Proportion of variation explained by the
regressor, x
• For the rocket propellant data

Linear Regression Analysis 6E Montgomery, Peck 46


& Vining
Page 69

2.6 Coefficient of Determination

• R2 can be misleading!
– Simply adding more terms to the model will increase R2
– As the range of the regressor variable increases (decreases), R2
generally increases (decreases).
– R2 does not indicate the appropriateness of a linear model

Linear Regression Analysis 6E Montgomery, Peck 47


& Vining
Page 70

Two Interesting Applications

• 2.7 A Service Industry Application of Regression


– Patient satisfaction study
• 2.8 Does Pitching Win Baseball Games?
– Wins versus team ERA

Linear Regression Analysis 6E Montgomery, Peck 48


& Vining
Page 71

2.10 Considerations in the Use of Regression


• Extrapolating
• Extreme points will often influence the slope.
• Outliers can disturb the least-squares fit
• Linear relationship does not imply cause-effect relationship
(interesting mental defective example in the book pg 45)
• Sometimes, the value of the regressor variable is unknown and
itself be estimated or predicted. Let x=# of workers y=# of items shipped at an Amazon
location An increase in x may not cause an increase in y (due
to tighter quality control procedures).

A decrease in x may result in an increase in y due to more


efficient processes.

Linear Regression Analysis 6E Montgomery, Peck 49


& Vining
Page 72

Linear Regression Analysis 6E Montgomery, Peck 50


& Vining
Page 73

2.11 Regression Through the Origin


• The no-intercept model is
y = β1 x + ε
• This model would be appropriate for situations where the origin (0, 0)
has some meaning.
• A scatter diagram can aid in determining where an intercept- or no-
intercept model should be used.
• In addition, the practitioner could test both models. Examine t-tests,
residual mean square.

Linear Regression Analysis 6E Montgomery, Peck 51


& Vining
Page 74

2.12 Estimation by Maximum Likelihood


• The method of least squares is not the only method of parameter
estimation that can be used for linear regression models.
• If the form of the response (error) distribution is known, then the
maximum likelihood (ML) estimation method can be used.
• In simple linear regression with normal errors, the MLEs for the
regression coefficients are identical to that obtained by the method of
least squares.
• Details of finding the MLEs are on page 52.
• The MLE of σ2 is biased. (Not a serious problem here though.)

Linear Regression Analysis 6E Montgomery, Peck 52


& Vining
Page 75

Linear Regression Analysis 6E Montgomery, Peck 53


& Vining
Page 76

Identical to OLS

Biased

Linear Regression Analysis 6E Montgomery, Peck 54


& Vining
Page 77

Linear Regression Analysis 6E Montgomery, Peck 55


& Vining
Page 78

2.13 Case Where the Regressor is Random

• When x and y have an unknown joint distribution, all of the “fixed


x’s” regression results hold if:

Linear Regression Analysis 6E Montgomery, Peck 56


& Vining
Page 79

Linear Regression Analysis 6E Montgomery, Peck 57


& Vining
Page 80

Linear Regression Analysis 6E Montgomery, Peck 58


& Vining
Page 81

Linear Regression Analysis 6E Montgomery, Peck 59


& Vining
Page 82

The sample correlation coefficient is related to the slope:

Also,

Linear Regression Analysis 6E Montgomery, Peck 60


& Vining
Page 83

Linear Regression Analysis 6E Montgomery, Peck 61


& Vining
Page 84

Linear Regression Analysis 6E Montgomery, Peck 62


& Vining
Page 85

Linear Regression Analysis 6E Montgomery, Peck 63


& Vining
Page 86

Linear Regression Analysis 6E Montgomery, Peck 64


& Vining

You might also like