OSL

You might also like

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 31

1

Slide
Slides Prepared by
JOHN S. LOUCKS
St. Edwards University
2002 South-Western/Thomson Learning
2

Slide
Chapter 14
Simple Linear Regression
Simple Linear Regression Model
Least Squares Method
Coefficient of Determination
Model Assumptions
Testing for Significance
Using the Estimated Regression Equation
for Estimation and Prediction
Computer Solution
Residual Analysis: Validating Model Assumptions
Residual Analysis: Outliers and Influential
Observations
3

Slide
The Simple Linear Regression Model
Simple Linear Regression Model
y = |
0
+ |
1
x

+ c

Simple Linear Regression Equation
E(y) = |
0
+ |
1
x

Estimated Simple Linear Regression Equation
y = b
0
+ b
1
x
^
4

Slide
Least Squares Method
Least Squares Criterion



where:
y
i
= observed value of the dependent variable
for the ith observation
y
i
= estimated value of the dependent variable
for the ith observation

min (y y
i i

)
2
^
5

Slide
Slope for the Estimated Regression Equation



y-Intercept for the Estimated Regression Equation

b
0
= y - b
1
x
where:
x
i
= value of independent variable for ith observation
y
i
= value of dependent variable for ith observation
x = mean value for independent variable
y = mean value for dependent variable
n = total number of observations
_ _
b
x y x y n
x x n
i i i i
i i
1
2 2
=


( ) /
( ) /
_
_
The Least Squares Method
6

Slide
Example: Reed Auto Sales
Simple Linear Regression
Reed Auto periodically has a special week-long sale.
As part of the advertising campaign Reed runs one or
more television commercials during the weekend
preceding the sale. Data from a sample of 5 previous
sales are shown below.

Number of TV Ads Number of Cars Sold
1 14
3 24
2 18
1 17
3 27
7

Slide
Slope for the Estimated Regression Equation
b
1
= 220 - (10)(100)/5 = 5
24 - (10)
2
/5
y-Intercept for the Estimated Regression Equation
b
0
= 20 - 5(2) = 10
Estimated Regression Equation
y = 10 + 5x
^
Example: Reed Auto Sales
8

Slide
Example: Reed Auto Sales
Scatter Diagram














y = 5x + 10
0
5
10
15
20
25
30
0 1 2 3 4
TV Ads
C
a
r
s

S
o
l
d
9

Slide
The Coefficient of Determination
Relationship Among SST, SSR, SSE

SST = SSR + SSE





Coefficient of Determination

r
2
= SSR/SST
where:
SST = total sum of squares
SSR = sum of squares due to regression
SSE = sum of squares due to error

( ) ( ) ( ) y y y y y y
i i i i

2 2 2
^
^
10

Slide
Coefficient of Determination

r
2
= SSR/SST = 100/114 = .8772

The regression relationship is very strong since
88% of the variation in number of cars sold can be
explained by the linear relationship between the
number of TV ads and the number of cars sold.
Example: Reed Auto Sales
11

Slide
The Correlation Coefficient
Sample Correlation Coefficient




where:
b
1
= the slope of the estimated regression
equation
2
1
) of (sign r b r
xy
=
ion Determinat of t Coefficien ) of (sign
1
b r
xy
=
x b b y
1 0

+ =
12

Slide
Example: Reed Auto Sales
Sample Correlation Coefficient


The sign of b
1
in the equation is +.


r
xy
= +.9366


2
1
) of (sign r b r
xy
=

10 5 y x = +
=+ .8772
xy
r
13

Slide
Model Assumptions
Assumptions About the Error Term c
The error c is a random variable with mean of
zero.
The variance of c , denoted by o
2
, is the same for
all values of the independent variable.
The values of c are independent.
The error c is a normally distributed random
variable.
14

Slide
Testing for Significance
To test for a significant regression relationship, we
must conduct a hypothesis test to determine whether
the value of |
1
is zero.
Two tests are commonly used
t Test
F Test
Both tests require an estimate of o
2
, the variance of c
in the regression model.
15

Slide
Testing for Significance
An Estimate of o
2
The mean square error (MSE) provides the estimate
of o
2
, and the notation s
2
is also used.

s
2
= MSE = SSE/(n-2)
where:

= =
2
1 0
2
) ( )

( SSE
i i i i
x b b y y y
16

Slide
Testing for Significance
An Estimate of o
To estimate o we take the square root of o
2
.
The resulting s is called the standard error of the
estimate.
2
SSE
MSE

= =
n
s
17

Slide
Hypotheses
H
0
: |
1
= 0
H
a
: |
1
= 0
Test Statistic


Rejection Rule

Reject H
0
if t < -t
o/2
or t > t
o/2


where t
o/2
is based on a t distribution with
n - 2 degrees of freedom.
Testing for Significance: t Test
t
b
s
b
=
1
1
18

Slide
t Test
Hypotheses H
0
: |
1
= 0
H
a
: |
1
= 0
Rejection Rule
For o = .05 and d.f. = 3, t
.025
= 3.182
Reject H
0
if t > 3.182
Test Statistics
t = 5/1.08 = 4.63
Conclusions
Reject H
0

Example: Reed Auto Sales
19

Slide
Confidence Interval for |
1
We can use a 95% confidence interval for |
1
to test
the hypotheses just used in the t test.
H
0
is rejected if the hypothesized value of |
1
is not
included in the confidence interval for |
1
.

20

Slide
Confidence Interval for |
1
The form of a confidence interval for |
1
is:


where b
1
is the point estimate
is the margin of error
is the t value providing an area
of o/2 in the upper tail of a
t distribution with n - 2 degrees
of freedom

1
2 / 1 b
s t b
o

1
2 / b
s t
o
2 / o
t
21

Slide
Example: Reed Auto Sales
Rejection Rule
Reject H
0
if 0 is not included in the confidence
interval for |
1
.
95% Confidence Interval for |
1

= 5 +/- 3.182(1.08) = 5 +/- 3.44
or 1.56 to 8.44
Conclusion

Reject H
0
1
2 / 1 b
s t b
o

22

Slide
Testing for Significance: F Test
Hypotheses
H
0
: |
1
= 0
H
a
: |
1
= 0
Test Statistic
F = MSR/MSE
Rejection Rule
Reject H
0
if F > F
o


where F
o
is based on an F distribution with 1 d.f. in
the numerator and n - 2 d.f. in the denominator.
23

Slide
F Test
Hypotheses H
0
: |
1
= 0
H
a
: |
1
= 0
Rejection Rule
For o = .05 and d.f. = 1, 3: F
.05
= 10.13
Reject H
0
if F > 10.13.
Test Statistic
F = MSR/MSE = 100/4.667 = 21.43
Conclusion
We can reject H
0
.
Example: Reed Auto Sales
24

Slide
Some Cautions about the
Interpretation of Significance Tests
Rejecting H
0
: |
1
= 0 and concluding that the
relationship between x and y is significant does not
enable us to conclude that a cause-and-effect
relationship is present between x and y.
Just because we are able to reject H
0
: |
1
= 0 and
demonstrate statistical significance does not enable
us to conclude that there is a linear relationship
between x and y.
25

Slide
Confidence Interval Estimate of E(y
p
)


Prediction Interval Estimate of y
p

y
p
+ t
o/2
s
ind



where the confidence coefficient is 1 - o and
t
o/2
is based on a t distribution with n - 2 d.f.

Using the Estimated Regression Equation
for Estimation and Prediction

/
y t s
p y
p

o 2
26

Slide
Point Estimation
If 3 TV ads are run prior to a sale, we expect the
mean number of cars sold to be:
y = 10 + 5(3) = 25 cars
Confidence Interval for E(y
p
)
95% confidence interval estimate of the mean number
of cars sold when 3 TV ads are run is:
25 + 4.61 = 20.39 to 29.61 cars
Prediction Interval for y
p

95% prediction interval estimate of the number of
cars sold in one particular week when 3 TV ads are
run is: 25 + 8.28 = 16.72 to 33.28 cars
^
Example: Reed Auto Sales
27

Slide
Residual for Observation i

y
i
y
i

Standardized Residual for Observation i



where:


Residual Analysis
^
y y
s
i i
y y
i i

^
^
s s h
y y i
i i

= 1
^
28

Slide
Example: Reed Auto Sales
Residuals
Observation Predicted Cars Sold Residuals
1 15 -1
2 25 -1
3 20 -2
4 15 2
5 25 2
29

Slide
Example: Reed Auto Sales
Residual Plot
TV Ads Residual Plot
-3
-2
-1
0
1
2
3
0 1 2 3 4
TV Ads
R
e
s
i
d
u
a
l
s
30

Slide
Residual Analysis
Detecting Outliers
An outlier is an observation that is unusual in
comparison with the other data.
Minitab classifies an observation as an outlier if its
standardized residual value is < -2 or > +2.
This standardized residual rule sometimes fails to
identify an unusually large observation as being
an outlier.
This rules shortcoming can be circumvented by
using studentized deleted residuals.
The |i th studentized deleted residual| will be
larger than the |i th standardized residual|.

31

Slide
End of Chapter 14

You might also like