SESI 11 - CH 13 - SIMPLE-REGRESSION - Levine - Smume6 - ppt13

You might also like

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 56

Statistics for Managers using

Microsoft Excel
6th Edition

Chapter 13

Simple Linear Regression

Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 13-1


Learning Objectives
In this chapter, you learn:
 How to use regression analysis to predict the value of
a dependent variable based on an independent
variable
 The meaning of the regression coefficients b0 and b1
 How to evaluate the assumptions of regression
analysis and know what to do if the assumptions are
violated
 To make inferences about the slope and correlation
coefficient
 To estimate mean values and predict individual values
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 13-2
Correlation vs. Regression
DCOVA
 A scatter plot can be used to show the
relationship between two variables
 Correlation analysis is used to measure the
strength of the association (linear relationship)
between two variables
 Correlation is only concerned with strength of the
relationship
 No causal effect is implied with correlation
 Scatter plots were first presented in Ch. 2
 Correlation was first presented in Ch. 3

Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 13-3


Introduction to
Regression Analysis
DCOVA
 Regression analysis is used to:
 Predict the value of a dependent variable based on
the value of at least one independent variable
 Explain the impact of changes in an independent
variable on the dependent variable
Dependent variable: the variable we wish to
predict or explain
Independent variable: the variable used to predict
or explain the dependent
variable

Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 13-4


Simple Linear Regression
Model
DCOVA
 Only one independent variable, X
 Relationship between X and Y is
described by a linear function
 Changes in Y are assumed to be related
to changes in X

Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 13-5


13.1 Types of Relationships
DCOVA

Linear relationships Curvilinear relationships

Y Y

X X

Y Y

X X
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 13-6
Types of Relationships
DCOVA
(continued)
Strong relationships Weak relationships

Y Y

X X

Y Y

X X
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 13-7
Types of Relationships
DCOVA
(continued)
No relationship

X
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 13-8
13.2 Simple Linear Regression Model

DCOVA

Population Random
Population Independent Error
Slope
Y intercept Variable term
Coefficient
Dependent
Variable

Yi  β0  β1Xi  ε i
Linear component Random Error
component

Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 13-9


Simple Linear Regression
Model DCOVA
(continued)

Y Yi  β0  β1Xi  ε i
Observed Value
of Y for Xi

εi Slope = β1
Predicted Value
Random Error
of Y for Xi
for this Xi value

Intercept = β0

Xi X
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 13-10
Simple Linear Regression
Equation (Prediction Line) DCOVA
The simple linear regression equation provides an
estimate of the population regression line

Estimated
(or predicted) Estimate of Estimate of the
Y value for the regression regression slope
observation i
intercept
Value of X for

Ŷi  b0  b1Xi observation i

Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 13-11


The Least Squares Method
DCOVA

b0 and b1 are obtained by finding the values of


that minimize the sum of the squared
differences between Y and Ŷ :

min  (Yi Ŷi )  min  (Yi  (b 0  b1Xi )) 2 2

Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 13-12


Finding the Least Squares
Equation
DCOVA

 The coefficients b0 and b1 , and other


regression results in this chapter, will be
found using Excel

Formulas are shown in the text for those


who are interested

Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 13-13


Interpretation of the
Slope and the Intercept
DCOVA

 b0 is the estimated average value of Y


when the value of X is zero

 b1 is the estimated change in the


average value of Y as a result of a
one-unit increase in X

Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 13-14


Simple Linear Regression
Example P 502: Sunflowers Apparel
DCOVA
 The business objective of the director of planning
is to forecast annual sales for all new stores,
based on store size
 A random sample of 14 stores is selected
 Dependent variable (Y) = annual sales in

millions of dollars
 Independent variable (X) = square feet

(thousands)

Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 13-15


Simple Linear Regression
Example: Data
DCOVA
Store Square Feet in 1,000 Annual Sales in $1,000,000s
(X) (Y)
1 1.7 3.7
2 1.6 3.9
3 2.8 6.7
4 5.6 9.5
5 1.3 3.4
6 2.2 5.6
7 1.3 3.7
8 1.1 2.7
9 3.2 5.5
10 1.5 2.9
11 5.2 10.7
12 4.6 7.6
13 5.8 11.8
14 3.0 4.1
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 13-16
Simple Linear Regression
Example: Scatter Plot
Scatter Diagram for Site Selection DCOVA

Annual Sales in $1,000,000s (Y)


14

12

10

0
0 1 2 3 4 5 6 7

Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 13-17


Simple Linear Regression Example:
Using Excel Data Analysis Function
DCOVA
1. Choose Data 2. Choose Data Analysis
3. Choose Regression

Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 13-18


Simple Linear Regression Example:
Using Excel Data Analysis Function
(continued)
Enter Y’s and X’s and desired options DCOVA

Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 13-19


Simple Linear Regression Example:
Using PHStat
Add-Ins: PHStat: Regression: Simple Linear Regression

Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 13-20


Simple Linear Regression Example:
Excel Output
DCOVA
Regression Statistics
Multiple R 0.950883 The regression equation is:
R Square 0.904179
Adjusted R Square 0.896194 Yˆi  0.9645  1.6699 X i (square feet)
Standard Error 0.96638
Observations 14

ANOVA
  df SS MS F Significance F
Regression 1 105.7476 105.7476 113.2335 1.82269E-07
Residual 12 11.20668 0.93389
Total 13 116.9543     

  Coefficients Standard Error t Stat P-value Lower 95% Upper 95%


Intercept 0.9645 0.5262 1.8329 0.0917 -0.1820 2.1110
Square Feet 1.6699 0.1569 10.6411 0.0000 1.3280 2.0118

Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 13-21


Simple Linear Regression Example:
Scatter Plot
Scatter Diagram for Site Selection DCOVA

Annual Sales in $1,000,000s (Y)


14

12 Y = 0.9645 + 1.6699X
R2 = 0.9042
10

0
0 1 2 3 4 5 6 7

Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 13-22


Simple Linear Regression Example:
Interpretation of bo
DCOVA

  0.9645 + 1.6699
 b0 is the estimated average value of Y when the
value of X is zero (if X = 0 is in the range of observed
X values)
 Because the square footage of the store cannot be
0, this Y intercept has little or no practical
interpretation.

Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 13-23


Simple Linear Regression Example:
Interpreting b1 DCOVA

  0.9645 + 1.6699

 b1 estimates the change in the average value


of Y as a result of a one-unit increase in X
 Here, b1 = 1.6699 tells us that for each increase of 1.0
thousand square feet in size of the store, the predicted
annual sales are estimated to increase by 1.6699 millions
of dollars.

Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 13-24


Simple Linear Regression
Example: Making Predictions
DCOVA
Predict the annual sales for a
store with 4,000 square feet:

  0.9645 + 1.6699

0.9645 + 1.6699 = 7.644 or $7,644,000

A store with 4,000 square feet has


predicted annual sales of $7,644,000

Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 13-25


SOAL LATIHAN
 Manajer penjualan mengumpulkan data penjualan (dalam $1,000) dan lamanya
pengalaman (dalam tahun) dari 10 orang sales.

Sales person Years (X) Sales ($1,000) (Y) Pertanyaan:


1 1 80 1. Tentukan estimasi
2 3 97 persamaan regresi linear
3 4 92 sederhananya.
4 4 102 2. Interpretasikan
5 6 103 persamaannya
6 8 111 3. Berapakah besarnya
7 10 119 penjualan jika seorang
8 10 123 sales dengan
9 11 117 pengalaman 5 tahun?
10 13 136

Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 13-26


Jawaban
Years (X) Sales (Y) X2 Y2 XY
1 80 1 6,400 80
3 97 9 9,409 291
4 92 16 8,464 368
4 102 16 10,404 408
6 103 36 10,609 618
8 111 64 12,321 888
10 119 100 14,161 1190
10 123 100 15,129 1230
11 117 121 13,689 1287
13 136 169 18,496 1768
avg 7 108
sum 70 1080 632 119,082 8,128

SSXY 568
SSX 142
b1 4
b0 80

Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 13-27


13.3 Measures of Variation
DCOVA
 Total variation is made up of two parts:

SST  SSR  SSE


Total Sum of Regression Sum Error Sum of
Squares of Squares Squares

SST   ( Yi  Y )2 SSR   ( Ŷi  Y )2 SSE   ( Yi  Ŷi )2


where:
=Y
Mean value of the dependent variable
Yi = Observed value of the dependent variable

i
= Predicted value of Y for the given Xi value
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 13-28
Measures of Variation
(continued)

DCOVA
 SST = total sum of squares (Total Variation)
 Measures the variation of the Yi values around their mean
Y
 SSR = regression sum of squares (Explained Variation)
 Variation attributable to the relationship between X and Y
 SSE = error sum of squares (Unexplained Variation)
 Variation in Y attributable to factors other than X

Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 13-29


Measures of Variation
(continued)

DCOVA
Y
Yi  
SSE = (Yi - Yi )2 Y
_
SST = (Yi - Y)2

Y  _
_ SSR = (Yi - Y)2 _
Y Y

Xi X
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 13-30
Coefficient of Determination, r2
DCOVA
 The coefficient of determination is the portion of
the total variation in the dependent variable that is
explained by variation in the independent variable
 The coefficient of determination is also called r-
squared and is denoted as r2

SSR regression sum of squares


r 2

SST total sum of squares

note: 0 r 1 2

Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 13-31


Examples of Approximate
r2 Values
DCOVA
Y
r2 = 1

Perfect linear relationship


between X and Y:
X
r2 = 1
Y 100% of the variation in Y is
explained by variation in X

X
r =1
2

Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 13-32


Examples of Approximate
r2 Values
DCOVA
Y
0 < r2 < 1

Weaker linear relationships


between X and Y:
X
Some but not all of the
Y
variation in Y is explained
by variation in X

X
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 13-33
Examples of Approximate
r2 Values
DCOVA

r2 = 0
Y
No linear relationship
between X and Y:

The value of Y does not


X depend on X. (None of the
r2 = 0
variation in Y is explained
by variation in X)

Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 13-34


Simple Linear Regression Example:
Coefficient of Determination, r2 in Excel
DCOVA
SSR 105.7476
Regression Statistics
r2    0.9042
Multiple R 0.9509 SST 116 .9543
R Square 0.9042
Adjusted R Square 0.8962 90.42% of the variation in
Standard Error 0.9664
annual sales is explained by
Observations 14
variability in the size of store
ANOVA
  df SS MS F Significance F
Regression 1 105.7476 105.7476 113.2335 0.0000
Residual 12 11.2067 0.9339
Total 13 116.9543      

  Coefficients Standard Error t Stat P-value Lower 95% Upper 95%


Intercept 0.9645 0.5262 1.8329 0.0917 -0.1820 2.1110
Square Feet 1.6699 0.1569 10.6411 0.0000 1.3280 2.0118

Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 13-35


SOAL LATIHAN
 Manajer penjualan mengumpulkan data penjualan dan lamanya pengalaman
(dalam tahun) dari 10 orang sales.

Sales person Years (X) Sales (Y) Pertanyaan:


1 1 80
1. Tentukan
2 3 97
3 4 92
koefisien
4 4 102 determinasi
5 6 103
6 8 111
7 10 119
8 10 123
9 11 117
10 13 136

Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 13-36


Jawaban Soal Latihan
Coefficient of Determination, r2 in Excel
DCOVA
SSR 2272
Regression Statistics
r2    0.9304
Multiple R 0.964565 SST 2442
R Square 0.930385
Adjusted R Square 0.921683 93.04% of the variation in
Standard Error 4.609772
Observations
annual sales is explained by
10
variability in the size of store
ANOVA
  df SS MS F Significance F
Regression 1 2272 2272 106.9176 0.0000
Residual 8 170 21.25
Total 9 2442     

  Coefficients Standard Error t Stat P-value Lower 95% Upper 95%


Intercept 80 3.0753 26.0133 0.0000 72.9082 87.0918
Square Feet 4 0.3868 10.3401 0.0000 3.1079 4.8921

Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 13-37


Standard Error of Estimate
DCOVA
 The standard deviation of the variation of
observations around the regression line is estimated
by
n

SSE
 ˆ
(Yi  Yi ) 2

i 1
S YX  
n2 n2
Where
SSE = error sum of squares
n = sample size

Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 13-38


Simple Linear Regression Example:
Standard Error of Estimate in Excel
DCOVA
Regression Statistics
Multiple R 0.9509 S YX  0.9664
R Square 0.9042
Adjusted R Square 0.8962
Standard Error 0.9664
Observations 14

ANOVA
  df SS MS F Significance F
Regression 1 105.7476 105.7476 113.2335 0.0000
Residual 12 11.2067 0.9339
Total 13 116.9543      

  Coefficients Standard Error t Stat P-value Lower 95% Upper 95%


Intercept 0.9645 0.5262 1.8329 0.0917 -0.1820 2.1110
Square Feet 1.6699 0.1569 10.6411 0.0000 1.3280 2.0118

Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 13-39


Comparing Standard Errors
DCOVA
SYX is a measure of the variation of observed
Y values from the regression line
Y Y

smallSYX X large SYX X

The magnitude of SYX should always be judged relative to the


size of the Y values in the sample data

Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 13-40


13.7 Inferences About the Slope
DCOVA
 The standard error of the regression slope
coefficient (b1) is estimated by

S YX S YX
Sb1  
SSX  i
(X  X ) 2

where:
S=bEstimate
1
of the standard error of the slope

SSE
S YX  = Standard error of the estimate
n2
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 13-56
Inferences About the Slope:
t Test
DCOVA
 t test for a population slope
 Is there a linear relationship between X and Y?
 Null and alternative hypotheses
 H0: β1 = 0 (no linear relationship)
 H1: β1 ≠ 0 (linear relationship does exist)
 Test statistic where:
b1  β1
t STAT  b1 = regression slope
coefficient
Sb β1 = hypothesized slope
1
Sb1 = standard
d.f.  n  2 error of the slope

Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 13-57


Inferences About the Slope:
t Test Example
Square Feet in Annual Sales in
1,000 $1,000,000s DCOVA
(X) (Y)
1.7 3.7
Estimated Regression Equation:
1.6 3.9
2.8 6.7
Annual Sales  0.9645  1.6699 X i
5.6 9.5
1.3 3.4
2.2 5.6
1.3 3.7 The slope of this model is 1.6699
1.1 2.7
3.2 5.5
Is there a relationship between the
1.5 2.9 size of the store and the annual sales
5.2 10.7 at the 0.05 level of significance?
4.6 7.6
5.8 11.8
3.0 4.1

Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 13-58


Inferences About the Slope:
t Test Example
H0: β1 = 0 DCOVA

From Excel output: H1: β1 ≠ 0


  Coefficients Standard Error t Stat P-value
Intercept 0.9645 0.5262 1.8329 0.0917
Square Feet 1.6699 0.1569 10.6411 0.0000

b1 Sb1

b 1  β 1 1.6699  0
t STAT    10.6411
S b1 0.1569

Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 13-59


Inferences About the Slope:
t Test Example
DCOVA
H0: β1 = 0
Test Statistic: tSTAT = 10.6411
H1: β1 ≠ 0

d.f. = 14- 2 = 12

a/2=.025 a/2=.025
Decision: Reject H0

There is sufficient evidence


Reject H0
-tα/2
Do not reject H0
tα/2
Reject H0 that the size of the store
0
-2.1788 2.1788 10.6411 affects annual sales

Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 13-60


Inferences About the Slope:
t Test Example
DCOVA
H0: β1 = 0
H1: β1 ≠ 0
From Excel output:
  Coefficients Standard Error t Stat P-value
Intercept 0.9645 0.5262 1.8329 0.0917
Square Feet 1.6699 0.1569 10.6411 0.0000

p-value
Decision: Reject H0, since p-value < α
There is sufficient evidence that the
size of the store affects annual sales.
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 13-61
F Test for Significance
DCOVA
 F Test statistic: MSR
FSTAT 
MSE

where SSR
MSR 
k
SSE
MSE 
n  k 1
where FSTAT follows an F distribution with k numerator and (n – k - 1)
denominator degrees of freedom

(k = the number of independent variables in the regression model)

Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 13-62


F-Test for Significance
Excel Output
DCOVA

Regression Statistics
Multiple R 0.9509
MSR 105.7476
R Square 0.9042 FSTAT    113.2335
Adjusted R MSE 0.9339
Square 0.8962
Standard Error 0.9664
With 1 and 12 degrees p-value for
Observations 14 of freedom the F-Test

ANOVA Significance
  df SS MS F F
Regression 1 105.7476 105.7476 113.2335 0.0000
Residual 12 11.2067 0.9339
Total 13 116.9543     

Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 13-63


F Test for Significance
(continued)

DCOVA
H0: β1 = 0 Test Statistic:
H1: β1 ≠ 0 MSR
FSTAT   113.2335
 = .05 MSE

df1= 1 df2 = 12 Decision:


Critical Reject H0 at  = 0.05
Value:
F = 4.75
Conclusion:
 = .05
There is sufficient evidence that
0 F the size of the store affects
Do not Reject H0
reject H0
F.05 = 4.75
annual sales
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 13-64
SOAL LATIHAN
 Manajer penjualan mengumpulkan data penjualan dan lamanya pengalaman
(dalam tahun) dari 10 orang sales.

Sales person Years (X) Sales (Y) Pertanyaan:


1 1 80
1. Lakukan uji t dan
2 3 97
3 4 92
uji F
4 4 102
5 6 103
6 8 111
7 10 119
8 10 123
9 11 117
10 13 136

Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 13-65


Jawaban: t Test
H0: β1 = 0 DCOVA

From Excel output: H1: β1 ≠ 0


  Coefficients Standard Error t Stat P-value
Intercept 80 3.0753 26.0133 0.0000
Square Feet 4 0.3868 10.3401 0.0000

b1 Sb1

b1  β1 40
t STAT    10.3401
S b1 0.3868

Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 13-66


Jawaban: t Test
DCOVA
H0: β1 = 0
Test Statistic: tSTAT = 10.3401
H1: β1 ≠ 0

d.f. = 10- 2 = 8

a/2=.025 a/2=.025
Decision: Reject H0

Ada bukti yang cukup


Reject H0
-tα/2
Do not reject H0
tα/2
Reject H0 bahwa lamanya pengalaman
0
-2.3060 2.3060 10.3401 sales mempunyai pengaruh
pada penjualan
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 13-67
Jawaban F-Test
DCOVA

Regression Statistics
Multiple R 0.9646
MSR 2272
R Square 0.9304 FSTAT    106.9176
Adjusted R MSE 21.25
Square 0.9217
Standard Error 4.6098
With 1 and 8 degrees p-value for
Observations 10 of freedom the F-Test

ANOVA Significance
  df SS MS F F
Regression 1 2272 2272 106.9176 0.0000
Residual 8 170 21.25
Total 9 2442     

Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 13-68


F Test for Significance (continued)

DCOVA
H0: β1 = 0 Test Statistic:
H1: β1 ≠ 0 MSR
FSTAT   106.9176
 = .05 MSE

df1= 1 df2 = 8 Decision:


Critical Reject H0 at  = 0.05
Value:
F = 5.32
Conclusion:
 = .05
Ada bukti yang cukup bahwa
0 F lamanya pengalaman sales
Do not Reject H0
reject H0
F.05 = 5.32
berpengaruh pada penjualan.
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 13-69
Confidence Interval Estimate
for the Slope
DCOVA
Confidence Interval Estimate of the Slope:
b1  t α / 2 S b d.f. = n - 2
1

Excel Printout for annual sales:


  Coefficients Standard Error t Stat P-value Lower 95% Upper 95%
Intercept 0.9645 0.5262 1.8329 0.0917 -0.1820 2.1110
Square Feet 1.6699 0.1569 10.6411 0.0000 1.3280 2.0118

At 95% level of confidence, the confidence interval for


the slope is (1.3280, 2.0118)

Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 13-70


Confidence Interval Estimate
for the Slope (continued)
DCOVA
  Coefficients Standard Error t Stat P-value Lower 95% Upper 95%
Intercept 0.9645 0.5262 1.8329 0.0917 -0.1820 2.1110
Square Feet 1.6699 0.1569 10.6411 0.0000 1.3280 2.0118

The confidence interval indicates that for each


increase of $1,000 square feet, predicted annual
sales are estimated to increase by at least
$1,328,000 but no more than $2,011,800.

Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 13-71

You might also like