Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Chapter 12 12-1

Learning Objectives
North Seattle Community College

In this chapter, you learn:


BUS210 !  How to use regression analysis to predict the value of
Business Statistics a dependent variable based on an independent
variable
!  The meaning of the regression coefficients b0 and b1
Chapter 12 !  How to evaluate the assumptions of regression
analysis and know what to do if the assumptions are
violated
Simple Linear Regression !  To make inferences about the slope and correlation
coefficient
!  To estimate mean values and predict individual values
BUS210: Business Statistics Simple Regression- 2

Correlation Regression Analysis


!  Correlation analysis… !  Regression analysis is used to…
!  is used to measure the association (linear !  predict the value of a dependent variable based on
relationship) between two variables the value of at least one independent variable
!  is only concerned with the strength of the relationship !  explain the impact on the dependent variable from
!  does not imply cause and effect changes in an independent variable

!  can be visualized by the use of a scatter plot to show Dependent variable (Y): the variable we wish to predict or explain
the relationship between two variables
Independent variable (X): the variable used in order to predict or
explain the dependent variable

BUS210: Business Statistics Simple Regression- 3 BUS210: Business Statistics Simple Regression- 4

Regression Analysis Regression Analysis


Simple Linear Regression Types of Relationships
Linear relationships Curvilinear relationships
!  Only one independent variable, X
Y Y
!  Relationship between X and Y is
described by a linear function
!  Assumption:
X X
!  Changes in Y are related to changes in X
Y Y

X X
BUS210: Business Statistics Simple Regression- 5 BUS210: Business Statistics Simple Regression- 6

NSCC – BUS210 Simple Regression


Chapter 12 12-2

Regression Analysis Regression Analysis


Strength of Relationships Simple Linear Regression
Strong Weak None For a population:

Y Y Y Independent
Intercept Slope
Variable
Coefficient Coefficient

Y
X

Y
X

Y
X
Yi = !0 + !1X i + "i
Dependent Linear Random Error
Variable component component
X X X

BUS210: Business Statistics Simple Regression- 7 BUS210: Business Statistics Simple Regression- 8

Regression Analysis Simple Linear Regression


Simple Linear Regression Linear Regression Equation
The simple linear regression equation…
Y Random Error • is based on a sample set
Observed
(for this Xi value) Slope (!1) • gives an estimate of the population regression line
Value
of Y for Xi
Estimate of Estimate of
!i the regression the regression
Predicted intercept slope
Value Value of X for

ŷi = b0 + b1x i
of Y for Xi
observation i
Yi = !0 + !1X i
Intercept Estimated
Commonly shown in math
(!0) as y = mx+b (or predicted)
X Y value for
Xi observation i
BUS210: Business Statistics Simple Regression- 9 BUS210: Business Statistics Simple Regression- 10

Simple Linear Regression Simple Linear Regression


The Least Squares Method The Least Squares Method
Least Squares Criterion: The sum of the squared differences can be
Minimize the sum of the squared differences minimized when:
between Y and Ŷ !(xi " x )(yi " y )
b1 =
min " (yi !yˆ i ) 2
!(xi " x )2
which equates to
and b0 = y ! b1 x
min " (yi ! (b0 + b1x i )) 2

BUS210: Business Statistics Simple Regression- 11 BUS210: Business Statistics Simple Regression- 12

NSCC – BUS210 Simple Regression


Chapter 12 12-3

Regression Analysis Regression Analysis


Simple Linear Regression Simple Linear Regression
Square House
Feet Price
Example: ($000 s)

!  A real estate agent wishes to examine the


x y x!x y! y (x ! x )(y ! y ) (x ! x )2
1400 245 -315 -41.5 13072.5 99225
relationship between the selling price of a home 1600 312 -115 25.5 -2932.5 13225
and its size (measured in square feet) 1700 279 -15 -7.5 112.5 225
1875 308 160 21.5 3440 25600
!  A random sample of 10 houses 1100 199 -615 -87.5 53812.5 378225
1550 219 -165 -67.5 11137.5 27225
is selected 2350 405 635 118.5 75247.5 403225
!  X = square feet 2450 324 735 37.5 27562.5 540225
1425 319 -290 32.5 -9425 84100
!  Y = house price ($000 s) 1700 255 -15 -31.5 472.5 225
Sum 0.0 0.0 172500.0 1571500
Mean 1715 286.5
BUS210: Business Statistics Simple Regression- 13 BUS210: Business Statistics Simple Regression- 14

Regression Analysis Regression Analysis


Simple Linear Regression Scatter Plot
Square House
Feet Price
($000 s) House Price vs. Square Feet
x y x!x y! y (x ! x )(y ! y ) (x ! x )2 450
Sum 0.0 0.0 172500.0 1571500 400
House Price ($1000s)

Mean 1715 286.5 350


300
250
!(xi " x )(yi " y ) 172500 200
b1 = = = 0.11 150
!(xi " x )2 1571500 100
50
0
b0 = y ! b1 x = 286.5 ! (0.11)(1715) = 98.25 0 500 1000 1500 2000 2500 3000
Square Feet

BUS210: Business Statistics Simple Regression- 15 BUS210: Business Statistics Simple Regression- 16

Regression Analysis Regression Analysis


Using Excel Excel output
Regression Statistics
Multiple R 0.76211 The regression equation is:
R Square 0.58082
Adjusted R Square 0.52842 house price = 98.25 + (0.11)(square feet)
Standard Error 41.33032
Observations 10

ANOVA
df SS MS F Significance F
Regression 1 18934.9348 18934.9348 11.0848 0.01039
Residual 8 13665.5652 1708.1957
Total 9 32600.5000

Coefficients Standard Error t Stat P-value Lower 95% Upper 95%


Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386
Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580

BUS210: Business Statistics Simple Regression- 17 BUS210: Business Statistics Simple Regression- 18

NSCC – BUS210 Simple Regression


Chapter 12 12-4

Regression Analysis Regression Analysis


Scatter Plot Interpretation of b0
House Price vs. Square Feet house price = 98.25 + 0.11 (square feet)
450
400 !  b0 is….
House Price ($1000s)

Intercept 350
Slope
= 98.25 300
= 0.11 !  the estimated mean value of Y
250
200
150
!  when the value of X is zero
100
50
Note:
0
0 500 1000 1500 2000 2500 3000 Because a house cannot have a square footage of zero, b0 has no practical application
Square Feet

house price = 98.25 + (0.11)(square feet)


BUS210: Business Statistics Simple Regression- 19 BUS210: Business Statistics Simple Regression- 20

Regression Analysis Regression Analysis


Interpretation of b1 Making Predictions

house price = 98.25 + 0.11 (square feet) Predict the price for a house
with 2000 square feet:
!  b1 is….
!  the estimated change in the mean value of Y house price = 98.25 + 0.11(sq.ft.)
!  when the value of X changes by one unit = 98.25 + 0.11(2000)
! 318
Here, b1 = 0.11 tells us that the mean value of a house
increases by 0.11 x $1000 or about $110, on The predicted price for a house
average, for each additional one square foot of size with 2000 square feet is $318,000
BUS210: Business Statistics Simple Regression- 21 BUS210: Business Statistics Simple Regression- 22

Regression Analysis Measures of Variation


Making Predictions Coefficient of Determination
!  When using a regression model for prediction,
!  Total variation of y is made up of two parts:
only predict within the relevant range of data
Relevant range for
interpolation
!(y
!(yi " y )2 = !( yˆi "i "y )y2 ) + !(yi " yˆi )2
450
Total Sum of Regression
Total Sum
Sum
of Error Sum of
400
Squares of Squares
Squares Squares
House Price ($1000s)

350
300
250

SST = SSR + SSE


200
150 Do not try to
100
extrapolate
50
0
beyond the range
0 500 1000 1500 2000 2500 3000 of observed X s
BUS210: Business Statistics Square Feet Simple Regression- 23 BUS210: Business Statistics Simple Regression- 24

NSCC – BUS210 Simple Regression


Chapter 12 12-5

Measures of Variation Measures of Variation


Coefficient of Determination (cont d) Coefficient of Determination (cont d)
y
!  SST = total sum of squares (Total Variation)
!  The variation of the Yi values around their mean. yi
!  SSR = regression sum of squares (Explained Variation) SSE = !(yi " yˆi )2
Variation from the relationship between X and Y SST = !(yi " y )2 ŷi = b0 + b1x i
! 
yˆi
!  SSE = error sum of squares (Unexplained Variation) SSR = !( yˆi " y )2
!  Variation in Y attributable to factors other than X
y

x
xi
BUS210: Business Statistics Simple Regression- 25 BUS210: Business Statistics Simple Regression- 26

Measures of Variation Measures of Variation


Coefficient of Determination (cont d) Coefficient of Determination (cont d)
!  The coefficient of determination is… Y
!  the portion of the total variation in the dependent r2 = 1
variable that is explained by variation in the
independent variable
There is a perfect linear
!  also called r-squared and is denoted as r2
relationship between X and Y:
X
regression sum of squares !( yˆi " y )2 SSR r2 = 1
r2 = = = Y 100% of the variation in Y is
total sum of squares !(yi " y )2 SST explained by variation in X

Note: Since r2 is a ratio, then 0 " r2 " 1


X
r2 = 1
BUS210: Business Statistics Simple Regression- 27 BUS210: Business Statistics Simple Regression- 28

Measures of Variation Measures of Variation


Coefficient of Determination (cont d) Coefficient of Determination (cont d)
Y
0 < r2 < 1 r2 = 0
Y
As r2 decreases, there is a There is no linear
weaker linear relationship relationship between X and Y:
X
between X and Y:
The value of Y does not
Y X
Some, but not all, of the r2 = 0 depend on X.
variation in Y is explained None of the variation in Y is
by variation in X explained by variation in X.

X
BUS210: Business Statistics Simple Regression- 29 BUS210: Business Statistics Simple Regression- 30

NSCC – BUS210 Simple Regression


Chapter 12 12-6

Coefficient of Determination
Using Excel
Regression Statistics SSR 18934.9348
r2 = = = 0.58082
Multiple R 0.76211
SST 32600.5000
R Square 0.58082
Adjusted R Square 0.52842 58.08% of the variation in
Standard Error 41.33032 house prices is explained
Observations 10 by variation in square feet

ANOVA
df SS MS F Significance F
Regression 1 18934.9348 18934.9348 11.0848 0.01039
Residual 8 13665.5652 1708.1957
Total 9 32600.5000

Coefficients Standard Error t Stat P-value Lower 95% Upper 95%


Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386
Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580

BUS210: Business Statistics Simple Regression- 31

NSCC – BUS210 Simple Regression

You might also like