Simple Linear Regression

Simple Linear Regression
Copyright ©2011 Pearson Education 13-1

Correlation vs. Regression
 A scatter plot can be used to show the
relationship between two variables
 Correlation analysis is used to measure the
strength of the association (linear relationship)
between two variables
 Correlation is only concerned with strength of the
relationship
 No causal effect is implied with correlation
 Scatter plots
 Correlation

Introduction to
Regression Analysis
 Regression analysis is used to:
 Predict the value of a dependent variable based on
the value of at least one independent variable
 Explain the impact of changes in an independent
variable on the dependent variable
Dependent variable: the variable we wish to
predict or explain
Independent variable: the variable used to predict
or explain the
dependent variable

Model
 Only one independent variable, X
 Relationship between X and Y is
described by a linear function
 Changes in Y are assumed to be related
to changes in X

Types of Relationships
Linear relationships Curvilinear relationships
Y Y
X X
Y Y
X X
(continued)
Strong relationships Weak relationships
Y Y
X X
Y Y
X X
(continued)
No relationship
X
Model
Population Random
Population Independent Error
Slope
Y intercept Variable term
Coefficient
Dependent
Variable
Yi  β0  β1Xi  ε i
Linear component Random Error
component

Model
(continued)
Y Yi  β0  β1Xi  ε i
Observed Value
of Y for Xi
εi Slope = β1
Predicted Value Random Error
of Y for Xi
for this Xi value
Intercept = β0
Xi X
Equation (Prediction Line)
The simple linear regression equation provides an
estimate of the population regression line
Estimated
(or predicted) Estimate of Estimate of the
Y value for the regression regression slope
observation i
intercept
Value of X for
Ŷi  b0  b1Xi
observation i

The Least Squares Method
b0 and b1 are obtained by finding the values of

that minimize the sum of the squared
differences between Y and Ŷ :
min  (Yi Ŷi )  min  (Yi  (b 0  b1Xi ))

2 2

Finding the Least Squares
Equation
 The coefficients b0 and b1 , and other

regression results in this chapter, will be
found using Excel
Formulas are shown in the text for those

who are interested

Interpretation of the
Slope and the Intercept
 b0 is the estimated average value of Y

when the value of X is zero
 b1 is the estimated change in the

average value of Y as a result of a
one-unit increase in X

Example
 A real estate agent wishes to examine the
relationship between the selling price of a home
and its size (measured in square feet)
 A random sample of 10 houses is selected
 Dependent variable (Y) = house price in $1000s
 Independent variable (X) = square feet

Example: Data
House Price in $1000s Square Feet
(Y) (X)
245 1400
312 1600
279 1700
308 1875
199 1100
219 1550
405 2350
324 2450
319 1425
255 1700

Example: Scatter Plot
House price model: Scatter Plot
450
400
House Price ($1000s)
350
300
250
200
150
100
50
0
0 500 1000 1500 2000 2500 3000
Square Feet

Simple Linear Regression Example:
Using Excel Data Analysis Function
DCOVA
1. Choose Data 2. Choose Data Analysis
3. Choose Regression

Using Excel Data Analysis Function
(continued)
Enter Y’s and X’s and desired options

Excel Output
Regression Statistics
Multiple R 0.76211 The regression equation is:
R Square 0.58082
Adjusted R Square 0.52842 house price  98.24833  0.10977 (square feet)
Standard Error 41.33032
Observations 10
ANOVA
df SS MS F Significance F
Regression 1 18934.9348 18934.9348 11.0848 0.01039
Residual 8 13665.5652 1708.1957
Total 9 32600.5000
Coefficients Standard Error t Stat P-value Lower 95% Upper 95%

Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386
Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580

Graphical Representation
House price model: Scatter Plot and Prediction Line

450
400
350
Slope
300
250
= 0.10977
200
150
100
50
Intercept 0
= 98.248 0 500 1000 1500 2000 2500 3000
Square Feet
house price  98.24833  0.10977 (square feet)

Example: Interpretation of bo
 b0 is the estimated average value of Y when the

value of X is zero (if X = 0 is in the range of
observed X values)
 Because a house cannot have a square footage
of 0, b0 has no practical application

Example: Interpreting b1
 b1 estimates the change in the average

value of Y as a result of a one-unit
increase in X
 Here, b1 = 0.10977 tells us that the mean value of a
house increases by .10977($1000) = $109.77, on
average, for each additional one square foot of size

Example: Making Predictions
Predict the price for a house
with 2000 square feet:
house price  98.25  0.1098 (sq.ft.)
 98.25  0.1098(200 0)
 317.85
The predicted price for a house with 2000
square feet is 317.85($1,000s) = $317,850
Example: Making Predictions
 When using a regression model for prediction,
only predict within the relevant range of data
Relevant range for
interpolation
450
400
350
300
250
200
150
100
50 Do not try to
0
extrapolate
0 500 1000 1500 2000 2500 3000
Square Feet
beyond the range
of observed X’s
Measures of Variation
 Total variation is made up of two parts:
SST  SSR  SSE

Total Sum of Regression Sum Error Sum of
Squares of Squares Squares
SST   ( Yi  Y )2 SSR   ( Ŷi  Y )2 SSE   ( Yi  Ŷi )2

where:
Y = Mean value of the dependent variable
Yi = Observed value of the dependent variable
Yˆi = Predicted value of Y for the given X value
i
(continued)
 SST = total sum of squares (Total Variation)

 Measures the variation of the Yi values around their
mean Y
 SSR = regression sum of squares (Explained Variation)
 Variation attributable to the relationship between X
and Y
 SSE = error sum of squares (Unexplained Variation)
 Variation in Y attributable to factors other than X

(continued)
Y
Yi  
SSE = (Yi - Yi )2 Y
_
SST = (Yi - Y)2

Y  _
_ SSR = (Yi - Y)2 _
Y Y
Xi X
Coefficient of Determination, r2
 The coefficient of determination is the portion
of the total variation in the dependent variable
that is explained by variation in the
independent variable
 The coefficient of determination is also called
r-squared and is denoted as r2
SSR regression sum of squares
r 2

SST total sum of squares
note: 0 r 1
2

Examples of Approximate
r2 Values
Y
r2 = 1
Perfect linear relationship

between X and Y:
X
r2 = 1
Y 100% of the variation in Y is
explained by variation in X
X
r =1
2

r2 Values
Y
0 < r2 < 1
Weaker linear relationships

between X and Y:
X
Some but not all of the
Y
variation in Y is explained
by variation in X
X
r2 Values
r2 = 0
Y
No linear relationship
between X and Y:
The value of Y does not

X depend on X. (None of the
r2 = 0
variation in Y is explained
by variation in X)

Coefficient of Determination, r2 in Excel
SSR 18934.9348
2
r    0.58082
Multiple R 0.76211 SST 32600.5000
R Square 0.58082
Adjusted R Square 0.52842
58.08% of the variation in
Observations 10
house prices is explained by
variation in square feet
ANOVA
Regression 1 18934.9348 18934.9348 11.0848 0.01039
Residual 8 13665.5652 1708.1957
Total 9 32600.5000

Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386
Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580

Standard Error of Estimate
 The standard deviation of the variation of
observations around the regression line is
estimated by
n
SSE
 (Yi  Yˆi ) 2
i 1
S YX  
n2 n2
Where
SSE = error sum of squares
n = sample size

Standard Error of Estimate in Excel
Multiple R
R Square
0.76211
0.58082
S YX  41.33032
Observations 10
ANOVA
Regression 1 18934.9348 18934.9348 11.0848 0.01039
Residual 8 13665.5652 1708.1957

Total 9 32600.5000

Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386
Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580

Comparing Standard Errors
SYX is a measure of the variation of observed
Y values from the regression line
Y Y
small SYX X large SYX X
The magnitude of SYX should always be judged relative to the

size of the Y values in the sample data
i.e., SYX = $41.33K is moderately small relative to house prices in
the $200K - $400K range
Assumptions of Regression
L.I.N.E
 Linearity
 The relationship between X and Y is linear
 Independence of Errors
 Error values are statistically independent
 Normality of Error
 Error values are normally distributed for any given
value of X
 Equal Variance (also called homoscedasticity)
 The probability distribution of the errors has constant
variance

Residual Analysis
ei  Yi  Ŷi
 The residual for observation i, ei, is the difference
between its observed and predicted value
 Check the assumptions of regression by examining the
residuals
 Examine for linearity assumption
 Evaluate independence assumption
 Evaluate normal distribution assumption
 Examine for constant variance for all levels of X
(homoscedasticity)
 Graphical Analysis of Residuals
 Can plot residuals vs. X
Residual Analysis for Linearity
Y Y
x x
residuals
x residuals x
Not Linear
Copyright ©2011 Pearson Education
 Linear
13-38
Residual Analysis for
Independence
Not Independent
 Independent
residuals
residuals
X
residuals

Checking for Normality
 Examine the Stem-and-Leaf Display of the

Residuals
 Examine the Boxplot of the Residuals
 Examine the Histogram of the Residuals
 Construct a Normal Probability Plot of the
Residuals

Normal Probability Plot
 The normal probability plot is a graphical

technique for assessing whether or not a data
set is approximately normally distributed.
 The data are plotted against a

theoretical normal distribution in such a way
that the points should form an approximate
straight line.

Residual Analysis for Normality
When using a normal probability plot, normal

errors will approximately display in a straight line
Percent
100
0
-3 -2 -1 0 1 2 3
Residual
Residual Analysis for
Equal Variance
Y Y
x x
residuals
x residuals x
Non-constant variance
 Constant variance

Example: Excel Residual Output
RESIDUAL OUTPUT House Price Model Residual Plot

Predicted
House Price Residuals 80
1 251.92316 -6.923162
60
2 273.87671 38.12329
40
3 284.85348 -5.853484
Residuals
4 304.06284 3.937162 20
5 218.99284 -19.99284 0
6 268.38832 -49.38832 0 1000 2000 3000
-20
7 356.20251 48.79749
-40
8 367.17929 -43.17929
-60
9 254.6674 64.33264
Square Feet
10 284.85348 -29.85348
Does not appear to violate

any regression assumptions
Inferences About the Slope
 The standard error of the regression slope

coefficient (b1) is estimated by
S YX S YX
Sb1  
SSX  (X i  X) 2
where:
Sb1 = Estimate of the standard error of the slope
SSE
S YX  = Standard error of the estimate
n2
Inferences About the Slope:
t Test
 t test for a population slope

 Is there a linear relationship between X and Y?
 Null and alternative hypotheses
 H0: β1 = 0 (no linear relationship)
 H1: β1 ≠ 0 (linear relationship does exist)
 Test statistic where:
b1  β 1
t STAT  b1 = regression slope
coefficient
Sb β1 = hypothesized slope
1
Sb1 = standard
d.f.  n  2 error of the slope

t Test Example
House Price
Square Feet
Estimated Regression Equation:
in $1000s
(x)
(y)
house price  98.25  0.1098 (sq.ft.)
245 1400
312 1600
279 1700
308 1875
The slope of this model is 0.1098
199 1100
219 1550 Is there a relationship between the
405 2350 square footage of the house and its
324 2450
sales price?
319 1425
255 1700

t Test Example
H0: β1 = 0
From Excel output: H1: β1 ≠ 0
Coefficients Standard Error t Stat P-value
Intercept 98.24833 58.03348 1.69296 0.12892
Square Feet 0.10977 0.03297 3.32938 0.01039
b1 Sb1
b1  β 1 0.10977  0
t STAT    3.32938
Sb 0.03297
1

t Test Example
H0: β1 = 0
Test Statistic: tSTAT = 3.329
H1: β1 ≠ 0
d.f. = 10- 2 = 8
a/2=.025 a/2=.025
Decision: Reject H0
There is sufficient evidence

Reject H0
-tα/2
Do not reject H0
tα/2
Reject H0 that square footage affects
0
-2.3060 2.3060 3.329 house price

t Test Example
H0: β1 = 0
H1: β1 ≠ 0
From Excel output:
Coefficients Standard Error t Stat P-value
Intercept 98.24833 58.03348 1.69296 0.12892
Square Feet 0.10977 0.03297 3.32938 0.01039
p-value
Decision: Reject H0, since p-value < α
There is sufficient evidence that
square footage affects house price.
F Test for Significance
 F Test statistic: F MSR

STAT 
MSE
where SSR
MSR 
k
SSE
MSE 
n  k 1
where FSTAT follows an F distribution with k numerator and (n – k - 1)
denominator degrees of freedom
(k = the number of independent variables in the regression model)

F-Test for Significance
Excel Output
Multiple R 0.76211
MSR 18934.9348
R Square 0.58082 FSTAT    11.0848
Adjusted R MSE 1708.1957
Square 0.52842
With 1 and 8 degrees p-value for
Observations 10
of freedom the F-Test
ANOVA
Regression 1 18934.9348 18934.9348 11.0848 0.01039
Residual 8 13665.5652 1708.1957
Total 9 32600.5000

F Test for Significance
(continued)
H0: β1 = 0 Test Statistic:

H1: β1 ≠ 0 MSR
FSTAT   11.08
 = .05 MSE
df1= 1 df2 = 8 Decision:
Critical Reject H0 at  = 0.05
Value:
F = 5.32
Conclusion:
 = .05
There is sufficient evidence that
0 F house size affects selling price
Do not Reject H0
reject H0
F.05 = 5.32
Confidence Interval Estimate
for the Slope
Confidence Interval Estimate of the Slope:
b1  t α / 2 S b d.f. = n - 2
1
Excel Printout for House Prices:

Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386
Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580
At 95% level of confidence, the confidence interval for

the slope is (0.0337, 0.1858)

for the Slope (continued)

Intercept 98.24833 58.03348 1.69296 0.12892 -35.57720 232.07386
Square Feet 0.10977 0.03297 3.32938 0.01039 0.03374 0.18580
Since the units of the house price variable is

$1000s, we are 95% confident that the average
impact on sales price is between $33.74 and
$185.80 per square foot of house size
This 95% confidence interval does not include 0.

Conclusion: There is a significant relationship between
house price and square feet at the .05 level of significance

Pitfalls of Regression Analysis
 Lacking an awareness of the assumptions
underlying least-squares regression
 Not knowing how to evaluate the assumptions
 Not knowing the alternatives to least-squares
regression if a particular assumption is violated
 Using a regression model without knowledge of
the subject matter
 Extrapolating outside the relevant range

Strategies for Avoiding
the Pitfalls of Regression
 Start with a scatter plot of X vs. Y to observe
possible relationship
 Perform residual analysis to check the
assumptions
 Plot the residuals vs. X to check for violations of
assumptions such as homoscedasticity
 Use a histogram, stem-and-leaf display, boxplot,
or normal probability plot of the residuals to
uncover possible non-normality

Strategies for Avoiding
the Pitfalls of Regression
(continued)
 If there is violation of any assumption, use

alternative methods or models
 If there is no evidence of assumption violation,
then test for the significance of the regression
coefficients and construct confidence intervals
and prediction intervals
 Avoid making predictions or forecasts outside
the relevant range

Chapter Summary
 Introduced types of regression models

 Reviewed assumptions of regression and
correlation
 Discussed determining the simple linear
regression equation
 Described measures of variation
 Discussed residual analysis
 Addressed measuring autocorrelation

Chapter Summary
(continued)
 Described inference about the slope

 Discussed correlation -- measuring the strength
of the association
 Addressed estimation of mean values and
prediction of individual values
 Discussed possible pitfalls in regression and
recommended strategies to avoid them

Statistics for Managers using
Microsoft Excel
Introduction to Multiple Regression

Learning Objectives
In this chapter, you learn:

 How to develop a multiple regression model
 How to interpret the regression coefficients
 How to determine which independent variables to
include in the regression model

 How to determine which independent variables are more
important in predicting a dependent variable

 How to use categorical independent variables in a
regression model

The Multiple Regression
Model
Idea: Examine the linear relationship between

1 dependent (Y) & 2 or more independent variables (Xi)
Multiple Regression Model with k Independent Variables:
Y-intercept Population slopes Random Error
Yi  β 0  β1 X1i  β 2 X 2i      β k X ki  ε i

Multiple Regression Equation
The coefficients of the multiple regression model are

estimated using sample data
Multiple regression equation with k independent variables:

Estimated Estimated
(or predicted) Estimated slope coefficients
value of Y intercept
ˆ  b  b X  b X    b X
Yi 0 1 1i 2 2i k ki
In this chapter we will use Excel or Minitab to obtain the
regression slope coefficients and other regression
summary measures.
Multiple Regression Equation
(continued)
Two variable model
Y
Ŷ  b0  b1X1  b 2 X 2
X1
e
abl
ri
r va
fo
l ope X2
S
f or v ariable X 2
Slope
X1
Example:
2 Independent Variables
 A distributor of frozen dessert pies wants to
evaluate factors thought to influence demand
 Dependent variable: Pie sales (units per week)
 Independent variables: Price (in $)
Advertising ($100’s)
 Data are collected for 15 weeks

Pie Sales Example
Pie Price Advertising
Week Sales ($) ($100s)
Multiple regression equation:
1 350 5.50 3.3
2 460 7.50 3.3
3 350 8.00 3.0 Sales = b0 + b1 (Price)
4 430 8.00 4.5
5 350 6.80 3.0 + b2
6
7
380
430
7.50
4.50
4.0
3.0
(Advertising)
8 470 6.40 3.7
9 450 7.00 3.5
10 490 5.00 4.0
11 340 7.20 3.5
12 300 7.90 3.2
13 440 5.90 4.0
14 450 5.00 3.5
15 300 7.00 2.7

Excel Multiple Regression Output
Multiple R 0.72213
R Square 0.52148
Standard Error 47.46341 Sales  306.526- 24.975(Price)  74.131(Advertising)
Observations 15
ANOVA df SS MS F Significance F
Regression 2 29460.027 14730.013 6.53861 0.01201
Residual 12 27033.306 2252.776
Total 14 56493.333

Intercept 306.52619 114.25389 2.68285 0.01993 57.58835 555.46404
Price -24.97509 10.83213 -2.30565 0.03979 -48.57626 -1.37392
Advertising 74.13096 25.96732 2.85478 0.01449 17.55303 130.70888

The Multiple Regression Equation
Sales  306.526 - 24.975(Pri ce)  74.131(Adv ertising)

where
Sales is in number of pies per week
Price is in $
Advertising is in $100’s.
b1 = -24.975: sales b2 = 74.131: sales will
will decrease, on increase, on average,
average, by 24.975 by 74.131 pies per
pies per week for week for each $100
each $1 increase in increase in
selling price, net of advertising, net of the
the effects of changes effects of changes
due to advertising due to price

Using The Equation to Make
Predictions
Predict sales for a week in which the selling
price is $5.50 and advertising is $350:
Sales  306.526 - 24.975(Pri ce)  74.131(Adv ertising)

 306.526 - 24.975 (5.50)  74.131 (3.5)
 428.62
Note that Advertising is

Predicted sales in $100’s, so $350
means that X2 = 3.5
is 428.62 pies
Coefficient of
Multiple Determination
 Reports the proportion of total variation in Y
explained by all X variables taken together
SSR regression sum of squares

r 2

SST total sum of squares

Multiple Coefficient of
Determination In Excel
Multiple R 0.72213 SSR 29460.0
r 
2
  .52148
R Square 0.52148
SST 56493.3
52.1% of the variation in pie sales
Observations 15 is explained by the variation in
price and advertising
Regression 2 29460.027 14730.013 6.53861 0.01201
Residual 12 27033.306 2252.776
Total 14 56493.333

Intercept 306.52619 114.25389 2.68285 0.01993 57.58835 555.46404
Price -24.97509 10.83213 -2.30565 0.03979 -48.57626 -1.37392
Advertising 74.13096 25.96732 2.85478 0.01449 17.55303 130.70888

Adjusted r2
 r2 never decreases when a new X variable is
added to the model
 This can be a disadvantage when comparing
models
 What is the net effect of adding a new variable?
 We lose a degree of freedom when a new X
variable is added
 Did the new X variable add enough
explanatory power to offset the loss of one

degree of freedom?
Adjusted r2
(continued)
 Shows the proportion of variation in Y explained
by all X variables adjusted for the number of X
variables used
 2  n  1 
r 2
adj  1  (1  r ) 
  n  k  1 
(where n = sample size, k = number of independent variables)
 Penalize excessive use of unimportant independent

variables
 Smaller than r2
 Useful in comparing among models
Adjusted r2 in Excel
Multiple R 0.72213
2
radj  .44172
R Square 0.52148
Adjusted R Square 0.44172 44.2% of the variation in pie sales is
Standard Error 47.46341 explained by the variation in price and
Observations 15 advertising, taking into account the sample
size and number of independent variables
Regression 2 29460.027 14730.013 6.53861 0.01201
Residual 12 27033.306 2252.776
Total 14 56493.333

Intercept 306.52619 114.25389 2.68285 0.01993 57.58835 555.46404
Price -24.97509 10.83213 -2.30565 0.03979 -48.57626 -1.37392
Advertising 74.13096 25.96732 2.85478 0.01449 17.55303 130.70888

Is the Model Significant?
 F Test for Overall Significance of the Model
 Shows if there is a linear relationship between all
of the X variables considered together and Y
 Use F-test statistic
 Hypotheses:
H0: β1 = β2 = … = βk = 0 (no linear relationship)
H1: at least one βi ≠ 0 (at least one independent
variable affects Y)

F Test for Overall Significance
 Test statistic:
SSR
MSR k
FSTAT  
MSE SSE
n  k 1
where FSTAT has numerator d.f. = k and

denominator d.f. = (n – k -
1)

F Test for Overall Significance In
Excel
(continued)
Multiple R 0.72213
R Square 0.52148 MSR 14730.0

FSTAT    6.5386
Adjusted R Square 0.44172 MSE 2252.8
Observations 15
With 2 and 12 degrees P-value for
of freedom the F Test
Regression 2 29460.027 14730.013 6.53861 0.01201
Residual 12 27033.306 2252.776
Total 14 56493.333

Intercept 306.52619 114.25389 2.68285 0.01993 57.58835 555.46404
Price -24.97509 10.83213 -2.30565 0.03979 -48.57626 -1.37392
Advertising 74.13096 25.96732 2.85478 0.01449 17.55303 130.70888

F Test for Overall Significance
(continued)
H0: β1 = β2 = 0 Test Statistic:

H1: β1 and β2 not both zero MSR
FSTAT   6.5386
 = .05 MSE
df1= 2 df2 = 12
Decision:
Critical Since FSTAT test statistic is
Value:
in the rejection region (p-
F0.05 = 3.885
value < .05), reject H0
 = .05
Conclusion:
0 F There is evidence that at least one
Do not Reject H0
reject H0 independent variable affects Y
F0.05 = 3.885
Residuals in Multiple Regression
Two variable model
Y Sample
Yi observation Ŷ  b0  b1X1  b 2 X 2
Residual =
<
ei = (Yi – Yi)
<
Yi
x2i
X2
x1i
The best fit equation is found
X1 by minimizing the sum of
squared errors, e2
Multiple Regression Assumptions
Errors (residuals) from the regression model:
<
ei = (Yi – Yi)
Assumptions:
 The errors are normally distributed
 Errors have a constant variance
 The model errors are independent

Residual Plots Used
in Multiple Regression
 These residual plots are used in multiple
regression:
<
 Residuals vs. Yi
 Residuals vs. X1i
 Residuals vs. X2i
 Residuals vs. time (if time series data)
Use the residual plots to check for
violations of regression assumptions
Are Individual Variables
Significant?
 Use t tests of individual variable slopes
 Shows if there is a linear relationship between
the variable Xj and Y holding constant the effects
of other X variables
 Hypotheses:
 H0: βj = 0 (no linear relationship)
 H1: βj ≠ 0 (linear relationship does exist
between Xj and Y)

Significant?
(continued)
H0: βj = 0 (no linear relationship)

H1: βj ≠ 0 (linear relationship does exist
between Xj and Y)
Test Statistic:
bj  0
1) t STAT  (df = n – k –
Sb
j

Significant? Excel Output (continued)
Multiple R 0.72213
t Stat for Price is tSTAT = -2.306, with
R Square 0.52148 p-value .0398
Standard Error 47.46341 t Stat for Advertising is tSTAT = 2.855,
Observations 15
with p-value .0145
Regression 2 29460.027 14730.013 6.53861 0.01201
Residual 12 27033.306 2252.776
Total 14 56493.333

Intercept 306.52619 114.25389 2.68285 0.01993 57.58835 555.46404
Price -24.97509 10.83213 -2.30565 0.03979 -48.57626 -1.37392
Advertising 74.13096 25.96732 2.85478 0.01449 17.55303 130.70888

Inferences about the Slope:
t Test Example
From the Excel output:
H0: βj = 0
For Price tSTAT = -2.306, with p-value .0398
H1: βj  0
For Advertising tSTAT = 2.855, with p-value .0145
d.f. = 15-2-1 = 12
a = .05 The test statistic for each variable falls
t/2 = 2.1788 in the rejection region (p-values < .05)
Decision:
a/2=.025 a/2=.025 Reject H0 for each variable
Conclusion:
There is evidence that both
Reject H0
-tα/2
Do not reject H0 Reject H0
tα/2 Price and Advertising affect
0
-2.1788 2.1788 pie sales at  = .05
for the Slope
Confidence interval for the population slope βj
b j  tα / 2 Sb where t has
(n – k – 1) d.f.
j
Coefficients Standard Error

Intercept 306.52619 114.25389 Here, t has
Price -24.97509 10.83213
(15 – 2 – 1) = 12 d.f.
Advertising 74.13096 25.96732
Example: Form a 95% confidence interval for the effect of changes in price (X 1) on
pie sales:
-24.975 ± (2.1788)(10.832)
So the interval is (-48.576 , -1.374)
(This interval does not contain zero, so price has a significant effect on sales holding constant
the effect of advertising)

for the Slope
(continued)
Confidence interval for the population slope βj
Coefficients Standard Error … Lower 95% Upper 95%

Intercept 306.52619 114.25389 … 57.58835 555.46404
Price -24.97509 10.83213 … -48.57626 -1.37392
Advertising 74.13096 25.96732 … 17.55303 130.70888
Example: Excel output also reports these interval endpoints:

Weekly sales are estimated to be reduced by between 1.37 to
48.58 pies for each increase of $1 in the selling price, holding the
effect of advertising constant

Testing Portions of the
Multiple Regression Model
 Contribution of a Single Independent Variable X j
SSR(Xj | all variables except Xj)

= SSR (all variables) – SSR(all variables except Xj)
 Measures the contribution of Xj in explaining the total

variation in Y (SST)

Testing Portions of the
Multiple Regression Model
(continued)
Contribution of a Single Independent Variable Xj,

assuming all other variables are already included
(consider here a 2-variable model):
SSR(X1 | X2)
= SSR (all variables) – SSR(X2)
From ANOVA section of From ANOVA section of

regression for regression for
ˆ b b X b X
Y ˆ b b X
Y
0 1 1 2 2 0 2 2
Measures the contribution of X1 in explaining SST

The Partial F-Test Statistic
 Consider the hypothesis test:

H0: variable Xj does not significantly improve the model after all
other variables are included
H1: variable Xj significantly improves the model after all other
variables are included
 Test using the F-test statistic:

(with 1 and n-k-1 d.f.)
SSR (X j | all variablesexcept j)

FSTAT 
MSE
Testing Portions of Model:
Example
Example: Frozen dessert pies
Test at the  = .05 level

to determine whether
the price variable
significantly improves
the model given that
advertising is included

Example
(continued)
H0: X1 (price) does not improve the model

with X2 (advertising) included
H1: X1 does improve model
 = .05, df = 1 and 12
F0.05 = 4.75
(For X1 and X2) (For X2 only)
ANOVA ANOVA
df SS MS df SS
Regression 2 29460.02687 14730.01343 Regression 1 17484.22249
Residual 12 27033.30647 2252.775539 Residual 13 39009.11085
Total 14 56493.33333 Total 14 56493.33333

Example (continued)
(For X1 and X2) (For X2 only)

ANOVA ANOVA
df SS MS df SS
Regression 2 29460.02687 14730.01343 Regression 1 17484.22249
Residual 12 27033.30647 2252.775539 Residual 13 39009.11085
Total 14 56493.33333 Total 14 56493.33333
SSR (X1 | X 2 ) 29,460.03  17,484.22

FSTAT    5.316
MSE(all) 2252.78
Conclusion: Since FSTAT = 5.316 > F0.05 = 4.75 Reject H0;

Adding X1 does improve model

Relationship Between Test
Statistics
 The partial F test statistic developed in this section and
the t test statistic are both used to determine the
contribution of an independent variable to a multiple
regression model.
 The hypothesis tests associated with these two
statistics always result in the same decision (that is, the
p-values are identical).
2
ta  F1,a
Where a = degrees of freedom

Coefficient of Partial Determination
for k variable model
2
rYj.(all variables except j)
SSR (X j | all variables except j)


SST SSR(all variables)  SSR(X j | all variables except j)
 Measures the proportion of variation in the dependent

variable that is explained by Xj while controlling for
(holding constant) the other independent variables

Coefficient of Partial
Determination in Excel
 Coefficients of Partial Determination can be
found using Excel:
 PHStat | regression | multiple regression …
 Check the “coefficient of partial determination” box
Regression Analysis
Coefficients of Partial Determination
Intermediate Calculations
SSR(X1,X2) 29460.02687
SST 56493.33333
SSR(X2) 17484.22249 SSR(X1 | X2) 11975.80438
SSR(X1) 11100.43803 SSR(X2 | X1) 18359.58884
Coefficients
r2 Y1.2 0.307000188
r2 Y2.1 0.404459524

Using Dummy Variables
 A dummy variable is a categorical independent
variable with two levels:
 yes or no, on or off, male or female
 coded as 0 or 1
 Assumes the slopes associated with numerical
independent variables do not change with the
value for the categorical variable
 If more than two levels, the number of dummy
variables needed is (number of levels - 1)

Dummy-Variable Example
(with 2 Levels)
Ŷ  b0  b1 X1  b 2 X 2
Let:
Y = pie sales
X1 = price
X2 = holiday (X2 = 1 if a holiday occurred during the week)
(X2 = 0 if there was no holiday that week)

Dummy-Variable Example
(with 2 Levels) (continued)
Ŷ  b0  b1 X1  b 2 (1)  (b 0  b 2 )  b1 X1 Holiday
Ŷ  b0  b1 X1  b 2 (0)  b0  b1 X1 No Holiday
Different Same
intercept slope
Y (sales)
If H0: β2 = 0 is
b0 + b2
Holi rejected, then
day
b0 (X = “Holiday” has a
No H 2 1)
olida significant effect
y (X on pie sales
2 = 0
)
Copyright ©2011 Pearson Education X1 (Price) 14-100

Interpreting the Dummy Variable
Coefficient (with 2 Levels)
Example: Sales  300 - 30(Price)  15(Holiday)
Sales: number of pies sold per week
Price: pie price in $
1 If a holiday occurred during the week
Holiday:
0 If no holiday occurred
b2 = 15: on average, sales were 15 pies greater in

weeks with a holiday than in weeks without a
holiday, given the same price

Dummy-Variable Models
(more than 2 Levels)
 The number of dummy variables is one less
than the number of levels
 Example:
Y = house price ; X1 = square feet
 If style of the house is also thought to matter:

Style = ranch, split level, colonial
Three levels, so two dummy

variables are needed
Dummy-Variable Models
(more than 2 Levels) (continued)
 Example: Let “colonial” be the default category, and let

X2 and X3 be used for the other two categories:
Y = house price
X1 = square feet
X2 = 1 if ranch, 0 otherwise
X3 = 1 if split level, 0 otherwise
The multiple regression equation is:

Ŷ  b0  b1X1  b 2 X 2  b3 X3
Interpreting the Dummy Variable
Coefficients (with 3 Levels)
Consider the regression equation:
Ŷ  20.43  0.045X1  23.53X 2  18.84X 3
For a colonial: X2 = X3 = 0
With the same square feet, a
Ŷ  20.43  0.045X 1 ranch will have an estimated
average price of 23.53
For a ranch: X2 = 1; X3 = 0 thousand dollars more than a
colonial.
Ŷ  20.43  0.045X 1  23.53
With the same square feet, a
For a split level: X2 = 0; X3 = 1 split-level will have an
estimated average price of
Ŷ  20.43  0.045X 1  18.84 18.84 thousand dollars more
than a colonial.
Interaction Between
Independent Variables
 Hypothesizes interaction between pairs of X
variables
 Response to one X variable may vary at different
levels of another X variable
 Contains two-way cross product terms

Ŷ  b0  b1X1  b 2 X 2  b3 X3
 b0  b1X1  b 2 X 2  b3 (X1X 2 )
Effect of Interaction
 Given: Y  β0  β1X1  β 2 X 2  β3 X1X 2  ε
 Without interaction term, effect of X 1 on Y is

measured by β1
 With interaction term, effect of X1 on Y is
measured by β1 + β3 X2
 Effect changes as X2 changes

Interaction Example
Suppose X2 is a dummy variable and the estimated
regression equation is Ŷ = 1 + 2X1 + 3X2 + 4X1X2
Y
12
X2 = 1:
8 Y = 1 + 2X1 + 3(1) + 4X1(1) = 4 + 6X1
4 X2 = 0:
Y = 1 + 2X1 + 3(0) + 4X1(0) = 1 + 2X1
0
X1
0 0.5 1 1.5
Slopes are different if the effect of X1 on Y depends on X2 value
Significance of Interaction Term
 Can perform a partial F test for the contribution

of a variable to see if the addition of an
interaction term improves the model
 Multiple interaction terms can be included

 Use a partial F test for the simultaneous contribution
of multiple variables to the model

Simultaneous Contribution of
Independent Variables
 Use partial F test for the simultaneous
contribution of multiple variables to the model
 Let m variables be an additional set of variables
added simultaneously
 To test the hypothesis that the set of m variables
improves the model:
[SSR(all)  SSR (all except new set of m variables)] / m

FSTAT 
MSE(all)
(where FSTAT has m and n-k-1 d.f.)

Chapter Summary
 Developed the multiple regression model
 Tested the significance of the multiple regression model
 Discussed adjusted r2
 Discussed using residual plots to check model
assumptions
 Tested individual regression coefficients
 Tested portions of the regression model
 Used dummy variables
 Evaluated interaction effects

Statistics for Managers using
Microsoft Excel
Multiple Regression Model Building

Learning Objectives
In this chapter, you learn:

 To use quadratic terms in a regression model
 To use transformed variables in a regression
model
 To measure the correlation among the
independent variables
 To build a regression model using either the
stepwise or best-subsets approach
 To avoid the pitfalls involved in developing a
multiple regression model
Nonlinear Relationships
 The relationship between the dependent
variable and an independent variable may
not be linear
 Can review the scatter plot to check for non-
linear relationships
 Example: Quadratic model
Yi  β0  β1X1i  β 2 X1i2  ε i
 The second independent variable is the square
of the first variable

Quadratic Regression Model
Model form:
Yi  β0  β1X1i  β 2 X  ε i
2
1i
 where:
β0 = Y intercept
β1 = regression coefficient for linear effect of X on Y
β2 = regression coefficient for quadratic effect on Y
εi = random error in Y for observation i

Linear vs. Nonlinear Fit
Y Y
X X
residuals
X residuals X
Linear fit does not give Nonlinear fit gives

random residuals
 random residuals
15-115
Quadratic Regression Model
Yi  β0  β1X1i  β 2 X1i2  ε i
Quadratic models may be considered when the scatter
plot takes on one of the following shapes:
Y Y Y Y
X1 X1 X1 X1
β1 < 0 β1 > 0 β1 < 0 β1 > 0
β2 > 0 β2 > 0 β2 < 0 β2 < 0
β1 = the coefficient of the linear term
β2 = the coefficient of the squared term
Testing the Overall
Quadratic Model
 Estimate the quadratic model to obtain the
regression equation:
Ŷi  b0  b1X1i  b 2 X2
1i
 Test for Overall Relationship

H0: β1 = β2 = 0 (no overall relationship between X and Y)
H1: β1 and/or β2 ≠ 0 (there is a relationship between X and Y)
MSR
 FSTAT = MSE

Testing for Significance:
Quadratic Effect
 Testing the Quadratic Effect
 Compare quadratic regression equation

Yi  b0  b1X1i  b 2 X1i2
with the linear regression equation
Yi  b0  b1X1i

Quadratic Effect
(continued)

 Consider the quadratic regression equation
Yi  b0  b1X1i  b 2 X1i2
Hypotheses
H0: β2 = 0 (The quadratic term does not improve the model)
H1: β2  0 (The quadratic term improves the model)

Quadratic Effect
(continued)

Hypotheses
H0: β2 = 0 (The quadratic term does not improve the model)
H1: β2  0 (The quadratic term improves the model)

 The test statistic is
where:
b2  β2
t STAT  b2 = squared term slope
coefficient
Sb
2 β2 = hypothesized slope (zero)
Sb = standard error of the slope
d.f.  n  3 2

Quadratic Effect
(continued)
Compare r2 from simple regression to

adjusted r2 from the quadratic model
 If adj. r2 from the quadratic model is larger

than the r2 from the simple model, then the
quadratic model is likely a better model

Example: Quadratic Model
Purity
Filter
Time Purity increases as filter time increases:
3 1
7 2 Purity vs. Time
8 3
15 5 100
22 7
80
33 8
40 10
60
Purity
54 12
67 13 40
70 14
78 15 20
85 15
87 16 0
0 5 10 15 20
99 17
Time

Example: Quadratic Model
(continued)
 Simple regression results:
^
Y = -11.283 + 5.985 Time
Standard
Coefficients Error t Stat P-value t statistic, F statistic, and r2
Intercept -11.28267 3.46805 -3.25332 0.00691 are all high, but the
Time 5.98520 0.30966 19.32819 2.078E-10 residuals are not random:
Regression Statistics Time Residual Plot
Significance
R Square 0.96888 F F 10
Adjusted R Square 0.96628 373.57904 2.0778E-10
5
Residuals 0
-5 0 5 10 15 20
-10
Time
Example: Quadratic Model in Excel
(continued)
 Quadratic regression results:
^
Y = 1.539 + 1.565 Time + 0.245 (Time)2
Standard Time Residual Plot
Coefficients Error t Stat P-value
10
Intercept 1.53870 2.24465 0.68550 0.50722
5
Residuals
Time 1.56496 0.60179 2.60052 0.02467
Time-squared 0.24516 0.03258 7.52406 1.165E-05 0
0 5 10 15 20
-5
Regression Statistics Significance Time
R Square 0.99494 F F
Adjusted R Square 0.99402 1080.7330 2.368E-13 Time -square d Re sidual Plot
10
5
Residuals
The quadratic term is significant and 0
0 100 200 300 400
improves the model: adj. r2 is higher and -5
Time-squared
SYX is lower, residuals are now random
Using Transformations in
Regression Analysis
Idea:
 non-linear models can often be transformed
to a linear form
 Can be estimated by least squares if transformed
 transform X or Y or both to get a better fit or
to deal with violations of regression
assumptions
 Can be based on theory, logic or scatter
plots
The Square Root Transformation
 The square-root transformation
Yi  β0  β1 X1i  ε i
 Used to
 overcome violations of the constant variance
assumption
 fit a non-linear relationship

The Square Root Transformation
(continued)
Yi  β0  β1X1i  ε i Yi  β0  β1 X1i  ε i
 Shape of original relationship  Relationship when transformed
Y Y b1 > 0
X X
Y Y
b1 < 0
Copyright ©2011 Pearson Education X X 15-127
The Log Transformation
The Multiplicative Model:

 Original multiplicative model  Transformed multiplicative model
Yi  β0 X1iβ1 ε i log Yi  log β0  β1 log X1i  log ε i
The Exponential Model:

 Original multiplicative model  Transformed exponential model
Yi  e
β 0 β1X1i β 2 X 2i
εi ln Yi  β0  β1X1i  β 2 X 2i  ln ε i

Interpretation of coefficients
For the multiplicative model:
log Yi  log β0  β1 log X1i  log ε i
 When both dependent and independent

variables are logged:
 The coefficient of the independent variable Xk can
be interpreted as : a 1 percent change in Xk leads to
an estimated bk percentage change in the average
value of Y. Therefore bk is the elasticity of Y with
respect to a change in Xk .
Collinearity
 Collinearity: High correlation exists among two

or more independent variables
 This means the correlated variables contribute
redundant information to the multiple regression
model

Collinearity (continued)
 Including two highly correlated independent

variables can adversely affect the regression
results
 No new information provided
 Can lead to unstable coefficients (large
standard error and low t-values)
 Coefficient signs may not match prior
expectations

Some Indications of Strong
Collinearity
 Incorrect signs on the coefficients
 Large change in the value of a previous
coefficient when a new variable is added to the
model
 A previously significant variable becomes non-
significant when a new independent variable is
added
 The estimate of the standard deviation of the
model increases when a variable is added to
the model
Detecting Collinearity
(Variance Inflationary Factor)
VIFj is used to measure collinearity:
1
VIFj 
1 R j
2
where R2j is the coefficient of determination of

variable Xj with all other X variables
If VIFj > 5, Xj is highly correlated with

the other independent variables

Example: Pie Sales
Pie Price Advertising
Week Sales ($) ($100s)
1 350 5.50 3.3
2 460 7.50 3.3
Recall the multiple regression
3 350 8.00 3.0
4 430 8.00 4.5
equation of chapter 14:
5 350 6.80 3.0
6 380 7.50 4.0
7 430 4.50 3.0
8 470 6.40 3.7
Sales = b0 + b1 (Price)
9 450 7.00 3.5
10 490 5.00 4.0 + b2
11 340 7.20 3.5
12 300 7.90 3.2 (Advertising)
13 440 5.90 4.0
14 450 5.00 3.5
15 300 7.00 2.7

Model Building
 Goal is to develop a model with the best set of
 Easier to interpret if unimportant variables are
removed
 Lower probability of collinearity
 Stepwise regression procedure
 Provide evaluation of alternative models as variables
are added and deleted
 Best-subset approach
 Try all combinations and select the best using the
highest adjusted r2 and lowest standard error

Stepwise Regression
 Idea: develop the least squares regression

equation in steps, adding one independent
variable at a time and evaluating whether
existing variables should remain or be removed
 The coefficient of partial determination is the

measure of the marginal contribution of each
independent variable, given that other
independent variables are in the model

Best Subsets Regression
 Idea: estimate all possible regression equations

using all possible combinations of independent
variables
 Choose the best fit by looking for the highest

adjusted r2 and lowest standard error
Stepwise regression and best subsets

regression can be performed using PHStat

Alternative Best Subsets
Criterion
 Calculate the value Cp for each potential

regression model
 Consider models with Cp values close to or

below k + 1
 k is the number of independent variables in the

model under consideration

Alternative Best Subsets
Criterion (continued)
 The Cp Statistic
(1  Rk2 )(n  T )
Cp   (n  2(k  1))
1  RT2
Where k = number of independent variables included in a

particular regression model
T = total number of parameters to be estimated in the
full regression model
Rk2 = coefficient of multiple determination for model with k
R 2T = coefficient of multiple determination for full model with
all T estimated parameters
Steps in Model Building
1. Compile a listing of all independent variables
under consideration
2. Estimate full model and check VIFs
3. Check if any VIFs > 5
 If no VIF > 5, go to step 4
 If one VIF > 5, remove this variable
 If more than one, eliminate the variable with the
highest VIF and go back to step 2
4.Perform best subsets regression with remaining
variables …
Steps in Model Building
(continued)
5. List all models with Cp close to or less than (k

+ 1)
6. Choose the best model
 Consider parsimony
 Do extra variables make a significant contribution?
7.Perform complete analysis with chosen model,
including residual analysis
8.Transform the model if necessary to deal with
violations of linearity or other model
assumptions
9.Use the model for prediction and inference
Model Building Flowchart
Choose X1,X2,…Xk Run subsets

regression to obtain
No “best” models in
Run regression Any terms of Cp
to find VIFs VIF>5?
Yes Do complete analysis
Remove Yes
variable with More
Add quadratic and/or interaction
highest than one?
terms or transform variables
VIF
No
Remove Perform
this X predictions
Pitfalls and Ethical
Considerations
To avoid pitfalls and address ethical considerations:
 Understand that interpretation of the
estimated regression coefficients are
performed holding all other independent
variables constant
 Evaluate residual plots for each independent
variable
 Evaluate interaction terms

Additional Pitfalls
and Ethical Considerations
(continued)
To avoid pitfalls and address ethical considerations:

 Obtain VIFs for each independent variable
before determining which variables should be
included in the model
 Examine several alternative models using best-
subsets regression
 Use other methods when the assumptions
necessary for least-squares regression have
been seriously violated

Chapter Summary
 Developed the quadratic regression model
 Discussed using transformations in
regression models
 The multiplicative model
 The exponential model
 Described collinearity
 Discussed model building
 Stepwise regression
 Best subsets
 Addressed pitfalls in multiple regression and
ethical considerations
Objectives:
 Statistical learning including quantitative,
qualitative analysis techniques
 Predictive Analytics using linear, polynomial
and logistic regression techniques and
model comparison
 The use of the above analysis and
visualization to aid decision making

Content:
 Business Analytics - Introduction
 Statistical Methods for Business Analytics
 Basics of Hypothesis Testing
 Correlation and Regression
 Multiple Linear Regression
 Model Comparison and Performance
 Classification
 Time Series Analysis

Statistical Learning

Function:
y = 3x + 12x + 2
y
x
Function:
y = 3x + 12x + 2
y
x
Function:
y = 3x + 12x + 2
y
x
Statistical Learnings:
Y = f(x) + є
Income = f(Yrs of Edu and Seniority) + є

Statistical Learning: Refers to a set of

approaches for estimating f
Income = f(Yrs of Edu and Seniority) + є

Why Estimate Function “ f ”
1) What is Inference
Happening?
2) What is Going to Prediction

Happen?

Inference
Y = 14 + 223x1 + 34x2 - 120x3 + 0.0002x4 - 12x5 + 0.006x6 + є
All x values (x1 to x6 are in the range of 0 to 10)
Which predictors are associated with the response?
What is the relationship between the response and each

predictor?
Can the relationship between Y and each predictor be

adequately summarized using a linear equation, or is the
relationship more complicated?

Prediction
Income = f (Education and Seniority) + є

Prediction
Y = f (x) + є
Y1 = f1 (x) + є
E(Y - Y1)2 = [f (x) - f1 (x)]2 + Var(є )

Reducible Irreducible
A set of approaches for estimating f

Why Estimate Function “ f ”
1) What is Inference
Happening?
2) What is Going to Prediction

Happen?
Examples ? ? ?
Methods to Estimate Function “ f ”

 Parametric
Non-Parametric 

Supervised Learning

Unsupervised Learning

Unsupervised Learning

Summary
1) Functions and its Variables
2) Statistical Learning
3) Estimating Function
4) Purpose of Estimating Function:
• Inferences
• Predictions
5) Methods to Estimate f
• Parametric
• Non-parametric
6) Prediction Accuracy vs Interpretability
7) Supervised vs Unsupervised Learning
Content:
 Business Analytics - Introduction
 Statistical Methods for Business Analytics
 Basics of Hypothesis Testing
 Correlation and Regression
 Multiple Linear Regression
 Model Comparison and Performance
 Classification
 Time Series Analysis

Classification:
 A person arrives at the emergency room with a set of
symptoms that could possibly be attributed to one of
three medical conditions. Which of the three
conditions does the individual have?
 An online banking service must be able to determine
whether or not a transaction being performed on the
site is fraudulent, on the basis of the user’s IP
address, past transaction history, and so forth.
 On the basis of DNA sequence data for a number of
patients with and without a given disease, a biologist
would like to figure out which DNA mutations are
deleterious (disease-causing) and which are not.

Classification:

Why not MRA?
Classification:
Conditional Probability: Prob (Y=1| X=x)

Classification: Prob (Y=1| X=x)

Logistic Regression: Prob (X)
Likelihood Function

Likelihood Function



Multiple Logistic Regression: Prob (X)

Classification – Multi Level Categorical DV
Bayes Theorem

Classification - KNN

Classification - KNN

Classification – KNN (Value of K)

Classification – KNN – For Regression

Classification – KNN – For Regression
K=1 K=9

Classification – Clustering
Unsupervised
Learning

Classification – Clustering
• K Means
• Hierarchical

Classification – Clustering – K Means




1. Randomly assign a number, from 1 to K, to each of the
observations. These serve as initial cluster assignments
for the observations.
2. Iterate until the cluster assignments stop changing:
(a) For each of the K clusters, compute the cluster
centroid. The kth cluster centroid is the vector of the p
feature means for the observations in the kth cluster.
(b) Assign each observation to the cluster whose centroid
is closest (where closest is defined using Euclidean
distance).


Classification – Clustering-Hierarchical





Simple Linear Regression

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Simple Linear Regression

Uploaded by

Copyright:

Available Formats

Simple Linear Regression

Copyright ©2011 Pearson Education 13-1

Copyright ©2011 Pearson Education 13-2

Copyright ©2011 Pearson Education 13-3

Copyright ©2011 Pearson Education 13-4

Linear relationships Curvilinear relationships

Copyright ©2011 Pearson Education 13-8

Copyright ©2011 Pearson Education 13-10

b0 and b1 are obtained by finding the values of

min  (Yi Ŷi )  min  (Yi  (b 0  b1Xi ))

Copyright ©2011 Pearson Education 13-11

 The coefficients b0 and b1 , and other

Formulas are shown in the text for those

Copyright ©2011 Pearson Education 13-12

 b0 is the estimated average value of Y

 b1 is the estimated change in the

Copyright ©2011 Pearson Education 13-13

 Independent variable (X) = square feet

Copyright ©2011 Pearson Education 13-14

Copyright ©2011 Pearson Education 13-15

Copyright ©2011 Pearson Education 13-16

Copyright ©2011 Pearson Education 13-17

Copyright ©2011 Pearson Education 13-18

Coefficients Standard Error t Stat P-value Lower 95% Upper 95%

Copyright ©2011 Pearson Education 13-19

House price model: Scatter Plot and Prediction Line

house price  98.24833  0.10977 (square feet)

house price  98.24833  0.10977 (square feet)

 b0 is the estimated average value of Y when the

Copyright ©2011 Pearson Education 13-21

house price  98.24833  0.10977 (square feet)

 b1 estimates the change in the average

Copyright ©2011 Pearson Education 13-22

house price  98.25  0.1098 (sq.ft.)

 Total variation is made up of two parts:

SST  SSR  SSE

SST   ( Yi  Y )2 SSR   ( Ŷi  Y )2 SSE   ( Yi  Ŷi )2

 SST = total sum of squares (Total Variation)

Copyright ©2011 Pearson Education 13-26

Copyright ©2011 Pearson Education 13-28

Perfect linear relationship

Copyright ©2011 Pearson Education 13-29

Weaker linear relationships

The value of Y does not

Copyright ©2011 Pearson Education 13-31

Coefficients Standard Error t Stat P-value Lower 95% Upper 95%

Copyright ©2011 Pearson Education 13-32

Copyright ©2011 Pearson Education 13-33

Residual 8 13665.5652 1708.1957

Coefficients Standard Error t Stat P-value Lower 95% Upper 95%

Copyright ©2011 Pearson Education 13-34

small SYX X large SYX X

The magnitude of SYX should always be judged relative to the

Copyright ©2011 Pearson Education 13-36

Copyright ©2011 Pearson Education 13-39

 Examine the Stem-and-Leaf Display of the

Copyright ©2011 Pearson Education 13-40

 The normal probability plot is a graphical

 The data are plotted against a

Copyright ©2011 Pearson Education 13-41

When using a normal probability plot, normal

Copyright ©2011 Pearson Education 13-43

RESIDUAL OUTPUT House Price Model Residual Plot