Business Statistics: Fourth Canadian Edition

Business Statistics
Fourth Canadian Edition
Chapter 20
Multiple Regression
Copyright © 2021 Pearson Canada Inc.

Ch. 20: Multiple Regression
Learning Objectives
1) Model one variable in terms of multiple other variables
2) Test the significance level of the model

20.1 The Linear Multiple Regression Model
(1 of 5)
For simple regression, the predicted value depends on only one

predictor variable:
ŷ  b0  b1x
For multiple regression, we write the regression model with more
predictor variables:
yˆ  b0  b1x1  b2 x2    bk xk

(2 of 5)
Simple Regression Example: Home Price vs. Bedrooms (1 of 2)

Random sample of 1057 homes. Can Bedrooms be used to predict
Price?
• Approximately linear relationship

• Equal Spread Condition is
violated
• Be cautious about using
inferential methods on these data
Figure 20.1 Side-by-side boxplots of Price
against Bedrooms show that price increases, on
average, with more bedrooms.

(3 of 5)
Simple Regression Example: Home Price vs. Bedrooms (2 of 2)

Computer regression output: Price = 14349.48 + 48218.91× Bedrooms
Response variable: Price R2 = 21.4%
s = 68432.21 with 1057 − 2 = 1055 degrees of freedom
Table 20.1 Linear regression of Price on Bedrooms.

Variable Coeff SE(Coeff) t-ratio P-value
Intercept 14349.48 9297.69 1.5 0.1230
Bedrooms 48218.91 2843.88 16.96 ≤0.0001
• The variation in Bedrooms accounts for only 21% of the variation in

Price.
• Perhaps the inclusion of another factor can account for a portion of the
remaining variation.
(4 of 5)
Multiple Regression: Include Living Area as a predictor in the regression

model, computer regression output
Response variable: Price R2 = 57.8%
s = 50142.4 with 1057 − 3 = 1054 degrees of freedom
Table 20.2 Multiple regression output for the linear model predicting Price
from Bedrooms and Living Area.
Variable Coeff SE(Coeff) t-ratio P-value
Intercept 20986.09 6816.3 3.08 0.0021
Bedrooms −7483.10 2783.5 −2.69 0.0073
Living area 93.84 3.11 30.18 ≤0.0001
Price  20,986.09  7483.10 Bedrooms  93.84 Living Area.

• Now the model accounts for 57.8% of the variation in Price.
(5 of 5)
Multiple Regression:
• Residuals: e  y  yˆ (as with simple regression)
• Degrees of freedom: df  n  k1
n = number of observations
k = number of predictor variables
• Standard deviation of residuals:
 y  yˆ 
2
se =
n  k 1

20.2 Interpreting Multiple Regression
Coefficients (1 of 4)
NOTE: The meaning of the coefficients in multiple regression
can be subtly different than in simple regression.
Price  20,986.09  7483.10 Bedrooms  93.84 Living Area.
Price drops with increasing bedrooms? Counterintuitive?

For houses with similar sized
Living Areas, more bedrooms
means smaller bedrooms
and/or smaller common living
space. Cramped rooms may
de-value the home.
Figure 20.3 For the 96 houses with Living

Area between 2500 and 3000 square feet,
the slope of Price on Bedrooms is negative.
For each additional bedroom, restricting data
to homes of this size, we would predict that
the house’s Price was about $17,800 lower.

So, what’s the correct answer to the question:
“Do more bedrooms tend to increase or decrease the
price of a home?”
Correct answer:
• “increase” if “Bedrooms” is the only predictor (“more
bedrooms” may mean “bigger house”, after all!)
• “decrease” if “Bedrooms” increases for fixed Living Area
(“more bedrooms” may mean “smaller, more-cramped
rooms”)

Summarizing:
Multiple regression coefficients must be interpreted in
terms of the other predictors in the model.

20.3 Assumptions and Conditions for the
Multiple Regression Model (1 of 8)
Linearity Assumption: Check each of the predictors.
Home Prices Example:
Linearity Condition is well-satisfied for both Bedrooms and Living Area.

Linearity Assumption: Also check the residuals plot.
Home Prices Example:
Figure 20.4 A scatterplot of Residuals against the Predicted Values

shows no obvious pattern.
Linearity Condition is well-satisfied.

Independence Assumption:
• As usual, there is no way to be sure the assumption is
satisfied
• But, think about how the data were collected to decide if
the assumption is reasonable
• Check the Randomization Condition as well. Does the data
collection method introduce any bias?

Equal Variance Assumption:
• The variability of the errors should be about the same for each
predictor.
• Use scatterplots to assess the Equal Spread Condition.
Residuals vs. Predicted Values:

Home Price Example

Normality Assumption:
• Check to see if the distribution of residuals is unimodal and
symmetric.

Home Price Example:
The ‘tails” of the distribution appear to be non-normal.
Figure 20.5 A histogram of the residuals shows a unimodal, symmetric distribution, but the tails seem a
bit longer than one would expect from a Normal model. The Normal probability plot confirms that.

Summary of Multiple Regression Model and Condition Checks:
1) Check Linearity Condition with a scatterplot for each predictor. If
necessary, consider data re-expression
2) If the Linearity Condition is satisfied, fit a multiple regression
model to the data
3) Find the residuals and predicted values
4) Inspect a scatterplot of the residuals against the predicted values.
Check for nonlinearity and non-uniform variation.
5) Think about how the data were collected.
– Do you expect the data to be independent?
– Was suitable randomization utilized?
– Are the data representative of a clearly identifiable population?
– Is autocorrelation an issue?
Summary of Multiple Regression Model and Condition Checks:
6) If the conditions check out this far, feel free to interpret the
regression model and use it for prediction.
7) Check the Nearly Normal Condition by inspecting a residual
distribution histogram and a Normal plot. If the sample size is
large, the Normality is less important for inference. Watch for
skewness and outliers.

20.4 Testing the Multiple Regression Model
(1 of 5)
• There are several hypothesis tests in multiple regression

• Each is concerned with whether the underlying parameters
(slopes and intercept) are actually zero
The Null Hypothesis for slope coefficients:

H0 : 1     k  0
Test the hypothesis with an F-test (a generalization of the

t-test to more than one predictor).

(2 of 5)
• The F-distribution has two degrees of freedom:

– k, where k is the number of predictors
– n – k – 1 , where n is the number of observations
• The F-test is one-sided – bigger F-values mean smaller P-
values.
• If the null hypothesis is true, then F will be near 1.

(3 of 5)
If a multiple regression F-test leads to a rejection of the null

hypothesis, then check the t-test statistic for each
coefficient:
bj  0
t n  k 1 
SE  b j 
Note that the degrees of freedom for the t-test is n − k − 1.

b j  t n* k 1  SE (b j ).
Confidence interval:

(4 of 5)
“Tricky” Parts of the t-tests:

• SE’s are harder to compute (let technology do it!)
• The meaning of a coefficient depends on the other predictors in
the model (as we saw in the Home Price example)
– If we fail to reject H0 : j = 0 based on it’s t-test, it does not
mean that xj has no linear relationship to y
– Rather, it means that xj contributes nothing to modeling y
after allowing for the other predictors

(5 of 5)
In Multiple Regression, it looks like each tells us the effect of

its associated predictor, xj.
BUT
• The coefficient j can be different from zero even when
there is no correlation between y and xj.
• It is even possible that the multiple regression slope
changes sign when a new variable enters the regression.

20.5 The F-Statistic and ANOVA (1 of 3)
Analysis of Variance (ANOVA) table is used to present various
measures of variability in a regression analysis.
Summary of Multiple Regression Variation Measures:
Parameter Significance
Sum of Squared Residuals
Larger SSE = “noisier” data and less
precise prediction
Regression
Regression Sum
Sum of of Squares
Squares
Larger
Larger SSR
SSR = = stronger
stronger model
model
correlation
correlation
Total
Total Sum
Sum ofof Squares
Squares
Larger
Larger SST
SST = = larger
larger variability
variability in
in y,
y,
due to “noisier” data (SSE) and/or
due to “noisier” data (SSE) and/or
stronger
stronger model
model correlation
correlation (SSR)
(SSR)

R2 in Multiple Regression:
SSR SSE
R 
2
 1
SST SST
R2 = fraction of the total variation in y accounted for by the model
(all the predictor variables included)
F and R2:
By using the expressions for SSE, SSR, SST, and R2, it can be
shown that:
R2 n  k  1
F .
(1  R )
2
k
So, testing whether F = 0 is equivalent to testing whether R2 = 0.

Table 20.3 Typical ANOVA table in multivariate regression analysis. The table
indicates the formulas that are used in the software to produce numerical results.
Degrees of Sum of
Blank Freedom (df) squares Mean Square F-Ratio P-Value
M S R = start fraction S S R over F = M S R over
Regression K SSR k end fraction MSE P
(explained
variability)
N minus k minus 1 M S E = start fraction S S E over
Errors SSE n minus k minus end fraction Blank Blank
(unexplained n−k−1
variability)
Total (Sum of n−1 SSTotal Blank Blank Blank
Squares, Total)

20.6 R2 and Adjusted R2
• Adding new predictor variables to a model never decreases R2 and
may increase it.
• But each added variable increases the model complexity, which may
not be desirable.
• Adjusted R2 imposes a “penalty” on the correlation strength of larger
models, depreciating their R2 values to account for an undesired
increase in complexity:
 2 k  n 1  SSE / (n  k  1)
R 2
adj  R      1 .
 n  1  n  k  1 SST / ( n  1)
Adjusted R2 permits a more equitable comparison between models of

different sizes.

What Can Go Wrong? (1 of 2)
• It is sometimes a mistake to claim to “hold everything else
constant” for a single individual. (For the predictors Age
and Years of Education, it is impossible for an individual to
get a year of education at constant age.)
• Don’t interpret regression causally. Statistics assesses
correlation, not causality.
• Be cautious about interpreting a regression as predictive.
That is, be alert for combinations of predictor values that
take you outside the ranges of these predictors.

What Can Go Wrong? (2 of 2)
• Don’t think that the sign of a coefficient is special. The sign
of a predictor coefficient may depend on which predictors
are included in the model.
• If a coefficient’s t-statistic is not significant, don’t interpret it
at all.

What Else Can Go Wrong?
• Don’t fit a linear regression to data that aren’t straight.
Usually, we are satisfied when plots of y against the x’s are
straight enough.
• Watch out for plot thickening. If plots of residuals vs.
predictors all show thickening, then consider re-expressing
y. If the thickening is observed for just one predictor,
consider re-expressing that predictor.
• Make sure the errors are nearly normal.
• Watch out for high-influence points and outliers.

What Have We Learned? (1 of 2)
• The assumptions and conditions for multiple regression
are the same as those for simple regression.
• R2 is still the fraction of the variation accounted for by the
regression model.
• se is still the standard deviation of the residuals.
• The degrees of freedom (in the denominator of se and for
each t-test) is n minus the number of parameters
estimated.

What Have We Learned? (2 of 2)
• The regression table produced by any statistics package shows
a row for each coefficient, giving its estimate, a standard error, a
t-statistic, and a P-value.
• If all the conditions are met, we can test each coefficient against
the null hypothesis that its parameter value is zero with a
Student’s t-test.
• We can perform an overall test of whether the multiple
regression model provides a better summary for y than its mean
by using the F-distribution.
• We learned that R2 may not be appropriate for comparing
multiple regression models with different numbers of predictors.

Business Statistics: Fourth Canadian Edition

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Business Statistics: Fourth Canadian Edition

Uploaded by

Copyright:

Available Formats

Business Statistics

Fourth Canadian Edition

Copyright © 2021 Pearson Canada Inc.

Copyright © 2021 Pearson Canada Inc.

For simple regression, the predicted value depends on only one

Copyright © 2021 Pearson Canada Inc.

Simple Regression Example: Home Price vs. Bedrooms (1 of 2)

• Approximately linear relationship

Copyright © 2021 Pearson Canada Inc.

Simple Regression Example: Home Price vs. Bedrooms (2 of 2)

Table 20.1 Linear regression of Price on Bedrooms.

• The variation in Bedrooms accounts for only 21% of the variation in

Multiple Regression: Include Living Area as a predictor in the regression

Price  20,986.09  7483.10 Bedrooms  93.84 Living Area.

Copyright © 2021 Pearson Canada Inc.

Price  20,986.09  7483.10 Bedrooms  93.84 Living Area.

Price drops with increasing bedrooms? Counterintuitive?

Copyright © 2021 Pearson Canada Inc.

Figure 20.3 For the 96 houses with Living

Copyright © 2021 Pearson Canada Inc.

Copyright © 2021 Pearson Canada Inc.

Copyright © 2021 Pearson Canada Inc.

Linearity Condition is well-satisfied for both Bedrooms and Living Area.

Copyright © 2021 Pearson Canada Inc.

Figure 20.4 A scatterplot of Residuals against the Predicted Values

Linearity Condition is well-satisfied.

Copyright © 2021 Pearson Canada Inc.

Residuals vs. Predicted Values:

Copyright © 2021 Pearson Canada Inc.

Copyright © 2021 Pearson Canada Inc.

Copyright © 2021 Pearson Canada Inc.

Copyright © 2021 Pearson Canada Inc.

• There are several hypothesis tests in multiple regression

The Null Hypothesis for slope coefficients:

Test the hypothesis with an F-test (a generalization of the

Copyright © 2021 Pearson Canada Inc.

• The F-distribution has two degrees of freedom:

Copyright © 2021 Pearson Canada Inc.

If a multiple regression F-test leads to a rejection of the null

Note that the degrees of freedom for the t-test is n − k − 1.

Copyright © 2021 Pearson Canada Inc.

“Tricky” Parts of the t-tests:

Copyright © 2021 Pearson Canada Inc.

In Multiple Regression, it looks like each tells us the effect of

Copyright © 2021 Pearson Canada Inc.

Copyright © 2021 Pearson Canada Inc.

So, testing whether F = 0 is equivalent to testing whether R2 = 0.

Copyright © 2021 Pearson Canada Inc.

Adjusted R2 permits a more equitable comparison between models of

Copyright © 2021 Pearson Canada Inc.

Copyright © 2021 Pearson Canada Inc.

Copyright © 2021 Pearson Canada Inc.

Copyright © 2021 Pearson Canada Inc.

Copyright © 2021 Pearson Canada Inc.

Copyright © 2021 Pearson Canada Inc.

You might also like