Professional Documents
Culture Documents
Chapter 8 Residual Analysis (Auto-Saved)
Chapter 8 Residual Analysis (Auto-Saved)
Chapter 8
Residual Analysis
HU, Qinlu
Email: qinlu.hu@link.cuhk.edu.hk
Date: 2024.03.18
DSME 2021
CV
Name: Qinlu Hu
PhD Candidate, Department of Decisions, Operations and Technology, CUHK
Business School, qinlu.hu@link.cuhk.edu.hk
Research interests:
two-sided online platforms
CONTENTS
02 04 06
Detecting Detecting Outliers and
Regression Analysis Unequal Variance Identifying Influential
01 03 05 Observations
07
• is normally distributed;
• with a mean of 0;
• the variance is constant;
• all pairs of error terms are uncorrelated;
Chapter 08 4
8.1 Introduction
Based on these assumptions, least squares regression analysis produces reliable statistical tests and
confidence intervals. Violations of these assumptions may lead to inefficiency of the OLS estimators
and incorrect inferences
Chapter 08 5
8.1 Introduction
provide you with both graphical tools and statistical tests that will aid
in identifying significant departures from the assumptions.
Chapter 08 6
CONTENTS
02 04 06
Detecting Detecting Outliers and
Regression Residuals Unequal Variance Identifying Influential
01 03 05 Observations
07
• The regression residual is the observed value of the dependent variable minus the
predicted value:
𝜀^ =𝑦 − ^𝑦 =𝑦 − ( ^𝛽 0 + ^𝛽 1 𝑥 1+ ⋯ + ^𝛽 𝑘 𝑥 𝑘)
Chapter 08 8
8.2 Regression Residuals
(1) The mean of the residuals is equal to 0. This property follows from the fact
that the sum of the differences between the observed y-values and their
least squares predicted ˆy values is equal to 0.
𝑛 𝑛
∑ 𝜀^ 𝑖 =∑ ( 𝑦 𝑖 − ^𝑦 𝑖 ) =0
𝑖=1 𝑖=1
(2) The standard deviation of the residuals is equal to the standard deviation of
the fitted regression model, s.
𝑛 𝑛
Chapter 08
𝑖=1 𝑖=1 9
8.2 Regression Residuals
• Examples:
• Google Colab: https://colab.research.google.com
• Python tutorial:
https://colab.research.google.com/drive/1LBD-pZPYm_GopWQDv3THOp
5waicPq_-S?usp=sharing
Chapter 08 10
CONTENTS
02 04 06
Detecting Detecting Outliers and
Regression Analysis Unequal Variance Identifying Influential
01 03 05 Observations
07
𝐸𝑚 (𝑦)≠ 𝐸 (𝑦)
𝐸 ( 𝜀𝑚 ) ≠ 0
Chapter 08 12
8.3 Detecting Lack of Fit
• Detecting Model Lack of Fit with Residuals:
In each plot, look for trend, dramatic changes in variability, and/or more than
5% of residuals that lie outside 2s of 0. Any of these patterns indicates a
problem with model fit.
Chapter 08 13
8.3 Detecting Lack of Fit
• Detecting Model Lack of Fit with Residuals:
Partial regression residuals plot: y: partial residual; x: Xj
We can use partial residual plot to find the trend between y and x1.
1. Partial residual plot– model with more than one independent variable
In each plot, look for trend, dramatic changes in variability, and/or more than
5% of residuals that lie outside 2s of 0. Any of these patterns indicates a
problem with model fit.
We can use partial residual plot to find the trend between y and x1.
Chapter 08 14
8.3 Detecting Lack of Fit
• Examples:
• Google Colab:
• Python tutorial:
https://colab.research.google.com/drive/1LBD-pZPYm_GopWQDv3THOp
5waicPq_-S?usp=sharing
Chapter 08 15
CONTENTS
02 04 06
01 03 05 Observations
07
Chapter 08 17
8.4 Detecting Unequal Variances
• When data fail to be homoscedastic, the reason is often that the variance of
the response y is a function of its mean E(y).
• Examples:
1. If the response y is a count that has a Poisson distribution, the variance
will be equal to the mean E(y).
Chapter 08 18
8.4 Detecting Unequal Variances
• When data fail to be homoscedastic, the reason is often that the variance of
the response y is a function of its mean E(y).
• Examples:
1. If the response y is a count that has a Binomial distribution, the variance
will be equal to: 𝑝 𝑖 ( 1− 𝑝𝑖 ) 𝐸 ( 𝑦 𝑖 ) [ 1 − 𝐸 ( 𝑦 𝑖 ) ]
𝑉𝑎𝑟 ( 𝑦 𝑖 ) = =
𝑛𝑖 𝑛𝑖
Chapter 08 19
8.4 Detecting Unequal Variances
• When data fail to be homoscedastic, the reason is often that the variance of
the response y is a function of its mean E(y).
• Examples:
1. If the response y is a count that has a multiplicative
2 2 model, the variance
will be equal to:
𝑉𝑎𝑟 ( 𝑦 )=[ 𝐸 ( 𝑦 ) ] 𝜎
Chapter 08 20
Poisson Distribution Formula
−𝜆 𝑥
𝑒 𝜆
𝑃 ( 𝑋=𝑥∨ 𝜆)=
𝑋!
where:
x = number of events in an area of opportunity
= expected number of events
e = base of the natural logarithm (2.71828...)
5-21
Poisson Distribution Formula
• Mean
𝜇= 𝜆
Variance and Standard Deviation
2
𝜎 =𝜆
𝜎 =√ 𝜆
where = expected number of events
5-22
Binomial Distribution Formula
where:
n = the number of experiments
x = the number of successful experiment: 0, 1, 2…
p = Probability of Success in a single experiment
5-23
Binomial Distribution Formula
• Mean
𝜇=𝑛𝑝
Variance and Standard Deviation
2
𝜎 =𝑛𝑝(1 −𝑝 )
𝜎 = √ n𝑝(1 −𝑝)
5-24
Multiplicative Model Formula
• The random error component has been assumed to be
additive in all the models.
𝑦 =𝐸 ( 𝑦 ) +𝜀
• Another useful type of model is the multiplicative model. In
this model, the response is written as the product of its
mean and the random error component,
𝑦 =[ 𝐸 ( 𝑦 ) ] 𝜀
• The variance of this response will growth proportionally to
the square of the mean
2
𝑉𝑎𝑟 ( 𝑦 )=[ 𝐸 ( 𝑦 ) ] 𝜎
2
Chapter 08 25
8.4 Detecting Unequal Variances
• Solution:
• Variance-stabilizing transformations
• When the variance of y is a function of its mean, we can often satisfy the
least squares assumption of homoscedasticity by transforming the
response to some new response that has a constant variance.
• For example, if the response y is a count that follows a Poisson
distribution, the square root transform can be shown to have
approximately constant variance. Consequently, if the response is a
Poisson random variable, we would let
𝑦 =√ 𝑦
∗
∗
𝑦 =𝛽 0 +𝛽 1 𝑥 1+ 𝛽 2 𝑥 2+ ⋯ + 𝛽 𝑘 𝑥 𝑘+𝜀
• This model will satisfy approximately the least squares assumption of
homoscedasticity.
Chapter 08 26
8.4 Detecting Unequal Variances
• Solution:
• Variance-stabilizing transformations
• When the variance of y is a function of its mean, we can often satisfy the
least squares assumption of homoscedasticity by transforming the
response to some new response that has a constant variance.
Chapter 08 27
8.4 Detecting Unequal Variances
• Examples:
• Google Colab:
• Python tutorial:
https://colab.research.google.com/drive/1LBD-pZPYm_GopWQDv3THOp
5waicPq_-S?usp=sharing
Chapter 08 28