Professional Documents
Culture Documents
Chapter 2 PDF Lecture Notes
Chapter 2 PDF Lecture Notes
Chapter 2 PDF Lecture Notes
• Fitting such a line to a set of data involves estimating the slope and
intercept to produce a line that is denoted by 𝑦ො = 𝛽መ0 + 𝛽መ1 𝑥
Fitting the Model: The Least-Squares
Approach
• Sum of squared errors: 𝑆𝑆𝐸 = σ𝑛𝑖=1 𝑦𝑖 − 𝑦ො𝑖 2
• 𝛽መ0 = 𝑦ത − 𝛽መ1 𝑥ҧ
Example: Power Load and Temperature
Day Maximum Temperature (𝒙) Peak Power Load 𝒚
1 95 214
2 82 152
3 90 156
4 81 129
5 99 254
6 100 266
7 93 210
8 95 204
9 93 213
10 87 150
Continue
• σ𝑛𝑖=1 𝑥𝑖 = 915, σ𝑛𝑖=1 𝑦𝑖 = 1948, σ𝑛𝑖=1 𝑥𝑖 𝑦𝑖 =180798
• σ𝑛𝑖=1 𝑥𝑖2 = 84103, σ𝑛𝑖=1 𝑦𝑖2 = 398734,
σ𝑛 𝑥 σ𝑛
𝑖=1 𝑦𝑖
• 𝑆𝑆𝑥𝑦 = σ𝑛𝑖=1 𝑥𝑖 𝑦𝑖 − 𝑖=1 𝑖
= 2556
𝑛
2
σ𝑛
𝑖=1 𝑥𝑖
• 𝑆𝑆𝑥𝑥 = σ𝑛𝑖=1 𝑥𝑖2 − = 380.5
𝑛
2
σ𝑛
𝑖=1 𝑦𝑖
• 𝑆𝑆𝑦𝑦 = σ𝑛𝑖=1 𝑦𝑖2 − = 19263.6
𝑛
𝑆𝑆𝑥𝑦
•𝑟= = 0.944, hence strong positive linear relationship
𝑆𝑆𝑥𝑥 𝑆𝑆𝑦𝑦
𝑆𝑆𝑥𝑦
መ
• 𝛽1 = = 6.7175
𝑆𝑆𝑥𝑥
• 𝛽መ0 = 𝑦ത − 𝛽መ1 𝑥ҧ = −419.85
• The Least-squares regression line is given by:
• 𝑦ො = 𝛽መ0 + 𝛽መ1 𝑥 = −419.85 + 6.7175𝑥
• Where 𝑦ො is the predicted peak power load and 𝑥 is the maximum
daily temperature.
• For every one degree (in degree F) increase in the maximum
temperature the peak power load will increase on the average by
6.7175 megawatts
Using the model for prediction
• Consider the example of peak power load. Predict the required peak
power load, if tomorrow’s maximum temperature is expected to be
98 degree F.
• 𝑟 2 = 0.89
• The sample variability of the peak loads about their mean is reduced
by 89% when the mean peak load is modeled as a linear function of
daily high temperature.
• In a study of pollution in a water stream, the concentration of
pollution is measured at 5 different locations. The locations are at
different distances to the pollution source. In the table below, these
distances and the average pollution are given:
• B)r=-0.931
• 𝑟 2 = 0.868 (an estimate of the variation in concentration which can
be explained by distance)