Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

Lecture 29

Chapter 17
Constrained Optimization
The Least-Squares
Regression
Sections 17-1 & 17-2

Examples 17.1 and 17.2


Problems 17.3, 17.4,17.6

1
Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Where substantial error is associated with data, polynomial
interpolation is inappropriate and may yield unsatisfactory results
when used to predict intermediate values. Experimental data are often
of this type. For example, Fig. 17.1a shows seven experimentally
derived data points exhibiting significant variability. Visual inspection
of these data suggests a positive relationship between y and x. That is,
the overall trend indicates that higher values of y are associated with
higher values of x.
Now, if a sixth-order interpolating polynomial is fitted to these data
(Fig b), it will pass exactly through all of the points. However, because
of the variability in these data, the curve oscillates widely in the
interval between the points. In particular, the interpolated values at
x=1.5 and x=6.5 appear to be well beyond the range suggested by
these data.
A better strategy for such cases is to derive an approximating function
that fits the shape or general trend of the data without necessarily
matching the individual points.
(Fig c) illustrates how a straight line can be used to generally
characterize the trend of these data without passing through any
particular point. This line can be drawn by visual inspection.
Fig. 17.1 (a) Data exhibiting significant error. (b) Polynomial fit
oscillating beyond the range of the data. (c) More satisfactory result
using the least-squares fit.
Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
2
This line can be drawn by visual inspection.
and then sketch a “best” line through the points. Although such
“eyeball” approaches have common sense appeal and are valid for
quick estimate ” calculations, they are deficient because they are
arbitrary. One way to do this is to derive a curve that minimizes the
discrepancy between the data points and the curve.
A technique for accomplishing this objective, called “least squares
regression”.

3
Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
17.1 Linear Regression
The simplest example of a least-squares approximation is fitting a
straight line to a set of paired observation:
(𝑥1 , 𝑦1 ), (𝑥2 , 𝑦2 ), … … . (𝑥𝑛 , 𝑦𝑛 )
The mathematical expression for the straight line is:
𝑦 = 𝑎0 + 𝑎1 𝑥 + 𝑒
where 𝑎0 and 𝑎1 are coefficients representing the intercept and the
slope, respectively, and e is the error, or residual, between the
model and the observations, which can be represented by:
e= 𝑦 − 𝑎0 − 𝑎1 𝑥
A strategy that overcomes the shortcomings of the aforementioned
approaches is to minimize the sum of the squares of the residuals
between the measured y and the y calculated with the linear model
𝑛 2 𝑛
𝑆𝑟 = 𝑖=1 𝑒𝑖 = 𝑖=1(𝑦𝑖,𝑚𝑒𝑎𝑠𝑢𝑟𝑒𝑑 − 𝑦𝑖,𝑚𝑜𝑑𝑒𝑙 )2
𝑛

= (𝑦𝑖 − 𝑎0 − 𝑎1 𝑥𝑖 )2 17.3 (17.8)


𝑖=1
𝑆𝑦
𝑆𝑟 is called the standard
𝑥
𝑆𝑦 /𝑥= 𝑛−2
(17.9)
error of the estimate. 4
Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Problem Statement. Fit a straight line to the x and y
values in the first two columns of Table 17.1.
TABLE 17.1 Computations for an
error analysis of the linear fit
2

Therefore, the least-squares fit line is: 𝑦 = 𝑎0 + 𝑎1 𝑥


y=0.07142857 + 0.8392857x The line, along with the data,
is shown in Fig. 17.1c
5
Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
yi
7

5
Example 17-1
4
7
3

2 6

1
5
0
0 2 4 6 8
4

fit line
3
7.0000

6.0000
2
5.0000

4.0000 1

3.0000
0
2.0000
0 2 4 6 8
1.0000
yi fit line
0.0000
0 2 4 6 8 6
Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Example 17.2: Estimation of Errors for the Linear Least-Squares Fit Problem Statement.
Compute the total standard deviation, the standard error of the estimate, and the correlation
coefficient for the data in Example 17.1.
Solution. The summations are performed and presented in Table 17.1. The standard
deviation is [Eq. (PT5.2)] 2
The most common measure of spread for a sample is
the standard deviation (𝑆𝑦 ) about the mean, given by:

𝑆𝑡 (𝑦𝑖 − 𝑦)2 22.7143


𝑆𝑦 = 𝑛−1
= 𝑛−1
= 7−1
= 1.94569

𝑆𝑦 = 1.94569
𝑛 2 𝑛
𝑆𝑟 = 𝑖=1 𝑒𝑖 = 𝑖=1(𝑦𝑖 − 𝑎0 − 𝑎1 𝑥𝑖 )2
𝑆𝑟 2.991
𝑆𝑦/𝑥 = = = 0.7735 22.7143 − 2.9911
𝑛−2 7−2 𝑟2 = = 0.868
22.7143
The error can be quantified by:
𝑆𝑡 −𝑆𝑟 𝑟 = 0.932
𝑟2 = (17.10) These results indicate that 86.8% of
𝑆𝑡

Thus, because 𝑆𝑦/𝑥 < 𝑆𝑦 the linear regression the original uncertainty has been
model has merit. Thus from Eq. (17.10): explained by the linear model

Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Problem 17.6: Use least-squares regression to fit a straight line to
x 1 2 3 4 5 6 7 8 9
y 1 1.5 2 3 4 5 8 10 13
Along with the slope and intercept, compute the standard error of the estimate and
correlation coefficient.
Solution: The equation for least-squares fit is, 𝑥𝐢 𝒚𝒊 𝑥𝐢 𝐲𝐢 𝑥𝐢 2 Fit
𝑦 = 𝑎0 + 𝑎1 𝑥 Line
To find the least-squares fit, first compute the 1 1 1 1 -0.5556
following quantities, that is, (also here n=9) 2 1.5 3 4 0.9028
𝑛 𝑥𝑖 𝑦𝑖 − 𝑥𝑖 𝑦𝑖 3 2 6 9 2.3611
𝑎1 = 𝑛 𝑥𝑖 2 −( 𝑥𝑖 )2
(17.6) 4 3 12 16 3.8194
5 4 20 25 5.2778
𝑎0 = 𝑦 −𝑎1 𝑥 (17.7)
6 5 30 36 6.7361
𝑥 = 5 and 𝑦 = 5.2778
7 8 56 49 8.1944
787.5
𝑎1 = 9 9325 −45(47.5)
285 −(45)2
= = 1.4583 8 10 80 64 9.6528
540 9 13 117 81 11.1111
𝑎0 = 5.2778 − (1.45833) (5)= -2.01385 45 47.5 325 285

Therefore, the least-squares fit line is: 𝑦 = 𝑎0 + 𝑎1 𝑥


y= -2.01385 +1.4583x The line, along with the data, is on next slide using excel sheet
8
Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
14 yi fit line
12.0000
12
10.0000
10
8.0000
8
6.0000
6
4.0000
4
2.0000
2
0.0000
0 0 2 4 6 8 10
0 2 4 6 8 10 -2.0000

Chart Title
14
12
10
8
6
4
2
0
0 2 4 6 8 10
-2
yi fit line 9
Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Hints to
solution

10
Copyright © 2006 The McGraw-Hill Companies, Inc. Permission required for reproduction or display.

You might also like