ECM6Lecture11aVietnam 2014

ECM6 Computational Methods :
Slide 1 of 7
Lecture 11a
Linear Regression
Brian G. Higgins
Department of Chemical Engineering & Materials Science
University of California, Davis
April 2014, Hanoi, Vietnam
ECM6Lecture11aVietnam_2014.nb
Background
A common task in engineering is to determine a formula that describes how a physical quantity such as
say heat capacity varies as function of a second quantity say temperature.
As a result of an experiment you have made set of measurements and have M data points that are in
the form
88x1 , y1 <, 8x2 , y2 <, 8x3 , y3 <, 8xM , yM <<
Here say xi represents a temperature measurement and yi represents the heat capacity. When we plot
the data we might get something like this
1.2
1.0
0.8
CP
0.6
0.4
0.2
0.0
0
10
15
20
Temperature
Based on your understanding of the physics you may have an idea that the data should agree with the
blue line shown in the above plot.
We call the blue line the guess function g(x).
Since the points 8xi , yi < are experimental that are not likely to lie on the curve given by the guess function. Thus
dk = g Hxk L - yk
where dk represents the vertical distance from the data point 8xk , yk < and the curve given by the guess
function g(x). Here is a graphical representation
0.12
0.10
Hxk , yk L
CP
0.08
dk =gHxk L-yk
0.06
0.04
0.02
0.00
0.0
0.5
1.0
1.5
2.0
Temperature
Types of Errors
We can define 3 types of error
M
Absolute Error of g : E1 HgL =
d1
d2
+ dM
dk
(1)
k=1
M
Square Error of g :
E2 HgL = d21 + d22 + d2M = d2k
(2)
k=1
Maximum Error of g :
E3 HgL = max 8 d1 ,
d2 ,
d3
dM <
(3)
Note if the graph passes through all the points 8xk , yk <, then the error Ei HgL no matter how it is defined
is zero.
On the other hand as Ei HgL get larger Hsome or all of the data points do not lie on the curveL, then we say
that the guess function g(x) does not fit the data as well. In short, the smaller is Ei HgL, the better is the fit.
Squared Error
The square error E2 HgL is the most widely used estimate of the error of how well g(x) fits the data. There
are statistical reasons why this is the best way to estimate the goodness of a fit, but we will not discuss
it in these lectures.
Thus our goal will be to
M
minimize E2 HgL where
E2 HgL = @g Hxi L - yk D2
(4)
k=1
We say then we seek a g(x) that has the least square error. Another way of saying this is the g(x) with
the minimum E2 HgL is called the least square g(x).
Example 1
Problem Statement
Suppose we are given the following list of data points Hxk , yk L
881, 1.5<, 82, 3.9<, 84, 6.6<, 87, 11.7<, 89, 15.6<<
(i) Determine the square error as a function of k for
g HxL = k x
(ii) Find the parameter value of k that minimizes E2 HgL. What is E2 HgL for this value of k?
Solution Step 1: Define Sum of Squared Error Function

We will use Mathematica to do these calculations. The data is given by
data = 881, 1.5<, 82, 3.9<, 84, 6.6<, 87, 11.7<, 89, 15.6<<;
Next we define a function for the error E2 HgL. Note we use double strike E, as the variable E in Mathematica is the protected exponential constant equal to 2.7182...
M
E2 @g_, M_D := Hg@x@iDD - y@iDL2

i=1
Note that the parameter N defines the number of data points
Solution Step 2: Compute the Sum of Squared Error using the data
Here we have supposed that xi and yi are our data points. These are defined from the given data list
as follows:
x@i_D := data@@i, 1DD; y@i_D := data@@i, 2DD; M = Length@dataD;
Next we must define our function g.

g@x_D := k x
Then the sum of squares is given by

E2 @g, MD
H- 1.5 + kL2 + H- 3.9 + 2 kL2 + H- 6.6 + 4 kL2 + H- 11.7 + 7 kL2 + H- 15.6 + 9 kL2
We can also express this result as

Factor@E2 @g, MDD Simplify
441.27 - 516. k + 151. k2
which symbolically can be expressed as

M
E2 HgL = k2 x2i - 2 k xi yi + y2i

i=1
i=1
i=1
(5)
Graphical View of Sum of Errors

Let us plot the square error as a function of the unknown parameter k in our model:
Plot@E2 @g, MD, 8k, - 3, 5<, PlotStyle 8Blue, Thick<,
Frame True, FrameLabel 8Style@"k", 16D, Style@"E2 ", 16D<D
3000
2500
E2
2000
1500
1000
500
0
-2
k
It is clear that E2 has a minimum value for a value of k 1.8
Computing the value of k

From calculus we know that the minimum value for E2 occurs when
E2
k
= 0,
2 E
k2
> 0 at k = kmin
Differentiating E2 gives
D@E2 @g, MD, kD Simplify
302. H- 1.70861 + kL
and solving for kmin where E2 k=0 we get

sol = Flatten@Solve@D@E2 @g, MD, kD 0DD
8k 1.70861<
We can check that this value of k gives a local minimum

D@E2 @g, MD, 8k, 2<D . sol
302
Thus our "best" guess function is

g HxL = 1.7086 x
Plotting the Best Fit

Let us plot the results and compare the fit
plt1 = Plot@g@xD .sol, 8x, 0, 10<, PlotStyle 8Thick, Blue<,

FrameLabel 8Style@"x", 16D, Style@"gHxL", 16D<, Frame TrueD;
plt2 = ListPlot@data, PlotStyle 8PointSize@0.02`D, Red<D;
Show@plt1, plt2D
15
gHxL
10
0
0
10
x
Finally let us evaluate the least square error
E2 @g, MD . sol
0.448808
Note that the actual value is not small number, yet the fit looks good. We will address this point later.
General Theory for Linear Regression

Background
In the last example we considered a fitting function with a single parameter. Here we extend the method
to two parameters, which is the general theory for linear regression
We will now extend the ideas from the previous section to find the best fit in a least squares sense for
the function
L HxL = m x + b
where m is the slope and b is the y- intercept. As before we will given a set of M data points in the form
88x1 , y1 <, 8x2 , y2 <, 8x3 , y3 <, 8xM , yM <<
and our objective is to find the parameters m and b that minimize the least square error. Thus
M
E2 HLL = Hm xk + b - yk L2
k=1
Minimizing the Squared Error

We know from calculus that the desired values of b and m must satisfy
E2 HLL
b
E2 HLL
= 0,
=0
Note that since the xk ' s and yk ' s are constant, we can evaluate the following sum
M
b=Mb
k=1
Now evaluating the derivatives using the chain rule gives

E2 HLL
b
= 2 Hm xk + b - yk L
k=1
Hm xk + b - yk L
= 2 Hm xk + b - yk L
k=1
= 2 M b + m xk - yk
k=1
k=1
Similarly we find
E2 HLL
m
= 2 Hm xk + b - yk L xk
k=1
= 2 m x2k + b xk - yk xk
k=1
k=1
k=1
Then setting the partial derivatives to zero gives the following linear set of equations for m and b
M
M b + m xk - yk = 0
k=1
k=1
m x2k + b xk - yk xk = 0
k=1
k=1
k=1
We can write this system in matrix notation as

M
xk
xk
x2k
yk
b
O=K
O
m
yk xk
Let us show how this can be programmed into Mathematica .
Example 2
Problem Statement
881, 5.12<, 83, 3<, 86, 2.48<, 89, 2.34<, 815, 2.18<<
and we want to find the parameters m and b that give the least square error for a guess function
LHxL = m x + b
Solution Step 1: Definition of Functions

We proceed as before and define the following quantities
data2 = 881, 5.12<, 83, 3<, 86, 2.48<, 89, 2.34<, 815, 2.18<<;
x@i_D := data2@@i, 1DD; y@i_D := data2@@i, 2DD; M = Length@data2D;
L@x_D := m x + b
M
E2 @g_, M_D := Hg@x@iDD - y@iDL2

i=1
Evaluating the least squares function gives

E2 @L, MD
H- 5.12 + b + mL2 + H- 3 + b + 3 mL2 + H- 2.48 + b + 6 mL2 + H- 2.34 + b + 9 mL2 + H- 2.18 + b + 15 mL2
Solution Step 2: Determining the Parameters

Taking the derivatives gives the following equations
eqns = 8D@E2 @L, MD, mD 0, D@E2 @L, MD, bD 0<
82 H- 5.12 + b + mL + 6 H- 3 + b + 3 mL + 12 H- 2.48 + b + 6 mL +
18 H- 2.34 + b + 9 mL + 30 H- 2.18 + b + 15 mL 0, 2 H- 5.12 + b + mL +
2 H- 3 + b + 3 mL + 2 H- 2.48 + b + 6 mL + 2 H- 2.34 + b + 9 mL + 2 H- 2.18 + b + 15 mL 0<
sol2 = Solve@eqnsD
88b 4.15298, m - 0.166026<<
The least square error is

E2 @L, MD . sol2
82.54009<
Solution Step 3: Visualizing the Solution

Now let us visualize the fit graphically
plt1 = Plot@L@xD .sol2, 8x, 0, 20<, PlotStyle 8Thick, Blue<,

Frame True, FrameLabel 8Style@"x", 16D, Style@"LHxL", 16D<D;
plt2 = ListPlot@data2, PlotStyle 8PointSize@0.02D, Red<D;
Show@plt1, plt2D
4.0
3.5
3.0
LHxL
10
2.5
2.0
1.5
1.0
0
10
15
20
x
It is apparent from the plots that a straight line is not particularly good fit of the data.
Using Mathematicas Fit Function

Mathematica has a function that allows one to use least squares to determine a best fit to a given
function. It is called Fit. Here is the syntax for the function
? Fit
Fit@data, funs, varsD finds a least-squares fit to a
list of data as a linear combination of the functions funs of variables vars.
Let us apply it to our data

linearFit = Fit@data2, 81, x<, xD
4.15298- 0.166026 x
We see we get the same result.
11
12
Final Comments
In these notes we have outlined the general principle for doing linear regression.
We have kept the discussion limited to fitting the data to a linear function
L HxL = m x + b
In this case there are two parameters: L and M and they appear linearly in the fitting function
The term linear regression does not mean that the function is linear in the independent variable x, but
that it is linear in the unknown parameters m and L.
For example, the same ideas and programming steps apply to fitting the data to a polynomial function of
x:
P HxL = a0 + a1 x + a2 x2 + a3 x3 + + an xn
Note that the unknown parameters a0 , a1 , , an appear linearly in the function P(x)
References
These notes and the examples are adapted from Maron (1987)
M.J. Maron, Numerical Analysis. A Practical Approach, 2nd Edition, Macmillan Publishing Company,
1987

ECM6Lecture11aVietnam 2014

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

ECM6Lecture11aVietnam 2014

Uploaded by

Copyright:

Available Formats

ECM6 Computational Methods :

Absolute Error of g : E1 HgL =

E2 HgL = d21 + d22 + d2M = d2k

minimize E2 HgL where

Solution Step 1: Define Sum of Squared Error Function

E2 @g_, M_D := Hg@x@iDD - y@iDL2

Note that the parameter N defines the number of data points

Next we must define our function g.

Then the sum of squares is given by

We can also express this result as

which symbolically can be expressed as

E2 HgL = k2 x2i - 2 k xi yi + y2i

Graphical View of Sum of Errors

Computing the value of k

and solving for kmin where E2 k=0 we get

We can check that this value of k gives a local minimum

Thus our "best" guess function is

Plotting the Best Fit

plt1 = Plot@g@xD .sol, 8x, 0, 10<, PlotStyle 8Thick, Blue<,

General Theory for Linear Regression

Minimizing the Squared Error

Now evaluating the derivatives using the chain rule gives

We can write this system in matrix notation as

Let us show how this can be programmed into Mathematica .

Solution Step 1: Definition of Functions

E2 @g_, M_D := Hg@x@iDD - y@iDL2

Evaluating the least squares function gives

Solution Step 2: Determining the Parameters

The least square error is

Solution Step 3: Visualizing the Solution

plt1 = Plot@L@xD .sol2, 8x, 0, 20<, PlotStyle 8Thick, Blue<,

Using Mathematicas Fit Function

Let us apply it to our data

We see we get the same result.

You might also like