Leastsq E

Least Squares Approximation
Numerical Analysis E3, I3 FMN050

Numerical Analysis
Centre for Mathematical Sciences
Lund University, Sweden
URL: http://www.maths.lth.se/na/
February 28, 2007
Numerical Analysis E3/Least Squares

Approximation problems
We will treat two basic problems:

• Overdetermined linear systems
• Interpolation
Examples:
• Fit straight line to measurement data
• Find continuous function that agrees with discrete
data (“digital-to-analogue conversion”)
Form & Norm: central questions of approximation

• Form: Which form should the approximant have?
Straight line, polynomial, trigonometric function,
3rd degree surface,. . . ?
• Norm: How do we measure the “error”?
Which is the “best” approximation?
For example, k · k2 ⇒ least squares method
c 2003–2007 Gustaf Söderlind, www.maths.lth.se/na Numerical Analysis E3/Least Squares

1
Fitting a straight line
Table of data:
x 1 2 3
y 1 2 2
2.5
1.5
0.5
0 0.5 1 1.5 2 2.5 3 3.5 4
Let y ∗ (x) = c0 + c1x. Determine parameters c0, c1!
Wishful thinking:
y ∗(1) = y(1) ⇒ c0 + c1 =1
y ∗(2) = y(2) ⇒ c0 + 2c1 =2
y ∗(3) = y(3) ⇒ c0 + 3c1 =2
“Overdetermined system” — what is a solution?
Three equations, two unknowns c0, c1.

2
Minimax approximation, k · k∞
3
2.5
1.5
0.5
0 0.5 1 1.5 2 2.5 3 3.5 4
Error e(x) = y ∗(x) − y(x) = c0 + c1x − y(x).

Determine c0, c1 so that ke(x)k∞ is minimized!
min kek∞ ⇒ |e(1)| = |e(2)| = |e(3)|

c0 ,c1
e(1) = −e(2) ⇒ c0 + c1 − 1 = −(c0 + 2c1 − 2)

e(1) = e(3) ⇒ c0 + c1 − 1 = c0 + 3c1 − 2
2c0 + 3c1 = 3
2c1 = 1 ⇒ c1 = 1/2, c0 = 3/4
3
Best (minimax) approximation: y ∗ (x) = 4 + 12 x
Note only for three points; otherwise Remes’ algorithm

3
Least squares approximation
Error e(x) = y ∗(x) − y(x) = c0 + c1x − y(x).
Determine c0, c1 to minimize ρ(c0, c1) = ke(x)k22
ρ = (c0 + c1 − 1)2 + (c0 + 2c1 − 2)2 + (c0 + 3c1 − 2)2

X
min ρ(c0, c1) = min |e(xi)|22
c0 ,c1 c0 ,c1
i
Smooth minimum if ∂ρ/∂c0 = ∂ρ/∂c1 = 0
∂ρ/∂c0 := 0 ⇒ 6c0 + 12c1 − 10 = 0

∂ρ/∂c1 := 0 ⇒ 12c0 + 28c1 − 22 = 0
Solution: c1 = 1/2, c0 = 2/3
Best (least squares) approximation: y ∗(x) = 23 + 21 x
Note: minimax and least-squares are not the same!

4
Comparison
Dashed line: 2.5
minimax 2
1.5
Solid line:
1
least squares
0.5
0 0.5 1 1.5 2 2.5 3 3.5 4
The optimal solution depends on the choice of norm!
Data
x 1 2 3
y 1 2 2
Minimax solution
y∗ 5/4 7/4 9/4
e 1/4 −1/4 1/4 kek∞ = 1/4 kek22 = 3/16
Least squares solution

y∗ 7/6 10/6 13/6
e 1/6 −2/6 1/6 kek∞ = 1/3 kek22 = 3/18

5
The least squares method
Overdetermined linear system:
Ax ≈ b ; A ∈ Rm×n ; m>n
Usually no solution exists: existence ⇔ b ∈ R(A).

This is an exception, because
If rank A = n, then b ∈ Rm ⊃ Rn = R(A).
Least squares approximate solution:
Determine x so that the residual kAx − bk2 is minimal!
Problem:
min kAx − bk2
x

6
The least squares method. . .
Minimize the residual r or (more conveniently) r T r.
r = Ax − b ⇒ r T r = kAx − bk22
Quadratic form:
r T r = (Ax − b)T (Ax − b)

= xT AT Ax − 2bT Ax + bT b
Stationary points if and only if
grad r T r = 2xT AT A − 2bT A = 0.
The normal equations: AT Ax − AT b = 0.

The residual must be orthogonal to the columns of A:
AT (Ax − b) = 0 ⇔ AT r = 0
The residual r is normal to R(A).

7
The normal equations
Overdetermined system: Ax ≈ b ; A ∈ Rm×n
Normal equations: AT Ax = AT b
Least squares solution: x = (AT A)−1AT b.
Note: AT A is n × n (square), and rank A = n ⇒
• det AT A 6= 0
• AT A > 0 (positive definite)
• normal equations give the minimum solution
• (AT A)−1AT is called the pseudo inverse of A.

8
Example: Fitting a straight line
Fit y ∗(x) = c0 + c1x to

x 1 2 3
y 1 2 2
Overdetermined system Ac ≈ y:
   
1 1 1
 1 2  c 0
≈ 2 
c1
1 3 2
 
1 1
T 1 1 1
A=  1 2  ⇒ A = .
1 2 3
1 3
Normal equations AT Ac = AT y:

3 6 c0 5
=
6 14 c1 11
Solution: c0 = 23 ; c1 = 1
2 ⇒ y ∗(x) = 2
3 + 21 x.

9
L2 approximation
Problem: Given f , find a ϕ ∈ Φ such that ϕ ≈ f .
Standard questions of form and norm:
• Form: How to choose the system of functions Φ?

Example: polynomials, trigonometric functions. . .
• Norm: How do we measure the approximation error?

Example: kf − ϕk2 (known as L2 approximation).
Approximation with function systems in L2 is a

generalization of the least squares method.
Important examples:
• Orthogonal polynomials
• Fourier analysis
• the Finite Element Method

10
Inner products and the norm k · k2
Inner product = generalized scalar product.

Properties of a real inner product:
1. hf, gi = hg, f i
2. hf, αg + βhi = αhf, gi + βhf, hi
3. hf, f i ≥ 0
4. hf, f i = 0 ⇒ f ≡ 0
Euclidean norm: kf k22 = hf, f i

Orthogonality: f ⊥ g if hf, gi = 0
Pythagorean theorem:
hf, gi = 0 ⇒ kf + gk22 = kf k22 + kgk22.
Proof:
hf + g, f + gi = hf, f i + 2hf, gi + hg, gi

11
Inner products. . .
Discrete case:
m
X m
X
hf, gi = f (xi)g(xi); kf k22 = f (xi)2
i=1 i=1
T
kf k22 fi2.
P P
Compare f g = figi and =
Continuous case:
Z 1 Z 1
hf, gi = f (x)g(x) dx; kf k22 = f (x)2 dx
0 0
Orthogonal systems:
A set of functions Φ = {ϕj } is an orthogonal system
with respect to h·, ·i if
hϕi, ϕj i = 0 i 6= j.

12
L2 approximation
Given a function f , an inner product h·, ·i and an

orthogonal system Φ = {ϕj }.
∗
P
Approximate f by f = j cj ϕj !
f ∗ is the best approximant in the least squares sense

if the residual satisfies the orthogonality condition
f ∗ − f ⊥ ϕj ∀j.
Normal equations:
hϕi, f ∗ − f i =0 ∀i ⇒
P
hϕi, j cj ϕj i = hϕi, f i ⇒
P
j hϕi, ϕj icj = hϕi, f i
Compare matrix formula “ΦT Φc = ΦT f ”.

13
L2 approximation and Fourier coefficients
The normal equations

P
j hϕi , ϕj icj = hϕi, f i
is a linear equation system for the coefficient vector c.

The system matrix has elements aij = hϕi, ϕj i.
If {ϕj } is an orthogonal system, hϕi, ϕj i = 0 unless

i = j. The system matrix becomes diagonal and
hϕi, f i
ci =
hϕi, ϕii
The ci are called Fourier coefficients.
Note: In an orthogonal system the computation is

reduced to a minimum: it is only necessary to compute
inner products and no equation solving is needed!
Example: Fourier series, wavelet decomposition etc.

14
L2 approximation. . .
“Theorem:” Let {ϕj } be a linearly independent

system of basis functions in L2. For every f ∈ L2
there is a unique f ∗ = Σj c∗j ϕj , given by the normal
equations hϕi, f ∗ − f i = 0, such that
kf ∗ − f k22 ≤ kg − f k22
for any g = Σj cj ϕj .
Proof: Write g − f = g − f ∗ + f ∗ − f and note that

by the normal equations, g − f ∗ ⊥ f ∗ − f . Apply the
Pythagorean theorem:
kg − f k22 = kg − f ∗k22 + kf ∗ − f k22.
So kg − f k22 is minimal when kg − f ∗k22 = 0, i.e.,

g = f ∗ by the linear independence of {ϕj }.

15

Leastsq E

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Leastsq E

Uploaded by

Copyright:

Available Formats

Least Squares Approximation

Numerical Analysis E3, I3 FMN050

February 28, 2007

Numerical Analysis E3/Least Squares

We will treat two basic problems:

Form & Norm: central questions of approximation

For example, k · k2 ⇒ least squares method

c 2003–2007 Gustaf Söderlind, www.maths.lth.se/na Numerical Analysis E3/Least Squares

Let y ∗ (x) = c0 + c1x. Determine parameters c0, c1!

“Overdetermined system” — what is a solution?

Three equations, two unknowns c0, c1.

c 2003–2007 Gustaf Söderlind, www.maths.lth.se/na Numerical Analysis E3/Least Squares

Error e(x) = y ∗(x) − y(x) = c0 + c1x − y(x).

min kek∞ ⇒ |e(1)| = |e(2)| = |e(3)|

e(1) = −e(2) ⇒ c0 + c1 − 1 = −(c0 + 2c1 − 2)

c 2003–2007 Gustaf Söderlind, www.maths.lth.se/na Numerical Analysis E3/Least Squares

Error e(x) = y ∗(x) − y(x) = c0 + c1x − y(x).

Determine c0, c1 to minimize ρ(c0, c1) = ke(x)k22

ρ = (c0 + c1 − 1)2 + (c0 + 2c1 − 2)2 + (c0 + 3c1 − 2)2

Smooth minimum if ∂ρ/∂c0 = ∂ρ/∂c1 = 0

∂ρ/∂c0 := 0 ⇒ 6c0 + 12c1 − 10 = 0

Solution: c1 = 1/2, c0 = 2/3

Best (least squares) approximation: y ∗(x) = 23 + 21 x

Note: minimax and least-squares are not the same!

c 2003–2007 Gustaf Söderlind, www.maths.lth.se/na Numerical Analysis E3/Least Squares

Dashed line: 2.5

The optimal solution depends on the choice of norm!

Least squares solution

c 2003–2007 Gustaf Söderlind, www.maths.lth.se/na Numerical Analysis E3/Least Squares

Overdetermined linear system:

Usually no solution exists: existence ⇔ b ∈ R(A).

If rank A = n, then b ∈ Rm ⊃ Rn = R(A).

Least squares approximate solution:

Determine x so that the residual kAx − bk2 is minimal!

c 2003–2007 Gustaf Söderlind, www.maths.lth.se/na Numerical Analysis E3/Least Squares

Minimize the residual r or (more conveniently) r T r.

r T r = (Ax − b)T (Ax − b)

Stationary points if and only if

grad r T r = 2xT AT A − 2bT A = 0.

The normal equations: AT Ax − AT b = 0.

The residual r is normal to R(A).

c 2003–2007 Gustaf Söderlind, www.maths.lth.se/na Numerical Analysis E3/Least Squares

Overdetermined system: Ax ≈ b ; A ∈ Rm×n

Least squares solution: x = (AT A)−1AT b.

Note: AT A is n × n (square), and rank A = n ⇒

c 2003–2007 Gustaf Söderlind, www.maths.lth.se/na Numerical Analysis E3/Least Squares

Fit y ∗(x) = c0 + c1x to

c 2003–2007 Gustaf Söderlind, www.maths.lth.se/na Numerical Analysis E3/Least Squares

Problem: Given f , find a ϕ ∈ Φ such that ϕ ≈ f .

Standard questions of form and norm:

• Form: How to choose the system of functions Φ?

• Norm: How do we measure the approximation error?

Approximation with function systems in L2 is a

c 2003–2007 Gustaf Söderlind, www.maths.lth.se/na Numerical Analysis E3/Least Squares

Inner product = generalized scalar product.

Euclidean norm: kf k22 = hf, f i

hf, gi = 0 ⇒ kf + gk22 = kf k22 + kgk22.

hf + g, f + gi = hf, f i + 2hf, gi + hg, gi

c 2003–2007 Gustaf Söderlind, www.maths.lth.se/na Numerical Analysis E3/Least Squares

c 2003–2007 Gustaf Söderlind, www.maths.lth.se/na Numerical Analysis E3/Least Squares

Given a function f , an inner product h·, ·i and an

f ∗ is the best approximant in the least squares sense

Compare matrix formula “ΦT Φc = ΦT f ”.