Download as pdf or txt
Download as pdf or txt
You are on page 1of 16

Least Squares Approximation

Numerical Analysis E3, I3 FMN050


Numerical Analysis
Centre for Mathematical Sciences
Lund University, Sweden

URL: http://www.maths.lth.se/na/

February 28, 2007

Numerical Analysis E3/Least Squares


Approximation problems

We will treat two basic problems:


• Overdetermined linear systems
• Interpolation

Examples:
• Fit straight line to measurement data
• Find continuous function that agrees with discrete
data (“digital-to-analogue conversion”)

Form & Norm: central questions of approximation


• Form: Which form should the approximant have?
Straight line, polynomial, trigonometric function,
3rd degree surface,. . . ?
• Norm: How do we measure the “error”?
Which is the “best” approximation?

For example, k · k2 ⇒ least squares method

c 2003–2007 Gustaf Söderlind, www.maths.lth.se/na Numerical Analysis E3/Least Squares


1
Fitting a straight line

Table of data:
x 1 2 3
y 1 2 2

2.5

1.5

0.5
0 0.5 1 1.5 2 2.5 3 3.5 4

Let y ∗ (x) = c0 + c1x. Determine parameters c0, c1!

Wishful thinking:
y ∗(1) = y(1) ⇒ c0 + c1 =1
y ∗(2) = y(2) ⇒ c0 + 2c1 =2
y ∗(3) = y(3) ⇒ c0 + 3c1 =2

“Overdetermined system” — what is a solution?

Three equations, two unknowns c0, c1.

c 2003–2007 Gustaf Söderlind, www.maths.lth.se/na Numerical Analysis E3/Least Squares


2
Minimax approximation, k · k∞
3

2.5

1.5

0.5
0 0.5 1 1.5 2 2.5 3 3.5 4

Error e(x) = y ∗(x) − y(x) = c0 + c1x − y(x).


Determine c0, c1 so that ke(x)k∞ is minimized!

min kek∞ ⇒ |e(1)| = |e(2)| = |e(3)|


c0 ,c1

e(1) = −e(2) ⇒ c0 + c1 − 1 = −(c0 + 2c1 − 2)


e(1) = e(3) ⇒ c0 + c1 − 1 = c0 + 3c1 − 2

2c0 + 3c1 = 3
2c1 = 1 ⇒ c1 = 1/2, c0 = 3/4
3
Best (minimax) approximation: y ∗ (x) = 4 + 12 x
Note only for three points; otherwise Remes’ algorithm

c 2003–2007 Gustaf Söderlind, www.maths.lth.se/na Numerical Analysis E3/Least Squares


3
Least squares approximation

Error e(x) = y ∗(x) − y(x) = c0 + c1x − y(x).

Determine c0, c1 to minimize ρ(c0, c1) = ke(x)k22

ρ = (c0 + c1 − 1)2 + (c0 + 2c1 − 2)2 + (c0 + 3c1 − 2)2


X
min ρ(c0, c1) = min |e(xi)|22
c0 ,c1 c0 ,c1
i

Smooth minimum if ∂ρ/∂c0 = ∂ρ/∂c1 = 0

∂ρ/∂c0 := 0 ⇒ 6c0 + 12c1 − 10 = 0


∂ρ/∂c1 := 0 ⇒ 12c0 + 28c1 − 22 = 0

Solution: c1 = 1/2, c0 = 2/3

Best (least squares) approximation: y ∗(x) = 23 + 21 x

Note: minimax and least-squares are not the same!

c 2003–2007 Gustaf Söderlind, www.maths.lth.se/na Numerical Analysis E3/Least Squares


4
Comparison

Dashed line: 2.5

minimax 2

1.5

Solid line:
1
least squares
0.5
0 0.5 1 1.5 2 2.5 3 3.5 4

The optimal solution depends on the choice of norm!

Data
x 1 2 3
y 1 2 2

Minimax solution
y∗ 5/4 7/4 9/4
e 1/4 −1/4 1/4 kek∞ = 1/4 kek22 = 3/16

Least squares solution


y∗ 7/6 10/6 13/6
e 1/6 −2/6 1/6 kek∞ = 1/3 kek22 = 3/18

c 2003–2007 Gustaf Söderlind, www.maths.lth.se/na Numerical Analysis E3/Least Squares


5
The least squares method

Overdetermined linear system:

Ax ≈ b ; A ∈ Rm×n ; m>n

Usually no solution exists: existence ⇔ b ∈ R(A).


This is an exception, because

If rank A = n, then b ∈ Rm ⊃ Rn = R(A).

Least squares approximate solution:

Determine x so that the residual kAx − bk2 is minimal!

Problem:
min kAx − bk2
x

c 2003–2007 Gustaf Söderlind, www.maths.lth.se/na Numerical Analysis E3/Least Squares


6
The least squares method. . .

Minimize the residual r or (more conveniently) r T r.

r = Ax − b ⇒ r T r = kAx − bk22

Quadratic form:

r T r = (Ax − b)T (Ax − b)


= xT AT Ax − 2bT Ax + bT b

Stationary points if and only if

grad r T r = 2xT AT A − 2bT A = 0.

The normal equations: AT Ax − AT b = 0.


The residual must be orthogonal to the columns of A:

AT (Ax − b) = 0 ⇔ AT r = 0

The residual r is normal to R(A).

c 2003–2007 Gustaf Söderlind, www.maths.lth.se/na Numerical Analysis E3/Least Squares


7
The normal equations

Overdetermined system: Ax ≈ b ; A ∈ Rm×n

Normal equations: AT Ax = AT b

Least squares solution: x = (AT A)−1AT b.

Note: AT A is n × n (square), and rank A = n ⇒

• det AT A 6= 0
• AT A > 0 (positive definite)
• normal equations give the minimum solution
• (AT A)−1AT is called the pseudo inverse of A.

c 2003–2007 Gustaf Söderlind, www.maths.lth.se/na Numerical Analysis E3/Least Squares


8
Example: Fitting a straight line

Fit y ∗(x) = c0 + c1x to


x 1 2 3
y 1 2 2

Overdetermined system Ac ≈ y:
   
1 1   1
 1 2  c 0
≈ 2 
c1
1 3 2
 
1 1  
T 1 1 1
A=  1 2  ⇒ A = .
1 2 3
1 3

Normal equations AT Ac = AT y:
    
3 6 c0 5
=
6 14 c1 11

Solution: c0 = 23 ; c1 = 1
2 ⇒ y ∗(x) = 2
3 + 21 x.

c 2003–2007 Gustaf Söderlind, www.maths.lth.se/na Numerical Analysis E3/Least Squares


9
L2 approximation

Problem: Given f , find a ϕ ∈ Φ such that ϕ ≈ f .

Standard questions of form and norm:

• Form: How to choose the system of functions Φ?


Example: polynomials, trigonometric functions. . .

• Norm: How do we measure the approximation error?


Example: kf − ϕk2 (known as L2 approximation).

Approximation with function systems in L2 is a


generalization of the least squares method.

Important examples:
• Orthogonal polynomials
• Fourier analysis
• the Finite Element Method

c 2003–2007 Gustaf Söderlind, www.maths.lth.se/na Numerical Analysis E3/Least Squares


10
Inner products and the norm k · k2

Inner product = generalized scalar product.


Properties of a real inner product:
1. hf, gi = hg, f i
2. hf, αg + βhi = αhf, gi + βhf, hi
3. hf, f i ≥ 0
4. hf, f i = 0 ⇒ f ≡ 0

Euclidean norm: kf k22 = hf, f i


Orthogonality: f ⊥ g if hf, gi = 0

Pythagorean theorem:

hf, gi = 0 ⇒ kf + gk22 = kf k22 + kgk22.

Proof:

hf + g, f + gi = hf, f i + 2hf, gi + hg, gi

c 2003–2007 Gustaf Söderlind, www.maths.lth.se/na Numerical Analysis E3/Least Squares


11
Inner products. . .

Discrete case:
m
X m
X
hf, gi = f (xi)g(xi); kf k22 = f (xi)2
i=1 i=1

T
kf k22 fi2.
P P
Compare f g = figi and =

Continuous case:
Z 1 Z 1
hf, gi = f (x)g(x) dx; kf k22 = f (x)2 dx
0 0

Orthogonal systems:
A set of functions Φ = {ϕj } is an orthogonal system
with respect to h·, ·i if

hϕi, ϕj i = 0 i 6= j.

c 2003–2007 Gustaf Söderlind, www.maths.lth.se/na Numerical Analysis E3/Least Squares


12
L2 approximation

Given a function f , an inner product h·, ·i and an


orthogonal system Φ = {ϕj }.

P
Approximate f by f = j cj ϕj !

f ∗ is the best approximant in the least squares sense


if the residual satisfies the orthogonality condition

f ∗ − f ⊥ ϕj ∀j.

Normal equations:
hϕi, f ∗ − f i =0 ∀i ⇒
P
hϕi, j cj ϕj i = hϕi, f i ⇒
P
j hϕi, ϕj icj = hϕi, f i

Compare matrix formula “ΦT Φc = ΦT f ”.

c 2003–2007 Gustaf Söderlind, www.maths.lth.se/na Numerical Analysis E3/Least Squares


13
L2 approximation and Fourier coefficients

The normal equations


P
j hϕi , ϕj icj = hϕi, f i

is a linear equation system for the coefficient vector c.


The system matrix has elements aij = hϕi, ϕj i.

If {ϕj } is an orthogonal system, hϕi, ϕj i = 0 unless


i = j. The system matrix becomes diagonal and

hϕi, f i
ci =
hϕi, ϕii

The ci are called Fourier coefficients.

Note: In an orthogonal system the computation is


reduced to a minimum: it is only necessary to compute
inner products and no equation solving is needed!
Example: Fourier series, wavelet decomposition etc.

c 2003–2007 Gustaf Söderlind, www.maths.lth.se/na Numerical Analysis E3/Least Squares


14
L2 approximation. . .

“Theorem:” Let {ϕj } be a linearly independent


system of basis functions in L2. For every f ∈ L2
there is a unique f ∗ = Σj c∗j ϕj , given by the normal
equations hϕi, f ∗ − f i = 0, such that

kf ∗ − f k22 ≤ kg − f k22

for any g = Σj cj ϕj .

Proof: Write g − f = g − f ∗ + f ∗ − f and note that


by the normal equations, g − f ∗ ⊥ f ∗ − f . Apply the
Pythagorean theorem:

kg − f k22 = kg − f ∗k22 + kf ∗ − f k22.

So kg − f k22 is minimal when kg − f ∗k22 = 0, i.e.,


g = f ∗ by the linear independence of {ϕj }.

c 2003–2007 Gustaf Söderlind, www.maths.lth.se/na Numerical Analysis E3/Least Squares


15

You might also like