1 Week 02: 2 and 4 September 2008

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

1

Week 02: 2 and 4 September 2008

This week, we will review the fundamental theorem of the IVP


y = f (t, y)

y(0) = y0 ,

(1)

and begin to analyze and construct numerical methods.

Fundamental theorems of IVP theory

The fundamental theorems of the IVP guarantee that solutions of the IVP
exist and are unique and stable, hence physically relevant. A superb reference
for these theorems (and a host of other analytical facts about ODEs) is
Chapter 1 of [HNW93].
Norms: We measure the size of error vectors, differences, and so forth in
many ways. Each corresponds to a norm, most commonly to one of the
p-norms

kxkp =

n
X

j=1

1/p

|xj |p

(2)

or the limiting case


kxk = max |xj | = lim kxkp .
p

1jn

(3)

The usual Euclidean norm is p = 2, while p = 1 is called the Manhattan


norm. Numerical analysts like the maxnorm with p = because it is easy
to compute and gives a guaranteed error bound for every component, not
just an average.
Given a choice of norm, we can specify niceness of the data functions in
ODEs (such as f ) by degrees of continuity:
1. f is continuous at a iff kf (x) f (a)k 0 as kx ak 0.
2. f is differentiable at a iff there is some Jacobian matrix Df (a) such
that kf (x) f (a) Df (a)(x a)k/kx ak 0 as kx ak 0. The
Jacobian is then uniquely given as the matrix of partial derivatives
[fi /xj ].
1

3. f is C 1 on a region iff Df (a) is a continuous function of a in that region.


Here we define continuity of a matrixvalued function with the matrix
norm
kAkp = max kAxkp ,
(4)
kxkp =1

so the size of a matrix is measured by how much it stretches vectors.


4. In ODE theory, a condition intermediate between C 0 and C 1 is extremely convenient: f is Lipschitz on a region iff there is a constant L
such that kf (x) f (y)k Lkx yk for all x and y in the region.
A Lipschitz function is always C 0 and almost but not quite C 1 .

Example: f (y) = y is continuous but not Lipschitz at y = 0, because it


develops an infinite slope.
Example: f (x) = |x a| on R is continuous and Lipschitz by the reverse
triangle inequality ||x| |y|| |x y|, but not C 1 since the derivative does
not exist at x = a.
Example: f (x) = x2 is C 1 but not globally Lipschitz, but is locally Lipschitz on bounded regions.
Example: f (t, y) = A(t)y + b(t) is Lipschitz as long as A(t) is bounded.
If A is constant, the Lipschitz constant of f is called the matrix norm of A
subordinate to the vector norm used to compute the Lipschitz constant. For
example, in the p-norms with p = 1 and , it turns out that the matrix
norms can be computed directly by
kAk1 = max

1jn

n
X
i=1

|aij ,

kAk = max

1in

n
X

j=1

|aij .

For other p the matrix norms are harder to evaluate.


These definitions come into play because the ODE y = f (t, y) guarantees
that y is C 1 whenever y and f are continuous. The fundamental theorems
of ODE theory are as follows.
Theorem 1 Suppose f is a continuous function of t and y defined on the
cylinder Q = {0 t T, ky y0 k r} and kf k M on Q.
2

1. Then the IVP for (1) has a solution which exists for 0 t min(r/M, T ).
2. If in addition f is Lipschitz in y then the solution is unique.
3. If in addition the Jacobian matrix Df (t, y) is continuous on Q then the
solution y is differentiable with respect to the initial condition y0 .
4. Suppose f also depends on parameters u Rm and Du f exists and is
continuous on Q for all u. Then the solution y is differentiable with
respect to u.
The guaranteed interval of existence can be found by dimensional analysis: r/M is the only combination with units of time. It is also the largest
time during which the trajectory (t, y(t) cannot leave Q when it moves with
velocity bounded by M starting from y0 . The general message is: if f is
C 1 then everything is nice. The converse does not hold, and simple general
necessary and sufficient conditions are rare in ODE theory.
If the solution exists, is unique and depends continuously on the data y0
and f then we say the IVP is wellposed, and it is then worthy of numerical
solution. Most problems are wellposed if properly formulated in the first
place.

Integral equation

A key tool in proving many such theorems, and in constructing numerical


methods, is the equivalent integral equation produced by integrating the IVP:
y(t) = y0 +

f (s, y(s))ds.

(5)

This equation does not require (and instead implies) differentiability of y,


so analysis is much simpler than for the original IVP. Moreover, the initial
condition and differential equation are contained in a single package.
For example, Picard iteration treats the integral equation as a fixed point
equation:
Z
yn+1 (t) = y0 +

f (s, yn (s))ds.

(6)

When f is Lipschitz it is easy to show that the Picard iterates converge


uniformly to a unique continuous solution of the IVP on the interval of t
specified in the theorem.
3

2d gravity example

The equations of motion for a particle in a 2d gravity well will be useful: at


position (x, y) with velocity (u, v), we have
x = u
y = v
x
x2 + y 2
y
= 2
x + y2

u =
v

or

x
y
u
v

= z = f (z) =

u
v
x/r 2
y/r 2

(7)

where r 2 = x2 + y 2. The function f is singular when x = y = 0, but Lipschitz


in any closed rectangle Q not containing x = y = 0.

Differential-algebraic problems

In real life, we rarely have purely differential relations between the various
solution components and derivatives. Instead, we have a trajectory governed
by differential force laws and algebraic constraints. For example, we may
have a linear differential-algebraic equation of the form
Ay (t) + By(t) = f (t).
If the matrix A is invertible, we can multiply by A1 to get an ODE for y.
But if A is not invertible, then only a subset of the components of y are
determined, and we have a combination of differential and algebraic equations. In the simplest case, A is projection (y1 , . . . , yn ) y1 and only y1 is
determined. The other components are determined by the equation By = f .
Naturally most DAEs are nonlinear, and a similar selection of problems
exists: IVPs, BVPs, control and inverse problems. Methods for DAEs usually
come from methods for stiff ODEs, in the limit of infinite stiffness.
4

Building methods for the IVP

We review techniques for building numerical formulas for the IVP and begin
to develop criteria for designing such formulas. These criteria involve the
cost of obtaining a solution as a function of the accuracy desired.
Many texts derive formulas by many different methods: see [SB93, Lam91,
HNW93] and so forth. The basic convergence theory is presented in Chap. 7
of [SB93], Sec. 3.4 of [HNW93], Chap. 5 of [IK94], and Chap. 2 of [Lam91].

6.1

Ad hoc approach

The most obvious way to advance the numerical solution un from time tn to
tn+1 = tn + h is the Taylor expansion
1
y(tn + h) = y(tn ) + hy (tn ) + h2 y (tn ) + .
2

(8)

Using the ODE for y and dropping higherorder terms gives the forward
Euler method
un+1 = un + hf (tn , un ),
u0 = y0 .
(9)

6.2

Taylor series approach

For more accuracy, we can differentiate the ith component of the vector ODE
to get higherorder derivatives:
yi =

n
d
fi X
fi
(t, y(t))fj (t, y(t))
fi (t, y(t)) =
+
dt
t
j=1 yj

(10)

where we used the ODE again to replace yj by fj . Using the Jacobian matrix
Dfij = fi /yj and the shorthand ft = f /t, this gives
y = ft + Df f.

(11)

Thus the secondorder Taylor series method is


1
un+1 = un + hf (tn , un ) + h2 (ft (tn , un ) + Df (tn , un ) f (tn , un )) .
2

(12)

Example: For 2d gravity, we have


u
v
x/r 2
y/r 2

f (z) =

(13)

where r 2 = x2 + y 2 , so

Df (z) =

0
0

0
0

x2 y 2
r4
2xy
r4

2xy
r4
y 2 x2
r4

1
0
0
0

0
1
0
0

(14)

Clearly this approach will be messy and errorprone for complicated


ODEs, and inconvenient for ODEs where the righthand side is defined by
experiments or computations which are not easy to differentiate.

6.3

Integral equation approach

A systematic approach is provided by applying numerical integration to the


equivalent integral equation
y(t) = y0 +

f (s, y(s))ds,

(15)

where integrating a vector function of a real variable s simply means integrating each component. For example, the forward Euler method results from
the left rectangle rule
Z
1

f (s)ds f (0)

(16)

and the right rectangle rule yields the backward or implicit Euler method
un+1 = un + hf (tn+1 , un+1).

(17)

Backward Euler is expensive since the new solution is determined by solving


an equation which is nonlinear if f is, but has some decided advantages for
certain stiff problems.
Other numerical integration rules also yield useful ODE formulas: the
Trapezoidal Rule
Z 1
1
f (s)ds (f (0) + f (1))
(18)
2
0
6

yields

h
(f (tn , un ) + f (tn+1 , un+1 ))
2

un+1 = un +
and the midpoint rule
Z

1
f (s)ds f ( )
2

(19)
(20)

yields

h
1
un+1 = un + f (tn + , un + (un+1 un )).
(21)
2
2
Here we have approximated the midpoint value of un+1/2 by linear interpolation.
In general, a qpoint numerical integration rule
Z

1
0

f (s)ds

q
X

wp f (p )

(22)

p=1

yields an ODE formula


un+1 = un + h

q
X

wp f (tn + hp , y(tn + hp ))

(23)

p=1

where the offgrid y values on the righthand side can be treated by three
approaches:
1. Approximate them by combinations of known (and unknown) values,
for example by interpolation as above in the midpoint rule.
2. Predict them by another numerical scheme such as Eulers method.
For example, the midpoint value un+1/2 in the midpoint rule can be
predicted by a halfstep of forward Euler:
h
un+1/2 = un + f (tn , un )
2
to yield one of the simplest RungeKutta formulas:

(24)

h
h
(25)
un+1 = un + hf (tn + , un + f (tn , un )).
2
2
Or we could predict by backward Euler to obtain a method which does
not appear in the literature (as far as I know):
h
h
un+1 = un + hf (tn + , un + f (tn+1 , un+1)).
2
2
It might have some interesting stability properties.
7

(26)

3. Forbid them by requiring all the quadrature points to be on the grid.


For equidistant time steps, this requires p to be integers between
R
to 1. For example, we can approximate 01 f (s)ds by an interpolatory
rule based on values f (2), f (1), f (0), and f (1). This seems strange
in the context of numerical integration, but works very naturally for
causal ODEs. It yields a multistep method known as 3step implicit
Adams or AdamsMoulton:
un+1

9
19
5
1
= un + h
fn+1 + fn +
fn1 + fn2
24
24
24
24


(27)

where fnj = f (tnj , unj ).

6.4

Undetermined coefficients

One of the most powerful and general techniques for the generation of numerical formulas is the method of undetermined coefficients. Here we decide
precisely what form and properties we wish our formula to have, specify them
mathematically as equations or inequalities, and select coefficients of our formula to satisfy the requirements. Some design criteria which we translate
to produce equations for the coefficients will be developed over the course of
the next few sections.

6.5

Promotion to higher order

Later we will use techniques such as Richardson extrapolation and defect


correction to promote loworder accurate methods such as forward Euler,
backward Euler and the midpoint rule to higherorder accurate methods.
These techniques depend on the ideas of asymptotic error expansions and
variational equations.

Convergence of Eulers method

Convergence is the first basic requirement that a reasonable numerical method


must satisfy. It requires that we get arbitrarily accurate solutions if we put
sufficient effort into solving the problem, by taking h sufficiently small.

Definition 1 A method for the IVP converges if for all sufficiently smooth
righthand sides f , the numerical solution un satisfies
max kun y(tn )k 0

0tn T

(28)

as h 0 and u0 y0 . It is accurate of order p if


max kun y(tn )k = O(hp ) + O(ku0 y0 k)

0tn T

(29)

as h 0 and u0 y0 .
Convergence means that as the mesh size decreases and the precision of
the starting values increases, the numerical solution approaches the exact
solution on the whole interval [0, T ]. Accuracy of high order p produces
a highlyaccurate solution faster than loworder accurate methods if the
implied constant in O(hp ) is not too large, for the particular problem being
solved.
The proof of convergence for the forward Euler method is simple and
demonstrates a standard approach: reduce convergence to consistency and
stability, prove each separately, then put them together. This approach works
for many PDEs as well as ODEs. A similar but more detailed proof is given
in Section I.7 of [HNW93].
Theorem 2 Eulers method converges for any IVP where f is Lipschitz and
the solution y is C 2 .
Proof: First, we derive a recursion for the error at each step: the forward
Euler method says that
un+1 = un + hf (tn , un )
and Taylor expansion with remainder of each component gives
1
yj (tn+1 ) = yj (tn )+hfj (tn , y(tn ))+ h2 yj (tn +j h) = yj (tn )+hfj (tn , y(tn ))+nj
2
(30)
for some unknown collection of numbers j between 0 and 1. This defines
the local truncation error n , the amount by which the exact solution y fails

to satisfy the method after one step. Subtraction gives a difference equation
for the error
en+1 = un+1 y(tn+1 ) = en + h[f (tn , un ) f (tn , y(tn ))] + n
Assume f is Lipschitz with constant L and the local truncation error satisfies
a bound
kn k
for all n (which depends only on the exact solution) then
ken+1 k ken k + hLken k + = (1 + hL)ken k +
(1 + hL)2 ken1 k + (1 + (1 + hL))
...
(1 + hL)n+1 1
(1 + hL)n+1 ke0 k +
.
(1 + hL) 1
Since (1 + hL) ehL = 1 + hL + 21 (hL)2 + . . ., this gives
ken k enhL ke0 k +

enhL 1
eLT 1
eLT ke0 k +
n
hL
LT

(31)

for 0 tn = nh T . This shows stability: local errors in the numerical


solution are bounded independently of the mesh size.
Now the local truncation error is
1
1
kn k = k h2 y (tn + n h)k Mh2
2
2

(32)

assuming y is C 2 and ky k M (i.e. f is C 1 with bounded derivatives).


This conditionthe local truncation error is O(h2 )is called consistency.
Consistency gives a local bound on and stability allows us to conclude
convergence:
kun y(tn )k eLT ku0 y0 k +

eLT 1 T
Mh O(|u0 y0 |) + O(h) (33)
LT 2

Convergence means that un yn on the whole interval [0, T ] as h 0 and


as u0 y0 . In other words the errors in the computed solution can be made
arbitrarily small by taking sufficently small steps and sufficiently accurate
starting values.
10

References
[HNW93] E. Hairer, S. P. Nrsett, and G. Wanner. Solving Ordinary Differential equations I : Nonstiff problems. Springer-Verlag, second
revised edition, 1993.
[IK94]

E. Isaacson and H. B. Keller. Analysis of numerical methods.


Dover, 1994.

[Lam91]

J. D. Lambert. Numerical Methods for Ordinary Differential Systems: The Initial Value Problem. John Wiley and Sons, 1991.

[SB93]

J. Stoer and R. Bulirsch.


Springer-Verlag, 1993.

11

Introduction to numerical analysis.

You might also like