Professional Documents
Culture Documents
Initial Value Methods: 4.1 Introduction: Shooting
Initial Value Methods: 4.1 Introduction: Shooting
Initial Value Methods: 4.1 Introduction: Shooting
php
Initial
Value
Methods
In the previous chapter we saw that there is a close theoretical relationship between
BVPs and IVPs. Indeed, the basic theorems of solution existence and representation for
a BVP were given in terms of fundamental solutions, which are solutions of associated
IVPs. At the same time, BVPs may have a more complex structure than IVPs (see Sec-
tion 3.1). It makes lense, then, to construct a numerical method for a given BVP by
relating the problem to corresponding "easier" IVPs, and solving the latter numerically.
This chapter describes and analyzes such methods.
The intuitive appeal of this approach is strengthened by the advanced state of
numerical analysis for IVPs: As described in Section 2.7, good numerical methods for
such problems are well understood, and efficient, flexible, general-purpose software is
readily available in any mathematica) software library. Thus, one is able, at least in
principle, to solve BVPs numerically with minimal problem analysis and preparation.
The simplest initial value method for BVPs is the single (or simple) shooting method,
briefly described below for linear and nonlinear problems. The method is analyzed in
more detail in further sections. It is a useful, easy-to-understand method. Unfor-
tunately, it can also (and often does) have stability drawbacks. These drawbacks are
alleviated by more complex initial value methods like multiple shooting, the stabilized
march, and the Riccati method. Some of the resulting variants lead to powerful,
132
general-purpose algorithms (unlike single shooting). At the same time, the charm of
single shooting is its simplicity, and this is lost as well.
Downloaded 06/07/16 to 130.237.29.138. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
u(b) = 0 2 (4.1c)
Assume that (4.1 a,b,c) has a unique solution u (x). In order to use an initial value
integrator for (4.1a) we need u (a) and u'(a ). By (4.1b) we have u (a), but u'(a) is
unknown. Let us therefore guess a "shooting angle" s 1 ,
u'(a) = s (4.1d)
The IVP (4.1a,b,d) can be solved, yielding a solution v (x), say. But, in general, v u
because v (b) ^ P. Let us then guess another value
u'(a) = s z (4.1e)
and call w (x) the solution of the IVP (4. la,b,e). Again, in general w (b) ^ 01 2 , so w E4 u .
Linearity of the problem implies that
u (x) = Ow (x) + (1-9)v (x) a S x <_ b. (4.2)
Assuming (reasonably) that v (b) # w (b ), we can adjust the shooting angle by defining
0 :_ 2—v (b)
i.e., 2 = 0w (b) + (1— 9)v (b) (4.3)
w(b)—v(b)
A direct substitution into (4.1 a,b,c) shows that u (x) of (4.2) is indeed the sought solu-
tion of the BVP.
This is an instance of the single shooting method. The IVPs (4.1a,b,d) and
(4.1a,b,e) are solved numerically to yield a numerical algorithm. If a stable scheme of
order r is used (cf. Section 2.7) and the coefficients c 1 (x ), c 0(x ), and q (x) of (4. la) are
sufficiently smooth, then it follows from (4.3) that the obtained approximation to u (x)
also has an error of 0 (h`), where h is the maximum step size used in the numerical
integration of the IVPs.
A general exposition and analysis of the shooting method for linear systems of
ODEs is given in Section 4.2. Here we note that instead of the IVP (4.1a,b,e) we could
solve (4.1 a) under the initial conditions
In (4.3) we have explicitly used the linearity of the problem. Yet, the shooting principle
extends to nonlinear problems as well, as the following example illustrates.
Example 4.1
Consider the following very simple model of a chemical reaction
u"+e° =0, 0<x < 1 (4.5a)
This latter problem can be solved numerically by an iterative scheme (cf. Section 2.4).
Note that each function evaluation in this iterative scheme involves the (numerical) solu-
tion of an IVP. ❑
The shooting approach remains rather straightforward even for a general system of first
order nonlinear differential equations,
y' = f(x, y) a <x <b (4.7a)
subject to general two-point boundary conditions
Again, if we denote by y(x; s) the solution of (4.7a) subject to the initial conditions
Downloaded 06/07/16 to 130.237.29.138. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
y(a ; s) = s (4.7c)
the problem reduces to that of finding s = s* which solves the nonlinear algebraic equa-
tions
Each function evaluation in (4.8) again involves the solution of an IVP (4.7a, c).
Now, a user can combine a standard library routine for solving nonlinear equa-
tions (e. g. using Broyden updates — see Section 2.3.3) and a standard initial value
solver (cf. Section 2.7) to obtain a general code for solving BVPs, based on the method
outlined above. This is so straightforward and general that, if the method did not have
severe drawbacks, there would be hardly any justification for writing this book! How-
ever, as it tums out, the simple shooting method cannot always be applied, and its
numerical results cannot always be trusted.
One major drawback already arises in the linear case. The IVPs integrated in the
process could be unstable, even when the BVP is well-conditioned. The shooting
method is unstable in that case, as discussed in Section 4.2. With nonlinear problems,
there is another potential trouble: When shooting with the wrong initial values s, the
(exact) IVP solution y(x; s) might exist only on [a , cl, where c <b. The iteration pro-
cess for (4.8) is obviously in trouble under such circumstances.
Initial value methods like multiple shooting and stabilized march attempt to
alleviate these difficulties by restricting the size of the intervals over which IVPs are
solved. The resulting schemes are described and analyzed in Sections 4.3, 4.4 and 4.6.
The Riccati method, described in Section 4.5, involves solving different, usually more
stable IVPs.
In this section we describe and analyze the shooting method for the general linear two-
point BVP
y' = A (x)y + q(x) a <x <b (4.9a)
4.2.1 Superposition
We describe the extension of (4.4) for the general problem (4.9a,b), since it is
mathematically more elegant than the extension of (4.3). Recall from Section 3.1 that
the general solution of (4.9a) can be represented as
particular solution. For definiteness, let Y (x) = Y (x; a) be the fundamental solution
satisfying
Y(a)=1 (4.11 b)
The particular solution v(x) may be determined as the solution of the IVP
v(a)=a (4.12b)
for some vector a (e. g. a = 0). Thus, the n columns of Y (x) and the vector v(x) can
be computed as solutions of n + 1 IVPs. The BVP solution y(x) is given as their super-
position in (4.10). To determine the parameters s we substitute in (4.9b) to obtain
D = B a [Y (a )s+ v(a )] + B b [Y (b )s + v(b )] = [B 0 +B b Y (b )]s +B a v(a) +Bb v(b )
or
Q s = 0 (4.13a)
where
Q := Ba + B b Y (b) (4.13b)
Assuming that the BVP (4.9a,b) has a unique solution, we know that Q is nonsingular
[cf. Theorem 3.26]; hence s,is well-defined by (4.13). Then the solution for our BVP is
given by (4.10).
Example 4.2
Consider the ODES
^y 2y
[ 1
1 I y2 +(1—?) {
1 1 0 <x <b (4.14a)
l 1 01 (yi(0)1 (0 0 ^Y.(b)1 _
0 0 lY2(0) J + (1 0 y2(b)J le b J (4.14b)
for some 1 > 0, b > 0. The reader should have no trouble in identifying here a special case
of (4.1a, b, c). It is straightforward to show that
1 1
cosh kb 1 - ' sin1 h ;J, s ( cosh kb +1 - ' sinh b,
with the solution s = [i] . Hence the solution to (4.14a,b) is given by y(x) _ [eX
J .
❑
1. Integrate (4.1 la,b), (4.12a,b) numerically, using an IVP code, obtaining Y h (b)
and v' (b ).
2. Form Q h and p h by (4.13b,c).
3. Solve Q h sh = p h for s's.
4. y h (a) := s' + a
5. Integrate the IVP for (4.9a) numerically, obtaining y h (x l ), ... , y h (x1 ). ❑
The last two steps of the algorithm deserve a comment. Step 4 follows directly from
(4.10), but step 5, where the solution is evaluated (formed), solves an additional IVP,
instead of using (4.10). There are two reasons for this. One is that if there are many
evaluation points (e. g. we wish to plot or tabulate the solution), then storing Y' (x1 ) and
vl (xi ) for each j during step 1 is impractical. The other reason is that when per-
forming steps 1-4 we do not need to know what evaluation points would be desired.
Note that theoretically, step 5 is equivalent to using (4.10), provided that the same mesh
is used in the IVP integration.
The quality of the approximate solution obtained by this algorithm depends on the
discretization error and on the roundoff error. In the next section we assume infinite
precision, which enables us to deal with the discretization error alone. The following
section then deals with roundoff errors.
As indicated in the single shooting algorithm, the IVPs have to be integrated numeri-
cally, and this introduces a discretization error. We have already indicated, following
Now, let the IVP integrator induce the (small) local truncation error t, /(x ;+l —x; ) at x.
Then
Y(Xi+t) — Yi+l = ri[Y(xi) — Yi] + t (4.19)
;
Note that (4.19) has the form of (4.15) with the global error y(x ; ) — y i replacing y i and
the local truncation error term ij replacing inhomogeneity g; . Use of (4.18) then yields
N
Y(Xj) - yi = E G i; t. (4.20)
=1
This expression relates the error in y ; to the local truncation errors in the IVP
integrations. But we stilt have to consider G. It often occurs, for instance, that a fun-
damental solution grows exponentially. Yet, it can be shown that under reasonable
assumptions there is a constant C of moderate size such that
so from (4.20),
Therefore
b
I y(x;) - y I <_ K TOL JIJ G (x ,t )dt II = K TOL x
; ; (4.21)
a
and the mode e -k` essentially disappears. As a consequence, the step size will mainly
be determined by the dominating mode. It is sensible then to use the adaptive step
selection of the integrator only once, thus constructing a mesh, and compute the other
solution vectors on the same mesh. [Unexpectedly this supports the assumption made
for convenience of the analysis before, when obtaining (4.15).] Third, as most standard
integrators are designed for (usually stable, or at least not extremely unstable) IVPs,
great care has to be taken with respect to the stability of the numerical computation of
growing modes. In particular, methods for stiff IVPs are not necessarily good here.
For instance, if one uses the backward Euler scheme (cf. Section 2.7) then e >- is
represented by a decaying discrete solution component if h 1. > 2 (h the step size). This
may have very undesirable consequences, as some terminal condition may no longer
control this mode in the numerical approximation. An automatic stiff IVP integrator,
even with a relative step size control, proceeds with a very small step size, keeping
h 1 < 2 for accuracy reasons. We return to this issue in Chapter 10.
Despite the negative tones in the above observations, (4.21) indicates that the
superposition method is usually in a satisfactory state regarding the effect of discretiza-
tion errors. The trouble is in roundoff error accumulation, which we treat next.
Consider the linear BVP (4.9a,b) and assume that it is well-conditioned (i. e. x of
Downloaded 06/07/16 to 130.237.29.138. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
(3.34) is of moderate size). We shall see that accurate numerical solutions for such
problems can be obtained, e. g., by the finite difference techniques of Chapter 5; so, if
something goes wrong when one is using simple shooting, it is the method and not the
problem which is to be blamed for the difficulty.
Stability difficulties arise when the differential operator of (4.9a) contains rapidly
growing (and decaying) fundamental solution modes. Then the initial value solution is
very sensitive to small changes in the initial values s of (4.10). Thus, even if a numer-
ical approximation sh is obtained for 1 s= 1 such that
Is" — si = £M
where cM is the machine precision (this is the best we may hope for), the roundoff error
e (x) when forming the approximation to y(x) by (4.10) can be expected to grow like
Ie(x)I ~cMIIY(xjI, (4.22)
which may be acceptable relative to II Y (x )H but is not acceptable relative to 1 y(x )1 !
Example 4.3
We make a slight modification to Example 4.2 to yield a symmetric coefficient matrix
A(x)= [ 0 Ó, 0<_x <_ 1
The fundamental solution modes for y 1 are et, so the roundoff error when constructing
the solution is expected to grow like
(s — s" )e X `
Y (x) _
(cosh .x sinh ax
l sinh Ax cosh )lx
Thus, as ?.x grows, Y (x) rapidly becomes large in norm and nearly singular. This is just
another manifestation of the presence of rapidly increasing and nonincreasing fundamental
modes.
Taking the inhomogeneous term and the boundary conditions as
0
q(x)= [
yl(0)=yl(1)=0
cos 2 ttx +t cos 2nx
x — cos 2 irx
Y(x)= e M1)^- e - x
+ X sin 2^r
1+
sos 1 =0ands 2 =-1 ifv(a)=0and1 islarge.
computer with a 14-hexadecimal-digit mantissa. Under "e" the absolute errors in y 1 are
listed. The errors in y 2 are similar. Note that the error growth in the formed solution is
exponential in ? , as predicted. These results are clearly unacceptable.
Note that the matrix
Q _( 1 0
l cosh ? sinh X
approximated by the numerical initial value solutions contains extremely large elements.
Indeed, one thing to beware of when using the shooting method in general is the possibility
of floating-point overflow in the course of initial value integration. This phenomenon cer-
tainly occurs in this example with ? = 500, say.
1 x e R, x e
__ Bap
^
[ 0 1
Ba 0
j =
Bb Bb2
Now, if there exists a mode that has grown much larger in magnitude than a comple-
mentary set of (n — 1) other modes at x = b, then Y(b) is nearly a matrix of rank 1 [fust
like Y(1) of Example 4.3]. In fact, in such a case the effective (or numerical ) rank of
Y (b) is 1, due to rounding errors. Thus, the effective rank of Bb 2 Y (b) is also 1 and
that of
Q ^
_ Bod
Bb2y(b) j
is at most n — v + 1. So, if v > 1, the effective rank of Q is less than n, and we cannot
expect s h to be calculated accurately!
Example 4.4
Consider the problem
Downloaded 06/07/16 to 130.237.29.138. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
If A (x) varies with x, then even with one growing fundamental mode the
shooting method may be unable to reproduce s accurately in some cases, as the fol-
lowing example shows.
Example 4.5
Consider the ODE
(1 — ^. cos 2nx 1 + sin 2ttx 1
y —rz l-1+1sin2ttx 1+1cos2nxJy 0<x<1
with ? a constant, ?. > 1. Here,
__ cositx e` sin 1
Y(x) )
—e t I-X >n` sin nx e (1+?`tom` rtx
CO,
Thus, the upper right element of Y (x) is 0 at x = 1, but it is very large for 1 = 50, say,
when x is slightly perturbed from 1. This component of Y"(1), as well as its first column,
would not approximate the corresponding elements of Y(l) well, even relatively. Indeed,
with boundary matrices B a = B,, = 1, which yield a welt-conditioned problem, the simple
shooting method fails to produce an s h which accurately approximates s (see Example 9). ❑
For a BVP (4.9a,b) of order n, the superposition method calls for the integration of n+1
IVPs, namely, for n fundamental modes of (4.11) plus one particular solution of (4.12).
When the BC are partially separated, e. g. if n — v of the BC involve y(a) alone, for
some v < n , then it is possible to integrate only v + 1 ODEs, as described below. The
resulting procedure is called reduced superposition.
The reader may well ask at this point: Why waste time trying to improve upon the
efficiency of a method which has just been shown to be unstable in many cases? The
main answer is that similar considerations arise later on for methods which are designed
to improve upon the stability problems of single shooting.
Bat Bb2
[P2J
B i Y (a) = 0.
Q (4.27b)
and ie R`'.
The reason that we can write (4.26), in general, is that (4.25) (and the full rank of
Bq i) imply that v(x) should lie in a v-dimensional mani fold, and a complementary solu-
tion space is found by superposition of the v columns of Y(x) which satisfy (4.27a,b).
Now, clearly y(x) of (4.26) satisfies the first n — v BC of (4.24). The parameters
s are then determined so that the remaining v BC are satisfied as well. Parallel to
(4.13), we have the system of order v
Q s = 0 (4.28a)
HBa _ 0 j
(4.29)
(cf. Section 2.4). Now, let Y (a) be the last v columns of H T . Then, transposing (4.29)
and considering the last v columns, (4.27b) is seen to be satisfied. Writing
H T =: [Y (a) I (a )] (4.30a)
and defining
Example 4.6
Consider the Dirichlet problem (4.1 a, b, c) again. When we described the shooting method
for this BVP in Section 4.1, only two IVPs were solved. Yet, when we convert to a first-
order system (of order n = 2) and applying the superposition method, three IVPs (= n + 1)
need to be integrated! This discrepancy is now resolved: In Section 4.1 we merely
showed a trivial case of reduced superposition. For y, = u , y 2 = u' we have from (4.1b, c)
It is not difficult to verify that the results obtained earlier for the full superposition
method with respect to discretization and roundoff errors remain qualitatively the same
for the reduced superposition method. The advantage of this method lies simply in
having fewer IVPs to integrate and a smaller Q to invert. The minor disadvantage is in
the added overhead of (4.29), (4.30).
As we saw in the previóus section, a major disadvantage of the single shooting method
is the roundoff error accumulation, occurring when unstable IVPs have to be integrated.
A rough (but often approximately achievable) bound on this error is of the order
M e L(b-a)
E
bound, it is natural to restrict the size of domains over which IVPs are integrated. Thus,
the interval of integration [a, b] is subdivided by a mesh
and then, as in shooting, initial value integrations are performed on each subinterval
[x; , x;+ 1 ], 1 <_ i <_ N (see Fig. 4.2). Recall that N integration steps are performed on the
one subinterval in Section 4.2.2. The resulting solution segments are patched up to
form a continuous solution over the entire interval [a, b]. This leads to the method of
multiple shooting.
b
a
Xl
X2 X3 X4
X5 Xe
We consider again the general linear BVP (4.9a, b) and generalize the superposition
method (4.10)-(4.12) directly. On each subinterval [x ,x1+ ], 1 <_ i <_ N, of the mesh
(4.31), we can express the general solution of (4.9a) as
If the matrices F i are chosen independently from each other, as in (4.35), then the
problems (4.33a, b) can be integrated in parallel for 1 <_ i <_ N. Indeed, the method is
sometimes called parallel shooting in that case. Other variants involve marching tech-
niques, discussed in the next section, where for each i, Fi+1 is determined only after
(4.33a, b) has been solved. If (4.35) holds then we call the method standard multiple
shooting. Note that Yi (x) = Y (x ; x i ) when F i =1. It should be mentioned that Y i (x)
here differs from Yi in (4.16).
The nN parameters
sT := (s T ,s2 , ... , s,) (4.36)
are determined so that the (numerical) solution is continuous over the entire interval
[a,b] and so that the BC (4.9b) are satisfied. By using (4.32), the continuity (or
matching) conditions y(xi+l) = y(xi+l ), 1 <_ i 5 N -1, are
-
Y1(x2) F2 S1
v1(x2) — a2
-
Y2(x3) F3 S2 V2(x3) - a3
_ (4.37c)
VN-1(XN) - aN
-
YN-I(XN) FN
BaFI BbYN(b) SN ^—Baal—BbVN(b)
Unlike single shooting, where after obtaining the initial values s the solution of
the BVP still has to be formed (recall the last step of the single shooting algorithm and
the discussion following), here the solution at x ; is given by
y(xi )=Fi s; +a;
In particular, y(x) = s ; for the standard multiple shooting method.
As before, let us denote by Y;` and v/ the approximations to Y• and v i obtained by
using a numerical scheme. The corresponding solution to the linear system of alge-
braic equations (4.37b, c), denoted by s's, yields approximate solution values
y h (xi ), 1 < i <_ N . Thus, if we assume that the evaluation points (i. e. the points where
the solution is desired) are included in the mesh (4.31), the following algorithm is
obtained.
Example 4.7
Consider Example 1.8 of Section 1.2 (theoretical seismograms). For simplicity, we con-
centrate on the SH case. To recall, the BVP is given by (1.20a, b, c). Thus, the differential
equations are (with the notation of this Example 1.8)
Y1'=µ Y2 '
0<x <b
Downloaded 06/07/16 to 130.237.29.138. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
Y2'=V 21y1
µ = p13 v2 = k 2 — a2
,
. Here k and w (horizontal wave number and angular frequency) are
known constants and p(x) and 13(x) (density and velocity) are given functions generated
from measured data. The variable x is a space variable, pointing downward into earth.
Let us assume that v 2(b) > 0. The BC are
and
13(x) =pe , p(x)=pi, x 5x <xi +1,
; 1 :i 5N
cosh (v ; (x —x; ))
Y'' (X) — µ; v ; sinh (v ; (x x; )) _1 (µi v ; )-' sinh (v i ( z -x;))
cosh (v ; (x x; )) )
ifk 2 _ wx >O,and
1 cosv sin(v (v(x —x(x —x)) ; ))
;
;
;
; (µ; v ; )-' sin (v ; (x —x; ))
cos (v ; (x x; )) J
otherwise. Also v i (x) = 0, 1 <— i <— N, because the problem is homogeneous.
Thus, the exact solution for this particular model is obtained by solving the 2Nx2N
system (4.37b, c), where each vector s ; has two components, each F ; is a 2x2 identity
matrix, the matrices Y; (x1+ are given above,
,)
B° = l0 01),
B,, I v(b ) µ ( b)J ,
and the right-hand side vector is _ (0, ... , 0, 1, 0) T • The solution y(x) is given by
(4.32). O
4.3.2 Stability
A case like the simplified application in Example 4.7, where we know the exact solu-
Downloaded 06/07/16 to 130.237.29.138. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
tions of (4.33a, b) and (4.34a, b), is of course rather rare in practice. Usually, these
IVPs have to be integrated numerically as in the above algorithm. Thus we have, in
fact, a two-level discretization process: First, by a coarse mesh (4.31), and then by a
finer discretization in each subinterval for the numerical solution of the IVPs.
Consider now the resulting combined fine mesh for this two-level discretization
process, and suppose we applied single shooting (a one-level process) based on the
same fine mesh and using the same scheme for initial value integrations. It is not
difficult to see that, in the absence of any roundoff error, the single shooting method
would yield identical results to those of multiple shooting. Furthermore, since the main
concern in Section 4.2 was the instability of the single shooting method, it remains to
investigate how stability properties with respect to roundoff errors are improved with
the current method. For notational convenience, let us then assume in the sequel that
there is no discretization error, as in Example 4.7. Also assume that the initial value
integrations each use N steps (as in Section 4.2.2), so the combined fine mesh of the
two-level multiple shooting process has
N := NN (4.38)
mesh points.
For simplicity, let us consider the standard multiple shooting method (F ; = 1, ai
= 0, for all i) and denote
F := Y(x,+l;x,) 1 <_ i <_ N (4.39a)
Thus, (4.37a) yields
si +1 <— i <— N —1
= I'; s; + p i1 (4.39b)
where , is the appropriate piece of P (see (4.37)). Again we denote by s h the com-
puted approximation of s, so the error in the solution y(x ; ) is s/ — s, ,^1 <— i S N. As
for single shooting, we assume that for each i there are matrices E, E i with a small
norm and a vector c, 1 c ; 1= 1 such that
—I'1 (I+E,)s"+s;+ 1 =p i :=D i +rc; 1<—i SN-1 (4.40a)
Ba Si +BbFN(I+EN)SN = N PN BbI'NENCN — (4.40b)
Thus, with the perturbation quantities
— F1E 1 0 ci
—1 '2E2
Ei= ... ch=
0 —FN-IEN-1
B b I N EN ' CN
(a) The inverse of A can be expressed in terras of Green's function (3.32), viz.
Downloaded 06/07/16 to 130.237.29.138. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
G(x1,x2)
... G(x1,xN) Y1(x1)Q -1
A -1 = • • • ( 4.42a)
G(xN,x2)
... G(xN,xN) YI(xN)Q -i
_Theorem 4.45 The standard multiple shooting method has a condition number
xh = NiK = cond (A)N. ❑
Thus, if the BVP is well-conditioned and if the shooting points are chosen so that K of
(4.43) is of moderate size (with N not extremely large), then the standard multiple
shooting method is stable, in the sense that there is a tolerable roundoff error
amplification. In contrast, with single shooting (N = 1) there are instances like Example
3 where K (and hence X') becomes intolerably large.
Theorem 4.45 states that it is feasible to control the roundoff error accumulation by
Downloaded 06/07/16 to 130.237.29.138. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
using enough shooting points. Practically, the point is that given a welt-conditioned
problem and a tolerance TOL which is a few orders of magnitude larger than the
machine precision £,M , and choosing shooting points such that K is of moderate size,
then normally either
NKKE„ < TOL
or N is too large for the method to be cost-effective anyway. The effect of using more
shooting points is demonstrated in the following example.
Example 4.8
Consider the BVP
y - (x) 19 cos 2x 1+19 sin 2zl y(x) + (x)
= 1-11—+19 sin 2x 1 +19 cos 2x1 q
where q(x) is chosen such that the solution is y(x) = ex(1,1) T . A fundamental solution is
given by
(cos x sin zoX )
Y (x) = l x cos x) diag (e ,e
and the problem is well-conditioned; in fact, ic = 1. Suppose we have N equally spaced
shooting points. Then the stability of the multiple shooting system is quite well illustrated
by Table 4.2. The reader should realize that N =1 actually gives single shooting, so the
large IIAII IIA
- 'II
is accounted for by the large IIAII = IIQ
II. It is also leen that the alge-
braic condition number IIAII IIA"'II levels off already for quite moderate values of N. This
seems to indicate that not very many shooting points are necessary for this problem, even
if a very high accuracy is desired. ❑
TABLE 4.2 The effect of the number of shooting points on cond (A)
TOL / cM . That is, the IVP integration from a shooting point x ; proceeds until II Y;'(x ) I
Downloaded 06/07/16 to 130.237.29.138. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
gets large (K), whereupon the integration stops and a new shooting point x 1 is set.
However, taking K as large as possible in such a scheme (in an attempt to use as few
shooting points as possible) is not always the most efficient way to go. If an IVP
integration code with an absolute error control is used to solve a problem with
increasing modes, then it is not d' ficult to see that the steps tend to decrease. Thus,
longer shooting intervals may well be less desirable, as they require more integration
steps, hence more function evaluations and an increased total amount of computation.
On the other hand, one needs to restart the integration at each new shooting point, as
well as to solve (4.37b, c), so some compromise is necessary.
Other factors are involved as well: Consider the question of evaluating the com-
puted solution. If there are many output points, where the BVP solution value is
desired (say, for graphing or tabulating purposes), then incorporating them all in the
mesh (4.31), as in the standard multiple shooting algorithm, may be very wasteful.
(There may even be more such evaluation points than points in the fine shooting mesh.)
In that case, another solution evaluation procedure is necessary. One possibility is to
integrate, for a given output point z, x ; <2 <_ x 1 the IVP
,
which is similar to, but much more stable than, step 5 of the single shooting algorithm.
Another possibility is to interpolate y" (X) from the given values y' (x i ) = s,
i = 1, ... N. This implies different, additional criteria for choosing the shooting
,
points, viz., so that accurate interpolation can be made possible. None of these two
alternatives is easily automated.
One important asset of the multiple shooting method is that the initial value
integrations in different subintervals are independent of each other. Hence, they may be
performed concurrently, if parallel computing is available. No such parallel computa-
tion can be done with the marching techniques of Section 4.4. But to utilize this paral-
lelism feature, the shooting points have to be fixed in advance. An appropriate pro-
cedure in such circumstances is to first subdivide the interval [a , b ] coarsely, say
uniformly. Then, on each subinterval, a marching technique can be used.
An application where multiple shooting is both natural and convenient is in prob-
lems where the solution has jumps at certain prescribed points. A typical example is an
ODE
u"=f(x,u,u')
with jump dicontinuities in u'(x). Such is the case in some structural mechanics prob-
lems with layered material or cracks. Considering a conversion to a first-order system
in (u , u'), if the jump in u'(x) is known, then one should fix a shooting point there and
apply the matching taking this jump into account. The only difference compared to the
continuous case is a change in the right-hand side vector (3.
Ilri 11 = Ie^`x .
, ,- X'
,
and to keep K of (4.43) below some value M (say M = 107 ) we must choose
h := max (xi+1 — x; ) such that
1<i <N
4.3.4 Compactification
In describing the multiple shooting method in Section 4.3.1, we have not discussed the
important question of how to solve the linear system (4.37b, c). The corresponding step
in the single shooting method requires no special attention, because Q is only n x n and
is dense. Thus, the usual Gaussian elimination method (Section 2.3) is adequate. In
(4.37b,c), on the other hand, only 2n 2N out of n 2N 2 elements of A are possibly
nonzero. Thus, A is sparse when N is large, and a special scheme is needed to ensure
its efficient storage and decomposition.
We defer the general treatment of this question to Chapter 7, because other BVP
algorithms also lead to linear systems with a similar structure. An exception is made
for two schemes, one discussed in Section 4.4 and the other, called the compact ification
or condensing method, discussed next.
The idea is dangerously appealing: Use the recursion relation (4.37a) to express
(by repeated application for i = N — 1, N — 2, ... , 1) sN in terms of s,; substitute in the
BC and obtain an n xn system of linear equations for s i ; solve for s 1 ; finally, use
(4.37a) again (this time forward, i = 1, ..., N — 1) to obtain s. This algorithm cer-
tainly avoids any need for sophisticated storage schemes for A! We express it, for the
standard multiple shooting scheme, as a superposition method for the recursion (4.39b).
Thus, we compute a particular solution 1 ,, ) NI and a fundamental solution 1 <D i } N 1
by
ti+] = ri ti + vi (xi+i) i = 1, ... , N (4.46a)
9 1 = 0 (4.46b)
(D 1 =1 (4.47b)
and form
Alas, the compactification algorithm suffers from instability, much like the single
shooting method. To see this, consider the case where there is no discretization error,
as in Example 4.7. Then
and the matrix of (4.48b) is precisely Q of (4.13b). The solution formation by (4.48a)
is equivalent to (4.10). When discretization errors are present as well, there is a similar
error buildup in both { cpi } and { F }. But, just as for single shooting, we may not
;
Example 4.9
In Table 4.3 we list some results comparing the performance of simple shooting (SH), mul-
tiple shooting with Gaussian elimination and row-partial pivoting (MSH) and multiple
shooting with compactification (MSC). The initial value solver, tolerances, and computer
used are the same as in Example 4.3. The errors for the first solution component are listed
for three problems:
Y1(0)+Y2(0) =Y1(1)+Y2(1) = 1
whose solution is
1 '+a>n)eu -a)
y1(x) = e Zxa_1 ( (1+e(i-xrx)eo +x)n(x -n sin xx—(l +e~ ( ' cos nx 1
For MSH and MSC, 10 uniformly spaced shooting points were used. Note that s (or
si) is approximated well by all three methods for problems (1) and (3), whereas for
problem (2) only MSH does a good job. In general, the full multiple shooting method per-
forms well, whereas the other methods do not. The compactification algorithm fails, just
like simple shooting, to approximate s i well for problem (2). ❑
Despite this incriminating evidence, we should realize that MSC is not just SH.
In particular, the IVP integrations are done in a safer way. This point becomes more
important in nonlinear problems, as we shall see in Section 4.6.
The following two sections, Section 4.4 and Section 4.5, need a more involved
notation and use somewhat more sophisticated mathematical tools. Those readers who
are interested in an overview of initial value methods may want to read Section 4.6 first,
but it is advisable to come back: The next section contains important practical material,
including our most recommended numerical algorithm discussed in this chapter.
Let us consider again the solution of the BVP (4.9a, b) by multiple shooting. To recall,
a marching technique for multiple shooting is one where the IVPs (4.33a, b)
are solved consecutively, with the initial matrix F i+ , for the (i + 1)st subinterval being
formed only after the IVP for the ith subinterval has been solved.
The motivation behind such a scheme is to construct not only F• 1 , but also the
next shooting point x i+1 , using the following reasoning: In order to make a multiple
shooting algorithm stable, it is necessary to monitor the growth of the fundamental solu-
tion modes to decide where the next shooting point should be. Since these basis func-
tions tend to resembie more and more the most dominant one, they become more and
more numerically linearly dependent as the integration interval gets larger (see Exam-
ples 2.5, for instance). Thus, when some tolerance of dependence is exceeded, a new
shooting point x• +i should be set and the columns of Y ; (x) should be reorthogonalized
to restore their linear independence.
4.4.1 Reorthogonaiization
We start integrating the IVPs (4.33a, b) and (4.34a, b) for i = 1, assuming that F, is an
orthogonal matrix. For the general step i, i = 1, ..., N, we begin the IVP integrations
with an orthogonal matrix F. Then, at the next shooting point x;+l , we perform a QU-
decomposition of Y ; (x; +l ), i.e.
with F i+1 orthogonal and Fi upper triangular (see Section 2.4). Regarding the particular
solution v ; (x) of (4.34), we do not impose any special requirements on a ; at this point
(so, e. g., take ai = 0).
Consider next the recursion relation (4.37a). Defining
we find
As= (4.51b)
-r, 1
-r 2 1
A = (4.51c)
—rN-1 1
BaF1 BbFN+1rN
Note that A and r here conrespond to, but are not the same as, A and r of Sec-
; ;
tion 4.3. A skeptical reader may well ask at this point "so what?": A of (4.51c) does
not seem to be very different from A of (4.37b, c). The gain in using (4.51) is
explained next.
Let us examine the connection between r, here and the fundamental solutions (4.38) of
Downloaded 06/07/16 to 130.237.29.138. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
the standard multiple shooting method. To allow a simpler notation, let us again assume
for the time being that no discretization errors occur. We have
Y (xj+i; xi) = Y1(xi+i)[Y1 (xi )l ' = Fi+1F FT
- (F1 1 = FT)
so
...
Y(x^+t;a)=Y(xi+t;x;)Y(xi;xi-,) Y(x2;a)=Fi+1ri
+ M 2) = R 22
(Q21 I Q 22 )M ' = 0, (Q 21 I Q 22 )(M ' L
Hence, (Q 2'1 Q 22 )M 2 =R22 and the proof is completed using the 2-norm preserving pro-
perty of the orthogonal Q. ❑
implies that the right lower (n — k) x (n — k) block of r ; has a norm of the same order
of magnitude as the decaying modes. [This is considered more precisely in Theorem
6.15.]
This means that in (4.5 la) we have effectively decoupled the recursion (4.37a),
where the decoupling corresponds to the splitting of growing and decaying modes. If
we write
D C;
;
(4.54)
= 0 Ei
gj = gz gr1e R k (4.55)
9 11
^
2.FORi=1,2,....NDO
zz 2
g '+1:=Eik +a^
22 22
4I1
:=D.'cD'
12 := 1 12
Cb 2 )
4. M := Ba F 101 + Bb FN+1 0N+1; t := P — B a (a1+F 191) — Bb (FN+IkN+l + VN(b),
5. Solve Mto =g (an n x n linear system).
6.FORi=1,...,NDO
y(xi):=F;ci11+F+a; ❑
Downloaded 06/07/16 to 130.237.29.138. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
The lower rows' components of these are calculated in step 2 (after an initialization in
step 1). The rest are calculated in step 3. Note that, since D i is upper triangular, the
multiplications by Df 1 in this step amount to back substitution. Then steps 4 and 5 are
the equivalent of (4.48b). Step 6 is a combination of the formulae
alone. We now show that this algorithm is indeed much more stable than the previous
compactification algorithm (4.46)-(4.48).
Theorem 4.58 If the BVP (4.9a, b) has a conditioning constant x [cf. (3.34)],
then the decoupling algorithm has a condition number Xh = K 2KN [cf. (4.38), (4.43),
(4.44)].
Proof: First, let us note that quantities involved in (4.56) are perturbed by
NKeJ . This and step 3 of the algorithm leave us to show that M -1 is bounded essen-
tially by K. The discussion preceding the algorithm shows that II<Di II and 1k I are ;
bounded in terms of K. Since II`I' ; 112 = IIz;112 and Iz; I 2 = Ie,; I2, the quantities appearing
in step 6 and in M are bounded by constants of moderate size as well (in Section 6.2
we give more precise bounds). So we are left to show that PIM - ' II is bounded, because
this will imply that Ti can be calculated stably. Within numerical precision we may
assume that
II`P^ M-1 II25K j=1, ...,N+1
Hence, using again the orthogonality of F,
II(0 1 1„-k )M 'II 2 :5 III,M 'II2 = II`F1M '112 <_ K
- - -
Example 4.10
Consider the problem from Example 4.8 again. Using the routine MUSL from Appendix A
(where stable compactification is employed) with TOL =10 4 and K =max1IT 112510 3 , we ;
obtain the resuits listed in Table 4.4a, where e denotes the error in the max norm.
i x I;_^ e
Downloaded 06/07/16 to 130.237.29.138. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
0 .0 .19-8
(yr(x) 0 1 + ^ (I-W(x»e'1
12W(x) -W(x)ly 2ex J
where V(x) = 20 sin x + 20x cos x, and the BC
r
y(0)+y(T)= 2(1+e r ) ' T>0
This BVP has the solution y(x) = (2 ] ex, although the ODE has modes growing like
e 20.` s'" x and e -20x s'° x Thus, we only have a monotonic behaviour of these modes
(dichotomy with a dichotomy constant = 1) on [0, L], L = 2.029. For x > L the increasing
TABLE 4.4b Incremental matrix and errors for Example 4.10, K=10 3°
i x e
0 0 .19-8
0.1 apart. For T = 2 we see the (common) slight overkill of a multiple shooting approach
(i. e., the errors are smaller than the tolerance). For T =3 we see a dramatic loss of accu-
racy, which is predicted by the estimate of the conditioning constant.
One of the questions left to be dealt with, if we want to use this algorithm, is to
find an appropriate initial matrix F 1 . Recall that the first k columns of F 1 have to span
a subspace that generates the growing solutions (assuming that we have an exponential
dichotomy). We will pursue this question for the case when the BC are separated, or
partially separated, and leave more general cases to Section 6.4.3. We are now ready to
generalize the reduced superposition method of Section 4.2.4.
Let us consider again the ODE (4.9a) subject to the partially separated BC (4.24),
y' = A (x )y + q(x)
Ba 1 y(a) + 0 y(b) _
Ba 2 Bb 2 F'2 v
We proceed to generalize the reduced superposition method into a multiple shooting set-
ting. Given a mesh (4.31) of shooting points, the solution y(x) is expressed in the form
y(x) = Y; (x )s + v i (x)
; x, <_x <_x,, 1 <_i <_N (4.59)
where each Y, (x) is now an n xv, rather than n x n, matrix of fundamental solutions,
satisfying
Balvi(a) = 01 (4.61a)
Ba1 F 1 =0 (4.61b)
How this choice is implemented has already been discussed [see (4.29), (4.30)1.
TABLE 4.5 Errors and conditioning constant estimates for Example 4.11
range (Y +1 (xi+l )) = range (Yi (xi+l)) and that the new particular solution vi+1(x) is in
range (Yi (xi+l)) + v i (xi+l). This is done as follows: First, we orthogonalize the
columns of Y; (xi+l) using, say, Householder transformations. This gives
Yi (xi+l) = Fi+lri (4.62)
The relations (4.64a), (4.61c) give a linear system of order VN (not nN) for
T /^ T T
S = 1^1,...,SN).
A s = (4.64b)
where
—rl L
—r2 1
A = .. (4.64c)
.
—rN-1
Ba2F1 Bb2FN+IFN
The construction of the right-hand-side vector is straightforward.
In (4.64) we recognize a system like (4.37) and (4.51), but of a reduced order.
Hence, this implementation is more efficient with respect to the number of IVPs to be
integrated and with respect to the algebraic system to be solved. The latter is even
simpler if the BC are completely separated, i. e.,
B at =0
Then A is a simple (block) upper triangular matrix, suggesting a computation of s by
(block) back-substitution. This, however, is just another application of the com-
pactification algorithm, and would be judged unstable (in general), unless it happens
that this is a special case of the decoupling algorithm of the previous section.
The latter, in fact, turns out to be the case. Indeed, the orthogonalization process
here may be viewed as a special case of the process described in Section 4.4.1. More-
over, Theorem 3.107 implies that, if the BVP is well-conditioned, then Y I (x l ) must gen-
the relevant part of the decoupling algorithm and the analysis of the previous section
Downloaded 06/07/16 to 130.237.29.138. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
(and Section 6.2) shows that it is stable. We summarize these results in an algorithm
and a theorem.
Remark It is possible to show that Theorem 4.58 holds also for the stabilized
march algorithm. [The proof is slightly more complicated than that shown for Theorem
4.58.] Thus, the method is stable in the sense that it yields, under very reasonable
assumptions, agreeable roundoff error amplification. ❑
Note tha t in step 3 of the algorithm we need to solve N linear systems of order v;
but only Bn2FN+lFN is a possibly full matrix to be inverted. The other matrices
encountered, F, are all upper triangular, so this step is very fast. Most of the computa-
tional effort in this algorithm goes into the IVP integrations. The second largest
expense is the orthogonalizations.
Note that in the above algorithm, we have assumed that the mesh (4.31) is fixed
in advance. In practice, this algorithm should be modified so that the shooting points
are chosen adaptively. One way to do that i s to monitor the linear dependence of the
columns of Y; (x). (Instead, the growth of 11V (x ) II, i.e., the growth of lID ; II, could be
monitored, essentially as discuseed previously.) Another way is to use the IVP
integrator only for a fixed (small) number of steps, letting the next shooting point be
determined essentially by the (aggregated) adaptive step selection of the integrator.
This introduces possibly more shooting points, but the integrator may perform better,
using fewer steps overall.
The assumption of separated BC is crucial for Theorem 4.58 to hold for the
method of stabilized march [because it relies on Theorem 3.107]. If Ba2 ^ 0 then exam-
ples may be constructed where the compactification algorithm is unstable; for this, see
Exercise 13 and a related discussion in Section 6.4.2.
The last in our series of initial value methods for solving linear BVPs is related to what,
for historical reasons, has been called invariant imbedding. A more adequate name for
the method which we consider is the Riccati method. A common feature to all of the
previous methods is that the discretization was first applied to the original BVP
(4.9a, b), and only then some transformations were applied to resolve stability issues. In
contrast, here the ODEs are transformed first and only then discretization is done. The
hope is that the transformed system of ODEs is such that the IVPs are stable and hence
can be stably and efficiently integrated.
In what follows, we first show how one can derive an invariant imbedding
method. Then we give a general treatment of the Riccati method.
0<x <b
where the upper dot denotes differentiation with respect to b and all quantities are
evaluated at b, unless otherwise specified. We now subject (4.66b) to a similar treat-
ment and argue that the obtained coefficients of y 2 (0) and y 1 (x) at x = b should equal
zero. This yields the following ODEs for b > 0:
= ( — Ya 21 + a 22)s (4.67d)
The way in which these ODEs are written is in anticipation for the more general case.
The quantities appearing in (4.67) are all functions of our new independent variable, b.
The equations (4.66) also fumish us with initial values for (4.67a, b, c, d). We get
Example 4.12
If we take
A(x)— l ol 1 1
Section 4.5 The Riccati Method 165
so
Note that the transformed IVP in this example is stable, even though the original ODE is
not. Its kinematic eigenvalues are the eigenvalues of A, which are ±1. O
Let us consider now the ideas introduced above in a more general setting, which will
enable us to attack a general BVP of the form (4.9a, b) and to obtain solution values
also at intermediate points x.
For a given ODE
y'=A(x)y+q(x)
let T (x) be a linear transformation of the form
T(x) = R(1
I
x) 01 (
4.68)
where R (x) is an (n — k) x k matrix, to be defined later on. Define
w(x) := T -' (x)y(x) (4.69)
Then w(x) satisfies
w' = U (x )w + g(x) (4.70a)
where U (x) (and T (x)) satisfy the Lyapunov equations [cf. (3.56)—(3.58)]
T' = AT — TU (4.70b)
and
g(x) := T '(x)q(x)
- (4.70c)
Now, we require that U be "block upper triangular" in the sense that its lower left
block, corresponding in dimension to R (x), is zero. Thus, we introduce decoupling.
We write
with U 1 1 and U 22 square submatrices and use a similar partition notation for other
matrices and vectors in the sequel. Our requirement on U (x) is that U 21 (x) _= 0. This
yields, upon substituting in (4.70b), the Riccati differential equations for R (x),
R ' = A 2 + A 22R — RA " — RA 12R (4.72)
and the block form of U (x),
Let us now describe how to choose the end values R (a), w 2 (a) and w 1 (b) needed
in the above outline, for the case of separated BC. Thus, let (4.9b) read
and so steps 1 and 2 of the method can be readily applied. After finding R (b) and
w 2 (b ), it is not difficult to find w' (b) from (4.75a) and apply step 3.
Before proceeding to discuss the properties of this method, let us show how it
corresponds to the invariant imbedding method described previously for the BVP
(4.65a,b,c). This BVP is, of course, an instance of the BVP (4.9), (4.75) with n =2,
k = 1, A,, = a 1 , C = 0, D = 1. Thus, the Riccati equations (4.67c) and (4.72) coincide.
Moreover, (4.76a) yields R (a) = 0 here, so R (x) = y(x). Further, we identify (4.67d)
with (4.74b), which is integrated forward, i.e., w 2(x) = c 1 S(x ). Finally, to integrate
(4.74a) backward, consider the independent variable t := b —x, yielding
We now discuss the properties of the Riccati method, outlined above, under the assump-
tion that the BC are separated, i. e., (4.75) holds. Some disadvantages are apparent in
(4.72): We have to solve a nonlinear matrix differential problem. This is potentially
expensive or even infeasible, i. e., it is possible that the IVP (4.72), (4.76a) simply
blows up for some c, a < c < b , as discussed further below. The questiotï we address
next is whether the method offers any advantages. The answer is positive: The insta-
bility problem which was most haunting with the single shooting method (and even
with multiple shooting) is resolved here. Let us indicate why.
The method analysis is analogous to that of the stabilized march in Section 4.4.3.
Using Theorem 3.107 we conclude that, provided the BVP is well-conditioned,
range (R should induce initial values of increasing modes only. Hence the choice
('ai
(4.76a) not only gives satisfactory initial values for wz (a) in (4.76b) but also, in prin-
ciple, a stable algorithm, in that the kinematic eigenvalues of U' '(x) can be expected to
have positive real parts.
The first k columns of the transformation matrix T (x) of (4.68) should maintain
the directions of the increasing modes. However, from a geometrical point of view,
considering possible rotational activity of these solution directions, it is not clear
whether or not this is true. To analyze this, let us consider a fundamental solution for
(4.9a)
Y(x) = Y(x)
Y "" (x) Y(x)
y 22( ) xj
such that
u
range y 21 (a) - range R(a) [ ]
Thus, we link the stability question of the method with the feasibility of integrating the
nonlinear Riccati equation (4.72), starting with (4.76a). If difficulties arise, this would
become apparent when integrating for R (x).
Downloaded 06/07/16 to 130.237.29.138. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
Example 4.13
Consider the ODE (cf. Example 3.12)
_ –) cos 2wx w+? sin 2wx
y – –w + 1 sin 2wx ?. cos 2wx , Y 0 <x < it
Y(x)=Y(x)()
Now, considering (4.78b) (all quantities appearing there are now scalar), we realize
that the solution of (4.72), (4.76a) is
R(x)= acos Cox e xx +[i sin we
a sin COX e xx – Pcos COX e_XX
It is simple to check that a * 0 is necessary for a well-conditioned problem. Hence R (x)
has potes at all points x for which
-L = e D r tan COX
a
From this expression we see that 0 (w) restarts (or changes in imbedding) are needed for
the solution of the Riccati equation. Away from the restart points, however, the integration
should be fairly easy, as R (x) is smooth here. More importantly, the integration of R (x) is
hardly affected by 1„ regardless of its size. Thus, the Riccati method may overcome fast
growth and decay of modes, but at the same time it could be plagued by rotational activity.
It is intriguing to note that the latter observation is contrary to the situation with
multiple shooting techniques, even though the decoupling principle is so similar here and
in the previous section. Let us consider two cases
only effect on shooting techniques is that 0 (co) steps are needed in the IVP integra-
tions. The single shooting method may even be applied.
Downloaded 06/07/16 to 130.237.29.138. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
The Riccati method is one of a larger class of methods which solve IVPs for
transformed problems. Some of its desirable properties are its basic simplicity and the
stability of (4.72), (4.74a, b). We defer a discussion of the latter to Section 10.4.2.
In this section we return to the general nonlinear BVP (4.7a, b), briefly considered
before in Section 4.1.3. Thus, the problem to be solved is
y' = f(x , y) a <x <b (4.79a)
Let us recail (4.7), (4.8). For any initial vector s, let y(x; s) satisfy
y(a ; s) = s (4.80b)
The required s = s* is that for which the BC (4.79b) are satisfied, i. e.
g(?, y(b ; s" )) = 0 (4.81)
From Theorem 3.16, the existence of isolated solutions for the BVP (4.79) corresponds
to the existence of simple roots of (4.81).
In general s* is not known, and therefore we may try to find it by an iterative
scheme like Newton's method. We postpone a thorough treatment of methods for
solving nonlinear equations to Chapter 8. Here we only briefly consider the basic
Newton algorithm to obtain a solution of
where
Downloaded 06/07/16 to 130.237.29.138. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
Recall that, given an initial guess s°, Newton's method generates a sequence of
iterates s', s 2 , ... , sm , ... , hopefully converging to a solution s* of (4.82a), by
Sm+l := Sm + k (4.83a)
where
ó (u,v)
Ba =ag(u,v)
au B b =
av
at u = s, v = y(b; s) (4.86)
In general, Ba , Bb, and Y (x) all depend on s. Also, note that in actual computation
y(x; s) and Y (x) are replaced by their numerical approximations obtained by solving the
initial value problems.
Examining (4.83)—(4.85), we see that it is now apparent that the nonlinear
problem solution proceeds by solving a sequence of linear problems by the method of
superposition, as discussed in Section 4.2.
Example 4.14
Recali the seismic ray tracing problem, presented as Example 1.7 in Section 1.2. In the
notation of (4.79), but with r as the independent variable and x , y as dependent ones, we
have
T
y = (x,y,z,F,T],ÇT,S)
f= (Sv g, Sv r), Sv C, Su., , Su,, , Su Z , Su ,0) T
Y =f, y(0)=s
for y = y(ti; s) and define
1 0 0 0 0 0 00
0 1 0 0 0 0 00
0 0 1 0 0 0 00
_ 0 0 0 0 0 0 10
Do —2uux —2uu r, —2uu 2 2g(0) 21(0) 2^(0) 0 0
0 0 0 0 0 0 00
0 0 0 0 0 0 00
0 0 0 0 0 0 00
and
SV.,; Svy g Sv,ï; Sv 0 0 0 vt
Svv 1 Svy 1I Sv 11 0 Sv 0 0 vl
SV Svy t Sv^,C 0 0 Sv 0 v^
SUS Su,, SuSZ 0 0 0 0 U,r
A= ay = suyx Suyr Sun 0 0 0 0ur
Su„ Su zy Su„ 0 0 0 0 uZ
Su, Suy Su, 0 0 0 0 u
0 0 0 0 0 0 0 0
Then, we solve the eight linear initial value problems (4.87), form F'(s) of (4.85), solve
the linear system (4.83b) for 9, add ; to s, and continue iterating until Ir;I is sufficiently
small.
Note that if an analogue of reduced superposition (cf. Section 4.2.4) is used, then
only three, not eight, additional initial value problems have to be solved at each iteration
step. However, extra caution has to be exercised with the Jatter approach, as the stability
properties of the integrated IVPs for s "far away" from ? may vary significantly. Thus,
no guarantee exists that the computed particular solution remains on the stable manifold. ❑
Example 4.15
Consider once again the BVP (4.5) in Example 4.1. Converting to a first-order system, we
get
so
F(s)
A(x ' s)= ^—eu 4 ; 5) 0j
—
lu(l;s)1'
0 = T2 cosh (0/4)
This equation has, in fact, two solutions, corresponding to two solutions of the differential
problem (4.5) (see Example 3.2). For the physically stable one we have 0 = 1.5171646,
a = 0.54935.
Table 4.6 displays some runs for different values of h and 6°, where "itn" lists the
number of iterations needed for convergence, and e := u' (.5) — u (.5).
Observe the expected fourth-order reductions in the error in the last column. Also,
for this simple problem there are no convergence difficulties and the number of iterations is
essentially independent of h. If the iteration is started with a° = 10, convergence to the
second solution with 6 = 10.8469 is obtained.
0
The reader should realize from Example 4.14 that Newton's method may some-
times be quite cumbersome to use. In other practical examples, the matrices Ba , Bb,
and especially A, may be much more complicated than in Example 4.14. In any case,
the need to solve n + 1 initial value problems at each iteration is not appealing. (Note,
0° h itn e a° h itn e
0 .2 6 —.426-5 5 .2 8 —.426-5
0 .1 5 —.279-6 5 .1 7 —.279-6
0 .05 5 —.174-7 5 .05 7 —.174-7
0 .025 5 —.109-8 5 .025 7 —.109-8
where e is the j `h unit vector and e is small, but not too small. Frequently, one can
choose i s C with eM the machine precision. This stilt necessitates evaluating F,
i.e., solving an initial value problem, n +1 times per iteration (in fact, now the n + 1
IVPs are all nonlinear). But the requirement to evaluate the Jacobian matrix F'(s) expli-
citly is avoided; hence the need to evaluate A (x) in (4.87c) is avoided as well.
The stability difficulties encountered when one is using shooting for linear problems
were analyzed in Section 4.2.3. Here, using the same method for nonlinear problems,
we find that the same difficulties appear again. There is an additional concern, namely
that when one is solving (4.80) with the "wrong" initial values sm , the solution y(x; sm )
may not exist for all a <– x <– b (i. e., it becomes unbounded before x reaches the right
end b). In such a case it is not known how to correct sm , because the nonlinear itera-
tion cannot be completed. Newton's method (or any of its variants) then fails. The fol-
lowing example is famous for its simple and yet extreme nature.
Example 4.16
Consider
u "=1 sinh lu O<x <1
u(0)=0, u(1)= 1
where ? is a positive constant such that the problem's difficulty rapidly increases as is
increased. The solution remains almost constant for x ? 0 until x gets very close to 1,
where it rises sharply to u (1) = 1. The missing initial value s* = u'(0) satisfies s* > 0
and Is* 1 is very small. Now, if we "overshoot," i. e., solve the initial value problem
u " = 1 sinh l u
u (0) = 0, u '(0) = s
calling the solution u (x ;s), with s > s* , then the solution is likely to blow up before x =1
is reached. This is because, as it turns out, the initial value solution u (x ;s) has a singular
point at
_ 1 7 dE 1 8
xs 1 f s +2 cosh 1; —2 — ) ln Is
Thus, to ensure that xr > 1 and hence that the initial value integration can be successfully
completed, we need
Is I <_ 8e
— a very severe restriction indeed. 0
Theorem 4.89 Let y(x) be an isolated solution of (4.79) and assume that
f = C (2 (D ), where D is a tube of radius r > 0 about y(x ),
p = re- _d1)
(4.89d)
Also, the fundamental solution Y (x ; s) =-- (x;s) exists then and satisfies (4.87).
In short, to guarantee successful completion of a nonlinear iteration from S'', it is
necessary that s"' be already sufficiently close to s* = y(a), as indicated by (4.89c),
(4.89d). Note that if L is large and/or r is not large, then p may easily be excessively
small!
An initial value solution of (4.80) satisfying (4.89b) may have exponentially
growing solutions, bounded only by
Iy(x; s)I < I e L1, sI
- (4.90)
An inspection of (4.89d) and (4.90) then suggests that one way to try to deal with the
difficulties of the shooting method is to restrict the size of the interval of integration of
initial value problems. As long as b is close enough to a, e L(b ° can be kept reason-
- )
ably small. This again leads to the idea of multiple shooting, discussed next.
We have seen that the two major drawbacks of the single shooting method for nonlinear
problems lead to the same basic cure: The length of intervals over which IVPs are
integrated should be shortened. Thus we are led to a direct extension of the standard
multiple shooting method of Section 4.3.
Let us then subdivide the interval [a, b] by a mesh a = x 1
<x 2 < • • • <X + = b [as in (4.31)] and consider the initial value problems for
1<_i<_N,
y'=f(x,y) xi <x <xi+1 (4.91a)
[as in (4.36)], and they are to be determined so that the solution is continuous over the
entire interval [a, b] and so that the BC (4.79b) are satisfied. Thus, the desired solution
is defined by
y(x) := y; (x;s;) x; <_ x <— x; +1, 1 <— i <— N (4.92a)
where the requirements on s are
Yi (xx +1;si) = si +1 1 <— i 5 N —1 (4.92b)
g(s1,YN (b; sN )) = 0 (4.92c)
So, defining
s2 — Y1(x2; sl)
s3 — Y2(x3; s2)
F(s) :_ (4.93a)
SN - YN-1(xN;SN-1)
g(sl,YN(b;sN))
generally hard to take advantage of sparseness with such methods. On the other hand,
with Newton's method, or the finite difference variant (4.88) which was designed to
avoid forming A (x), sparseness can be used to advantage. This finite difference variant
is the usual way to proceed, and is now described in more detail.
On each shooting interval (x i ,xi+l ) we first compute an approximate solution of
(4.91)
Yi(x,si)
with
yi(xi,si)=si
(s i being the previotisly obtained, or initially guessed approximate value of y(x i )). Con-
currently we compute n more solutions of (4.91a), written in matrix form as
Yi(x,si) (4.96a)
with
Yi(xi,si)=(Si^...^si)+1iLiy 1
directly in the nonlinear case, and not just as a means to solve the linear system (4.94).
This can be done as follows: Given a matrix Y_ 1 (x) as in (4.96b) (but i —1 replacing
i), perform a QU-decomposition and use the orthogonal matrix for the perturbation
directions at the next interval. In more detail, suppose
Yi-1(x1)=QjUi
Of course, this choice has consequences for the variables in the Newton process. How-
ever, their transformation is closely related to what we say in Section 4.4. The advan-
tages are obvious: we obtain a simple sparse matrix that can, e. g., be compactified in a
stable way. All aspects mentioned above are implemented in the code MUSN,
described in Appendix A.
Example 4.17
Consider Example 4.16 again. As an initial profile we choose
J^ sinh Xx
y sinh X.
If 1 =5 and we let the shooting points be 0, .2, .4, .6, .8, .85, .9, .95, 1, for an accuracy of
10 -6 the code MUSN needs 2634 function calls when the tolerance is initially chosen equal
to 10-2 , then 10 -4 and only then 10 4 (where the result for 10 -2 is used as an initial approxi-
mation for 10 -4 , etc.). Using 106 directly we have 4505 function calls. If the tolerance is
decreased to 10 -10 , these numbers are 9679 function calls and 25,489 function calls, respec-
tively (in the former case we choose TOL =10-2 , 10-4 , 10-s , 10-10). This problem gets
harder the larger ? is. As can be seen from linearization already around the initial approxi-
mation, there is a rapid increase of the increment of the fundamental solution over an
interval. Moreover, the solution is more active for x values close to 1 than to 0. Hence it
is not very easy to pick the appropriate shooting points in advance. In such a situation it
might be useful to have a device that picks a new shooting point when this growth is
beyond a certain threshold, K, say. For instance, if we choose 1=10, we will find that we
have to invest 50 percent of the points on [0,.972] and 50 percent on [.972,1] alone! The
sensitivity with respect to K is clearly demonstrated by the fact that MUSN takes 37
shooting points in this second interval when K = 1000 and 46 when K = 100. ❑
Example 4.18
A famous example for multiple shooting is the optimal reentry problem Example 1.21. As
an initial guess for the Newton process employing MUSN, we use a sufficiently good
approximation with TOL = 10-1 . How to obtain a "sufficiently good" approximation for
Newton convergente is not a trivial question at all, but we concentrate here on other
aspects. If we let the code determine the shooting points (increment <_ TOL/E M ,
eM =10 -16) then we obtain the following table:
Newton iterations 7 5 4 4
Downloaded 06/07/16 to 130.237.29.138. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
Shooting points 6 9 20 31
TOL 10-5 10 10-8 10-9
For a fixed tolerance TOL =10-5 and equispaced shooting points we obtain the Table 4.8.
TABLE 4.8
Newton iterations 6 5 4
Shooting ints 12 23 40
Hence, we may cautiously conclude that increasing the number of shooting points
improves the performance of the Newton process. On the other hand, choosing them in
advance may not give a proper distribution as far as activity of the modes of the variational
system is concemed. Hence we slightly favor the former type of choice. ❑
EXERCISES
1. Describe shooting schemes for the second-order ODE (4. la) under the following BC:
(a) u (a) = F 1 a 1 u (b) + a 2U '(b) = (3 2
,
(b) alu(a)+a2u'(a)=01,u(b)=02
(c) alu(a)+a2u'(a)=01,7tu(b) +Y2u'(b)=R2
In all cases, assume that the BVP is well-conditioned and use only two IVPs.
2. Consider Example 1.4. Problem (1.10) for f can be solved by shooting without converting
it to a standard form. Describe a scheme to do this.
3. The superposition method is described in the text for two-point boundary value problems
(4.9a, b). Extend the method to multipoint boundary value problems (1.4a, e).
4. Consider the problem
Let b = 100.
(a) Convert the ODE into a first-order system, and find its fundamental solution satis-
fying Y(0) = I.
(b) Using this exact fundamental solution, construct (approximately) the system
Q s = P of (4.13). Show that the resulting Q is extremely ill-conditioned and that,
therefore, s cannot be accurately obtained by the superposition method.
e -x e x-b e 2(x-b)
'P(x) = _ e -x e x-b 2e 2(x-b)
e -x e x-b 4 e 2(x-b)
(d) Solve the problem numerically for b = 1, 10 and 100. What are your conclusions?
5. Solve the problem of the previous exercise again, this time performing the numerical
integrations from b to 0. Explain the observed results.
6. Show that the reduced superposition matrix Q of (4.28) is nonsingular if the BVP (4.9a),
(4.24) has a unique solution.
7. Consider a BVP (4.9a, b), where rank (Bb) = v <n. Show that the BC can be converted
to the foren (4.24) (and therefore, reduced superposition can be applied).
8. Consider the P-SV problem of application 1.8. The BVP is given in (1.21). Assuming, as
in Example 4.7, a piecewise constant medium, construct the linear system (4.37b, c) for
this case.
9. Prove Theorem 4.42. Hints: For part (a), consider (4.37) with unit vectors in place of .
Use part (a) for part (b). For part (c) show that
e
E l -aix-x i
< K.h -t J e -OCI x-t I dt
j=2 a
for some a > 0, a <- x <_ b, h := min (x +1 -xi) and K = 1 an appropriate constant.
t <_i <N
10. Show that the inverse of the multiple shooting matrix A of (4.37b, c) is equal to
diag (F j', ... , F,»)
times the matrix given in (4.42a).
11. Theorem 4.42 is formulated for the exact A, without discretization errors. Let us introduce
discretization errors as well, bounded by an integration tolerance TOL, and consider A b
of the standard multiple shooting algorithm.
(a) Let Gif be the discrete Green's function, as in (4.16), (4.17). Show
G 12 ... " ) -1
GIN YÍ(xl)(Q
(Ah )-1
GN2 h ) -1
... GNN Y1 (xN)(Q
z(x; ) = y ;
Show that
(e) If iCF (x; )I K TOL is sufficiently smaller than 1, then for the discrete fundamental
solution [cf (4.16)] we have 1 G;1 1 < CG (x; , x^ ), C a moderate constant.
TABLE 4.9
(Thus we have a constant slope for 0 <— z <— 4, a discontinuity in the coefficients at z = 4,
and a uniform medium below.) The solution is to be evaluated at z = 1.012 and at
z = 4.924. Try to obtain wave solutions for the phase velocities w Ik = 0.5, 1.5,2.5,3.5
for each of the frequencies w = Ir, 2it, 57t, 107[. What do you observe?
16. Suppose that the BVP with separated BC (4.9a), (4.75a) has a unique solution. Show that
the BVP can be written such that B a 1 satisfies (4.75b) with D nonsingular.
17. Consider the nonlinear BVP
(a) Example 1.4. Observe the need for more and more shooting points as the Reynolds
number R increases. For which value of R does the code appear to be inefficient?
(b) Example 1.5 with the BC (1.13d). You will have to figure out what to do near
x = 0 (cf. Section 11.4.1).
ence in conditioning?
(e) Example 1.21. You will have to figure out a good initial guess for Newton's method
to converge.