(Kyoto Workshop On Numerical Analysis of Odes (199

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 227

NUMERICAL ANALYSIS OF

ORDINARY DIFFERENTIAL
EQUATIONS AND ITS
APPLICATIONS
This page is intentionally left blank
NUMERICAL ANALYSIS OF
ORDINARY DIFFERENTIAL
EQUATIONS AND ITS
APPLICATIONS

Editors

T Mitsui
Nagoya University, Japan

Y Shinohara
Tokushima University, Japan

fe World Scientific
WT Singapore * New Jersey * London•Hong Kong
Published by
World Scientific Publishing Co. Pie. Ltd.
P O Box 128. Farrer Road, Singapore 9128
USA office: Suite IB. 1060 Main Street, River Edge, NJ 07661
UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE

British Library Cataloguing-in-Publication Data


A catalogue record for this book is available from the British Library.

NUMERICAL ANALYSIS OF ORDINARY DIFFERENTIAL EQUATIONS


AND ITS APPLICATIONS
Copyright © 1995 by World Scientific Publishing Co. Pte. Ltd.
All rights reserved. This book, or pans thereof, may not be reproduced in any form or by an
electronic or mechanical, including photocopying, recording or any information storage and
system now known or to be invented, without written permission from the Publisher.

For photocopying of material in (his volume, please pay a copying fee through the Copyright
Clearance Center, Inc., 222 Rosewood Drive, Danvers, Massachusetts 01923, USA.

ISBN 981-02-2229-7

This book is printed on acid-free paper.

Printed in Singapore by L)to-Print


V

Preface

Numerical solutions of ordinary differential equations (ODEs) are broadly recognized that
they are not only interesting in theoretical study but also useful in practical applications. It
is the reason why the numerical analysis of ODEs has been attracting many research works
in the scientific computation community. One might be aware that this year is the centennial
memorial one since the historical article of C. RUNGE "Uber die numerische Auflosung von
Differeutialgleichungen" appeared in Mathematiscke Annalen as the pioneering work of more
sophisticated and effective numerical solution of ODEs.
Hoping that this volume contributes to the progress of numerical analysis of ODEs, we are
publishing it as a collection of original research articles. The contributions in this volume are
mainly based on those which were submitted in 1994 Kyoto Workshop on Numerical Analysis
of ODEs held in November of 1994 at the Research Institute for Mathematical Scicences,
Kyoto University. The topics of the articles are widely spreading, although they are touching
more or less upon the numerical solutions of ODEs. They reflect the state-of-the-art of the
study in numerical analysis.
Actually topics treated in the volume are: discrete variable methods, Runge-Kutta meth-
ods, linear multistep methods, stability analysis, parallel implementation, self-validating nu-
merical methods, analysis of nonlinear oscillation by numerical means, differential-algeraic
and del ay-differential equations, stochastic initial value problems and so on. Readers will be
able to recognize the recent development of these topics.
Last, but not least, we express our sincere gratitude to the present authors of the volume
as well as to the contributors of the Workshop.

June 1995 Taketomo Mitsui


Nagoya University
Yoshitane Shinohara
Tokushima University
This page is intentionally left blank
vii

CONTENTS

Preface v

Limiting Formulas of Eight-Stage Explicit Runge-Kutta Method


of Order Seven
H. Ono 1

A Series of Collocation Runge-Kutta Methods


T. Mitsui and H. Sugiura 15

Fourth Order P-Stable Block Method for Solving the Differential


Equation y" = f(x, y)
K. Ozawa 29

Two-Point Hermite-Birkhoff Quadratures and Its Applications


to Numerical Solution of ODE
C. Suzuki 43

Improved SOR-like Method with Orderings for Non-Symmetric Linear


Equations Derived from Singular Perturbation Problems
E. Ishiwata and Y. Muroya 59

Analysis of the Milne Device for the Finite Correction Mode of the
Adams PC Methods I
M. Fuji 75

A New Algorithm for Differential-Algebraic Equations Based on H I D M


T. Watanabe and G. Gnudi 91

Semi-Explicit Methods for Differential-Algebraic Systems of


Index 1 and Index 2
H. Skintani 113

Computational Challenges in the Solution of Nonlinear Oscillatory


Multibody Dynamics Systems
J. Yen and L . Petzold 127

Existence and Uniquess of Quasi periodic Solutions to Quasiperiodic


Nonlinear Differential Equations
Y. Shinohara, A. Kohda and H. Imai 147
viii

Absolutely Stable Delay Differential Equations and Natural


Runge-Kutta Methods
T. Koto 165

An Interval Method of Proving Existence of Solutions for Nonlinear


Boundary Value Problems
S. Oishi 179

Experimental Studies on Guaranteed-A ecu racy Solutions of the


Initial-Value Problem of Nonlinear Ordinary Differential Equations
M. Iri and J. Amemiya 195

Numerical Validation for Ordinary Differential Equations Using


Power Series Arithmetic
M. Kaskiwagi 213

Statistical Error Analysis in Numerical Simulation for Stochastic


Integral Processes
Y, Saito and T. Mitsui 219
1

L I M I T I N G F O R M U L A S OF E I G H T - S T A G E E X P L I C I T
R U N G E - K U T T A M E T H O D OF O R D E R S E V E N

HARUMl0N0
Faculty of Engineering, Chiha University
1-SS Yayoicko, Inage-ka, Chiba, 263, Japan
E-mail: aB9600Stansei.cc.u-tokyo.ac. jp

ABSTRACT
It is well known that eight-stage explicit Runge-Kutta formulas are of order at
most six. However, by taking the limit as the first abscissa approaches zero, the
formulas can achieve seventh order. Such formulas are called limiting formulas,
which requre the evaluations of the second derivatives of the solution. In this paper,
eight-stage seventh order limiting formulas using the second derivatives are derived.
And based on these limiting formulas, new eight-stage numerically seventh order
methods without derivatives are proposed.

1. I n t r o d u c t i o n

The attainable order of s-stage explicit Runge-Kutta methods is s — 1 for s =


5, 6 and 7. However, they can achieve sth order in the limiting case where the
distance between some pairs of abscissas approaches zero. Such formulas are called
3
s-stage sth order limiting formulas. Previously , we derived five-stage fifth order
and six-stage sixth order limiting formulas. Furthermore, we presented five- and six-
stage formulas of orders numerically five and six. They are obtained by replacing
the second derivatives involved in the limiting formulas with the simplest numerical
differentiation. The reason to be able to do so is that the values of the second
derivatives in the limiting formulas do not require full significant figures carried in
the computation and we can choose free parameters so as to minimize the error caused
by numerical differentiation.
In this paper, eight-stage seventh order limiting formulas are presented. And
based on these limiting formulas, new eight-stage numerically seventh order formulas
without derivatives are derived by the similar way as in the five-stage case.

2. L i m i t i n g formulas

The problem is an initial value problem

^=Mv), y{t ) = yo
a

where / and y are vectors and / is assumed to be different iable sufficiently often for
the definition to be meaningful. The parameters of an s-stage explicit Runge-Kutta
2

2
method are represented in the following Butcher array :
0.21

"31 132

• " ,l-l
s

W h ••
And, yi is used to denote the y ordinate at the abscissa Cj, namely,
i—1
a
Kf = y n + ft^ 'j/j'
j=i
where
/ l =/(*«,¥»), fi = f(tn+Cih,yi) (i = 2,3, • • • , * ) .
Using them, the method can be written as

5 , 5
Many eight-stage sixth order formulas are known and their properties are precisely
5
reported .
An eight-stage limiting formula that uses the values of the second derivatives at
the point ( t „ , J / „ ) has the form

A = /(*».*,), F = 2 D(f(t ,y ))-v(f,),


n n

¥3 = V* + HaaJi + ha F ),
3 2

h = f(t + c h,y ),
n 3 3

Vi =
•i—3
j—j
/, = flU + Cih^) (i = 4 , 5 , - - , 8 ) ,
B
S<n+1 = Vn + Khfl + Y, if' + b
kfaFj), (1)
1=3

where D(f(t , y )) and v(f,) denote the Jacobian matrix of / at the point (t„, y ) and
n n n
r
the vector (1, f\, • • •, / " ) respectively (the superscripts denote the component
numbers). The parameters of this limiting formula can be written in the following
array analogous to Butcher array:
a 31 <*3
0.43

©S "51 "S3 «S4 "s

Cs a
S3 OB4 • • "87 ClB
Is <*< •
°S ft.
3

2.1. Order conditions

We restrict ourselves to the case that

cg = l, 63 = 0
and the following simplifying assumptions hold:

«a = I (2)

X > , ^ + «, = f (i = 4 , 5 , - - , 8 ) , (3)

E ^ =f (i = 4,5,---,8). (4)

Comparing the Taylor series expansion of Eq.(l) with that of the true value
y(t + k) and matching the coefficients of each elementary differential, after tedious
n

computation, we get the following equations of condition for seventh order accuracy:
i-l
C3i-c ,
3 a„ + ^ O i j = a (t = 4,5, • - - ,8), (5)
j=3

£ kmr = 6,(1 - tj) ( j = 4,5,6,7), (6)


i=)+l

X>,a* = 0, (7)
i=4

jZ ^ ^ = 0, (8)
1=5 2=*

E 6 t L « i E « * « * 3 = o,
J J (9)
1=6 ,=S Jr=4

X > E W j 3 = 0, ( )
10

fe + £ 6 j = l , (11)
1=4

X>? = i (13)
J
i=4
M i=4
s i-i j - I

E^E^E

E ^ / E ^ ^ , (is)
1 Z U
i=6 ,= 5 *=4

1 9
E^E"./E^E^? = i t )
I D U
,=7 ,=6 i=5 1=4 '

i-1 I - l
1
E * .y=i
.=S E a . , E «*»^ = 2io. ( 2 1 )

i=6 ;=S it =4
8 i-l J-I J:-l
E ^ E ^ E ^ E ^ N — , (22)
i=7 ,=6 t=5 1=4

m s=s *=4 l b M

2.2. Solutions

Hereafter, we assume that all abscissas are distinct and are not equal to 0. And
we use, for later convenience, the notations

E km = 6,(1-c,) = P i (j = 4,5,6,7),
i=j+l

E W * = E I E b Aa^ ;ai =a k (4 = 4,5,6),


>= *+ ! j = fc+l \ ' = j + l /

E E f E ^#U<= E ( E ( E v W U , = % i 3
d

ic=l + l A= l + 1 V=*+l / *='+! \ j = * + l \ l = j + l / /


C = 4,5).
5

Using the notation p;, we can rewrite Eqs.(14), (15), (17) and (20) as

From this system of equations, we get

35cjc c, - 21(c q + CjC) + CjC ) + 14(cj + c + q) - 10


t it t k . _
Pi — ~ ,„„ •>/ si si s l
\> h ' — 3 0
^ l ! ! ' J-
420c?(ci - Cj)(c; - c )(c; - c,) t
(24)
Using the notation cr;, we can rewrite Eqs.(16), (18) and (21) as

And we get
UcjC
lltjCI: -
- 7(cj
(LCj +
TU ) +T 4*
CJ;
k . C G I fuel

n - " 3 8S 4 — ,r }r
f c0&^: -—^T)T^T-: 0 ( t , j , * = 4,5,6). (25)

In the similar way, from Eqs.(19) and (22) we get


5
1 5
1

2^ =
360' S^' =
840

and _
( M = 4 , 5 )
- ( 2 6 )

By using p^ a r , Eqs.(6), (11), (12) and (13), the parameters of the method can
if ;

be expressed rationally in terms of c^'s, provided that all denominators of a./s do not
vanish;
bi and 0 are
3

k = - ^ - (i = 4,5,6,7), 6 = |-£fti&
8 h = \-hh> (27)
1 — c,- 3
i=4 i=4

2 8
& = ( )
' i=4

and a,j (i = 6,7,8; j = 5, - 1) are

P7 (29)
Q S T =
b»'
a 76 = > ass = 7-1 (30)
Pi
1
o 65 = 1 «75 =
0-6
6

The last equation of condition given by Eq.(23) must be satisfied with the solutions
above obtained under the assumption given by Eq.{4). By trivial manipulations, we
get the following relation between c and C j : 4

14c cJ - 12c e« + 3cs - c, = 0.


s s (32)

The parameter a obtained from Eq.(23) and c can be rewritten using the relation
54 4

given by Eq.(32) as
c c
4l i ~ *l
• = - 3 (33)
And the other a 's are
it

a M = —(t, - c a ) , s 54 a T 4 = —(a t - (fJ 054 + Pe-o-n*)),


5
<r6 p 7

"S4 = r(P* ~ f^"" + °6°6 4 + hOl*})' (34)

We get from Eq.(4)

5 6 7 8 35
«« = j | , 00 = t | ( f " S « « ^ J (' " ' ' ' > - < >

These a, 's are found to satisfy Eqs.(7), (8), (9) and (10) by a straightforward com-
3

putation. Finally, from Eqs.(5) and (3) we get

«a = C - X > „ (. = 4,5,..-,8) (36)

and ^ ^
^ = f - £ > ^ (. = 4,5,--.,8). (37)
* ,=3

Now, we have obtained a set of parameters of the eight-stage seventh order limiting
formula with four free parameters, c , c , c§ and c . 3 t T

3. Determination of free parameters

In the solutions obtained in the previous section, four parameters c , c«, Ce and cj 3

are free to be chosen in any way. In this section we will consider how to determine
these parameters.
The stability region depends on only one free parameter c,. It is desirable to
determine c, so as to maximize the stability region. At the same time it is preferable
that every parameter is the number requiring a small number of digits and small in
magnitude. We intend to derive the eight-stage formulas which achieve numerically
7

seventh order by replacing derivatives with numerical differentiation. The key point
to derive these formulas is that the error caused by numerical differentiation does not
dominate over the leading error term of the limiting formula.
Here, we will present two sets of free parameters. One of them gives the parameters
requiring comparatively small number of digits, and the other gives comparatively
large stability region.

3.1. Stability

The polynomial r which determines the stability of the eitht-stage seventh order
limiting formula given by Eq.(l) is

•••• I + ; + ^ - + --- + ^
z7+ T* , 8

where
1 c (3 - 7c,)
4

7 = gj - W t ^ ^ o , = 1 5 1 2 0 f l 4 c . _ 1 2 c i + 3 )

and z is the complex number. The stability region is the set of points for which
|r(«)| < 1. Let the simply connected interval ( — d , Q ) be the intersection of the
stability region with the negative part of real axis. This interval is called the stability
interval. The boundaries of the stability regions for several values of 7 are shown in
Fig. 1 with the values of 7 attached. The graph indicates that

7 e { — — , — — } « (0.143 x 10 ,0.147 x 10 )
7 1 K
-4 -4
K
(38)
70000 '68000' ' '
gives the maximum stability region. In the case where the range of c is restricted to s

the interval (0,1), we get from Eq.(32)

3 1
0< c <- 4 or - < c < 1. 4
8

Fig. 2. The graph of 7(c).

3T-
I
]

V
11
/ f a
f u•
<* J
SV V
\s
\

! 1
Fig. 3. Stability boundaries for c = 2/T and 11/28. 4

The values of 7 as a function of c are given in Fig. 2.


4

We will choose the value of c, so that 7 is contained in the interval given by


Eq.(38), and both of c, and c are rational numbers requiring comparatively small
5

number of digits, moreover every parameter can be determined to be rational number


within 16 digits. For this purpose, the most reasonable choice of the value of c is 4

11/28. The value of 7 is about 0.1455 x 1Q~ . Another choice of the value of is
4

2/7. For this value of c 7 is about 0.265 x 10~ and is outside the interval given by
4l
4

Eq.(38). But we can determine all parameters which require smaller number of digits
than those for c = 11/28. The boundaries of the stability regions for c, = 2/7 and
4

c = 11/28 are shown in Fig. 3.


4

From Eq.(32), we get


2 2
c = -
5 for c = - 4 (39)

22 , 11
* = - for c = - .
4 (40)
9

3.2. Error caused by numerical differentiation

Next, we will consider the values of Cg and c . We intend to derive eight-stage 7

numerically seventh order method by replacing h • & with ( / - /i)/e, where 2

h = j{t + th,y + ehh) n n

with some small value of f. So, it is desirable that the error caused by numerical
differentiation is as small as possible.
The magnitudes of the optimum e and the error E of hF in Eq.(l) are roughly
opt avt 2

estimated as

(9-digits to the base p) (41)


dt 2

and
d

E Oft 2L\ (42)


opt dt 2

In Eq.(41), is taken to be 4 on the base of numerical experience. By


using Eq.(42), approximate value y„ of jjfo+i, the value of the limiting formula, can
+J

he written

i=*

(43)
i=4 j=3

where G\ and G2 are vectors which depend on the function / . We see from the
2 3
previous section 2.2 that the coefficients of h E , • G^ and h E • G are ap apC 2

i=4
- (70c C CeC
4 s 7 - 35(050607 + QCfiCy + C4C5C7 + C^Cfi)
+21(C4C S + C4C6 + C C 4 7 + C Cb5 + C C + CgC?)
5 7

-14(c + c + ce + c ) + 10)/420c c C6C7


4 5 7 4 5
(44)

and
35c c ce - 14(c c6 + C CB + c c ) + 7(c, + c + C s ) - 4
4 s 5 4 4 s 5

£ <E b a a
ai (45)
840c c 4 sCfi
i= 4 j=3

respectively. Unfortunately, under the assumption in section 2.2, we cannot choose


the values of c , Cs and c so that both Eqs.(44) and (45) vanish. So, if we choose
4 7
10

free parameters so that ft vanishes, not only the second term but also the third term
of the right-hand side of Eq.(43) vanishes. In this case, the leading term of the error
caused by numerical differentiation becomes
3 4 2
0(h E ) opl <x h -p-"'

and the coefficient of this term is given by Eq.(45).


Substituting each pair of c, and c given by Eqs.(39) and (40) into Eq.(44), we s

get
(25ce - 18)c - 18ce + 14 , 2 T
8 =
w 2
for c = - 4
240csc 7 7

and
„ (65ce + 63)c + 63^ - 56 , 11 7
A =
HMOc^ ^ *~W
We want to find the values of ce and c so that ft vanishes, and that they give all 7

parameters which depend on and C7, (that is a,j (i = 6, 7, 8; jf = 4, • • •, i —1) and


6;) to be not so large in magnitude and to be the rational numbers within 16 digits,
As the most suitable value for these conditions, we find
4 1 , 2

a n d
4 28 , 11
C 6 = C 7 = f 0 C
5 ' 575

S.3. Two sets of parameters for eight-stage seventh order limiting formulas

Finally, c is left to be determined. We will determine the value of c in consid-


3 3

eration of the magnitudes of the parameters a » , 0^3 and which depend on c . We 3

get
1 2

c =- tfor C 4 = - 3

a n d
7 # 11
C 3 = f r C
20 ° ' = 28'
Now, two sets of abscissas are obtained:
- , f\ 2 2 4 \ \
( ,c ,c ,c ,. ) = ( - - , _ - - j
C3 4 5 6 7 1 1 (46)
and
. , ( 7 11 22 4 28 \
(C3, C , C , 0 6 , 0 7 ) =
4 5 - — j (47)

Substituting Eq.(46) into Eqs.(27), (29), (30), (31), (33), (34), (35), (36) and (37),
we get formula 1. The parameters of this formula are shown in Table I . The stability
11

Table 1. Formula 1

c, an «i3 a,4 Oi5 a i7 ai


1 l l
9 9 162
ue 216 10
7 343 343 343
2 54 216 196 18
5 125 625 625 625
4 52 864 2499 o 68
5 125 625 625 o 625
1 4833 459 12789 17 1 37
5 8000 625 40000 64 64 625
1 5489 27 11879 9125 11125 500 9
j. 1344 7 1728 1.144 120% 63 14
3 16807 125 1625 125 7
h 64 0 25920 192 5184 Sin- 120

region of this formula, is not so large, but its parameters are the numbers requiring
comparatively small number of digits. For Eq.(47), we get formula 2. This formula
is nearly the best formula from the viewpoint of stability region. The parameters of
formula 2 are shown in Table 2.

4. Eight-stage numerically seventh order formulas

The eight-stage formulas which achieve numerically the same accuracy as the
seventh order limiting formulas are obtained by replacing derivatives with the simplest
numerical differentiations. Namely, in the formula given by Eq.(l) we compute as
follows:

/ i = /(**. if*),

m
where

t = V ? / I
= ( 33554432/, f o r 1 4 t o t h e b a s e 1 6
( w )

o r
A j 'giffi ^ 8 digits to the base 16.

Replacing F% in Eq.(l) with F , and using parameters of formula 1 and formula 2,


3

we get two eight-stage numerically seventh order formulas without derivatives. We


will call them formula 1' and formula 2 ' respectively.
12

Si
i s
X

o> os r-
5 " EEr § 3
O

_ o CO fcrt
!£J CO

HI II
H S =|SS S I S - i - sit
13

NnI™I
Fig. 4. Largest errors in numerical solution of example 1 at the last step.

5. Numerical example and conclusions

To show that formula 1 and formula 2 achieve seventh order, we give the errors
in numerical solution of a system of equations
Example 1 Integrate

tM n

= -3/13/3, 3/2(0) = 1,

^ 3 / s ( 0 ) = l, = 0.51

over the range [0,60]. The largest errors of both formulas at the last step for var-
ious values of h are shown in Fig. 4. From Fig. 4 we see that both formulas are
exactly of order seven because the accumulated truncation errors of both formulas
1
are proportional to h .
Next, to illustrate that the formulas given in the section 4 achieve numerically
4
the same accuracy as the limiting formula, we present the results of an equation by
formula 1 and formula 1'.
Example 2 Integrate

dy = _ W I + 1)
J V W
dt 3j/ (te<-6) '

over the range [0, 1]. The errors in numerical solutions at ( = 1 for various values
of k are shown in Fig. 5. The computations were performed in double and quadru-
ple precision arithmetic, using c for double precision arithmetic given by Eq.(48).
Observations of Fig. 5 are as follows:

(i) Formula 1' achieves the same accuracy as the formula 1 for all values of k in
double precision arithmetic.
14

Fig. 5. Errors in numerical solution of example 2 at ( = 1.

(ii) The accumulated error caused by numerical differentiation is insignificant in


3
double precision arithmetic for all values of k and is proportional to h , as
shown by the results of quadruple precision arithmetic.
In conclusion, we are able to say that the formula V is efficient for non-stiff
problem. The simplicity of parameters will be preferred, because the explicit Runge-
Kutta formulas are not suitable for stiff systems. Although automatic methods for
differentiation of functions are easily appbed to evaluate the derivatives involved in
our limiting formulas, we can achieve the same accuracy by formula 1' without
derivatives. The error caused by approximation does not become the significant part
of the total error throughout the computation.

Acknowledgments. The author is grateful to Prof. Linda Petzold for her


valuable comments. The author is also grateful to Prof. Taketomo Mitsui for his
kind suggestions.

6. References

1. R. Bulirsch and J.Stoer, Num. Math. 8 (1966) 1-13.


2. J.C. Butcher, The Numerical Analysis of Ordinary Differential Equations (Wi-
ley, New York, 1987).
3. H. Ono, Journal of Information Processing 12 (1989) 251-260.
4. A. Ralston, Math. Comp. 16 (1962) 431-437.
5. M. Tanaka, K. Kasuga, S. Yamashita and H. Yamazaki, Trans. Information
Processing Society of Japan 34 (1993) 62-74 (in Japanese).
15

A Series of Collocation Runge-Kutta Methods


Taketomo MITSUI
Graduate School of Human Informatics, Nagoya University
Nagoya (64-01, Japan
e-mail: a41794a@nucc.cc.nagoya-ii.ac.jp
and
Hiroshi STJGIURA
Depar!men( of Informationfinaineerinj,Nagoya University
Nagoya (64-01, Japan

ABSTRACT
Collocation Runge-Kutta formulae, a dominant class of implicit methods, are
considered for numerical initial value problems of ODEs. Since they are fully
characterized by their abscissae, we propose a systematical way to generate
collocation Runge-Kutta formulae of the same number of stages by a gradual
change of abscissae. They increase their orders as the changing, finally to co-
incide with the Butch er-Kuntzmann formula of the specified number of stages.
Their ,4-stabib'ty is also investigated to represent the stability factor with the
abscissae.

1. I n t r o d u c t i o n

In solving numerical initial value problem of ODE

the class of Runge-Kutta formulae (RK formulae, in short) is most important among
many discrete variable methods. Especially the stiff problem of Eq. (1) requires
highly sophisticated methods, one of which is the implicit RK formula.
It has the form

Yi = y, + h'£a f{x ij n + cjh,Y )


j (i = l,...,«), (2a)
1
h
= y + hJ2 if^n+c<h,Y)
n (2b)
i=l

Here, the interval of integration [a, b] is divided by the step-size ft so that the step-
points are given by
x = a + nh (n = 0 , l , . . . , J V ) .
n (3)
The real parameters specify the method. Formula (2), which is called an
s-stage RK, is usually assumed to satisfy
a
C = I > ; J (i = l , . . . , s ) (4)
J=l
16

so that the RK formula gives the same result for a non-autonomous O D E as well as
for its autonomous counterpart.
To specify the formula, i.e. to determine the parameters ay,&j and c;, the colloca-
tion method seems to be most simple and powerful. Although the exact definition will
be given in the next section, the sense by the word "collocation" could be explained
by considering the polynomial which interpolates the numerical solution Y in (2a) (

at the collocation points x„ and x„ + cji ( i = 1,........s). In the collocation method


one will first choose the distinct abscissae C\,C2, - •• ,c„ then the other parameters
will be uniquely determined. Many known RK formulae, such as GauS-Legend re (an-
other alias is Butcher-Kuntzmann), Radau and Lobatto types, belong to this class
4
( D E K K E R - V E R W E R ) . Hence the class of methods has attracted researchers' interest
7 8 9
(e.g. S C H N E I D , W A T T S - S H A M P I N E and WRIGHT ).
The aim of the present paper is to analyse the choosing process of the abscissae
so as to increase the order of consistency, to investigate the stability of the derived
1
collocation methods, and to reconsider a method proposed by a group of physicists,
who called it HIDM (=Higher order Implicit Difference Method),

2. A Construction of Collocation U K M e t h o d

2.1. Basic Properties

As usual, let us call the RK method to be consistent of order p if p is the largest


positive integer satisfying

local truncation error of the method = 0{h") as ft | 0 (5)


3
for any sufficiently differentiable solution y(x) of Eq.(l). B U T C H E R has introduced
the simplifying conditions to easily determine the order from very complicated alge-
braic equations of the parameters.
Definition 1 An s-siage RK formula is said to satisfy simplifying condition

A(p) if the formula is of order p, (6)


E 1
m ^ I> *~ =
r (fc = l , 2 , . . . , p ) , (7)

C{p) if (i = l,...,s;k = l,2,...,p), (S)

1
D{q) if i,b4- a j i = )b (l-c' )j j (j = l , . . . , ; f = l , 2 , . . . , g ) ,
S (9)

1
E(M if E ^ - ^ c * - = - — ( < = ! , . . . , p , A = i , . . . , , ) . (io)
17

2
Definition 2 ( B U R R A G E ) If an s-stage RK formula satisfies the conditions B(s)
and C(s), then it is called a collocation method.
The implication of the collocation RK method can be considered as follows. For the
sake of notational convenience, we introduce a scaling for the independent variable
by x = xq + th. Then we will consider the differential equation and its approximate
solution in the term of the variable (. Moreover, for the RK methods, it suffices
to consider on the interval t € [0,1]. Hereafter, within the present section, we will
restrict ourselves on this situation.
If the RK method (2) is regarded as a method of numerical quadrature, we may
apply the method to a differential equation depending only on t,

| = hf(t) (0<t<l), 3/(0) = 0. (11)

Then it reduces the RK method to the quadrature rule

M
where the right-hand side is an approximation of the integral

l
y(i) =
ft f f(t)dt
Jo
The condition B(p) implies that Eq. (12) is exact if / is merely a polynomial of t of
degree at most p— 1. That is, the quadrature (12) is of order p.
From Eq. (11), we have
^-ftEaijM) (13)
3=1

for each i, which should be considered as an approximation of the integral


i
y(d) = hj° f(t)dt.

Let P{t) be a polynomial interpolant of degree at most s. Then the derivative of P(t)
is integrated exactly at every internal point c*. Thus we obtain from Eq. (13)

% - V» = P(*) ~ m =h nt)dt = h£ aij P%).

By virtue of Eq. (2a), the equation

f(c )
j = P'(c )j (j = l s)

holds, which means that the condition C(s) implies the interpolancy of the derivative
P'(t) at every internal point c,- (i = 1 , . . . , s).
18

In conclusion, the collocation RK gives the exact values at i = 1 for the solution,
at t = Ci for its derivative if it is a polynomial of degree at most s.
Furthermore, we have stronger statements which can be found in 3.2 of D E K K E R -
4
V E R W E R They leads the followings.
Theorem 1 The collocation RK method of s-stage is consistent of order at least s for
general initial value problem Eq. (1) if it has distinct abscissae and nonzero weights.

Theorem 2 Let s distinct abscissae a, ...,c, be given. Then the simplifying con-
ditions B(s) and C(s) uniquely define an RK method.
Hence, we can focus on the determination of the abscissae for the collocation RK
method.

2.2. Collocation Method of Lower Order

Because of the interpolating property at the internal point t = c,-, we will consider
a polynomial on [0,1] of degree s which interpolates the solution y(t). Usually it can
be constructed on the fixed interpolating points &>£i, - • - Its Lagrange form is
given by
Lm=£y(®m) where i0$=J[t^k,

Theorem 3 For fixed distinct points fo, • • • , 6 on [0,1], the abscissae Ci, c, of
the s-stage collocation method satisfy the equation
0 w h e r e
= «*<*)=nc-&)=(*-&)(* -«!)•••<*-

Proof. The s-stage collocation method yields the condition C(s). The solution y(t)
is written as

if it is (s + l)-times continuously differentiable. Then we obtain

and the abscissae a , . . . , C j must satisfy the identities j/(ci) = L'^c,) (i = l , . . . , s )


because of the interpolancy of the internal points. •
5
This is another expression of Theorem 2 in N0RSETT-WANNER .
Most conceivable choice for the interpolating points on [0,1] is the
Newton-Cotes type equidistant distribution = j/s (j = 0 , 1 , . . , , s). In this case,
the determining equation for the abscissae c i , . . . , c , is the algebraic equation

^(() = 0, where Vl (t) = f[(t - 1). (14)


19

It is obvious that Eq. (14) has s distinct roots on (0,1), situated as ( i - l ) / s < a < ijs
(i = 1 , . . . , s) if they are labelled as ci < < • • • < c,.
Since the Newton-Cotes type formula of s-stage is a collocation one, it is consistent
of order at least s. The following theorem gives its actual order.
Theorem 4 The s-stage Newton-Cotes type formula is consistent of order s + 1 for
odd s, of order s + 2 for even s.
Proof of Theorem requires a Lemma concerning with a kind of orthogonality of poly-
nomials.
Lemma 1 Let 4> _ (t) be a polynomial of degree (p + 2q — 1) defined by
p q

^ ( o = ^ - i r n3=1( ' - j )
with p,q € N. Then we have

= 0 for even p,
jJ% (t)dt{
M
/ 0 for odd p.

Furthermore, let d>' ^(t) be a polynomial of degree (p + 2q) given by 4£,,(f) = td> (t},
p fiq

then we have
j f *;,,(*)<« *o.
The Lemma can be shown through tedious but straightforward calculations on poly-
nomials. Hence we omit it.
Proof of Theorem 4. It in known that the simplification conditions B(p),C(q) and
D(r) imply the condition A(p) provided the inequality p < min(g + r + 1,2q + 2).
Hence it suffices to prove that the condition B(s + 1) and B(s + 2) are satisfied for
the odd and the even cases, respectively.
The condition B(s + 1) means that the Newton-Cotes formula is exact for a
polynomial / ( ( ) of degree at most s as the numerical quadrature rule for Eq. ( I I ) .
So, let f(t) be a polynomial of degree s. Then the division of / ( ( ) by tff (t) yields the t

quotient and remainder polynomials of degree 0 and of degree less than s, respectively:
f(t) = q<fi',(t)+r,. (t).
1 Thus

l
3,(1) = h f {q<p',(t) + r -i(t)}dt s = k f r _!(t)dt.
3
JO Jo
On the other hand, the identity

X > / ( c ) = Eb {q<p' (c ) i 1 i + r,. (c )}


l i = Efctwfc)
i=l i i
holds. The polynomial r , _ i is of degree less than s and c ..., u c, are distinct. Thus
E&j**-i(<&) is equal to
20

to
Next, let /(£) be a polynomial of degree s + 1, Then we have / ( f ) = q{tW,{t) +
r,_i(t), where q{t) = git + go and r,_! is a polynomial of degree less than s. Similar
the above, we can deduce as

1,(1) = h !\q{tW,{t) + r,_i(*)>tt = - h f tp.(t)dt + k f


Ql T -$W
t
Jo Jo Jo

because VJ(1) = <?J(0) = 0.


By virtue of Lemma 1 in the case of p = s and q = 1, we have
= 0 for even p,
j4 0 for odd p.
This implies that the condition B(s + 2) is satisfied for even s while cannot be for
odd s. But if g(t) is of degree 2, i.e. if / ( f ) is of degree s + 2, Lemma 1 states

L'0 t<p (t)dt j£ 0. Hence B(s


t + 3) never be satisfied for even s-P
1
A physicists group ( A B E et al. ) presented a discrete variable method for ODEs
which can be found to be equivalent to the Newton-Cotes type collocation method.
They called it HIDM, which sounds slightly unsuitable. (See Appendix.)

2.3. A Series of Methods of the Same Stage Number

To increase the order of consistency for the s-stage collocation method, we will
make a gradual change of the distribution of the interpolating points £ o , . . . , £ which a

causes the change of the distribution of the collocation points Q , . . . , c . s

Let ipk(t) be the polynomial of degree (s — k) given by

1 for k = s — 1.
Moreover, define the polynomial <£>,,*(£) of degree s + 1 by

n t +
%M = J{? (t-i) V*(o} (* = o , i , . . . , 5 - i ) . (i6)
Note that the identity <p,fl{t) = tp,(t) holds for <p {t) defined in Eq. (14). Since ( = 0
t
+1 ,;+
and 1 are both (fc +l)-ple root o f ( * ( £ - l ) V * ( £ ) . *p,, (0) = <p,, (l) = 0 . Obviously
k k

s + 1 distinct real roots of <£,,*(() = 0 locates as £ ( = 0) < £i < • • • <


0 < £,(= l ) .
Thus, <^, (t) = 0 can be a determining equation for the abscissae of the collocation
t

method.
Theorem 5 The s-stage collocation formula determined by the equation

<*(i) = 0 (17)
is consistent of order s + k + 1 ifs-k is odd, and of order s + k + 2 ifs-k is even.
21

Table 1: The order of consistency for (s, k).


s\k 0 1 2 3 4 5 •••
2 4 4
3 4 6 6
4 6 6 8 8
5 6 8 8 10 10
6 8 8 10 10 12 12

Proof is similar to the one for the previous Theorem. The only difference is that
repeated application of integration by parts enables us to attain the exactness of the
quadrature for polynomials of required degree. Thus we omit it.
Theorems 4 and 5 bring a table showing the increase of the order of consistency
in this series. Refer to Table 1. Since the stretching of variable by r = 2t - 1

implies the equivalence of i ^ s _ i ( r ) and the Legendre polynomial of degree s through


Rodrigues' formula, we can readily show the following.
Corollary 1 The s-stage collocation method determined by the equation

<.-i(t) = 0
is nothing but the s-stage Gaufi-Legendre (Butcher-Kuntzmann) method.
On closing the present section, we have established away to generate a series of s-stage
collocation method starting from the Newton-Cotes type ((s+l)-st or(s+2)-nd order)
up to the Gaufi-Legendre formula (2s-th order), increasing the order of consistency
two by two.

3. ,4-Stability of the Collocation R K M e t h o d

We employ the Butcher array for the formula parameters of RK. Let A, b and e
T
be the matrix and the vectors given by A — (a,j) (1 < i, j < s), b = .. ,b } and
3
T
= ( 1 , 1 , . . . , 1) , respectively. The stability factor R(z) of the RK method by Eq. (2)
is known to be given by
T
det (l-zA + zeb )
R 18
^ = \et(I-zA) - < >

3.1. Derivation of the Stability Factor

In the s-stage collocation method, A and b are uniquely determined by the set of
the abscissae . .,c provided that they are distinct (Th.2). Hence we can conjecture
s
22

that the stability factor Eq. (18) should be expressed only with the abscissae. Define
the polynomial

3=1 ;=o
which is the derivative of w,(t) in Th. 3 divided by its leading coefficient to make it
monic. The coefficients p ( j = 0 , . . . , s — 1) can be represented by the fundamental
;

symmetric functions of c i , . . .,c,.


We introduce the Vandermonde matrix V and the diagonal matrices C, S as fol-
lows:
1
1 c, ••• c p

4'
1 cs ••• eT 1
r 1
V = , C = diag[ci,cj,---,c.] and S = diag

1 ft eT 1

(20)
Then, the simplifying conditions B(s) and C(s) are equivalent to the matrix identities
4
given by the following ( D E K K E R - V E R W E R ) :
T T
B{s) : b V = e S\ and C(s) : AV = CVS. (21)
Let W be the diagonal matrix given by

lV = d i a g [ l , l , 2 ! 3 ! - . - , ( s - l ) ! ] .
1 1 (22)
A direct calculation leads to the identity
0 -Po/s\
1 0
l l 1 1
wv~ cvsw~ = WV-'AVW- =

0 -(s-2)!p _ / ! a 2 S

1 -(«-l)!j>,_i/»!
The right-hand side matrix is the companion matrix for the polynomial

«_3 ft, + + ... i u i l ! f e ^ .


+ + ,_i£ J h l , i (23)
SI 3=0
:

that is, the identity


l l
d(z) = det [zl - WV- AVW~ )
holds. Taking the identity

3 1
A(z) = det(Z- zA) = z det(-I -WV^AVW- )

into account, we arrive at the following Lemma.


23

Lemma 2 The denominator polynomial A(z) of the stability factor R(z) is given by
AO) = z'd{\).

Next, we will consider the numerator of R(z). Eq. (21) yields


T T 1
A-eb = {CV-ee )SV- . (24)

The (j,j')-th component of the matrix CV — ee is given by cj - 1. Then, putting


T

7; = C; — 1 (i = l , . . . , s ) , we introduce another Vandermonde matrix U and the


diagonal matrix T by

' 1 7i '
1 72 • • 75" 1

u = and r = diag[7 ,7 ,--,7,j.l 2 (25)

. 1 7. " • i f *

Then we can obtain


T T 1
(A - eb )U = (CV - e e ) 5 V - [ / = TUS. (26)

The shift of the variable t in 7r,(t) by one derives the polynomial p,(t) by

p (t) = ir,(t + l) = f +
1
T
'£ > < fi (27)
3=0

whose coefficients introduce another polynomial

(28)
S< 8! S\

Similar to the previous Lemma, we have the following.


Lemma 3 The numerator polynomial N(z) of the stability factor R(z) is given by
N(z)=z'e(h
z
Due to the above Lemmas, R(z) is represented by R(z) = N(z)/A(z). This ex-
5
pression for R(z) was already obtained by N O S E T T - W A N N E R (their Theorem 4).
However, our proof is a self-contained algebraic one applying matrix notations, while
theirs is based on the real analysis.
The maximum magnitude principle implies the necessary and sufficient conditions
for the ^-stability of the collocation RK method as follows:
+
(51) All the roots of A(2) belong to the right half-plane C of the complex z.

(52) \N(iy)\ < |A(%)| for every real y. (The imaginary unit is written by i.)
24

3.2. A-Stability of Symmetric Methods

Practically occurring formulae have a special feature for the abscissae.


Definition 3 If the location of the abscissae c i . c j , . . . , c, is symmetric with respect
to 1/2, the collocation RK method of these abscissae is called as symmetric.
Theorem 6 If the collocation method is symmetric, it is A-stable iff all the zeros of
+
A(z) are in C
Sketch of Proof. The symmetricity means that if c is a root of JT,(I), then 1 — c is a
J
root of 7r,(t). Thus we have (—l) jr (l — f) = ?r,(t), which impbes p,(t) = 7r ((+ 1) =
s a

( — l J ' T T ^ - t ) . This identity yields a relationship between r^ and pj, which implies the
equation N(z) = A(—z). Therefore the condition (S2) follows immediately. •

By the Theorem it suffices to investigate the location of roots of only A(z) whether
the symmetric collocation method, which includes the class given in the previous
Section, is ^-stable. For instance, a sufficient condition can be given by the following
proposition.
Theorem 7 If all the roots of the truncated Taylor series expansion of exp(z) of
order m

+
are in C , then any symmetric collocation method of m-stage is A-stable.
Proof. We note that &(z), which can be written as

through Eq. (23), is composed by the following two polynomials

in the sense

6
Grace's theorem (e.g. P O L Y A - S Z E G O p60) states that every zero z of A(a} has the
form z = -6^ where 9 is a certain root of g(z) and £ is a suitably chosen pointa in
k

a circular domain including all the zeros of f{z). We readily see


|

g(z) = J[(l - c ) = (1 - d z ) ( l - c z) • • • (1 - c,z),


jZ 2

3=1
25

Table 2: ^-stability of the aeries of collocation RK


u ]
2 a 4 s 1 7 8 —— 11 12 u 14 15 16 -rr 19 20
2 U 50 0
3 Q
4 0 o 0 Q
S 0 0 0 Q 0
6! 0 0 0 0 0 o
Q 0 0 o 0 0 0
OO
OO
OO
oo

oo

oo
a 0
OO

9 X o
10 M o o Q 0 o 0 0 o o
11 XM 0 0 0 o o o o Q o o
12 X o 0 0 o 0 o o o 0 0 o
13 X 0 o o o o o o o o o 0
14 X 0 o o o o o 5 0 o o 0 o
15 X X o 0 0 0 0 0 0 o 0 o 0 0 0
16 X X 0 o o o 0 0 Q 0 o o 0 o o 0
17 XX X X o o o o 0 o 0 o o 0 o 0 0 o
IS X • 0 o o o 0 o Q Q o 0 o 0 0 0 0
19 X X X 0 o o o 0 Q 5 o o 0 0 0 o 0 5 0
20 X X X 0 0 0 o o o 0 o 0 o o 0 0 0 o 0 0
21 X X • • o o o o o Q o 0 o o o o o o o 0 0

+
which implies 0, = 1/c, {j = Hence every zero of A{z) lies in C under our
assumptions. •
The investigation of ^-stability should be, however, carried out for each cases.
For the Newton-Cotes type methods, the collocation points are given by Eq. (14).
The polynomial ir,{t) is then given by 7f,(t) = r V , ( 0 where <p (t) is defined in Eq.
t

s "H 1
(14). Taking the symmetricity of the abscissae of the Newton-Cotes type methods
into account, we can apply Theorem 6 and Lemma 2 to investigate the stability of
the methods. The question is whether all of the zeros of A { - z ) are in C~. The
symbolic and algebraic computation by computer gives the polynomials A ( — a n d ,
furthermore, algebraic computations of the Routh-Hurwitz criteria give the following.
Theorem 8 Alt the Newton-Cotes type collocation methods whose stage number is
less than 9, that is, the methods whose abscissae are determined by Eq. (14) for s < 8,
are A-stable.
Further computations of many principal minors for various pairs of indices (s, k) yield
Table 2 to discriminate the series of the collocation methods derived in Subsection 2.3
with respect to A-stability. Here the O mark means the formula of this pair (s,k) is
A-stable while the x mark means it fails. We remark that the Butcher- Kuntzmann
method has been known A-stable for any s.

4. Acknowledgement
The authors are indebted to W. HUNDSDORFER for his suggestion of the equiv-
alence of HIDM to implicit RK formula. They are also grateful to T . K O T O , C H .
26

SUZUKI and T . WATANABE for their stimulating discussions.

5. References

1. K. Abe, A. Ishida, T. Watanabe, Y. Kanada and K. Nishikawa, HIDM — New


numerical method for differential equations, Kakuyugo-kenkyu 57(1987) 85-95 (in
Japanese).
2. K . Burrage, High order algebraically stable Runge-Kutta methods, BIT 18(1978)
373-383.
3. J.C. Butcher, Implicit Runge-Kutta processes, Math. Comp. 18(1964) 50-64.
4. K. Dekker and J.G. Verwer, Stability of Runge-Kutta Methods for Stiff Nonlinear
Differential Equations (North-Holland, Amsterdam, 1984).
5. S.P. Nersett and G. Wanner, The real-pole sandwich for rational approximations
and oscillation equations, BIT 19(1979) 79-94.
6. G. Polyaand G. Szego, Problems and Theorems in Analysisll (Springer-V., Berlin,
1976).
7. J. Schneid, Stability properties of collocation methods, BIT 28(1988) 184-187.
8. H.A. Watts and L.F. Shampine, A-stable block implicit one-step methods, BIT
12(1972) 252-266.
9. K. Wright, Some relationships between implicit Runge-Kutta, collocation and
Lanczos r methods, and their stability properties, BIT 10(1970) 217-227.

Appendix. Equivalence of H I D M to the collocation R K


Originally HIDM is a discrete variable method for Eq. (1) stated as follows:
Let Xj be the step-points with the step-size ft

Xj = a + jh 0 = 0,1 m)

and j/j be the approximate solution at x . Let us denote the fractional points %, TJJ,
3

• - • i.Tm {r)j = a + Sjh), and the approximate solution and derivative at ij,- for y(x) by
Yj, respectively. Assume that Yj and Y, are given by

Yi=£c v>,
fk i ; ( A . i )

Here the coefficients {Cjt} and {D } jk are chosen so that the relations

m + l m + 1
tin) - £c*vM = o(n ), ^(m) - i f ; D (x )
jkV k = o(h ) (Ai)

hold for any sufficiently smooth function y(x). The series {y } is determined by k

Yi = f(m,Yi) (J-1.2 m).


27

Once the coefficients {Cff,}, {Djk} and the points {m} are specified, then the
process can be expressed as follows. Let y, Y, Y be the m-dimensional vectors given
by
T T
y = 0/.,---,Z/ ) ,
m Y = {Y ,...,Y ) ,
l m Y = (Y ...,Y f.
u m

From the second identity in Eq. (A.l) we have


hY = dy 0 + Vy
where the column vector d and the m-square matrix X> are given by

' 2>io " - D n • • Dlm

d =

Thus, provided that V is non-singular, we have


1 l
y = -D- dy 0 + hD- Y (AA)

Simiiarly the identity


Y = cy + Cy 0 (A.5)
holds, where
' C n • • C\ m

c = , c =
Ci
m •
Substitution of Eq. (A.4) into Eq. (A.5) yields
l l
Y = (c - CD- d)y 0 + kCV~ Y. (A.6)

Let the symbol denote the j'-th component of m-dimensional column vector. Then
from Eq. (A.4) y is given by
m

l l
ft. = [-T>- d] y m 0 + h\V- Y\ . m (A.7)
-1 l
If we can establish the identities [ - £ > d ] = 1 and c - CV~ d m = e, Eqs. (A.3)
and (A.7) are written as
l
Vm = ya + h[V- Y] mt (A.8n)
l
Y = f [lt,m + hCV Y) , (A.86)
where / is the m-dimensional column vector whose j - t h component is equal to
_1
/ [vi>Vo + y " j , ) . Thus, substituting h = H/m and denoting x = x , x = 0 n m

Xn+uVo = Vn&m = IM-ii we obtain


28

S
Y, = y + HY,*i fU«
n k + -±H Y ),l k
K m
k=i '
1
where a,> is the (j, k)-th component of CD' /m and bj is the j - t h component of the
- 1
last row vector of P / m . This is nothing but an impficit RK formula. (2) of m-stage
with step-size H{= mh).
Next, considering the special case y(x) = 1 for the constraints (A.2), we have the
identities
m in
£ c , . = l and £ f l = 0 (j = l,...,«*}.
( i

These imply
c + Ce = e and d + Ve = o.
Hence we arrive at the identities
1 1
e = -V~ d and c-CD~ d= e,

which are expected in the above.


Moreover, the constraints (A.2) and (A.7) mean that if the solution y(x) is a
polynomial of degree at most m, the approximations y and Y (j = 1 , . . . ,m) are
m 3

exact for y(x ) and — (%), respectively, at the next step-point and collocation points
m

dx
of the step-size H = mh. Hence we have a collocation RK formula. From the
derivation of HIDM, it is obvious that the equidistant points = a + (jjm)H stand
for the interpolating points with the step-size H. This implies that the formula is
the Newton- Cotes type mentioned in Section 2, and that the formula parameter is
uniquely determined. It also assures the nonsingularity of the matrix V.
29

F O U R T H ORDER P-STABLE BLOCK M E T H O D FOR SOLVING


T H E D I F F E R E N T I A L E Q U A T I O N y" = f(x,y)
Kazufumi OZAWA
Education Center for Information Processing, Tohoku University
Aaha-Kv, Sendai, Japan 980-77, Japan
E-mail: 0zawa@dai3.is.tahDku.ac.jp

ABSTRACT
A certain type of P-stabJe block method is derived for solving second order
initial value problem y" — f(x,y). The block method considered here computes
the numerical solutions simultaneously at the two points of x, and is easily
paralletisable. Some technique to reduce the local truncation error of the method
is also developed.

1. Introduction

The second order initial value problems (IVPs) of the form

y" = f{*,y), !/(zo). y'M given (1)


are of frequent occurrence in practice, so it is particularly important to develop effi-
cient methods for these problems. To solve the problems one uses the family of linear
multistep(LM) methods of the form

E%JWi=ft £:fe. a
( )
2

Xi = x„ + ih, /, = f(xi,yi).
where h is the step-size and y< is an approximation to the solution y(xi). Hereafter
we assume that the function f{x,y) satisfies the Lipschitz condition with respect to
y. The LM method of the type (2) is said to be consistent if

where p(() and a(Q are the polynomials denned by

= AC*+&-iC*~'+ •*•+&•
The consistent LM method (2) is said to be of order p if the power series expansion
of the operator

L[y(x); h] := £ a (x jV + jh) - g ft/0 + jh) (3)


30

p+2
satisfies L[y(x); h] = 0(h, ), for all sufficiently differentiable function y(x).
9
In the family (2) the Stormer-Cowell methods(see e.g. Hairer and Henrki") are
the most commonly used ones. It is, however, well known that the methods with step
number greater than 2 exhibit an orbital instability for the test equation

y" = - « V {«)
where u is real; the numerical solutions generated by such methods do not stay on
the circular orbit but spiral inwards. On the other hand, it is also known that the
Numerov method, 2-step Stormer-Cowell type method, is unstable for alarge step-size
h.
2 !
The interval of H = (w/i) is called the interval of periodicity, if the method
2
(2) with any H within this interval gives a periodical solution. The method having
interval of periodicity (0, oo), which is expected to be stable for any step-size h > 0, is
1J 7 13
said to be P-stable . For the P-stabie method Dahlquist and Lambert and Watson
independently established that the attainable order of the method is 2. However,
2 1
Cash and Chawla showed that higher order can be attained if one or more off-step
values are used. The multistep methods which use off-step values are often called
hybrid methods. For the certain type of the hybrid P-stable methods QI and Mitsui
1 5
gave the attainable order.
2
Many high order hybrid P-stable formulae have been derived (see e.g. Cash ,
5 10 12 17 18 10
Chawla , Hairer , Khiya , Simos and Thomas ). Among these methods Haiter's
one is the most simple one, and is given by

Vm+l - Sfc + l h - l = ^ { / V + i + (10 - T)/» + h - 1 + 7/-}. (5)

U = /v*i>*)i ' = " , « ± 1,

/. = /(*..*).
where the off-step value y is given by a

V* = » . - J^-A'C/m+l " 2 / + / n - l ) .
n (6)

The method(5) is shown to be P-stable provided that

7 ^ 0 , wk je s/n.

and is of order 4. In this article we shall develop a P-stable method which computes
y „ i using the information available at T„ and can be executed simultaneously with
t

Hairer's method (5) on parallel computers.


Perhaps the most commonly used parallel algorithms for the numerical IVPs are
the block methods, which consist of two or more LM methods and each of these
31

methods computes the numerical solution simultaneously at the different point of x.


The block methods are, in general, suitable for coarse-grain parallel computations.
1 3
Although a number of block methods have been derived (see Burrage , Chartier ,
6 14 16 19 20
Chu , L u , Shampine , W a t t and Zhou ), all the methods derived so far are not
for the second order equation such as (1) but for the first order equation j/ = f(x, y).
The aim of this paper is to develop the P-stable block method for the second order
IVPs (1).

2. Block method

The block method to be considered here consists of two LM methods, one of which
is Hairer's(5) one, and the other one is

J/-+2 - 2y + y -
n n 2 = / i { M / 2 + U - 2 ) + bi(U+i + U-i) + 2&o/ },
2
n + n (7)

fi = f{x„yi), t = n , n - l,n±2,
/ n + i = f(x„ ,y ),
+1 n+1

where the off-step value y„ +i is an approximation to y(x + k) and is given by n

2
j/.+i - a(y*+2 + y*-2) + 2by - I / * - , + h { c ( / n n + 2 + /„_ ) + 2df }. 2 n (8)

In our block method a pair of methods (5) and (7) compute y and y simulta- n+1 n + 2

neously using the previously computed step values y (i = n — 2, n — l , n ) . Note i:

that the second method (7) dose not use the values y i, / i , which are the values n+ n +

to be computed by the first one(5), since the use of these values in the second one
makes it difficult to execute the two methods (5) and (7) simultaneously on parallel
computers.
First of all, we must determine the coefficients b , bj and b in the second method 0 2

(7) so that the method is being of order 4, in accordance with that of the first method
(5). To do this, we associate with the method (7) the following difference operator
L[y(x)M:
L[y{x);h] := y(x + 2h) - 2y[x) + y(x - 2k)
2
- h {b (y"(x + 2k) + y"(x-2h)) 2 (9)
+ b,(y"(x + h) + y"(x - h)) 4- 2b y"{x)}. 0

Assuming that y(x) is sufficiently often differentiable, we expand the operator L about
x as the Taylor series

3 i l 2 b 2 + 4 4
L[y(x)-M = 2(2 - b - fc, - 0 fe) + " ~l y"'(^

J l 5 6 2 4 0 f e 2 + 3 2 a
+ '" 1 8 0 ^ ( ^ + 0(ft ). (10)
32

2
The values b and b, for which the terms 0(h } and 0(h*) vanish are given by
0

h-M t l | j jfcft±4 (it)


Substituting these into (10) we have

Therefore, under the assumptions that no previous errors have been made(localizing
assumption), and that / „ approximates to y"(x i) with the error of order at least
+ ] n+
6
ft , we have for the local truncation error of the method(7)

1 5 + 1
T := y(x )
2 n+2 - y n + 2 = " ^ ^ + 0(k*). (12)

In order that f is being such an approximation it is necessary that £ „ approxi-


n+1 +1
6
mates the solution y(x ) with an error of order at least ft , since f(x, y] has already
n+1

been assumed to satisfy the Lipschitz condition.


To get 5„ i satisfying the condition stated above we must consider the following
+

difference operator L:

L[y(x);h} := y(x + ft) - a{y(x + 2ft) + y{x - 2ft))


o R + 1 3
- &m+m - * ) + y V ' t ^ ( )
- ft2 { c ( / ( > + 2ft) + y'\x - 2ft)) + 2 d / ( T ) } .

Note that the definition of the operator is based on the practical consideration that
s
;/„ > always subject to the local truncation error when computing i /
+2 by (8), even n + 1

if the localizing assumption has been made. The Taylor series expansion of (13) is
given by
2
L[y(xy,h\ = 2(-a-b + \)y(x) + (-4a-2c-2d+l)yW{x)h
- 1 6 a - 4 8 c + l ,,,, . . W
+ " 2 y i ^
1 (14)
-360ab - 40a -480c + 12 ( 6 ) 6
yW(x)h + 0(h*).
360
The coefficients for which the terms up to order 4 in (14) vanish are given by
-48(i-l- 23 , 48d + 57 8d - 3
a = 4 = C = 1 5 )
— 8 0 — ' - ^ 0 — ' — - <
Substituting the coefficients into (14) we have

f 2 : = ^ + 1 )- = 3 4 8 d - 2 3 ) - 48.
M + 17 + ^ ^
33

Thus, we have a 4th order LM method which is suitable for the second method in our
block method.
In the next section we will investigate the conditions under which the method (7)
is P-stable .

3. P-stable analysis

If we solve the test equation(4) by the method (7) then the numerical solution t/i
satisfies the recurrence relation
2 2 3
R (H )y„
0 +2 - 2R (H )y
1 7l + MH )*^ = 0, (17)
2 2 2
where H = uh, and the coefficients Ro(H ) and Ri{H ) are the polynomials in H
and given by

2
3(3b - l)(16d - 1) +20 2 (3fca-l)(8d-3)
R (H )
0 = 1+ H H\ (18)
60 30
2, , , 3(36,-l)(16<f-l)-100
2 s 4(36 - l)d
2
R,{H ) = 1 + (19)
H .
Necessary and sufficient condition that the y„ defined by the relation (17) has a
9 2
solution of the form y = e*' [8 = real) for any H > 0, that is, the condition for
n

the method (7) being P-stable , is

2
V H > 0, 2 <1, (20)
Ro(H )
or equivalent!y
2 2 2 2
V H > 0, {R (H )a + R,(H )) (Ro(H ) - Ri(H*)) > 0. (21)

It can be easily seen that the necessary condition for (21) is

d e (-3/32,1/16),

since

2 2 2
Ro(H ) - R^H ) = 2H +
10
2
MH ) + R H)
l{
2
=2 + 3 ( ^ - l ) (30
y-D-40 , t f _ (3 & 2 -l)(32
30 t f + 3) t f 4

2
Moreover, since R + Ri ~ 2 for small H , we are allowed to consider only the case
0

that RQ + RX > 0 and R -Ri > 0. When d e (-3/32,1/16), necessary and sufficient
0

conditions for both of Ro + Ri and Ro - Ri to be positive are

3*2 - 1 < 0, (22)


34

and

2
m ; = |3(36 -I)(16d-1)-4Q|
2 + 4(36 -l)(32d
2 + 3) < Q ^

The discriminant D(d) has the distinct real zeros d, and d (di < d ) 2 2
962 - 43 + 20^/-3(36 - 1) 2

(24)
48{36s - 1) '
9 6 , - 43 - 2 0 ^ - 3 ( 3 ^ - 1 )
25
* = « f ^ T ) " < )

Since f ( d ) is negative on (di,d ), 2 the method (7) is P-stable provided that

Here we must investigate the condition for the first interval defined above is being
nonempty. We have for any b 2

d i + 1 9
i=96(1^) { - ^ - ey=i(>r^iy} > o,
which is given by the identity

2 2 2
(19 - 96 ) - ( 8 ^ - 3 ( 3 6 2 - 1 ) ) = (96 + 13) .
2 2

Therefore, if d] < ^ then the interval is shown to be nonempty. A simple computation


shows that this is the case when 6 < —4, Moreover, we have for d
2 2

d 2 2 < 0
l i - ^ 1 2 ( l ^ ( - - ^ ^ ) -

Thus, we have proved that the second method (7) is P-stable, for the parameters 6 2

and d in the intervals


b><--, 1 d,
( 2 7 )

Example 1. Here we solve the equation defined by

2
y" = -<" y, y(o) = i , y'(o) = o, (28)

y{x) = casux, w = 10
to compare the errors of our P-stable block method with those of Hairer's P-stable
method. The errors at x = 5rr, IOJT, 15TT, and 20?r are shown in Table 1 and 2. In
this experiment we set bj = -0.112 and d = 0.062.
35

Table 1. Errors of Hairer's P-stable method.


h = TT/40 h = ir/SO ft = TT/160 h = jr/320
x = 5JT -3.16E-03 -1.31E-05 -5.22E-08 -2.05E-10
X = 105T -1.27E-02 -5.27E-05 -2.09E-07 -8.20E-10
x = 15TT -2.85E-02 -1.19E-04 -1.71E-07 -1.85E-09
I = 20TT -5.06E-02 -2.11E-04 -8.37E-07 -3.28E-09

Table 2. Errors of P-stable block method.


k = TT/40 h = TT/80 h --= TT/160 h : = TT/320
X = 5x -5 75E-01 -3 20E-03 -1 33E-05 -5 27E-08
x = IOTT -1 65E+00 -1 28E-02 -5 32E-05 -2 11E-07
x = 15TT -1 96E+00 -2 88E-02 -1 20E-04 -4 75E-07
x = 20TT -1 15E+00 -5 11E-02 -2 13E-04 -8 45E-07

Example 2. Next we consider the 2-body problem:

Vt = - | , yi(Q) = 1, Jri(O) = 0,
(29)

yi(x) = cosx, y2(x) = s i n i .

ID this example, we integrate the equation in the interval x f [0, IOOOTT], and in order
to compare the accuracies of the methods, compute the maximum orbital errors

max (30)
D<z„<100*
where jfe,* and y , are the numerical solutions corresponding to the exact solutions
2 n

yi{x-n) and i/2(x„), respectively. The results are shown in Table 3.1 and 3.2.

Table 3.1. Maximum orbital errors (30) of Hairer's P-stable methods.


h = TT/10 h = TT/20 h = TT/40 h = TT/80 h = TT/160
3.407E-04 2.181E-05 1.371E-06 8.584E-08 5.367E-09

Table 3.2. Maximum orbital errors (30) of the P-stable block methods.
h = TT/10 h = TT/20 h = TT/40 h = TT/80 h = TT/160
4.317E-02 4.655E-04 6.613E-06 2.375E-07 1.348E-08

We can see from the tables that although the results of our P-stable block method
are slightly less accurate for both problems, our P-stable method integrates the equa-
36

tier) stably for long range of intervals.

4. Local extrapolation for P-stable block method

We have seen in the previous section that our P-stable block method does not
necessary give an accurate result compared with that of Hairer's one. The reason for
this is that the second method (7) of the block method has a large error constant
(—15b — 1)/15; the value of the constant at b — —1/9 is greater by a factor of 64
3 2

compared with that of the first one for the test problem (4), In order to improve
the accuracy of the second method we shall develop some extrapolation technique
such as Milne's device, which is often used to enhance the order of convergence in
conventional LM methods for the first order equations.
In our extrapolation technique we need another approximation to y(x } of the n+2

same order. To get such an approximation we use the same method as (7) with
different set of parameters. We attache the superscript * to all symbols relating to
the second approximation. The method for the second approximation is

2
y: +2 - + sk-a = h {b' {f; 2 +2 + u $ + Kif: +l + 1 U ) + 2b / },0 n (31)

ffi s
£ M = "(fn+2 + fc-a) + 26*& - + ft {c'{/„ +2 + /„_ ) + 2d'f }.
2 n (32)
In the method above the free parameters b and d' should be chosen in the range 2

which guarantees P-stability, The local truncation error of the method is given by

15 + 1 e
T- := y{x )n+2 - y' n+2 = ~^ y&(x )h
n + 0(k"). (33)

Using two methods (7) and (31) we can easily estimate the local truncation error of
(7). From (12) and (33) we have

6|
*V (z») = ^ ( y „ + 2 " C.J + 0(*ft (34)

where
-lMa + 1 -15ft;+ 1
15 ' 15 '
and therefore we get
Q
y{x )
n+2 =y n+2 + , _ c c (JW.2 - y* ) + 0(h').
n+2 (35)
37

Thus, we have the following modified block method:

fcU = 2y - y _ + —{f^j
n n 1 + (10 - ) / „ + /„_, + / }
7 7 n

= 2y - y _ + h?{b {} +
n n 2 2 n 2 + /„_ ) + 6,(/ 2 m+1 + f„_ ) + 26 /n}
t 0

2 ( 3 6 )
li+a = fa - y,-2 + ft {6 (/; 2 +3 + A- ) + b;(/;
2 + ! + fn-i) + 2b- f„} a

y*+2 = y*+2 + c , _ c (fn+2 - y' )


n+2

where the off-step values j j ^ j / and y^ are given by (6), (8) and (32), respectively.
n + 1 +1

The locally extrapolated value £ in the above algorithm is expected to be more


n + 2

accurate than is y . Therefore, if we compute y


n + 2 and replace y „ by it before n + 2 +2

proceeding to the next block then we can improve the accuracy of our P-stable block
method. Note that the calculations of the three values y„+i, t/„ and y^ can be +2 +2

performed in parallel using three processors, and that the calculation of j / requires n + 2

no function evaluations.

Example 3 . Let us consider the same equation as that of Example I , The results by
the modified algorithm are shown in Table 4. In this experiment we take b = -0.2 2

and d' = 0.0622, and take the same values for 6 and d as those in Example 1. 2

Table 4. Errors of modified block method.


h = TT/40 h = jr/80 k: = TT/160 h = TT/320
I = 5* -1 8 5 E - 0 2 -4 5 1 E - 0 6 -1 1 3 E - 0 9 -2.99E-13
X = IOTT -7 4 0 E - 0 2 -1 81E-05 -4 54E-09 -i.UE-12
X = 157T -1 6 5 E - 0 1 -4 0 8 E - 0 5 -1 0 2 E - 0 8 -2.51E-12
X = 20TT -2 87E-01 -7 2 6 E - 0 5 -1 8 2 E - 0 S -4.46E-12

Example 4. Next we consider the same problem as that of Example 2. In this


experiment we also take the same values for b , b' , d and d" as those in the previous 2 2

examples. The maximum orbital errors are shown in Table 5.

Table 5. Maximum orbital errors (30) of modified P-stable block methods,


~~h = TT/10 h = TT/20 k = TT/40 h = TT/80 h = TT/160
5.661E-02 6.585E-04 5.387E-06 4.412E-08 6.529E-10
From the results above we can see a very considerable improvement over those shown
in Table 1, 2 and 3, in particular for small h.

5. Stability analysis of modified method

Next we study the stability of the modified method (36). If we solve the test
38

equation (4) by the algorithm, then j i n + 2 and y^ +2 must satisfy the recurrence relations
2
y ~2S(H )y
n+2 a +y- n 2 = 0, (37)
2
!,; +s - 2S-(H )y n + y _ = 0, n 2 (38)
2
2 _ 2 _ R[(H )
S { H S { H )
> - M i w y - m m
where Rg and R, are the polynomials defined by(18) and (19), and flj and R' are
the polynomials defined by the same Eqs., but b and d are replaced by b and d' 2 2 :

respectively. From Eqs.(36), (37) and (38) we can find the recurrence relation to be
satisfied by y - The relation is
n+2

2
j}„ 2 -
+ 2S(H )y„ + y - n 2 = 0, (39)
2
where S{H ) is given by

2
The modified method is stable if and only if the function S{H ), which is often
called the stability function, is less than unity in modulus. A simple computation
shows that the modified method (36) is P-stable , if b —• —oo and b remains finite, 2 2
2 2 2 2
(if -oo and b remains finite), then S(H )
2 S'{H ) (S(H ) - S(H )). There
may be some cases that the method is not P-stable for finite b and b . However, in 2 2
2 2 2
these cases there exist the intervals (0, H ) and (H ,H ) (H < H\ < H ) in which 0 2

the method is stable. A further study on the stability of the modified method will be
needed.
2
The graphs of the stability functions S(H ) of the modified algorithm for some b 2

and b are shown in Fig.1-3. In these graphs we take d = d' = 0.062.


2
40

6. Concluding remark

We have derived a certain type of P-stable block method for solving the second
order IVP's on parallel computers, and developed the procedure to reduce the lo-
cal truncation error of the method. The modified algorithm using this procedure
41

produces an excellent Tesult, and seems to be P-stable for almost parameters. Fur-
ther consideration on implementation of the algorithm on parallel computers and its
performance evaluation will be necessary.

7. References

1. K. Burrage, J. Comput. Applied Math. 45(1993) 139.


2. J.R. Cash, Numer. Math. 37(1981) 355.
3. P. Chartier, SIAM J. Numer. Anal. 31(1994) 552.
4. M.M. Chawla, BIT 21(1981) 190.
5. M.M. Chawla and P.S. Rao, IMA J. Num. Anal. 5(1985) 215.
6. M.T. Chu and H. Hamilton, SIAM J. Sci. Stat. Comput. 3(1987) 342.
7. G.G. Dahlquist, BIT 18(1978) 133.
8. S.O. Fatunla, Numerical Methods for Initial Value Problems in Ordinary Dif-
ferential Equations, (Academic Press, New York, 1987).
9. E. Hairer, S.P. Nsrsett and G. Wanner, Solving Ordinary Differential Equa-
tions I, (Springer, Berlin, 1987).
10. E. Hairer, Numer. Math. 32(1979) 373.
11. P. Henrici, Discrete Variable Methods in Ordinary Differential Equations,
(John Wiley & Sons, New York, 1962).
12. M.S.H. Khiyal and R.M. Thomas, Computational Ordinary-Differential Equa-
tions ed. J.R. Cash and I . Gladviell, (Clarendon Press, Oxford, 1992).
13. J.D. Lambert and I.A. Watson, J. Inst. Maths. Applies. 18(1976) 189.
14. L. Lu, IMA J. Numer. Anal. 13(1993) 101.
15. Tie-Shan QI and T. Mitsui, JJAM 7(1990) 423.
16. L.F. Shampine and H.A. Watts, Math. Comp. 23(1969) 731.
17. T.E. Simos, JJIAM 10(1993) 289.
18. R.M. Thomas, BIT 27(1987) 599.
19. H.A. Watts and L.F. Shampine, BIT 12(1972) 252.
20. B. Zhou, J. of Comput. Math. 3(1985) 328.
43

TWO-POINT HERMITE-BIRKHOFF QUADRATURES


AND
ITS A P P L I C A T I O N S T O N U M E R I C A L S O L U T I O N OF ODE
C H I S A T O SUZUKI
Department of Computer Science, Skizuoka Institute of Science and Technology,
SSOO-g Toyosavia Fukvroi-shi Skizuoka 4SI, Japan
E-mail: suzuki@cs.sist.ac.jp

ABSTRACT

In this paper a special class of Hermite-Birkhoff quadratures is investigated


and applied to numerical method for solving directly any higher order ordinary
differential equation. That is, if an incidence matrix B = (eij) denned by ey =
e e 7 1 i s o i s e d i n t h e
0 or 1, (l<i<2, 0<j<n - 1), and £ " = 0 l.j + E"=o a.j = P
1
viewpoint of the lacunar; interpolation theory, it is shown that there exists a
quadrature formula in the form

ft
/ P{x)dx = E Vii^ixi), fa = a and i = 6),
a

to be exact for any polynomial P with degree at most n — I , where HUj'i are
weight coefficients independent of P. In addition, a numerical method for solving
directly the initial value problem of r-th order ordinary differential equations

f J/ LR, (01
= /(i.J .---,!/'''- ), 11
*>*»,
(
1 » '>(*o) = tf\ i = o,i,..., -i, r

is constructed from all the quadratures specified by £ - (e;>), (eio = ••• =


= e , = •:-•- = ei,. = 1). with n = 2s (n = 1,2,....r). To verify efficiency
a 0

of the method, a numerical example is included.

1. I n t r o d u c t i o n

Let 7 = [a, 6], (a < 6), X = {x^I \ a<0t<Xv<>• -<x <b), (rt > 1). Let n _ ! be
n n n

a space consisting of polynomials with degree s<n - 1 defined on I and E = (%) a


k x n matrix such that

(i) Ci,=0orl, 1 < i < M < j < n- 1,

pi) E
i=l j = 0
E
This E is often called an incidence matrix of quadrature or interpolation. Let V„
denote all the set of n real numbers yi indexed by (£, j) such that eg = 1 for a given
B
fc X n incidence matrix E = [ % ) , i.e., y„ = {yj | By = 1}.
Given a it X n incidence matrix E and X„, we can define a quadrature formula in
44

the form

Jo 1

for any polynomial P e n _ i , where W^'B are the weight coefficients independent of
n
1 2 3
P. This is called ' ' the Hermite-Birkhoff quadrature formula (hereafter, merely say
HB-QF) for the incidence matrix E.
In this paper, first we consider the question of existence and construction of HB-
QF specified by E with k = 2 and n > I . That is, this problem is of generalization
of the Euler-Maclaurin quadrature. As an application of HB-QF to ordinary differ-
ential equations, next we develop a numerical integration method for the initial value
problem of the r-th order ordinary differential equations
j,M = f( , x &%.,.,
(i)
(1)
y (xo) = Vo\ i = 0.1 r - l ,
where r is any fixed positive integer greater than 1. Then the numerical integration
formula in the form

Vm+i = 1 5 * + <*fcF + m m = 0,1,..., M ,


is given for the initial value problem, where h is a step-size of integration, x m =
XQ + mh,

tfm-H
Ym+i — and F m + i = i = 0,1,
J*)

here for each s (0 < s < T) is an approximation of yW(x i), and at, /3|, are m+

r-column vectors dependent upon h. To verify efficiency of this scheme, a numerical


example is included.

2. Lacunary Interpolation Problem


4
According to the theory of lacunary interpolation , an incidence matrix E =
( v ) . 1 < » < f c , 0 < j < f l — 1, ia said to be unconditional-poised or, merely, poised if
e

there exists a unique polynomial p€ll„^i such that for any F„ = {yl | ey = 1} and £

any X„ = {xi | 1 < i < n],


W
P ( X . ) =~&, 1 < i < fc, 0 < < n - 1.

For a k x n incidence matrix E = (eg), let


h

i=l
45

and
< = E < . 0<p<n-l. (2)
i=o
Then the following Pdlya's result is well-known.
4
Theorem 1 (Polya's theorem ) A 2 x n incidence matrix E is unconditional-
poised if and only if M£ determined by E satisfies the inequality
B
Af > p+ 1, 0<p<Ji - 1.
p

This inequality is called the Polya condition.


For a 2 x n incidence matrix E = (ey), we define a 2 x n matrix G = (gij) as
follows;
gij = 1 - e.,„_j_i, 1 < i < 2, 0 < j < n - 1.
Then G is also an incidence matrix since
2 * - l

E £ w = *•
i=l j=0
In this paper, the G is called a dual incidence matrix corresponding to E or, merely,
a dual matrix of E. For example, for the incidence matrix
1
F - (
E
° °
~ { 0 1 1
its dual matrix becomes
1 1 0
G =
0 0 1
The following theorem give a relationship between a 2 x n incidence matrix E and
its dual matrix G.
Theorem 2 A 2 x n incidence matrix E = (ey) i s unconditional-poised if and
only if its dual matrix G is unconditional-poised.
Proof To prove this theorem, we use the following relations obtained immediately
from the definition of the dual matrix G\

f = 2 - E e.^-j-i - 2 - m^,_ (3)


t=i
m u

for each j , (0 < j < n- 1).


On necessity: When p = 0 , 1 , . . . , n - 2, from Eq. (3) we have

M? - 2(p+l)-Em*_ _ , i 1
i=o

= 2(p+l)-^- E^-i-ij-
= 2{p + 1) - » + (4)
46

Therefore we obtain

Mjf > p + 1, 0 < p < n - 2,

since

by virtue of Theorem 1 for poisedness of B.


When p = n — 1, on the other hand, we have
n-i

Therefore G satisfies the Polya condition.


On sufficiency: Substituting n — q — 2 for p in Eq. (4), we get

M? = 2(q+l)-n + MZ_ _ , q 2

for q = 0 , 1 , . . . , n - 2. Thus we have > q - f 1 for each q (Q<q<n - 2) since


M„_ _2>n — q — 1 by poisedness of G.
q

When p = n-1, computing M ^ _ , from Eq.(3), we have Mf_, = n since = a.


We have thus proved Theorem 2.

3. Existence and Construction of H B - Q F


5
The following lemma can be shown by means of the Darboux formula .
Lemma 1 Let X\ = a and x = b, (a < b). Let q be any monic polynomial of degree
2

n. Then for every P e I I _ i , it holds that


n

J
/V(x)^ = E E ' i V F i +
J
( i )
(^). 0-6-a), (5)
" ,=o 1=1
where
| n , )
Wti = ^ P l ~'~ (i ). i (<i = 0 and t = 1). 2 (6)
Proof By using the Darboux formula it can be proved that, for any monic poly-
nomial q of degree n and any (n + l)-th continuously differentiable function u on /,
it follows that

u(b) = u(a) + E ( - i y ^ y (y-'>(0)«W(a) - q^\l)u^(b)) + ft,, (7)


47

where

= ( _ 1 ) + 1 } a + s h
* >)(Oji ^ ( ^ - (»)
Now define

u(x) = j * p{t)dt, (Pen„_,),

1
then we have u«>(x) = P ^ ' f i ) for j = 1,2,• • • , n + 1, and substituting P ^ - ' for
tt$ in Eqs. (7) and (8), we can obtain

ft?
u(b) = u(a) + £ ( - i y J ^ (q^(Cj)pU-V(a) - q^(l)P^{b)) , (9)

and

w <n,
However since P ( x ) = 0, we obtain R^-Q. Therefore by substituting g (0) = n!,
u{a) = 0, and

u(b) = £ P{t)dt,

into Eq. (9), the proof of Lemma 1 is completed.


A quadrature formula for any rc-th continuously differentiable function g defined
on 7 is obtained as

;=0 t = l

by substituting g for P in Eq. (5). However, in this case, it should be noted that the
remainder term is not necessary zero, that is, in general,

jj(x)dx = h(g) + J t » , (10)

where n + 1

R^ = ( - 1 ) " ^ - / ' q(x)g™(a + xh)dx, h = b - a.


n\ Jo
As an application of Lemma 1, we can use it for the proof of an existence theorem
of HB-QF. In fact, in order to show the existence of HB-QF specified by a 2 x n
incidence matrix E = (e^), now it is sufficient to prove that Wi, = 0 if e = 0. In tj
in l,
other words, if there exists a monic polynomial q of degree n such that q ~'~ {ti) =0
for ejj = 0, then the existence of HB-QF becomes clear by Lemma 1. Formulating
this assertion, we have the following lemma.
Lemma 2 7Viere exists a HB-QF specified by a 2 x n incidence matrix E if and only
if there exists a monic polynomial q of degree n such that <ft'(ti) = 0 if jfij = I for
the dual matrix G = (g ) of E, where h = 0 and t = 1.
i} 2
48

Proof Suppose the existence of HB-QF specified by E = [a,), then we have the
expression in the form

?p{x)dx= E mf W% m
Pen..,. (ii)

On the other hand, by Lemma 1 we also have the expression in the form

f'* P{x)dx
P(x)dx =
= E +l
fc' wyP (*i)+ W)
E iPhnfltHpfi,. (12)
_ _• r*

for any P e n _ , . Therefore, comparing Eq. (11) with Eq. (12) we get the relation
n

E » ^ % f % ) = o.
ti;=0

Since this relation must be satisfied for all P e n „ _ i , it is necessary that tetj = 0 if
ey = 0. This means by Lemma 1 that there must exist a monic polynomial q of
degree n such that
x
q^-i- \U) = 0 if = t,

because j)i,„-j-i = 1 — = 1.
Conversely, if there exists a monic polynomial g of degree TI such that ?'*'((;) = 0
if = 1, then it is trivial by Lemma 1 that the HB-QH prescribed by E exists. We
have completed the proof of Lemma 2.
By virtue of Lemma 2, the following theorem is proved.
Theorem 3 A 2 x n incidence matrix E is poised if and only if there exists a
HB-QF specified by E.
T
Proof Let G be the dual matrix of E and H = [hi, k ) with hi — 1 and h = 0, 2 2

then we consider the horizontal sum G = (§y) consisting of G and H, i.e.,

f gtf, if 0 < j < j i - l ,


9a = hf, if j = n.

This horizontal sum is often denoted by G = (G\H), and we recall that the horizontal
sum is poised if and only if both G and H are poised. Therefore if G is poised then
the horizontal sum is also poised since H is evidently poised. Consequently since G
is poised by Theorem 2 provided E is poised, there exists uniquely a polynomial q of
degree n such that

fiHti) = 0, if g - = 1,
i} ( 0 < j < n - 1, i = 1,2), (13)

and
<n)
q (t ) 1 = n\. (14)
49

where (, = 0 and t = 1.
2

Conversely, if E is not poised then G is also not poised. Therefore any polynomial
q of degree n which satisfies the interpolation conditions in Eqs. (13) and (14) does
not exist, and any HB-QF specified by E does also not exist by Lemma 2. We have
completed the proof of Theorem 3.
By Theorem 3 and Lemma 2, if a given incidence matrix E is poised then it is
guaranteed that there exists a monic polynomial q of degree n such that q^{U) = 0
if = 1 for the dual matrix G of E. Especially it should be noted that finding such
g is just equivalent to solve a homogeneous interpolation problem on G. Therefore if
the homogeneous interpolation problem can be solved, we can then obtain the HB-QF
specified by E from the solution, by computing the coefficient »);,• in the way shown
in Lemma 1,

4. Several Examples of H B - Q F

In this section, quadrature formulas of general Hermite and Euler-Maclaurin-like


types are illustrated together with the Euler-Maclaurin quadrature formula as typical
examples of HB-QF. In particular a feature of the Euler-Maclaurin-like type quadra-
ture is to use even order derivatives in the formula, though the Euler-Maclaurin
quadrature uses odd order derivatives.

4-3- Quadrature Formula of General Hermite Type

Let ni and n (n. < 112) be non-negative integers and n = n. + n > 0. Define
2 2

a 2 x n incidence matrix Ean(ni, »a) = (ey) by

f 1, if /=o,M,---.«i-i.
e
*' I 0, if otherwise,

for i = 1 and 2, then the interpolation problem on Eofffni.ttl] is known as a general


Hermite interpolation problem which always has a unique solution. Therefore the in-
cidence matrix E (n n )
GH is poised. Then the dual matrix G = {gij) of
u 2 Egsfa,fts)
defied by
9ij m X - e,,,*-,-. = % , (1 < i < 2,0 < j < n - 1)
is also poised by Lemma 2. The homogeneous interpolation problem on this G can
be uniquely solved as

a i + k
q(x) = t y r - * $ * '
50

without the factor of multiplier. Then the (n-j- l)-th derivatives of q are given by

BI \(n + t - j - I ) ! K
-SS* 0 < j < Bl - 1

By virtue of Lemma 1, therefore the quadrature formula on EGH{HI, 7 i j ) is given as

f P{x)dx = " £ +1
tf »» P«>(a) li + fc'+Vw^H*).
J o
j=0 ;=0

where the weight coefficient uVy can be computed from Eq. (6) as follows

23
n\h {
' U +«i-J-l+9; J + (A + g i ) !
J+ '
here = min{tii, j } and q = max{0, j — r i i } . This quadrature formula is said to be
3

a general Hermite type.


We are especially interested in the class of quadratures with tii = n = r and 2

n = 2r because there is an important application for solving directly higher order


differential equations.

Example 1: (Case of n = 6; n\ = n = 3) 2

(a) Incidence matrix :

. , / 1 1 1 0 0 0 \
BMIMJ-^ i I I O O O J
(b) Dual incidence matrix G corresponding to EGH{3,3) :

( 1 1 1 0 0 0
^ 1 1 1 0 0 0

(c) Solution of homogeneous interpolation problem on G:


6 5 4 3
q(x) = x - 3 i + 3z - x

(d) HB-QF of general Hermite type specified by Ec«(3,3):

[P{x)dx = ^P(a) + P(b)) + ^(pW(a)-pM(b))

( 2 2)
+ T^(P V ) + P' W).
51

4-2. Quadrature Formula of Euler-Maclaurin Type

Since the Euler-Maclaurin quadrature formula is included as a special class of


HB-QF stated in the previous section, accordingly the incidence matrix prescribing
the formula is poised. In fact, the following theorem shows it.
Theorem 4 Let r be a positive integer and n = 2r. Let E = (ey) be the 2 x n EM

f
incidence matrix such that

1. if j = 0,
= { i. if j = 2k-l and l<k<r,
[ 0, if otherwise,

for i = l and 2, then EEM is poised.


Proof It is evident that E is satisfying the Polya condition. In fact, by
E M

computing M from Eq. (2) for E , we can obtain Mo = 2 for p = 0, Ma _. — 2p+2


T EM p

and M = 2p + 2 for p = 1,2,... , r - 1, and Af _, = 2r + 2 for p = 2r- 1. Therefore


2 p 2p

since M >p + 1 for p = 0 , 1 , . . . , 2r - 1, E u satisfies Polya's condition. We have


p E

completed the proof of the theorem.


By virtue of Lemma 1, the quadrature formula specified by EEM is given as

j^P{x)dx = h(w P(a) 10 + w P(b))


i0

where the weight coefficient ttty can be computed from Eq. (6). This formula is
well-known as the Euler-Maclaurin quadrature.

Example 2: (Case of n = 6; r = 3)
(a) Incidence matrix :

1 1 0 10 0
EEM -
1 1 0 10 0

(b) Dual incidence matrix G corresponding to E :


EM

I 1 1 0 1 0 0
G =
^ 1 1 0 1 0 0

(c) Solution of homogeneous interpolation problem on G:

6 s
g( )=X -3x +
I ^ - ^
52

(d) HB-QF of Euler-Maclaurin specified by EEM'.

m
[P(x)dx = \{P{a) + P(b)) + ^ ( ^ ' ( a ) - P (t>))

i 3 ) m
- ^ ( P ( « ) - r m

4.3. Quadrature Formula of Euler-Maclaurin-Like Type

Theorem 5 Let r be a positive integer and n = 2r. Let E E M L = (eij)> 1 < i < 2
and Q<j<n —1,be the 2xn incidence matrix such that

if j = 2k and l<k<r - 1,
if otherwise.

fori = l and 2, then E EML is poised.


Proof We shall show that E M L satisfies Polya's condition. Computing M from
E T

Eq.(2) for E ML< we obtain evaluations of M = 2 > 1 for p = 0, A/2,,-1 — 2p>2p


E 0

for p = 1,2, . . . , r , and M = 2p + 2>2p + 1 for p = 1,2,..., r - 1. Therefore £ M I


2(! B

satisfies Pdlya's condition. We have completed the proof of Theorem 5.


By virtue of Lemma 1, the quadrature formula specified by EEUL is given as

where the weight coefficient ijjj,- can be computed as follows; Let an — 1/2, and

1
(2> + l)!2(j + l ) -E (2j - 2 * + l ) ! '
(1 < 3 < r - 1),

then wift - Wisj = a, for j = 0 , 1 , . . . , r - 1.

Example 3: (Case of J I = 6; r = 3)
(a) Incidence matrix :

0 10 1
0 10 1

(b) Dual incidence matrix G corresponding to E

10 10 10
10 10 10
S3

(c) Solution of homogeneous interpolation problem on G:


6 a 3
q{x) = x - 3a: + 5a; - Zx

(d) HB-QF of Euler-Maclaurin-like type specified by E :


BML

+ ( / > ( 4 ( Q ) + F ( 4 l ( 6 ) )
2^ ' "
Remarks-. In all the cases of examples illustrated above, it should be noted that each
dual incidence matrix coincides with the corresponding incidence matrix.

5. Application to Numerical Solutions of Ordinary Differential Equations

As an application of the quadrature formulas obtained in the previous section,


first we consider a numerical method for the initial value problem of the ordinary
differential equation,

Suppose that the solution y to the problem is sufficiently smooth, then we can obtain
the following equation from the quadrature formula by substituting y' for g in Eq.
(10) and computing the integration of the left hand side .

y(b) - y(a) = £ V +
+ E +

JLml-iyJL- l
[ (x)i (a+xk)dx,
g
n+1)
h = b — a, (16)
n! Jo
where is the error term of the quadrature. In this equation, truncating the term
Rn, we have the numerical integration scheme

y m + 1 = y m =E A W P + t (™ = o, l , . . . , M ) , (17)
,=i ;=i

(Qj = Wij-i and 0j = w - i ) , for the initial value problem (15), where x
2 j m+i =
x + (m + i)h and
0
54

for i = 1 and 2. It follows from Eq. (16) that the order of local truncation error in
this scheme is greater than or equal to n.
In the scheme with n > 2, because some derivatives of higher order are used, we
are usually a need of analytic computations to obtain the derivatives. Unfortunately
it is, in general, not easy to do so, in particular, in the case where the differential
equation is of a system, so that the scheme is not practical. But, as an application
of the scheme, we can design a useful scheme for computing directly a numerical
solution for the initial value problem of higher order differential equations. After
some preliminaries, therefore we will give such a useful scheme in §5.2,

5.1. Numerical Integration Scheme of Quadrature of General Hermite Type

In the numerical integration scheme (17), especially we are interested in the nu-
merical integration scheme derived from the quadrature of general Hermite type with
rii < "2- Then the scheme can be written as follows

(18)

where

Cti =

n, Un + k-pjY.
ni+k-j + qj) {k + q,)\ '

pj = min{fii,j) and q, = max{0,j — T I T } . A reason which we pay attention to this


scheme is that it has a strong numerical stability. To see it, we will apply the scheme
to the test equation

y' = (A€C, ReA > 0).

Then we can obtain the stability function S{z),

S(z) =
1! P.
[k + n.-j + qj) (k + qj )l j

with z = Aft. For the stability function, we can easily shown that S{t) < 1 (t e
(0,+co)) for any n and n with iii < n . In addition, we can also show that the
s 2 s

scheme is A-stable if H i = nj. The latter property is proved in the following theorem.
55

Theorem 6 Let p be a positive integer, then the scheme

(19)
3=1 j= l

is A-stable, where

1 fp\

m = -±h-» (
p
k p
V 2 p +
* - » !

' ' \p+k-j) k\

Proof Let o^ = 1 and 01 = t. By using the relation of combination,

(A)=s-H w)rrO' + *
we can show that

- $ ~ t - l M (J = l , 2 , . . . , p ) .

From this relation, then we have

3=0
m =

j=0

Evaluating this stability function at each point z = to/, (-co < u < oo), on the
imaginary axis, we obtain
A+iB
S(iV) =
A — iS
where

W , / a
A= £ ( - l y ^ w * . and S = £ (-#" 0 V-J

j: even>o 3: odd>o

On the other hand, we obtain

j=0 w /
5(1) = < 1.
E(:)(2 -y)! P

j=o
56

at 2 = 1. Therefore by the principle of the maximum in the complex-valued function,


it follows that S(z) < 1 for any zsC (Rez > 0). We have completed the proof of
Theorem 6.
It should be noted that the scheme (18) with m = 0 and n = n is equivalent to 2

the Taylor expansion method.

5.2. Application to Higher Order Ordinary Differential Equation

We construct a numerical integration scheme for solving an initial value problem


of the r-th order ordinary differential equation given in Eq. (1).
In Eq. (19), replacing y ^ with i £ f > and
m with yt~4 for J = 1,2, then +i)

we get

r )
= yir +i + i
j=i >=i
for p = 1 , 2 , . . . , r, and by setting t — p = s, we can moreover obtain

for s = 0 , 1 , . . . , r - 1. In addition, the right hand side of this equation can be


represented, using vector notation, in the form
L
+ (o.-.-.o,^-,...,^- ;;;_„/i'-' -;)f a Q m

+ ( o , . . . , o, ftiSp . . . , v—'ft:;.,, h'-'0;:;)Fm+l,


where
1
( wL ' \

('+1)
, and F m + I =

Arranging j^mti i " a column and setting j , ^ = / m + i , (j = 0,1), we have the desired
scheme
, l r
/" fca[ . . . k - a' _ h'a \
f 3/L°> \ r l T

0
2 2
: •• ka\ ha
0 ... 0 ha[ J \ fm }
Table 1: Absolute Errors in y ' and y ' with M = 5/ftM M

Absolute Error of y > M Absolute Error of jAV


Step-Size h This Method Euler Method This Method Euler Method
6 6 2
0.005 7.47 x 10^ 8.55 x lO"'* 4.52 x 10" 1.56 x 10"
s
0.010 2.99 x 10" 1.73 x 1 0 -i
1.81 x 10~ s
3.13 x 10" 2

4
0.050 7.47 x 10" 9.56 x l O - 2
4.52 x 10" 4
1.62 x 10""'
3 1
0.100 2.58 x 10~ 2.17 x 10"' 1.81 x 10~ 3.36 x 10"
3

2 3 +0
0.500 8.66 x l f r 2.60 x 10+° 9.81 x 10" 1.57 x 10

f Aft A'# \ ua)

Sm + 1
0
+ J'-l)
Sm+1
(20)

0 0 Aft 1
V /m+li }

1
for m = 0,1,..., M, where f = m+i • • •, J/m+, ') for i = 0 and 1.
Since this scheme is of an implicit form, we must solve Eq. (20) as a system of
nonlinear equations with respect to variables of T / ^ . , . . . . j / m + i \ provided / in Eq.
+

0 1
(1) is nonlinear with respect to j / ' , ; / ' , j ; ' ' y. However we can use this scheme
as a corrector in a predictor-correct or method, and then the predictor can be easily
obtained by replacing / i with / in the right hand side of Eq. (20) as a system of
m + m

linear equations with an upper triangular coefficient matrix.

Numerical Example: In order to illustrate the efficiency of the numerical integra-


tion scheme described above, we consider the application of the scheme with r = 2
to solve the simple initial value problem
y"(x) = y{x), 0<i<5,
V(0) = 1, y'tO) = 0.
That is, solve this problem by means of the scheme

¥2
,,(1)
Sm+1 7m" IA - A * fm+1
1) + 0 + 0 m+1
Sm+1

where / = f [ x
m , a n d / + i = / ( W i , ! / ! i , £ l i ) - Then the numerical
m m t

results is shown in Table 1, together with the results solved by the Euler method.

1. N. Dyn, J. App. Theory 31 (1981) 22-32.


2. N. Dyn, Math. Comp. 43 (1984) 168.
3. K. Jetter, SIAM J. Numer. Anal. 19 (1982) 1081-1089.
4. A. Sharama, SIAM Rev. 14 (1972) 129-151.
5. M . Mori, Numerical Analysis (in Japanese), (Kyoritsu-Shyuppan, 1972), p.280.
59

I M P R O V E D SOR-LDKE M E T H O D W I T H O R D E R I N G S
FOR N O N - S Y M M E T R I C L I N E A R EQUATIONS
D E R I V E D F R O M SINGULAR PERTURBATION PROBLEMS

EMIKO ISHIWATA
Department of Mathematics, Waseda University
Okkubo S-i-1 Sinjyuku-ku, Tokyo 169, JAPAN
E-mail; 63m502@cn.wBseda.ac.jp
and
YOSHIAKI MUROYA
Department of Mathematics, Waseda University
Ohkubo S-4-1 Sinjyuku-ku, Tokyo 169, JAPAN

ABSTRACT
We consider the linear system Ax = b derivedfromsingular perturbation prob-
lems, l b solve such non-symmetric linear problems, we propose a generalised
SOR method, which we have named the "improved SOR method with orderings",
We use three ideas, that is, orderings,rariablerelaxation parameters and a not
oecessarily strictly upper triangular splitting matrix U'vaA^D — L-U. The
basic theorem, the selection of the relaxation parameters, orderings and several
numerical experiments are also presented.

1. Introduction

We will be concerned with the non-symmetric linear system of equations Ax = b.


To solve such equations, we propose a generalized SOR method, which we have named
the "improved SOR method with ordering^'.
For an )i x n matrix A, we choose the proper permutation matrix P and put
T
A = PAP , x=Px, b = Pb.

Then the improved SOR method with orderings can be expressed as


m+1 (m)
£( > = (D — - $)£> + $[?}3: + 0 - *I)-'*b, m = 0,1,2," • (1)

where for A = D — L — U, D is a diagonal matrix, U is an upper triangular matrix, L


is a strictly lower triangular matrix and $ = diagfwi, • • • ,Q„) is a diagonal relaxation
matrix.
Let put
diag(>,,"-,ai„),
r T T T
^"l^p^W, * = F $P = D = P DP, L = P LP, U = P UP.

Then Eq.(l) is expressed as follows.


l
^+V = {D-*L)-\{I-*)D + W)aP* + {D-*L)- *b, m = 0 , l , 2 , - - . (2)
60

For example, if P = I , then Eq.(l) is called the improved SOR method with
natural orderings. In particular, if Q, = w, i = 1, • • • ,n, then we call Eq.(l) as the
usual SOR method with natural ordering for U), If

0 1
P = i and Q{ = Ul, i = 1,• • • ,n,
1 ' 0

then we call Eq.(l) as the SOR method with inverse ordering for UJ.
Our method has the following three features compared with the usual SOR method.
1) We take orderings into account. This turns out to be very important for
non-symmetric matrices.
2) We change the relaxation parameters G>i, i = 1, • • •, n of $ usefully.
3) U need not be a 'strictly' upper triangular matrix.
8 6
Recently, H.Han et al. and H.C.Elman and M.P.Chernesky studied the effect
of the partitioning and ordering of the unknowns on the convergence of the Gauss-
Seidel iterations and gave a general procedure to automate the partitioning and or-
dering phase of the solution process, for not only one-dimensional problems but also
11
two-dimensional problems of the discrete convection-diffusion equation. K.R.James
expressed the iteration to vary all relaxation parameters as in Eq.(2) and derived
a range of values of the relaxation parameters of the Gauss-Seidel and Jacobi type,
together with the bounds of the spectral radii of the corresponding iteration ma-
3 13 14 5
trices. P.H.Brazier , D.B.Russel , J.C.Strikwerda and L.W.Bhrlich respectively
proposed special selections of the relaxation parameters for two-dimensional prob-
4
lems, but they have no analytic results. J.J.Buoni and R.S.Varga commented that
the splitting matrices L , U need not be triangular matrices.
We use all three ideas at the same time to solve non-symmetric linear equations.
As a result, we have obtained an effective SOR-like method which converges more
rapidly and with fewer iterations than the usual SOR method. In this paper, we
only prove the basic Theorem on the tridiagonal matrix case with constant coefficient
and apply this theorem to several numerical examples of blocked systems, using the
special relaxation parameter wt with proper orderings.
Further results on the improved SOR method with orderings, that is, general con-
vergence theorems, special selections of the relaxation parameter u),, i = 1, • - •, n and
orderings in practical use and relationships between our method and the direct meth-
ods such as Gaussian Elimination, etc. will be published elsewhere (see E.Ishiwata
9 10
and Y.Muroya ' ).

2. Difference Schemes Derived from Singular Perturbation Problems


12,15
We first consider the singularly perturbed two-point boundary value problem
61

-«<"(*) " a(x)u'(x) + b(x)u(x) = f{x), x € (0,1)


u(0) = 7o, = 7 l

where e is a parameter in (0,1] and the functions a, b,f lie in C?[0,1] and are
independent of e. We first assume that there exist constants a, 8 such that
3 3 2
a{x) > o > 0, b(x) > /3, a + 4e/3 > 0. (4)
Under these hypotheses Eq.(3) has a unique solution u(x). This solution has, in
genera], a boundary layer at x — 0 for e near 0.
Now, as examples of non-symmetric difference equations, we show only two ex-
amples of difference equations derived from singular perturbation problems.
Let n be a positive integer and h = l / ( n + 1) be the uniform mesh width. The
nodes in (0,1] are Xi = ih, i = 0,1, • - • , n + 1.
i) Upwind difference scheme
— 2y + yf < + t y ,~yii+
£ 2
~ h a . — + 6 * = A, . = l , - , n { 5 }

Vo — 7o> JAi+1 = 7i

where Oj = a(xi), = b(xi) and fi = f(xi). Then we obtain a difference equation


-kVi-i + Vi- U i V , i = fc,, i = 1, • • •, n , where
+

e £ + a,h
2 2
2e + a ft + bih '
( ' 2£ + a,h + b,h '
If bi = 0, then li + « i = 1 holds.
ii) The El-Mistikawy and Werle difference scheme
The piecewise constant approximation a(a:) of a(x) on [0,1] is defined by
a u x.£ [*(-l,as), i = 1,2,••-,«+ 1
o i, n + x= 1
where hi = (a(z,_]) + a(xj))/2 for i = 1,2, • • • , « + 1.
Piecewise constant approximations 6 and / of b(x) and f(x) respectively are de-
fined analogously. The test functions [iik}t=] 8*8 defined by
1
-£^;' + av5' + oV* - 0, on (Xj,x )
k ju j = 0,l,---,n
(6)
5 , j = 0, l , - - . , n + 1.
i>k(xj) = ktj

Under the assumption Eq.(4), the El-Mistikawy and Werle difference scheme is
f, 7
denoted by Aii — b where u = {^(xi),--
11
• ,u (a; )) '. This matrix A = [flyr] is an
h
n

irreducible tridiagonal matrix such that


a
jJ-i = - e ^ f o - i + ° ) < ° . %-irf = ^ l ( ^ - 0 ) < 0 5 j = l,---,n,u+l
<tt = - + °) + - »(0t*» " 0) > 0, j = V-1«
b = (6i, - - . O n ) ' , where 6,r = (/,i/<j) - fij,iai 7o - <5,>cm,n+i7i-
|0
62

This holds a > 3iJ + If 6(x) = 0 hold, then = |0*j-il +


Moreover, fljj-i — = " ( i j ) > a > 0.
This difference scheme for solving Eq.(3) on a uniform mesh in (0,1] was pro-
7 2
posed in T.M.El-Mistikawy and M.J.Werle . A.E.Berger et al. , and E.O'Riordan
12 15
and M.Stynes ' gave independent proofs that the El-Mistikawy and Werle scheme
2
is uniformly second-order accurate (that is, all nodal errors are bounded by Ch ,
where the constant C is independent of x, h, and e). If the function a(x) is allowed
to have a zero in (0,1) and the zero of a(x) is assumed to be simple, and o(0)a(l)
1
must not vanish, then R.B.Kellogg et al. proved that the modified El-Mistikawy and
Werle scheme is uniformly first-order accurate in the case of having a turning point
27* such that fa < 0 or > 1 where 0% — b{xk)la'(x),).

3. Basic Theorem for Simple Tridiagonal Matrices

In this section, we show the error estimates for the SOR method with the special
relaxation parameter w = Q . b

16
We use an n x n tridiagonal matrix expression as

0\
f h Cl
02 62 C2
A = [a ,b ,*\ =
i i

On-1 0„-l C„_i

In particular, let us consider a n n x n simple tridiagonal matrix A = [—1,1, — « ) ,


that is, cti = —I, bi = 1, Ci = — it, i = 1, • •• ,n. Then the eigenvalue u of the jacobi }

matrix of A is well known as

(ij = 2Vl • ucos j = 1, (?)


n+V
2
Assume 4lu < 1. We set u%~ and u t =
Note thatov
1 + Vl - 4/ui+yr^
1 < Wopi <•£%'= Jim u ^ < 2 if lu > 0 and 0 < £ 4 = lim < Wept < 1 if lu < 0 .
(m)
Let us express x to be the unique vector solution and a r to be the m iteration
vector in the SOR method for u = wj, and the p -th element of the error vector
(m)
e = (m) _ .
x s m
m
> 0 be [e<*">] = e< >. Now we have the basic theorem for
m = tit,.
Theorem 1 Assume lu / 0 and 4lu < 1. Then for w = Q , X = w - 1 and b b

n+li ,1+1
? = 0,1,2,---. i / m = 9(71 + 1 ) , thenwehave ef = |A|«< >. e<°> and
63

if m = q(n + 1) + k, 1 < fe <n, then

) U II 1<p < n - A
u
mi
u t At !
4 = /A u
p = n - fe+ 1

u k A; JO)
I <4°' | , n - Je + 2 < p < n
ZA u

8
Proo/. Let the eigenvalues of the SOR matrix C be \ = (w — lje* ', j = 1,- -• ,n. u
17 2
Then , i can be expressed by %j and
3 = cosfij + isindj, where i = —1.
A,r+w - 1 1
- 1)5 , i. (w - 1)5 ^
H = ' .. = - ( 1 + e"-)e-i > = i ^-2cos ^
e

w(A )a 3 "j w 2
c o s
Since / i , - is already expressed by Eq.(7), we obtain cos^ = U\J-£ti ^i- Be-
cause of Cij = lit, we have u)^Jluj{us — 1) = 1 and fly = 2jw/{n+ 1), 1 < j < n.
The eigenvector u , corresponding to A can be defined by }

pjTf ,p5
e ' sinpfly
I — I ^ n + l " U

A {isin2pfl +i(l-cos2pe? )| i >

where tJj = Pj/2 = j j r / ( n + 1), j= l , - - , n . Then we can get the next relation

n
2
£ < f « , where c[ — Y" ei (sin2A:e +tcos2*fl,-)-
0,
i (8)

The reason is as follows :

£ 4 % = £
—*—• £ {sin2p^ + i ( l - cos2pfl,)} (sin 2/fcff,- + t cos 2*9^)
3=1 3=0

0)
XA V'" e[ *
*=1 y=o

We use the relations £ J sin 2pt?y - cos 2kS = 0 and £ " cos 2k&j = 0 . If p = A
= 0 } = 0

holds, then 5Z™ cos 2(p —fc)fl -= n + 1. Otherwise, if p / k and 1 < p, k < n, then
=0 3

E"= cos2(p - k)g~ = 0. Hence we obtain Eq.(8).


0 s
64

Therefore each error vector at m-th iteration is expressed as

U
" F
j=l „ j=l \V /
Substitute the above expression of cf^ to this formula, then the error vector is rewrit-
ten as
-k P

v
u j=0 n + 1

where
*(2m >-2fc)Jy
e +J . a i n p 9 j . _ { c o s ( p + 2(m-fc))fl + £sin (p + 2(m - k))0j} • sinpfij
>

= -r {sin2(p + m -fc)(7y- sin 2(m - k)S -} + 5 {-cos2(p + m - } + cos2(m - .

Since 8, = jv/(n + 1), then E"= sin 2 ( p + m - Jfc)# = £ J s i n 2 ( m - fc)0 = 0. The


0 3 = 0 y

error vector is finally defined as


(0) , « n
6 J 1
''
+ i • J2 {cos2(m - k)9j - cos2(p + m - ,

from which each value of p, m determines the error vector explicitly and we can finally
obtain the proof. •
For practical purposes, we denote two error estimates more explicitly, one for
8
\l\ > \u\ and the other for |f| < |uj (cf. H.C.Elman and M.P.Chernesky ).
Corollary 1 Assume \l\ > \u\ on Theorem I . Then for q — 0,1,2, • • •,
n+1))
if m = q(n + 1), then we get simply e^ = • e<°>, and
if m = q{n + 1) + k, 1 < k < n, then

\M «(«+!> p = n —fc+ 1

1 + 4
wfcere d, - ^ ' " and |d,| - ~J\Xl/u\.

If\l + u\ = 1, then |d,| = 1. Otherwise, if\l + u\ < 1, men |A| < 1^| < 1,
and if\l-rv\ > 1, then |A| < 1 < \di\. (9)

Corollary 1 implies that if |/ + ti| < 1, then the convergence ratio per one iteration
f e (n+1 +t>
° i ™ ^ J i ' ' ' ' l for 1 < fc < n is |A/di| which is greater than |A], but the number
65

of iterations (see section 5) is independent of n. On the other hand, if \l + u\ > 1,


then |d] | > 1 holds and the number of iterations increases according to n, but if n is
sufficiently large, then the number of iterations is not more than q(n + 1), where the
(n+11 0,
integer q satisfies | A | * max |e< | < 6 for an admissible error bound 6.
l<p<n "

Corollary 2 Assume \l\ < \u\ on Theorem I . Then for q = 0,1,2, • • -,


n+1 1
if m= q(n + 1), then we get simply ej?< » = jA-j*** * • e<°>, and
if m = q(n+ 1) + k, l<k<n, then

• 4 • U& - (f )" • Jf ] , 1< p <n - k

H -
Iff ]
n+1

<t <r\. >•

replaced di by
v
1 + Vl —4lu
Corollary 2 impfies that if |i + u\ < 1, then the convergence ratio per one iteration
tn+1>+ki
of max \e^ \ for 1 < k < n is |da| which is greater than Al, but in spite
l<p<n *
of \t\ < \u\, the number of iterations is independent of n. On the other hand, if
|( + u| > 1, then \di\ > 1 holds and the number of iterations should be theoretically
, +1 m
almost q{n + 1) where the smallest integer q satisfies | A | ' " ' max |e{, '| < 6. But in
practical computations, if n is large, then it may not be correct by the computational
errors (see Example 2).
Since (|A|/|di|)/|o2| — |A| < 1, the case of \l\ < |u| is not better than the case of
|/j > |u|. Example 1 and 2 imply those results. That is, for p = 1, • • •, n — k, k —
P o f e m ) i n m
1, • • • , n , the term d\ • efl or jA|" • 4 " • 4 ° '
k Corollary 2 remains and e< >
p

does not decrease until m becomes a multiple of (n + 1). On the other hand, we note
in both cases of Corollary 1 and 2 that
m) m
e p = ]X\ ef\ m = (n + l),
q 9 = 1,2,3,---

4. The Permutation Matrices and the Relaxation Parameters

In this section, we mention how to determine the permutation matrices for good
8
orderings (cf. H.Han et al. ) and the relaxation parameters. We first define the
turning points of the matrix.
Definition 1 Let us consider an n x n tridiagonal system Ax = b, where A =
[—li, 1, — i = 1, • - •, n. / / there is an integer k such that 3 < k < n — 2 and
66

a n r f 0 e n t
- |«»-I|)<IM - H*HD < o (IM - l ) ( K I - 5) > * p°"
X | u • .-.IJ'/I a point
0 , 8
In particular, if\l \,\u \k< 5,
k < |tt*-i| and | W l | > Nw-lli * * " *»
a "stable" turning point and if \l \, |u*| > \, \l -\\ > \uk-i\ and ]l i\ < \u \, then
k k k+ k+l

vje coll x an "unstable" turning point.


k

In this paper, we assume that I|,uj ^ 3, i = 1, • - - , n and if there are turning


points x , k = l , - - , p where 1 < rj < r < • • • < r < n, then each x is
rt 2 p Tk

only a stable turning point or an unstable turning point such that if p > 2, then
<M-|)(l^-i)<o,*=i,2,.-.,p-i.
For the n x n tridiagonal matrix A = [—Z,, 1, —m\ with turning points, we now
show how to choose the permutation matrix. We call the orderings good orderings if
the permutation

•••] oin))

corresponds to the permutation matrix P and a satisfies at least all conditions.


1) Suppose that x , 1 < k < n — 1 is not a turning point. Then if \l \ > \u \, then
k k k

cr(k - 1) < cr(k) < e(k + 1) and if \l \ < |«*|, then a(k - 1) > a{k) > cr(k + 1).
k

2) Suppose that x is a turning point. Then if


k < ]u -i\ and > k

then a(k - l),£r(fc + 1) > a{k) and if > \u ^\ and \l \ < |u»+i|, thenk k+l

<r(jfc-l),ff(fc + l ) <a(k).
Then, by assumption, we can practically use good orderings.
We now show examples of the turning points and the permutation matrices. We
apply the n x n tridiagonal matrix A = \—lj, 1, —Vj\ such that t

lj = h (2 < 3
< r\) where ii+wi = l , ii,fii>0
= Hi (l<j<n-D
h =h (r, + l < j < n )
where l + u^=l, l ,Ui>0
Uj = u\ (n<j<n-l) 2 2

Ci ~ |)(t/2 - I) > 0, then x , is a turning point. We call (r, - 1) x ( n - 1)


| E
T

submatrix Ai = [—li, 1, - U i ] and (n - ri) x (n - r,) submatrix A = |-i ,1, -uaj 2 2

respectively the first block and the second block of A.


As simple examples, the permutation matrix P is expressed as P, if 0 < Fi.fij <
and P if (,,uj > | .
u

/ 0 ••• 0 1 0 ••• 0\ r 1
1 0

0 0
1 0
0 0
0
01
0 1 0
^0-010
67

Note that the turning points of matrices in Definition 1 are simiiar to the turning
points of the singular perturbation problem, Eq.(3).
Now let us consider a n n x n tridiagonal matrix A = (-i*, 1, — u,] with /, = 1 — Uj
and tin = 1—1„ and assume 4i\iii < 1, t = 1, - • • , n . Assume this matrix A has several
blocks as mentioned above. Then we select Q^i for the j - t h block [—h, 1, —u,] of A
2 1
as Qij j = —-—, Note that if L + u,- = 1, then wa ,• = ——.,, - ,. r
1 + ^ 1 - 4/ 3llj max(|( |,K|) 3

Finally, we propose the selection of u, for an n x n tridiagonal matrix A =


[—It, 1, —Ui] (see Example 9).
2
i) If Xi is not a turning point, then Wi — - = = = = , If l - -iii = 1, then i r

1 + i/l - AliiLi
_ 1

ii) If there are turning points Xk, then we choose as


if x is a stable turning point.
k
l - ( i * + TiO'
if Xk is an unstable turning point and Z*, u > 0. k
(lk + u k ) - V

These selections are derived in consideration of a priori error estimates.

5. Numerical Experiments

In this section, we show several numerical experiments satisfying previous condi-


tions for the tridiagonal matrices with the turning points.
For simplicity, in all examples we specify the vector solution x =• of Ax = b
0
as x = 1, i — 1,' • • ,n and the starting vector x^ = [x' '] as x = 0, i = 1, - • - ,n.
t t
ffl
We iterate until we reach the first positive integer tn such that each component ej '
is less than the admissible error bound 6 — 10~ in magnitude. We call this fh as the
8

number of iterations.
We first apply the SOR method for w = u\ to a simple tridiagonal system Ax = b,
where A — [—1,1, —u], I + u = 1.
Example 1. For A = [—i, 1, —v], I + u = 1, we apply u> = u\.
Table 1. The number of iterations for ui = ui b

71 (=0.9 ( = 0.1 (=0.75 i=0.25 (=0.67 i=0.33 ;=o.e (=0.4


10 9 11 17 22 27 33 46 55
50 9 51 17 51 27 51 46 51
100 9 101 17 101 27 101 46 101
200 9 201 17 201 27 201 46 201
500 9 501 17 501 27 501 46 501
1000 9 1001 17 1001 27 1001 46 1001
68

If \l\ > \u\, then the number of iterations is small and independent of n. But if
|/| < |«|, then that depends on n and equals to a multiple of ( n + 1).
Example 2. For A = [ - / , 1 , -u], I + u ^ 1 and lu > 0, we apply w = w . 4

Table 2. The number of iterations for I + u ^ 1 and lu > 0


1+u<1 f+u >1
n i=0.8 0.1 0.6 0.1 0.6 0.3 1.0 0.1 1.1 0.2 1.5 0.1
u=0.1 0.8 0.1 0.6 0.3 0.6 0.1 1.0 0.2 1.1 0.1 1.5
10 9 11 9 11 17 22 9 11 29 33 16 22
30 9 31 9 31 20 31 10 31 28 31 18 31
50 9 51 9 42 20 51 10 51 37 100* 24 57"
70 9 71 9 42 20 71 13 71 45 130" 30 109*
100 9 101 9 42 20 76 14 101 75 172* 49 147*
200 9 141 9 42 20 76 20 204' 131 355* 86 276*
300 9 Ml 9 42 20 76 26 310* 207 514" 126 411*

If |i + u\ < 1, then the numbers of iterations in both cases |/| > |u| and |/| < |u| are
independent of n. On the other hand, if |f + u| > 1 and |/| < \u\, then the number of
iterations must theoretically be a multiple of (n-r 1). But in these cases, if n becomes
sufficiently large, then we may need more iterations than a multiple of (n +1) because
of the computational errors. We represent such cases by the numbers of iterations
with superscript * in Table 2.
Hence in practical computations, we should transform the case of I + v ^ 1 into
the case of i + u — 1, which can be done by using such d : dl + u/d = 1 as defined in
Corollary 1 and 2.
Example 3. For A = [—1,1, —u], I + u = 1, we change u>, 0 < u < 2.
Table 3. The ease of 0 < w < 2 and n = 100
w i=0.1 (-0.9 m d1=0.25 i=0.75 m di=0.33 (=0.67 m d

0.7 245 146 99 412 314 98 650 553 97


0.8 205 106 99 341 243 98 536 439 97
0.9 172 73 99 285 187 98 446 349 97
1.0 144 45 99 240 142 98 373 276 97
1.1 114 16 98 201 103 98 312 215 97
1.2 128 29 99 167 69 98 260 163 97
1.3 153 55 98 132 34 98 214 117 97
1.4 187 86 101 126 28 98 171 74 97
1.5 229 132 97 150 52 98 120 27 93
1.6 293 198 95 192 93 99 146 49 97
1.7 407 308 99 248 147 101 186 89 97
171,1 means the difference between the numbers of iterations in cases of |/| > and
|(j < hu|. In Table 3, m is almost n — 1. H.C.Elman and M.P.Chernesky" considered
d

the case of u> = 1 and obtained the similar results.


69

Remark 1 For the n x rt tridiagonal matrix A = [—1,1, —It), let lu > 0 and
Wojrt <wi, < Q < 2, then for any eigenvalue A of the fact \X\ = w —11 (8 fcrtowrc.
Let m 6e trie num&er of iterations for w — Q. Let A = Qt — 1 and m 6e a constant
Q

suck that |A|* = 5. Traen anrfer l&e assumptions we can guess m& as m& = m + an,
where a parameter a is determined by ^|A/A| = |A|" and does not depend on n.
For example in Table 3, if I = 0.9, w = 1.3 and u b = 1.111111111111111, them
A = 0.3, a = 0.4124, rn = 15 and jrt, = 56.
Next we apply our method Eq.(l) to various tridiagonal matrices with the turning
points. But before doing that, we shall explain some words used in all the following
tables to describe the numbers of iterations for each condition. The parameter ujgp, in
all examples is the calculated value for the nxn tridiagonal matrices A — [—li, 1, —ttj].
For UJ — uJopi, let m^i, m „ and m ^ , be the numbers of iterations for the SOR
in

method with natural ordering, with inverse ordering and with good ordering for each
block. Let m& be the number of iterations for the improved SOR method with good
b

ordering and 4> = diagfwj,- - • ,u„) in each example.


We actually compare the numerical experiments with the theoretical numbers
which we guess by Remark 1 in the following tables.
Example 4. We apply the nxn tridiagonal matrix A = \—lj, 1, — Uj] such that
(j = 0.9, Uj = 0.1, Wj = w = 1.111111111111111,
M (1 < j < [ ^ ] - 1)
!j = 0.9, u = 0.75,
3 wj= 1.538461538461539, (j = f ^ ] )
= 0.25, ty = Q.75, ^ =1111,2 = 1.333333333333333, + 1 < j < n).
We denote that [x] is the maximal integer less than X.
TaiAe 4. The case oF an unstable turning point at j = |(2n)/3]
value of Uopt "Lord
60 1.320605864340204 35 75 34 17
120 1.329710139543094 54 136 55 17
150 1.330960730282924 64 166 65 17
210 1.332102986243628 83 226 85 17
240 1.332394148600067 93 256 95 17
300 1.332749378400198 115 316 115 17
360 1.332954574021790 134 376 125 17

We apply good orderings and $ = diag(u>i, • • - , u ) denoted above to A. We guess


n

by Remark 1 that the number of iterations equals to that for the second block
Ai = [—0.25,1, —0.75] with good orderings because of Qhy. = maxuib,, =f Wopt.
For iii = let m ,\ be the number of iterations for the first block A\ =
b

[-0.9,1,-0.1] and m be that for the second block A = [-0.75,1,-0.25]. Then


b 2

by Remark 1, we guess m = m& = rn and m^, = mt + n/3 where a = 0.5. Similarly,


t b
n
we guess m^i =• rftt + / 3 and = m + 2n/3 = mj, + n. We can see that the
t l

computational results satisfy these relations.


70

Example 5. We apply the n x n tridiagonal matrix A = 1, -Uj] such that

0.1, u, = 0.9, Wj = u = 1.111111111111111, ( 1 < J < [ ^ ] - 1 )


M

0.1, uj = 0.25, Wji = 1.538461538461539,


0.75, Uj = 0.25, u>j = = 1.333333333333333, (\^\ 4- 1 < j < n).

Table 5, The cose of a stable turning point at j = [{2n)/3j


n value of Wopi m„d Ttinv ma t

60 1.320605854262298 71 34 34 18
120 1.329710155009154 130 53 55 18
150 1.330960720871857 161 64 64 18
210 1.332102985834606 221 84 83 18
240 1.332394148552820 251 94 95 18
300 1.332749378434252 312 115 115 18
360 1.332954574049759 371 135 132 18

Similarly, we guess the numbers of iterations by Remark 1 to be m ^ i — rhi +


n/3, T/ionj = rfte + n and m™ = i h + rt/3. Practical computational results also
b

satisfy these relations.


Next we show the cases of having two turning points.
Example 6. We apply the n x n tridiagonal matrix A = [—lj, 1,—Uj] such that

l = 0.9,
3 u = 0.1, = C7J = 1.111111111111111, (1 < 3 < [ ] " 1)
C|1 3

lj = 0.9, u = 0.75, uj = 1.538461538461539,


1^ = 0.25, u = 0.75, Oy = i J , - 1.333333333333333, m\ + 1 < J < i f i - 1 )
t 2
2

ij — 0.25, u = 0.33, u>j = 2.380952380952381,


lj = 0.67, u = 0.33, u>j ~ W6,3 = 1.492537313432836, (lf] + l<j<n).

Table 6. The case with an unstable and a stable turning point


n value of J.',,.,; "lord "linn " W ma*
60 1.468250161411775 48 75 40 26
120 1.485146834836888 69 135 61 27
180 1.489059929933442 97 144 83 27
240 1.490556405149029 122 185 105 27
300 1.491295304620208 148 226 125 27
360 1.491724815483151 173 268 147 27
420 1.492005472669140 201 309 170 27

We show another interesting example. The next table presents the results for a
tridiagonal matrix which has a stable and an unstable turning point.
71

Example 7. We apply the nxn tridiagonal matrix A = [—lj, 1, — u \ such that


}

lj = 0.012195, uj - 0.987805, Uj = u\i = 1.012345554031413, (1 < j < [f | - 1)


(, = 0.012195, uj = 0.33, 1^ = 1.520207356283397, (j = Is!)
lj = 0.67, uj = 0.33, u = Stf, = 1.492537313432836,
s (|§| + 1 < j < [ ^ ] - 1)
(j = 0.67, U j = 0.9, Uj = 1.754385964912281, (j =
l} = 0.1, uj = 0.9, uj = u , = 1.111111111111111,
b 3 ( [ ^ | + 1 < j < n)

where Uj = Ugpt — 1.492537260055542, j = 1, • •• ,n for tno^,m „, and m in u<J)1 .

Table 7. With turning points on j = [n/3], [(2n)/3]


n "lord "Km* "V...
120 246 125 167 27
180 360 181 241 27
240 473 234 314 27
300 582 284 386 27
360 695 336 456 27
420 808 389 529 27
480 921 442 602 27
540 1035 496 676 27
600 1145 549 746 27
660 1255 601 816 27
690 845 616 617 27
750 917 668 669 27
840 1022 746 747 27
900 1095 799 800 27

We consider the numbers of iterations given by Remark 1. Now for w = Wopi = Qt,2,
let fin,!, m ,2 and m ^ be the numbers of iterations for the first block, the second
0 D

block and the third block of A. Then we guess these to be respectively mt,! =
mi + 8n/9, m ,2 =? >n& and 771^3 = rftj, + n/3. Similarly we guess m d = m n + n / 3 =
0 or

m i + l l n / 9 and m = t%, + n/3 < rh + 8n/9.


inv h

For this tridiagonal matrix, the differences of the ui^j of two adjoining blocks are
very large. In such a case, we usually get a greater number of iterations if n is not
so large. Because we guess the number of iterations under the assumption that n is
sufficiently large, we note that in Table 7 if n < 660, then m , ^ and are much
more than our guess and monotone increasing according to n. But if n > 690, then
we get the numbers of iterations which we guess by Remark I ,
The number of iterations m „ is monotone increasing according to n. But the
in

number of iterations m ^ is much less than n and independent of n.


Next, we consider the case of the tridiagonal matrix which has no turning points.
We divide the matrix A into four blocks by |n/4], |n/2], and (3n/4] as indicated below.
Since i , > u, > 0 holds, we can iterate with natural ordering which is good ordering.
72

Example 8. We apply the n x n tridiagonal matrix A = 1, —Uj] such that

j = 0.67, *** = 0.33, Wj) = 1^,1 = 1.492537313432836, (l<j<i?]-l)


i = 0.67, *»i = 0.25, Wf = 1.492537313432836, (J = tm
i = 0.75, «J = 0.25, Wj = (7J6, = 1.333333333333333,
2 ([fi+i<j<[f]-i)
j = 0.75, «* = 0.1, Wj = 1.333333333333333, o=in)
i = 0.9, Uj = 0.1, Wj = iLt = 1.111111111111111,
|3 a?i+i<j<[^]-i)
s = 0.9, v
i = 0.012195, w
i = 1.111111111111111,
i = 0.987805, *i = 0.012195, W
J = w = 1.012345554031413
M : (IfI + 1 < 3 < ")

where = + w * ) / 2 , fc=l*i/4], |n/2], [3n/4].


+1

Table 8. The case with no turning points


n value of Wopt "lint, raz b

60 1.450762203852421 68 132 71 25
100 1.474511840524642 129 212 132 24
160 1.484800375838160 196 330 199 24
200 1.487445518273925 243 412 246 24
240 1.488946709089202 290 491 293 24

We guess that the number of iterations is usually coincident with the number
of iterations for the block with u = max LDJ,^, but in this case, rriQ is a little less than b

this.
Finally, we consider the case of changing uj , i = 1, • • - ,n for each entry. What
t

happens then? We consider the coefficient matrix which is derived from the upwind
difference scheme and the lower elements lj, j = l , - - , n are monotone decreasing
such that Ij + Uj = 1 and 0 < lj < j = 1,
Example 9. We apply the n x n tridiagonal matrix A = [—lj, 1, —u,) such that
E + Ojft 1 ., 1
=—rr>
L 2

T>i=—-—r. h
£= ai-th, u =—, i = l , - -,n.
2e + o /i'
j it + mri n +1 f
iij
Table 9. The case of changing all uii
n value of uJopi "lord mi mj m 4Ji

80 1.221880878114550 142 63 64 18 16 6
120 1.221880565772298 221 102 103 22 17 5
160 1.221880876205672 301 142 143 23 18 5
200 1.221880875020172 382 183 183 27 21 5
240 1.221880880390542 463 224 224 30 22 5
280 1.221880879444867 544 265 264 35 24 5
320 1.221880888127118 626 307 307 40 23 5
360 1.221880877534967 708 349 349 44 25 5
400 1.221880892204558 790 391 391 48 27 5
73

Let m be the number of iterations to apply the above w j — 1, • - •, n. Moreover,


ulj it

mi is the number of iterations to apply each UJ — H\, corresponding to each i-th


block which is split by the points [n/2), [3n/4], |9n/10] and m is the number of
7

iterations to use each ui = H> corresponding to each block which is split by the points
bii

[n/2], [3n/4], [4n/5], [9n/10], [19n/30], where we choose u\j = & • min u<
and the non-diagonal entries of i-th block are l , Uj for r i < j < ri.
}

It is clear that changing Wj, i = 1, • • •, n for each block is efficient from comparing
with m i or m j . But the most efficient case is to change all w , i =!,••• ,n.
t

Since the non-symmetry of this matrix is very strong, our method performs very
efficiently for these types.
All the above examples imply that the improved SOR method with orderings is
more rapidly convergent than the usual SOR method.

6. References

1. A. E. Berger, H. Han, and R. B. Kellogg, Math. Comp. 42, (1984) 465-492.


2. A. E. Berger, J. M. Solomon, and M. Cfment, Math. Comp. 37, (1981) 79-94.
3. P. H. Brazier, Comput. Meth. Appl. Mech. Engrg. 3, (1974) 335-347.
4. J. J. Buoni and R. S. Varga, in Numerical Mathematics, ed. R.Ansorge,
K.Glashoff and B. Werner (Birkhauser, Basel, 1979) 65-75.
5. L. W. Ehrlich, J. Comput Pkys. 44, (1981) 31-45.
6. H. C. Ehnan and M. P. Chernesky, in Recent Advances in Iterative Methods,
ed. G.Golub, A.Greenbaum and M.Luskin (1994) 45-57.
7. T. M. El-Mistikawy, and M. J. Werle, AIAA J. 16, (1978) 749-751.
8. H. Han, V. P. Il'in, W. Yuan and R. B. Kellogg, J. Comput Math. 10, (1992)
57-76.
9. E. Ishiwata and Y. Muroya, Tech. Report 95-15, Advanced Research Center
for Science and Engineering, Waseda University (1995).
10. E. Ishiwata and Y. Muroya, Tech. Report 95-16, Advanced Research Center
for Science and Engineering, Waseda University (1995),
11. K. R. James, SIAM J. Numer. Anal. 10, (1973) 478-484.
12. E. O'Riordan, and M. Stynes, Numer. Math. 50, (1986) 1-15.
13. D. B. Russel, Ministry of Aviation, Aeronautical Research Council, Reports
and Memoranda no.3331, (1963).
14. J. C. Strikwerda, SIAM. J. Sci. Statist Comput. 1, (1980) 119-130.
15. M. Stynes, and E. O'Riordan, Math. Comp. 46, (1986) 81-92.
16. T. Torii, Information Processing in JapanG, (1965) 187-193.
17. R. S. Varga, Mafrti Iterative Analysis, (Englewood Cliffs, New Jersey, 1962).
75

A N A L Y S I S OF T H E M I L N E D E V I C E F O R T H E F I N I T E
C O R R E C T I O N M O D E OF
T H E A D A M S PC M E T H O D S I
Masatomo F U J J I
Department of Mathematics, Fuktioka University of Education
Miatakata, Pukuoka 811—it, Japan
E-mail: hijiini@lukuok&-edu ac.jp

ABSTRACT
The behavior of the difference between the values of the predictor and of the
corrector for the Adams predictor-corrector method in the P{EC)"~ mode is
analysed. This leads not only to an accurate estimation of local truncation er-
rors but also to that of global truncation enors.

1. Introduction
1
In the previous paper , the author discussed an accurate method for estimating
local truncation errors, and as its application, an accurate method for estimating
global truncation errors. In that paper, he mentioned two theorems on the behavior
of the difference between the values of the predictor and of the corrector for the
m m
Adams PC methods both in the P(EC) E mode and in the P(EC) mode. The
proofs, however, were not given there.
m
The purpose of this paper is to give the proof in the P(EC) mode. In Section 2,
some preliminaries are given. In Section 3, the order of the error in the j^th correction
(i = 0 , 1 , . . . , m) is investigated. In our discussion we need the asymptotic formula of
the error in the (m — l)-st correction. In Section 4, the existence of the formula is
shown. In Section 5, the behavior of the difference between the values of the predictor
and of the corrector is analysed.

2. Preliminaries

We consider the initial value problem of the differential equation


t
V = y { a ) ^ y 0 { a < x < b ) , (1)

where we denote by y ( x ) the solution of this problem. The step points are given by

x n = a + nh (n = 0,1, ...,N), h = {b~ a)/N,

where N is the total number of the steps. Let p be the order of the Adams PC
methods. Put
v.—n+.p—X.
76

In what follows, we assume that f{x,y) in Eq.(l) is sufficiently smooth on the


regions in question. We assume that the solution y(x) o/Eq,(l) exists. Let

& fjt = 0 , l , . . . , p - l )
are p starting values and let

Bp = J/W) (pi = 0,1,..., J> - 1 ) ,

We also assume that yjs are chosen so that

e„ = 0(tf) (o->p+l;p. = 0 , l , . . . , p - l ) .

Let

k) {k)
9<») = / , ( * , » ( * ) ) , 9l =9 M, ff*=g{3v).
m
The formulae of the Adams predictor-corrector method of order p in the P(EC)
mode are given as follows:

;='
and

J=0 3=0

]
where y$ is the i-th correction of yf , /£' = J{xk,y£) , V is the backward difference
operator,

and
,
7 =^ (s-l)s---(s+j-2)ds/j!.
J

For the solution let


p
Jpi(a:, •/(*); ft) = y{x) - y(x - ft) - ft | ] a / ( x - jft, y ( i - jfc))
H

3=1

and
r ( z , j,(x); ft) = y(x) - y(x - ft) -
p2 ft£6 /(x
w - jft, y(x - jh)).
3=0
77

For the formulae Eqs.(2) and (3), we define the local truncation errors at x by n

Tph, = T i{x , T v y(x );h)


v

and

T 2n = T 2(xv,y(x );
P P a ft)
respectively.
For preparations of the succeeding discussion, we give three lemmas.
1
Lemma 1 (M.Fujii ) For the Adams-Bashforth-MovXton pair of order p in the P(EC)*'
mode, the identity

3=0

holds, and for the exact solution y{x), the identity

V - r P h = - v i W V w

holds.
The following lemma concerns the growth of solutions of the nonhomogeneous
linear difference equation
k
Zn+* - Zti+fc-i = h^Pj^+t-jZn+k-j + A n (n = 0 , l , . . . , N — k). (4)
3=0

3
Lemma 2 (P.Henrici ) Let B', 0 and A be the constants such that

El/M < B'> \h»\<P (» = 0 , l , . . . , i V ) ,

|A*| < A (« = 0 , 1 , . . . , AT-A)

and let 0h < 1. TVien euen/ solution of Eq.(i) for which

\z,\ <Z (i = 0,l,...,Jfc- 1)

satisfies
kL
\z \ < K'e" '
n ( = 0,l,... JV),
n 1

where
K' = I - ( J V A + 2fcz), t* = r - e * . r * = 1/(1 - ph).

The following lemma plays an important role in the proof of Theorem 3.


78

1
Lemma 3 {M.Fujii ) For any polynomial Pi(x) of degree i, the equality

= 0 for 0 < i < p - 1

holds.

3. Order of errors i n the i - t h correction

For the convenience of the later discussion, let us suppose that

(ft = 0 , l , . . . , p - l ; i = 0 , l , . .
and put
1
eg = $ - y(x ) n (n = 0 , 1 , . . . , N; i = 0 , 1 , . . . , m ) .
Then we have the following theorem.
Tn
Theorem 1 In the P(EC) mode, under the assumptions in Section 2, for a suitably
chosen ft, there exists a positive constant K such that

|egj <Kb? (n = d,\,...,N\ t = 0,l,...,ro). (5)

Proof. Under the assumptions in Section 2, we may suppose that there exists the
solution y(x) on the interval a < x < b and that for a small 6 > 0 the function f (x,y) y

is continuous on
Vi = {{x,y)\ a<x<b, \y-y(x)\<6}.
Let us put
0 < ft <ft<>< 1, I = \a,b], A^I^T-^l,
ft
where [ ] is Gaussian symbol. We may also suppose that C, (i = 0,1,2,3) be the
constants such that

\f (x,y)\<C
y 0 for [x,y)eV , s

p + 1
|e | < C , n
M for fte (O.ftr,] (u = 0 , 1 , . . . ,p - 1),
T+]
\T (x,y(x);h)\<C k
pl 2 for x e I, 0 < h < hn,
,+1
\T (x y{x) h)\<C h'
pl 1 1 3 for x e / , 0 < h < ho .
Furthermore put

-4 = E M , B = El*wl. C = max(C ,C3), 2


i=i j=0
79

J
J-IWft, « = £ W t*=o,i„4.

te = max (Idpil.loyl) (» = 1,2,... , p - 1 ) , fcp = |<ipp|,


,+1
ft- = ^(5(6 - )k" + 2pC,ft' , a

3=1 i=0

Let us choose ft, so that K' exp{(b — a)L'} < 6 for h satisfying 0 < ft < fti (fci < ho).
Suppose that
(XjJ^eVs 0 = 0 , 1 , . . . , N; 1 = 0 , 1 , . . . , l i t ) . (6)
The validity of Eq.(6) will be shown later. Since

01
4 = e£f, + ftE^.; ' 1
- /(^->,y(x ))} - w r p l B (7)
3=1

and
1 1
$ = el^+ftVI/r -/^,!/!^))}

-IJ^ (» = 1,2 m), (8)

we obtain
01 m| m 1| +1
|ei | < | e i 1 | - r f t C E | o | | e i 7 | +C7 h''
0 P3 2 (9)
3=1

and
i]
\e®\<s v + h3\et \ (i= 1,2,....m), (10)
where
p+1
s„ = n&i + ''Co E M k l V l + ^ f t . (ii)
J=I

From Eq.(10), we have


1
|eWf < rr^s,, + (flftJVl? ! (i = 1,2,... , m ) . (12)

From Eqs.(9), (11) and (12), it follows that

1*1 < * < | e & l + ^ 0 E + (' = 0,1, - . . , m ) . (13)


3=1
80

Put

Then we obtain

,+i
d <a {^
v m + hC f^k d -
n i v j + Ch> ) (v=p,...,N). (14)

In order to estimate the left-hand side of Eq.(14), let us consider the equation
m-l
Zv = 2 - i + ft|{<-„Cb*i + B E W K _ i
v

i=0

+ £ CoffmfcjZ^j] + (t) = R . . . , AT)


,=2

and let {z„} be a solution of this equation with the starting values Zj = |e | 3

(j = 0 , 1 , . . .,p- 1). Then it follows that

d,<Zj U= 0,l,...,N).

By Lemma 2, we see that

5, < hL
K-e" '
< K'exp{(b-a)L'} <S (n = 0 , 1 , . . . , JV).

Hence the validity of Eq.(6) is shown and Eq.(5) holds.

4. Asymptotic formula
-1
For the asymptotic formula of ejj™ ', we have the following theorem.
m
Theorem 2 In the P(EC) mode, under the assumptions in Section 2, the relation
11
eJr = h"e( Xn ) + 0(h" ) +l
(ft =0,1,...)
holds. Here e(x) is the magnified error function, which is the solution of the differ-
ential equation
e' = g(x)e-Cy^(x), e(x ) = 0,
o (15)
uihere C is the error constant.
7 1
Proof. First, we shall consider the case m > 2. For 0 < j < p — \, put y = Bp* " ', 3

U =/j m _ 1
', Wj = ur"~ and e, = ef'
n
and also put A

wf = £ f ( ,y(x ) y Xj 3 + 9ef)de, (IS)


81

1
5 W = n 7 (t,i + l ) -
J 3

Let us choose /12 (/12 < hi) so that

1
. < 1 for fi g (O.A,]. (17)

-11
Now we make the difference equation on ej[" . Here we consider h which satisfies
Eq.(17). From Eqs.(7) and (16), we have

and from Eq.(8) for i = m and Eq.(16), we have


p-i

J=0

Since

+ Tpi.z-p+i — 7Ja _ i, iZ p+ (19)

p- I p
>v * t \m—II \nt—11 L \—» fm—11 Im— 1
j=0 j=l

+ 7J,i,3-p-)-i T 2,z-p+i
f

m
-hV7z(0,m)(4 l-ei?)
1
+ Wto7*(nt - l,m)(eW - e j - ' ) (20)
and

e M _ [m-H
e = »J,( - m l)(ef - 4°1), (21)
substituting Eq.(19) into Eq.(20), we have

j=0

+ r i , - p i -TpLz-p+i}
P I +

+ hbpo^im - l , m ) ( H _ l — l ) e e { 2 2 )
32

and substituting Eq.(22) into Eq.(21), we also have


1
{1 - (ftSo)~7,(m - 1, m)SAm - - e^ )

= (hbto^SM - 1)(1 - ^,{0, )){k±b wtfe£f


m pi

3=0

p
- ftj^ flpjwi™J ei™7 + r i^-p+i - 7 p _ i } .
1| 11
p 2iI p+

3=1

Put
A= = (bpor-rAm - l,m)S«(m - 1)
and
1
tT, - (fi )"- 5,(m - 1)(1 - /i6po7-(0. m ) ) .
p0

Then we obtain

3=0

- hj2<hi*£f^ +?P -p i - W P - H }
V + (23)
3=1

Substituting Eq.(23) and the formula obtained by replacing z by z + 1 in Eq.(23) into


Eq.(18), we have

j=0

- ft E 0pjwlT7-'' iTw} + TO/(1 - A - A . )


e

3=1

j=0 j'=l
p-1
+ A E ^3 l+W L+W -
3=0
W E
7
P2,!+p-2

X
+ W ®4{\ - frA:,)}{T ^- v p+1 - T , _p }.
p2 I +1 (24)

By Theorem 1, we may put

e ! T - ft"e(^) + u , (n = 0,1,...). (25)


Substituting Eq.(25) into Eq.(24) and using the relation
2
e ( i „ ) = e{x ) + he'(x ) +
+1 n n 0{h ),

we find the nonhomogeneous linear difference equation

3=0

E lm-11 i

J=l
, n
+ tr- u /(i-h' \ )
n n

3=0 3=1

p + 2
+ /t A _ n p Cn=j»,p+i ,,.,]¥-l), 1

and there exists a constant A' such that

|A _„|<A-
n £» = p , p 4 - l , . . , , J ¥ - l ) .

Since
1 ^ 1 £<W** U* = o,i p-i),

then from Eq.(13), there exists a constant C such that


, + 1
|e^-«| <C/i'

and since

e(^) = j / t ^ ' e'(io + dd 0 = 0 , 1 , . . - , p ) ,

there exists a constant C such that

M = to - i f t ^ f f l ^ W ( J - o, i , . . . , ) ,P

Put
B" = BCb + {2(5A2r-V(l - WWHA + B)C .0

Since the coefficient of /iMn+i in the right-hand side of Eq.(27) is given by


l| m
•\o*&7 (i - ' V - ' t W I i - A A i ) ) . n +
84

we may take

Put

K' = r*{(6 - a)A' + 2(p + . L* = T'T?*

Then, by Lemma 2, it follows that

< /fexpHft-aJL*}^ - 4 1
(R = 0,1,...,JV).

Second, let us consider the case m = 1. Let e(x) be the solution of Eq.(15). Put
]
ef = tfeixi) + WP+V (t' = 0,l,...,A0. (28)

Since
e{x )=jk[
3 e'(x + 9jk}d6 0 (j = 0 , 1 , . . . , p ) ,
JO
there exists a constant A"o such that

|%| < C + | j - y e'(«o + 8jh)d0 \ <Ko ( j = 0 , l , . . . , ) . P

For brevity, let us put

Since
P
loi 101 . , n /L
loi
w e
ep+k+i = V * E^J P+*+W p-U+i-.i
+

p—J P
+ / l 1u 0 ft a 3
E V j.+*-j4 r*-> ~ £ p;Wp+*- ep _ J rJ: J

j=0 j=l
+ Tpum - - T, (A = 0,.. -, N - p - 1), I J b + a (29)

substituting Eq.(28) into Eq.(29) and using Eq.(26), we find


r
Wp+fc+i = iip+fr + ^ % % i t i - j « w w - , '
j=l

P-i P

i w <
+ ftE PJ i'+*-j™i>+*-j - * E - w * v + * - j % * * ^
J=0 j=l
+ A* (* = 0 , l , . . . j V - p - l )
1
and there exists a constant Kg such that

\*k\<K b,2 (k = 0,l,...,N-p-l).

By L^mma 2, there exists a constant such that

\*\<Ka (i = 0,h...,N)

and we obtain
] +
ef = tVe^) + 0(h* ').
Thus the proof is completed.

5. Behavior of y® - j , H

At the beginning o f this section, we give the following proposition.

Proposition 1 When z > p and m > 2, the equality

eW-efl = {l-hVr.(0,">)}
/ { l - (hb^^im - l,ra)S (m - 1)} ;

1 1
x {^V^f-'leL'"- -Tpj,,.^} (30)

/lo/ds.
Proo/. Prom Eqs.(2), (3), (16), (7), (8) and Lemma 1, it follows that
m| 01 p m ,| ,|
ei - e' = ft7p- V i - ^-
1 W e + Tfi,*w - W**,. (31)

We denote both sides of the sign o f equality in Eq.(31) by D. Then we have the
following equalitiy:

e m_ |o| e = /j_ft 0p07i (o, )( M- loi)


m e e

m| m_11
+ hbfnM - 1. ™)(4 - 4 )> (32)

Substituting Bq.(31) into Eq.(32), we obtain

e IH_e|o] = { l - f t 6 p 0 7 x ( 0 , m ) } D

+ Mtfh£« " l.m)( H - ef-'l). e (33)

Substituting Eq.(21) into Eq.(33), we also obtain

{1 - - hm)S,(m - - e»)

= {i-hb ^(Q,m)}D.
p0
Therefore we have

/ { I - ( A V r 7 * ( ™ - l,m)5,(m - 1)}

This completes the proof.

m
For j j f — j / [ ' , we have the following theorem,
1
Theorem 3 In the P(EC)" mode, under the assumptions in Section 2, the
2 +1
$ - = - T„ + e ln ppv + 0(ft " )

holds,where p = p and

P( 1
™ ' \v>t\j>-l) + l; l<t<p,m>2.

Proof. FVom Eqs.(2) and (3) by Lemma 1, we have


1
e - ^ - -H-iVVJ - m l i

= - kfp-iVfixv, y(x ))v

- h^Vifl^ - f(x ,y(x»))\


v

= - r p l n - ft -. V V " " ' ^ ! " - ' ! ,


7p

First, we shall consider the case m = 1. For v > p, k > 0, it follows that
fc-ip-i
J
«+r-j
1=0 j=o
P

— Tpi^+fc.p+i

Hence, we see that

1
-rp^-p+j + o ^ ) .
87

1
When we regard e„-i as ej, !,, the relation mentioned above holds for v = p. Since

^ - f f ^ + o f l ^ W=o,i,...),
for v > p, it follows that

e = —
5+t *1j8,v-p-ri — Tpi.u-^+i

+2 Ip+1
= P,(fc) + 0(/i" ) + 0 ( f t ),

where Pi(k) is a polynomial in k of degree at most 1, Let P\(k) = «o + ajk.


For v > 2p , k > 0, it follows that

s
E E *wl* + « " J'Jftffv + 0(ft )]|P,(« - j) + 0 ( ^ ) 1
*=0 J=0

r=o 3=0

+2
= E k « o + « - kooAfc + « m ] +0(ft" )]

r=o *

and
+2
E " w*^
- i * - ^ * - , = 9«"o + {* - ^)(c hg' 0 v + aig,) + 0 ( . V )
3=1 E Tpi,v+t-p+i + T i,v+t- +i r P

+ *<i.v- P + 1 +0(O.

where I J ^ H = 7 ^ { i , v ( i „ ) ; n ) ( i = 1,2), Hence,


u

e s
lit' expressed as follows:
p + 3 i p + 1
«£U = ft(*> 4- 0 ( n ) + 0(/i ) (p > 2).

By induction on j , if v > jp, we can show that


+ 1 1
4 * = I J W + 0 ( f t ^ ) + Otft** ) (p > j ) ,
88

where Pj{k) is a polynomial in k of degree at most j . For v > 0, we see that

+ , 1
e™, = h'efx,) + 0 ( h ' ) = Fo(fc) + O ^ ) -
For v > p , by Lemma 3, it is seen that

= D - l V ( J ) I * + 0(h)We(x ) v + 0(ft" ))+1

+I
= 0(A* ).

Therefore, from Eq.(34), we have


0] 2 +1
y[ - a!" = - T rln + 0(h*+*) + 0(fc * ) (p > l ) .

For v > ip, it follows that

V ^ e ™ = E(-l) J (j)[ffv-jV ;4--- l9

+ - l ) ' } * ? - " + W M I f l - i O j ) + OCA'*)]


- 0 ( 0 .
Therefore, we have

p+ +l 2 1
vS" - J/!" = T , * . - 7 > , „ + 0 ( f t - ) + O f A ^ ) (p > i ) .

Second, we shall consider the case m > 2. For v > 0, > 0, we see that
1 + 1
- A"e(x„) + O f A ^ ) = P (k) + 0 ( A ' ) . 0

For ti > p,k > 0, by Proposition 1, we have


* p-l k

J"'-'\ Jm] , l i „,|m-l| |m-l| _v>r.


r=ij=o f=i

- (hj^r-'s^m - i)(«EU - «5L)

H
- 4 +AE E ^ « t £ i £ # - E w
(=lj=0 (=1
89

- 1
- (nVP - D O - hV7«+fc(0, ™)}
/ { l - (nVJ'Vt-fcfm - l,m)S (Tn - 1)} v+t

6=1 j=0 *=1

_ 1
- («*>"-W)" (i - ftv^+*)/{i - ( / i v n ^ n
,, 11
x {/t7 -iV '/ fcei,+t + Tpi.v+t-p+i - Tp2,„ - i}
r t+ +k p+

2
-HOt^HOt/i '*™).
FVom Theorem 2, for t> > p, we see that

p + 2 2p+1
+ 0(/i ) + 0(/t )
p + 2 2 p + l
= P,(fc) ; - 0 ( h )+0(A ) (p>l).
By induction on j , for v > j(p — 1) + 1, it follows that
f l p+ +1 1
e£* = fy*) + 0(fc > ) + © ( f t ^ ) (p > j).
In a similar way as in the case m = 1, for v > i(j> — 1) + 1, we have

i $ - J/1" = - r p l n + o ( t f ^ ) + o(/t + 1 2p+1


) (p>»).

Thus the proof is completed.


0 m|
As is well known, the Milne device estimates a local truncation error by yj, ' — yj,
multiplied by a suitably chosen constant. Theorem 3 shows that the estimation of
1
Tptn ~ Tpin by yf® — j/j, "' is improved as the step proceeds. An application is shown
1
in the paper .
In general, the case m = 1 is often used. According to Theorem 3, however, the
case m > 2 is superior to the former in accuracy.

6. Acknowledgements

The author would like to express his gratitude to Professor Hisayoshi Shintani for
his invaluable advice.

7. References

1. M. Fujii, An Extension of Milne's Device for the Adams Predictor-Corrector


Methods, Japan J. Indust. Appt. Math. 8 (1991) 1-18.
2. P. Henrici, Discrete Variable Methods in Ordinary Differential Equations, Wi-
ley, New York, (1962).
91

A N E W A L G O R I T H M FOR D I F F E R E N T I A L - A L G E B R A I C
EQUATIONS BASED ON H I D M

WATANABE Tsuguhiro
National Institute for Fusion Science
Ckikusaku, Nagoya, 464-01, Japan
E-mail: wata@lsimsun.nifs.ac.jp
Giovanni GNUDI
National Institute for Fusion Science
Chihisaku, Nagoya, 464-01, Japan
E-mail: gnudi@srhatori.nifs.ac.jp

ABSTRACT
A new algorithm is proposed to solve differential-algebraic equations. The al-
gorithm is an extension of the algorithm of general purpose HIDM (higher or-
der implicit difference method). A computer program named HDMTDV and
based on the new algorithm is constructed and its high performance is proved
numerically through several numerical computations, including index-2 problem
of differential-algebraic equations and connected rigid pendulum equations.
The new algorithm is also secular error free when applied to dissipationless dy-
namical systems. This nature is demonstrated numerically by computation of
the Kepler motion. The new code can solve the initial value problem

where L and ip are vectors of length N. The values offirstor second derivatives
of ip{x) are not always necessary in the equations.

1. I n t r o d u c t i o n

Computer analysis is playing more and more important roles for the development
of science and technology. High speed and large scale computers together with pow-
erful algorithms are extending the field of activity of numerical computations. Many
types of equations are waiting to be solved numerically in the course of research and
development.
There are many excellent algorithms to solve the initial value problems described
by non-stiff ordinary differential equations. We can usually get good solutions for such
problems by excellent ready-made computer programs. However, we encounter some-
times serious numerical difficulties if the problems are reduced to stiff ordinary dif-
ferential equations, or to differential-algebraic equations. Differential-algebraic equa-
tions frequently arise in many physical problems, such as optimal control problems,
dynamical systems with constrained conditions and so on. The present status of the
1 2 3
research on differential-algebraic equations is described in references ' - .
92
4
In a previous paper we have constructed a new computer program named H I D -
M D V (HIDM with second derivative) to solve stiff ordinary differential equations or
differential-algebraic equations, based on the algorithm H I D M (higher order implicit
5 6 7 8
difference method) ' ' ' . The program H I D M D V can solve the equation

0 = L(<p(x),<fi'(x),<p"(x),x), (1)

where L and <p are vectors of length jV. To solve Eq.(l), we have introduced the
difference scheme as shown in Fig.l.

0 Sift ft sh
3 2ft

• 4 — ^ — I — $ — t — ^ — + -

p(0) tp(h) v(2ft)


</(0) <ft{2h)

Vf'(0) <ff(2h)

Fig.l The difference scheme for H I D M D V . The values ip(Q) and


<p'(Q) are given as initial values for rank-2 ordinary differential equations.
The remaining 5 function values at grid points shown by Q (equally sep-
arated points) are obtained numerically, by solving the differential equa-
tions at 5 intermediated points shown by • (unequally separated points).
The values Si(i = 1, 3) are uniquely determined from the minimization of
the truncation error for i^"(s,-ft).

The computer program H I D M D V has shown good performance and has been
4
extended to be able to solve boundary-value and eigenvalue problems However,
practical applications has revealed that the algorithm of H I D M D V should be im-
proved from the point of view of accuracy and easiness of use.
The algorithm of H I D M D V is proved to be A-Stable but not secular error free
for dissipationless dynamical systems. For long time tracing of dynamical systems,
9 11 10 12
symplectic integrators ' ' ' have attracted considerable attention because they are
13
free from secular errors. Recently, Watanabe and Gnudi has extended the algorithm
H I D M D V to satisfy the no secular error property by introducing the idea of time-
reversal integrator.
The computer program H I D M D V is designed to solve the second derivative
!p"(x) at the grid points (see Fig.l). Additional equations are needed if the Eq.(l)
contains no second derivatives <$'(x). This requirement makes the use of H I D M D V
occasionally complicated in applications.
93

In this paper, we have extended the algorithm of H I D M D V in order to satisfy the


time reversibility and to have a more easy-to-use nature. A computer program named
H D M T D V (HIDM, time reversal with second derivative) is constructed based on
this new algorithm.
In section 2 we summarize the principle of the H D M T D V , Numerical examples
of H D M T D V are shown in section 3. Section 4 is devoted to a short summary.

2. Principle of H D M T D V
5 7
The principle of H I D M is shown in detail in references '^ . Here we summarize
the principle of H D M T D V , which can solve differential-algebraic equations without
the trouble accompanying non adaptive initial conditions. Furthermore, H D M T D V
has a linearly symplectic nature, and guarantees absence of secular errors for recursive
motions of dissipationless dynamical systems. There are 3 types of H D M T D V
difference scheme, depending on the highest derivatives of each variable. These are
discussed in the following subsections.

2.1. Difference Scheme for Variables with Second Derivative

Here, we consider the difference scheme for variables which have second derivatives
in Eq.(l). In this case, we use the difference scheme shown in Fig.2.
S 0 Si S
2 S3 S4 S5 S
S

. ci) . . ti) . - fa •
T T
t/h
v(o) m

ff(-h)

Fig.2 The difference scheme for variables having second derivatives in


Eq.(l). The values <p(—h) and y/(—h) are given as initial values for rank-2
ordinary differential equations. The remaining 7 function values at grid
points shown by O (equally separated points) are obtained numerically,
by solving the differential equations at 7 intermediated points, shown by
• (unequally separated points). The values Si(i = 1,2,4,5) are uniquely
determined from the condition that truncation error for <i>"{sih) should be
minimized (SQ = —1, S3 = 0, sg — 1).

Expressions of the function and its derivatives at the points 1 = S(ft (t — 0, • • •, 6)


are given by linear combinations of function values at grid points x = —h, 0, A as
94

follows

k E Q W w E W + l i Q&wm + E • (2)

ip'(sh)

j=-l jm-i 3=-l

The difference scheme for the second derivative <p", Eq.(2), has a total of 9 parameters
7
(P, (s), Qj(s), Rj(s)). Then the truncation error for Eq.(2) becomes 0 ( / i ) . To reduce
this truncation error we introduce a relation which determines the value of s as follows
2 4
1 - 9s + 12s = 0 , (5)

that is

9l = -0.7838••• , 8 = -0.3682• • • , s = 0.3682«••-,.«$ = 0.7838• • • .


2 4 (6)

Then the values of (Pj(si), Qj($i), Rj{si)), {j = - 1 , 0 , 1 , i = 0,- •• ,6) are determined
s
uniquely, and then truncation error of Eq.(2) becomes 0(h ).
The difference scheme for tp', Eq.(3) has a total of 9 parameters (D,(s), Ej(s),
Fj(s)). Then the truncation errors for Eq.(3) becomes 0(h}). This order is com-
patible with the one of Eq.(2). Then, the parameters (Dj(s), Ej(s), Fj(s)) are also
determined uniquely.
The difference scheme for tp, Eq.(4), has a total of 9 parameters (4,(s), Bj(s),
s
Cj(s)). Then the truncation errors for Eq.(4) can be reduced to 0[h ). This order
is one order higher than the one of Eq.(2) and Eq.(3). Then one parameters, for
example Ci(s), becomes free if we are satisfied with the same order of accuracy of the
1
discretization scheme for <p, tp , <p". When we impose the time reversal condition for
the discretization scheme Eqs.(2-4), we obtain the conditions

U
<m ~ =- \ ^ s < , (7)

1 1 1
Ci{8 )-C { )
1 1 gt =- ~ f*8 .
1 6 (8)

Two coefficients are still left undetermined for the parameters (-4j(s,), Bj(Si), Cfai)),
(j = —1,0,1, i = 0, • - • ,6), if we request the compatibility for the truncation errors
for representations Eqs.(2-3). We discuss about this points in some detail in the
section 4.
In the following, we determine the parameters in Eq.(4) in order to minimize the
9
truncation error for (f(s). In this case the truncation error of Eq.(4) becomes C?(ft ),
95

which is one order higher than the one of ip' and (p" Coefficients for the discretization
scheme of H D M T D V are reduced to the form

, , , 4278 + 5 9 8 A ± (7569 + 513A)s . . . 2469 - 299A


L M S ) ( 9 )
3^64 - = 4608 •

.• . _ ( 2 9 3 7 + 361A) - (6372 + 164A)s T _ (405 - 59A)*


B ± 1 { S )
- 73728 ' M s )
- Tro2 ' ( 1 0 )

_ 16H-17A±(444-4A)s 283-21A
C ± , ( S ) = L C d W =
73728 ' ^ 0 7 2 ~ ' <">

( m 5 + 6 4 S l
^ ) = - ^ . D 0 { S ) = J J ^ , m

_ , , 351 - 5 2 9 A ± ( 2 0 8 8 - 600A)s „,..., 261 + 53A


E ± , ( S ) = £ o ! s ) = ( 1 3 )
^8432 ' 1152—'

^ ^ . ^ - ^ W = - « , (14)

p ± i W = _312 + 4 4 0 A T ( 9 4 - 735A), 5 i ^ = 35 + 55A _ ^

^ i ( s ) = ± ( 2 0 9 + 65AK(173 + 93A^ ^ = _(^l_19A s ^ ] ( w )

_ 117 + 3 7 A ± ( 9 9 + 51A)s 9 - 7A

where
f V33 ( for s = a, or s ) , s

\ -V53 ( for s = s or s ) . 2 4

Time reversibility conditions given by

Aj(»i) = A-A'a-i), Bj{*i) = C,( ) = e - j ^ j , S < (19)

% ( S i ) = - D - f a - i ) , Ej(Si) = B-j(st-i), F (Si)


s = -F-jise-i) , (20)

P/{«3 = P-j(%- 3, i =-Q-ifsg-i), = *-,-(»«-<). (21)


(j = - 1 , 0 , 1 ) , (i=l,2,4,5)
are completely satisfied.
Next we consider the stability of the discretization scheme given by Eq.(9-17) by
solving the characteristic frequency of the harmonic oscillator

ifi"(x) + uMx) = 0, (22)

where tu is some constant. If we express the eigenfunction of Eq.(22) under the


discretization scheme by
ip(nh) <x exp(i nhSl) , (23)
96

we get the dispersion relation

cos(2hn) =
2 4 5 6
457228800 - 881118000fl + 239415750g - 20934585^ + 724410g - 9792g + 38g
2 3 4 5 6
457228800 + 33339600g + 1275750g + 33O750 + 540<> - 18<j + 2g
(24)

2
where g = (kw) .

1.570792...

Fig. 3 The dispersion curve of the harmonic oscillator given by Eq.(22)


under the discretization scheme of H D M T D V u is the frequency of
the harmonic oscillator and ft is the eigenfrequency of the numerical so-
lution, h is the step size of the numerical integration. When \u[h >
1.570792120-• -, ft becomes occasionally complex , and the periodic na-
ture of the numerical solution begins to break down.

When cos(2/iSl) is real and the condition

|cos(2ftft)| < 1 , (25)

is satisfied, the numerical solution given by the above discretization scheme becomes
periodic with the correct amplitude (= 1). The relation given by Eq.(24) is shown
in Fig.3 when u)h is real. This figure shows that the largest step size h which
max

guarantees the purely periodic solution of Eq.(22) is given by

1.57079212078280060208152-
-8
| ( 1 - 2.67763---xlO ). (26)
97

In other words, the largest step size which guarantees the periodic solution of Eq.(22)
is 1/4 of the period of oscillation, and the relative error for the period is 2.67763- • • x
I D - 6 The local error of the discretization scheme becomes

cos(2 a -cos(2
f t ) M = - ^ | g 2 0 + - (27)

This analysis leads to the following conclusion. When we adopt the time step h as
1/20 of 1 period ( h = O.IJT/W), the local error for function value is of the order of
-14 13
4.4 x 1 0 and the error for the period of the solution is of the order of 1.1 x 10~
Furthermore, this discretization scheme can be proved to be linearly symplectic.

2.2. Difference Scheme for Variables with First Derivatives

In previous subsection, we have derived a discretization scheme for functions with


second derivatives. Here, we consider the difference scheme for variables with no
second order derivative, which appear in the equation like

0 = L(f(x), tp'(s), tf&f, #(4 0"(ar), x) . (28)

In this case, the value <p'(0) in Fig.2 cannot be specified as initial values. Then addi-
tional equations become needed if we use the discretization scheme in the preceding
subsection. Sometimes, this process becomes a nuisance. We introduce therefore a
separate discretization scheme for this case as shown in Fig.4.
SO « 1 $2 S3 Si S5 Sg

f •*- ' * • - > 1 • (TI • fV • m • f1


J * W * K. > ' <$> ' Li * W * t3
t/h

v'(-h) m <P'W

Fig.4 The difference scheme for variables having first derivatives


(without second derivative) in Eq.(l), which should be solved at 7 points
(x = Sih , i = 0, 6), shown by • (unequally separated points).
The value f(-h) is given as initial values for rank-1 ordinary differential
a r e
equations. The remaining 7 function values at grid points shown by O
obtained numerically. The values Si[i = 1,2,4,5) are uniquely determined
by the difference scheme for the variables with second derivative as shown
in Eq.(5). The additional embedded points x = ±h/V3 are determined
from the condition that the truncation error for >p'(Sih) (i = 1,2,4,5)
should be minimized.
98

Expressions of the function and its derivatives at the points x = s h ( i = 0, - • *, 6) t

are given by linear combinations of function values at grid points x = ±ft, ±ft«, 0 as
follows

= I [ E ^(sMm-r f.(sM-ku) + fAsMhu)) + E


(29)
I i
a ft
<p(sh) - E i < * M J ) + c-UM-hu) + cAsMhu) + h E o,isW{jh) - (30)
;=-l

The difference scheme for the first derivative iff, Eq.(29), has a total of 8 parame-
7
ters (d,(s), e,(s), f±{s)). Then the truncation errors for Eq.(29) become 0{h ). To
reduce this truncation error one more order, we choose the value of the embedded
points ftu appropriately. It is slightly surprising that the value u = l / i / 3 guaran-
tee the all truncation errors for <^(sjft) (t = 1,2,4,5) are reduced one order, and
8
the truncations errors of above expressions become 0(ft ) which is just the same or-
der for discretization scheme of functions with second derivative shown in previous
subsections. Coefficients for this discretization are reduced to the form

2 9 3 + 8 5 a 8 + 9 4 A ) s 6
M . ) - * g . " . W - ^ , (3D

. . , (25 + 9 A ) ( ^ - S) . , , (61 - 19A)s


± l W = i b o { s ) ( 3 3 )
" Wi = 384 •
, , . ±(1137+ 225A) + (1380 + 292A)s .. . ( 1 4 7 - 29A)s ....
d±M = , d (s) = — , 0 (34)

. T(39 + 55A)3\/3 + (216-168A)35


UM =~ , m (35)
,
- 1 4 3 - 3 1 A ( 1 6 8 + 40A)s . . ( - 6 1 + 19A) T

C i i , S ) = L ( 3 6 )
3072 > *<«>- 384
where the value A is given by Eq.(18).
We study the stability of the discretization scheme given by Eq.(31 -36) using the
equation
<p'(x) = Xifiiz), (9(0) = 1 , (37)
where A is a some constant. The difference scheme Eq.(31-36) gives the following
solution
2 3 5 6
..., = 7560 + 7560Aft + 3465(Aft) + 945(Afe) + 165(Aft)< + I8(Aft) + (Aft)
V { 2 3 4 1
' 7560 - 7560Aft + 3465(Aft) - 945(Aft) + 165(Aft) - 18(Aft)* + (Aft)«
(38)
99

This solution (the amplification factor of the difference scheme) has the following
characteristics
\f{2h)\ < 1 when3tAft<0, (39)
|(^(2/i)| = 1 when Xh is pure imaginary. (40)
The relation (39) shows that the difference scheme is A-stable and Eq.(40) guaran-
tees purely periodic numerical solutions independent of the step size h when A is
pure imaginary. The linear symplectic nature is also verified for this discretization
scheme. The local error of the difference scheme is given by the difference between
the discretized solution Eq.(38) and the analytic solution exp(2Aft)

« - e ^ 2 A / 1 ) = - I J ^ + .... (41)

These results show the excellent nature of the difference scheme of Eq.(31-36).

2.3. Difference Scheme for Variables with no Derivatives

Here, we consider the difference scheme for the variables with no derivatives, which
appear in the equation like
0 = lfax% ib(x), ^(4 &x% f ( s ) , ?(?), x) (42)
In this case the difference scheme is very simple, as shown in Fig.5, and and no
truncation errors are included.

-1 Si s2 0 s 4 s s i

@ 1 — @ © m ®

y(sih) <p(s ft)


2 f{sih) <p(ssh) t/h
ifii-h) <p(Q) <p{h)
Fig.5 The difference scheme for variables having no derivatives in
Eq.(I), which should be solved at a total of 7 points (x = Sjft, i =
0,---,6), shown by • (unequally separated points). The values y>(sih)
a r e
(i = 0,1, • - •, 6) shown by O obtained numerically. The values S{(i =
1,2,4,5) are uniquely determined by the difference scheme for the vari-
ables with second derivative as shown in Eq.(5). (s = - 1 , s = 0, s = 1) 0 3 6

2-4- Remarks on the coding of the program

In previous sections, we have considered the truncation error of the discretization


method. In actual numerical computations the roundoff errors deteriorate the accu-
racy of derivatives. To reduce these effects, increments of variables are treated as
100

practical unknown quantities in the actual program coding. For example, the quan-
tities <p{jh) $[jh) are treated as unknown quantities introduced by the relations

V>Uh) = v(-h) + + l ) V ( - f c ) + ip{jh), (43)

V'(jh) = <fi'(-h) + m h ) . («)

(J =0,1) .
If it is possible to determine all the highest derivatives using the given system of
equation (1), we can get solutions directly by the above mentioned algorithm. There
are, however, problems in which we cannot determine the highest derivatives only by
the given system of equations. An example is given by

G(</,^>,tM)=0, (45)

/r(vMfr,j)=0. (46)
In this case, both variables J/>(I) have second derivatives, so the discretiza-
tion scheme given by Eq.(9 -17) is applied. This discretization scheme assumes that
y>(-h), if/(—h), il>{—h) and ^(—h) are given as initial conditions. In this example,
however, we have only two true initial conditions, for example ip{-h) and f'{-h).
The other quantities ip(—h) and ift{—h) are not initial conditions, and should be
determined consistently from Eq.(46). In this case, two additional equations are nec-
essary to determine the values i>{—h) and i)'(-k). An example of a set of additional
equations is

^ff(,M,x)=0, (47)

£pH( ib,x)=0.
Vl (48)

Since the program should be informed of these facts, a index for each variables is
prepared in the program. Example of the index is rank-2 array variable named JVR
as shown in the following.

JVR{ 1, n) = highest derivatives of n-th variable.


JVR(2} n) — number of additional equations for n-th variable.

When equation is one of standard form as

iF--/fy,r,*), (49)

and high speed computations is requested, separate program should be prepared which
does not treat the second variables as unknown functions because this value is given
101

in Eq.(49) . In this case the computation speed can exceed the speed of standard
Runge-Kutta method program.

3. Numerical Examples of H D M T D V

In this section we show several numerical examples of H D M T D V Computation


was carried out on a Fujitsu M-1800 with double precision (1 word is 64 bits). The
'exact value' is calculated by long double accuracy (1 word is 128 bits). First we
show the accuracy of the discretization scheme of H D M T D V . Giving the 'exact
values' of ip{nh), <p'(nh), <p"{nh) on each grid points (<p(x) = sin(x), n =integer and
h = 7r/32), we have calculated the numerical error of the discretization scheme given
by Eqs.(9-17) and plotted it in Fig.6.
h = rc/32 h = rt/32 h = n/32

Fig.6 An example of numerical error of the discretization scheme


given by Eqs.(9-17). First and second derivatives and ip") have almost
the same order of accuracy and the accuracy for ip is one order higher
compared to them.

Next, we have calculated the numerical error for the variables with first order
derivative (without second order derivative). The discretization scheme is given by
Eqs.(31-36). In this case, we give the 'exact values' of <p(nh) and ip'(nh) on each grid
points and tp((2n + 1 ±1/V3)h) on embedded points (tp(x) — sinfa:), n ^integer and
h = TT/32), and calculate ip((2n +1 + s,)ft) and <p'((2n + l + Si)h) by the discretization
102

Fig.7 An example of numerical error of the discretization scheme


given by Eqs.(31-36). p and iff have almost the same order of accuracy.

3.1. Kepler Motion

As a example of dissipationless dynamical systems, we have integrated the equa-


tion for the Kepler motion
2
dx _ x
_ A
dt* ~ V + ?/)V2 ' ( 5 0 )

2
d y y
= 2 2 2 ( 5 1 )
~dtJ ~^{x + y )^ '
2
(I = 7T /16 ,
103

where the constant u and the initial conditions axe chosen such that the analytic
solution has period T = 64 and a relatively large value for the eccentricity. In this
system energy and angular momentum are conserved and it is possible to check the
accuracy of the numerical computations.
For this system the index of the variables is shown in Table 1. No additional
equations are necessary.
variables x(t) y(t)
n 1 2
JVBfl.nl 2 2
JVR{2, n) 0 0

Table 1 Index JVR of each variables to solve the Kepler motion


given by Eqs.(50-51) by H D M T D V .

For the numerical computations by H D M T D V , we have used step size h =


6 5
0.25(= T/256) and total time step number 10 (0 < t < 5 X 10 ). Plots of the orbit
[x(t), y(t)) and of the error for energy and angular momentum are shown in Fig.8 and
Fig.9.
s
u,-nVl6, h = 0.25, 0 < ; t < ; 5 x l 0 , (Tanai^c-64)
J 1 i I r i i J i f i i J i | i i r i i i j i i | i i i i T' i T i r | r i i i i T j

Fig.8 Plot of the orbit (x(t),y(t)) of the Kepler motion Eq.(50-51).


Because the period for the numerical solution is not the same of the an-
alytical one ( = 64), the phase of the motion gradually shifts from the
analytical position. The discrete plots of (x(t), y(t)} appears like a contin-
uous line. But the motion is guaranteed to come back to the initial state.
No secular error are present in the numerical results. The center of the
force is marked by ' + ' .
104

Fig.9 Plot of error for energy and angular momentum of the numer-
ical solution of Eq.(50-51) (Kepler Motion). Because the period of the
numerical solution is very close to the analytical value ( = 64), the recur-
sion time of the numerical solution is very long. This figure shows the
secular error free computation characteristics of H D M T D V .

3.2. Numerical Solution of a Connected Rigid Pendulum

In this section we solve the motion of a connected rigid body pendulum as an


example of differential-algebraic equations. The equations are

mi - 9j = -21-an +%• ( « - i i } . (52)

m> = T l y i + T i m ( 5 3 )
7t^ ~ ' * ' ~ '
m
^ ( ^ - s ) = - T i - ( x - t X l ) , (54)

m a = T 5 5
^*r - '"<»-»»>• ( )

fi. + Vl = ti , (56)
1
^ ( a i - ^ + dfi-Ift) = 4 . (57)
105

where £, and l represent the length of mass-less rigid rods. T and T correspond
2 t 2

to to the tensions of the rigid rods, 1%, m , g are constants. The unknown variables
2

are the position of the tip of each rod ( i , , jft, x , y ) and the tension of each rod,
2 2

(T and T ). The former group of variables has second derivatives, but last group of
3 2

variables has no derivatives. In this case, the the index JVR is given in Table 2.

variables Xi Pi %2 V2 r i T 2

n 1 2 3 4 5 6
JVR(l,n) 2 2 2 2 0 0
JVR(2 n) l 0 2 0 2 0 0

Table 2 Index JVR of each variables to solve the differential-algebraic


equations given by Eqs.(52- 57) by H D M T D V .

The system of Eqs.(52-57) has energy conservation law given by

— niigxi

= constant, (58)

which is used to check the accuracy of the numerical results.


Numerical example are shown in Figs.10 and 11.

Fig.10 Numerical solution of Eq.(52-57) by H D M T D V with step


size h = 0.001. i i and £ 2 are plotted as a function of time, mi = 65,
mi = 35, t\ = 10, i = 5, g = 9.8. Initial angle of rod-l= 175 (deg),
2

rod-2= 187 (deg). Initial velocities of rods are assumed to be 0.


106

Fig.ll Numerical solution of Eq.(52-57) by H D M T D V with step


size ft = 0.001. yi and the error for the energy are plotted as a function of
time. Parameters and initial conditions are the same of those of Fig.10.

This system is dissipationless, but numerical results show that numerical error
suddenly increase at special points. This will break the time reversal nature of the
motion. The reason of this phenomena will be discussed in the next section.

3.3. Differential-Algebraic Equation of Index 2

As an example of differential-algebraic equation of index 2, we have integrated


the following equations by H D M T D V ,

2 2
0= Lifolf'.M) =^- cos(t)z -8exp(-t)y ,
a (59)

0= t (v,()
2 = i-(i- m{t)
Qii + 3exp(-tj)-y, (60)

where a and 8 are constants. The analytical solution of this system is given by

rtt) t)B ( 6 1 )
^ l-a^W
In this case the variable y{t) contains the first order derivative, but it is determined
by the algebraic Eq.(60). Then, an additional equation is necessary to determine the
value j/(()- So, the index of the variables y{t) and z(t) becomes as shown in Table 3.
107

variables y(t) z(t)


n 1 2
JVR(l,n) 1 0
JVR{2,n) 1 0

Table 3 Index JV/fofeachvariablestosolvethedifferential-algebraic


equations of index-2 given by Eqs.(59-60) by H D M T D V .

We adopted as additional equation

= o, n : positive integer. (62)

Numerical results of H D M T D V for this differential-algebraic equations are shown


in Fig. 12.

a = 0.9, p = 0.2, h=0.01 a = 0.9, (J-0.2, h = 0.01


IIIII[II[|IITIIIII1|M[II[1II|IHIII11I|H U II11II1111111111II1111111IIIJII11III I f n

Time Time

Fig. 12 Plot of numerical solution and its relative error for differential-
algebraic equations of index-2, Eq.(59-60). Since the variables z(t) is
solved by the algebraic equation, Eq.(60), the error is only due to roundoff
15
error, i.e., order of 10~

4. Summary and Discussion

We have developed a new integration method with high accuracy and high ap-
plicability. A new program named H D M T D V can solve dissipationless dynamical
108

systems without secular error. Stiff ordinary equations or differential-algebraic equa-


tions can also be solved by the same program. These properties are demonstrated by
several numerical examples.

t t

Fig.13 Numerical solution of Eq.(52-57) by H D M T D V under au-


tomatic change of the value of JVR guided by the Table 4. yi and the
error for the energy are plotted as a function of time. Parameters and
initial conditions are the same of those of Fig.10.

In subsection 3.2, we have observed a sudden increase of numerical errors. Let


us consider the reason of this phenomena. When we treat the constraint given by
Eq.(56), we use two additional equations,

SiSH'teKS = 0 , (63)

x x'l + x[x[ + fctf + y[y{ = 0 .


L (64)
These equations are expected to work to determine the values y'(t) and y"{t). But,
when the rod is nearly vertical (xi n £\ and y\ ^ 0), the left hand side of Eqs.(63¬
64) becomes very close to zero independently of the values of j/'ff) and y"(t), and it
becomes difficult to determine the accurate value of y'(t) and y"(t). This will be the
reason of the sudden increase of numerical errors shown in Fig.l 1.
A quick treatment for this problem is provided by the replacement of the value of
the index JVR according to the relation > |xi| or \yi\ < \xi\, i.e., in the former
case, we treat yi{t) as a rank-2 variable, on the other hand, in the latter case, we
treat x,(t) as a rank-2 variable as shown in the Table 4.
109

yi > l^i 1 !/i < 1*1


variables n JVR(l,n) JVR(2,n) ;VR(l,n) JVr?(2,n)
1 2 0 2 2
yi 2 2 2 2 0

\Ui-yi\ > x -xi


2 lift - yi < x - I i |
2

variables n JVR(l,n) JVn*(2,n) J V R ( l , n ) JVR(2,n)


x2 3 2 0 2 2
V2 4 2 2 2 0

Table 4 An improved index J V i i to solve the differential-algebraic


equations given by Eqs.(52- 57) by H D M T D V . The index JVR for the
values Ti and T are same of those of Table 2.
2

The physical meaning of this process is the following. We treat j/i(/) as a rank-2
variable when |j/i(t)| > | Z ] ( f ) | . In this case, y\{—h) is treated as a initial condition
and ft) is determined by the equation of motions. The x\(-h) and x"(—h) are
determined by the additional equations Eqs.(63-64). When |j/i(t)| > |xi(t)|, %i(t)
is treated as a rank-2 variables. We show a numerical example in Fig.13, using this
choice of the value of the index JVR.

Fig.14 Numerical solution of Eq.(52-57) by H D M T D V introducing


the polar angles $ and 0 defined in Eqs.(65-66). j/i and the error for the
t 2

energy are plotted as a function of time. Parameters and initial conditions


are the same of those of Fig.10.
no
A more fundamental treatment for this problems is the introduction of polar
angles, 8\ and 82 as unknown variables instead of Xi, j / i , x and 3/2, 2

H = 4 C O S 0 , , jft =f infl
l S 1 , (65)
3% — x l = g COsS ,
2 2 V2-yi =f2Sinfl . 2 (66)

In this case, Eqs.(56-57) are automatically satisfied. Highly accurate numerical solu-
tion is obtained as shown in Fig.14.
In subsection 2.1, we found that two of the coefficients (Aj(si), B,{s ), C,(s,)), t

(j = —1,0,1, i = 0 , - - ,6) are undetermined (for example the values of Ci(s 4

and Ci(s ), if we are satisfied with the compatibility for the truncation errors for
5

representations Eqs.(2-3). In this case, we can change the dispersion relation Eq.(24)
so as to satisfy
cos(2fcf2) < 1, for ft|w| £ y , (67)

by appropriate choice of the values of Ci(s ] and Ci(s ). In this case, the largest step
4 5

size hmax which guarantees the purely periodic solution of Eq.(22) is given by 3/4 of
the period of oscillation.
It will be easy to extend H D M T D V to solve boundary value and eigenvalue
problems. This will be published elsewhere.
The next big task is the construction of a general purpose computer program to
solve time evolution of multi-dimension boundary value problems described by partial
differential equations. When the space dimension is 1-D, we have already constructed
1 4 , 1 5
such a general purpose computer program based on H I D M . The present work
represents also an important contribution to accomplish this task.

5. Acknowledgements

G. Gnudi acknowledges the Japan Society for the Promotion of Science for the
financial support.

6. References

1. K. E. Brenan, Annals of Numerical Mathematics, 1 (1994) 247.


2. S. L. Campbell, Annals of Numerical Mathematics, 1 (1994) 265.
3. R. Marz, Annals of Numerical Mathematics, 1 (1994) 279.
4. T. Watanabe, Annals of Numerical Mathematics, 1 (1994) 293.
5. K. Abe, A. Ishida, T. Watanabe, Y. Kanada and K. Nishikawa, Kakuyugo
Kenkyu (In Japanese), 57 (1987) 85.
6. T. Watanabe and M. Takagi, Trans. JPN Soc. Ind. and Appl. Mat. (In
Japanese), 1 (1991) 135.
111

7. T. Watanabe, K. Abe, A. Ishida, Y. Kanada and K. Nishikawa, Kakuyugo


Kenkyu (In Japanese), 58 (1987) 265-278.
8. T. Watanabe, RIMS koukyuuroku (Research Institute for Mathematical Sci-
ence, Kyoto University) (In Japanese), 841 (1993) 43.
9. J. M . Sanz-Serna, BIT, 28 (1988) 877.
10. S. Saito, H. Sugiura and T, Mitsui, BIT, 345 (1992) 345.
11. H. Yoshida, Cel. Meek, and Dyn. Astr., 56 (1993) 27.
12. G. Gnudi and T. Watanabe, J. Phys. Soc. JPN, 62 (1993) 3492.
13. T. Watanabe and G. Gnudi, ISM Cooperative Research Report (In Japanese),
55 (1994) 211.
14. T. Watanabe, Trans. JPN Soc. Ind. and Appl. Mat. (In Japanese), 2 (1992)
93.
15. T. Watanabe, GAKUTO International Series, Mathematical Science and Ap-
plications, 1 (1993) 189.
113

S e m i - e x p l i c i t M e t h o d s for D i f f e r e n t i a l - A l g e b r a i c
Systems of Index 1 and Index 2

Hisayoshi SHINTANI
Department of Mathematics, Facility of School Education
Hiroshima University, Higashi-Hiroshima 739,Japan

Abstract
A-stable semi-explicit methods are constructed for differential-algebraic sys-
tems of index 1 and for those of index 2 and their convergence is shown.

1. Introduction

Consider the differential-algebraic system of index 1

y' = f(y,z), S(v,i) = 0 (1)

and that of index 2


J - m & 9(ff) = 0, (2)
with the initial condition

yfro) = yo> z(x ) - zo,


0

where y and /, z and g are vectors of the same dimension respectively, / and
g are sufficiently smooth and for (1) gi{y,z) has a bounded inverse in the
convex closed domain D and so does g (y)f,ly,z) y for (2). The initial value
(yo>zo) is saied to be consistent if g{yo,zo) = 0 for (1) and if g(yo) — 0 and
9Ayo)flyo, zo) = 0 for (2). Let

x = x + 'ft (0 < h,
t 0 0 < t, th < C).

We are concerned with the case where the approximations (y„, z„) to (y(x„), z(x )) n

(TI = 1,2,...) are obtained by semi-explicit methods. Rosenbrock-type meth-


ods are convenient because they are non-iterative,but they are liable to be less
accurate because they do not solve the equation g — 0 directly, so that at the
n-th step of integration we cannot always expect (y -i> n - i ) to be a consistent
n
z

initial value. Rosenbrock methods for (1) with inconsistent initial values are
1
described in the literature .
The object of this paper is to construct A-stable semi-explicit methods
for (1) and (2) that correct the errors of the initial values and to show their
convergence. We also construct interpolation formulas for approximating y(xi)
and z(x ) (t ^ 0,1,...) and obtain the formulas useful for stepsize control.
t
114

2 . Methods for Systems of Index 1

Let do — — (ff7'ff)(!/o>zo)> and assume that ||doll is so small that there exists a
2
unique zrj such that g(ya, ZQ) — 0 and put ZQ - ZQ+CIQ. Thenifj = zo-t-0(|[do|| ).
Let (yo(x)«2o(x)) be the solution of (1) satisfying i/o(xo) = yo,zo(xo) = ZQ- Put
-1
G = (- )-\
Sl K = Gg , L - / , G , T = f + KU
y s 4 = (I- aftT") (a > 0),

F = f-rfzdn, where all function values are evaluated at (yo.^o)- Then we have
2
yr>o) = F + O(||do|| ), 2D ( x ) = JrF + 0(||doH), yo(xo) = TF + O(||do||),
0

We construct A-stable semi-explicit methods for approximating (j/o{xi),£o(xi))


of the form
m n f c 3
yi =yo + E £ p i j o ' ( )
i=ij=i
m n m
r m 4
z\ = ao + m i + E E P o ' o + E ' ' - ( )
i=lj=l i=2

where

hi = {A- ly-'Aklfi + Lgt), Ut = Kkj, (i = l,2,...,m;j - I,2,..,n),

m, = Ggu f, = f(v.i,Vi), gi = S(tfc,%J, iii = yo, «i = a , 0

ni—l i - l TI i—1
c v i + e m
^ = W + E E ' ' ^ * j ' ' ' = - M ~ m i + E E M > * E ' > j (* = 2,3,...,m),
;=1*=1 JJ=1fc=1 j=2
+1 2 2
yi - yo(xi) = 0(ftP + A ||doll + ft Noll ), (5)
, + l 2
si - s (xi) = 0 ( f t
0 + ft lldol + lldoll ) (P > 9 > 0). (6)
We also construct interpolation formulas of the form
m n
C
yi =!W + E E P O ' ' . J i + hvt{f(yuz\) + Lgim,zi)} (0 < ( < l ) ,
i=ij=i

+ h K p , n
*,=* + m i + E E « ^ ?' f(yi .*l)+E * *
i=l i=l uc2
and the formula
771 TI

e fc
= E E Py y

such that JI, - vo(xi) and z, - zo{xt) are of the order (5) and (6) respectively
p
and e — 0(ft ). The quantity e is used for stepsize control.
115

Let z = k(y) be the branch of the implicit function defined by g{y, z) = 0


such that io = k(yn) and put F(y) = f(y,k(y)). Then (1) is equivalent to the
system

so that the A-stability of the method (3)-(4) can be verified by the test equation
1
y = >~y
We have

Theorem 1 For m = 1 A-stable methods, interpolation formulas and the for-


mulas with (p, q) — (2,1) exist. For m — 2 those with (p, q) — (3,1) exist. For
m — 3 those with (p, 9) = (4,2) exist. For m = 4 fftose wtffi (p, q) — (4,3) exist
but methods with p = 5 do not exist.

Examples of the formulas


Case m = 1
" = g, P11 — 1, Pin = P12 — 1;

0=7. P11 = li P12 = 1, put = fc Pia = 2< - 1 , P12


2
— 1.
4
Case m = 2
1 3 3 3 11 7 11 16
—, Pll - ^f, P12 - - g , P l 3 - —g", P21 -
0 = 7 . Can = - , C212 - - , C13 = 2 -
16 16 , , 16, , 16,
3
16 , 2 3 a 3
TI = y Pn. - t--t , p, , = - t + 2 t — f , p i = t - 4 t + y ( , P21, - 2 3 1

16 32 32 32 32
~
2

ra - y i ,pn = - , Pis = E w Pi3 = ^ . P21 - gj-.


Case m = 3
1 3 3 33 1
= C 3 U = 0 3 , 2 = l f C 3 1 3 = C 3 H =
0 =
7> =
5' 0 2 1 2
~ 25' ^ ~ 125' ~3'
50 8 8 10 4 125 1
C32i = 0, e 2 = j , P11 = 0, P12 - - g . Pis = — j p P14 = g, P21 =
3 P3i - g>
3 2 3 2
r = B, r = 1, put = ^ t f O f - 49f + 27), p
2 3 l a = ^(24t - 41t + 18t - 9),

3 2 3 2 3
Put - |(12( +5t -36t+9), pi it m ^(24( -65t +54t+9), n = ^ t ( 4 - 3 t ) , P2

3 4 2 2
P3i, = | i ( 4 - 3 t ) , p, - t - i , ra - | j t ( 5 t - l ) , r 31 « i/. (5t-3), p n == y , P12 = | ,
20 5 125 _ 5
P i 3 - - T , P l 4 = g , P2i = - ^ . P3i-g.

Case m — 4
1 1 2 11 2 8 26
a = 2' 4 : 2 1 1 = C
3 ' 212 = - g . C213 = — , C 3U - - , C312 - - —, C i3 - 3 - — ,
116

5 1030 1331 560


C321 - CHI = 1, - "jjy, C 413 - - 3 ^ - , C421 = gg-, - 2,

27 9 621 189 13 151 463


c«i = ~ , CJ2 = 5, <42 = *43 = -fc. Pi] - ^ P12 - -go". P13 -
131 9 119 9 13 239 69
PM - P21 = Tq, P22 = - T J - P3t = 55, P4! = 720- r a
= -20' ' " 15- 3

4 3 2
r = I£
4 P U ( = -4r(378t - 1647t + 2255S - 1080t + 120),
15 120
3 2
p 121 = l->-76Uf + H685i - 6920t + 90( - 90),
yu
4 3 2
Pi3< = TT^I-27621t +52440f -23345( - 1080t + 540),
540
4 3 2
p l4[ = _ L { i o i 2 5 ( - 19140( +8105i + S72t-324},

2 2 3 2
PS1I = ^* (36t - 75f + 40), P22, = Y^( (16119t - 30810t + 14215),

P3i- - 4t3(-9(2 + 1 5 (
" 5 |
' p l
" • T^* '" * 3 5 4 2 + 1 0 S (
" 5 0 )
-
3 2 J 2
p, - jt (27t - 501 + 23), rj, = -^-( (65t + 93), r , = Jrt (85t - 39),
3
4 40 20
13 , 2 ... r 25 . 13 . 43 . 65 . 15
p ( ( 5 ( _ 3 ) P l 1 = m = P l 3 m -
* - 3o - i 2 ' y = is - = M' ™ = - y
16 . 3 . 13
P22 = - y , #M-jr;
2
We have shown the following
,+1
Theorem 2 Suppose that do - 0(ft ) (s > 0) ,0 < h < ho , nA < C. Then
for sufficiently small ho

V„ - yoM = O(ft'), in - z„{x ) - O(ft'), z - zo(x„) = 0{A") (n = 1,2,...),


n n

( = min(p,7 + 2,s + 3), u = min(p,7 + l,s + 2).

3. Methods for Systems of Index 2

3.1. Construction of the Methods


-1
Let G[y,z) = - ( f r f o ) / ^ , * ) ) , ff(!/,z} = /«(»,a)G(tf,«), let M (j = f

0,1,2,3) be the constants such that

\\G(y,z)\\ < Mo, \\H{y,z)\\ < M,, \\g {y, z)/2\\ < M , \\g iy)f Ay,z)f2\\ vs 2 s 1 < M inD
3
117

and put
m
D = M\M , £>i — M M ,
0 2 0 3 CQ = H(y , z )g(y ), a 0 a So = Vo + co, y = yo-
Then we have
1
Proposition 1 Suppose that r = A>IMI < L. Then the sequence { j / " } de-
fined by the iteration
. „C*) + W H{g i (fc = o, i,...) (7)

converges to yo,u)kich satisfies the equation g(y) = 0 and the estimate

IISo-wll < Do Hcoll / ( l - T-2) 2


(8)
holds.
+1 h
Proof. Put c = y<* ' - s/W (fc - 0,1,...). Since g(y< )) + j „ ( # > ) c i - 0, we
k

have
fc+1) W
<7(i/ ) =jf (1-*)frf + Oc )(Ck, c )d&,
k k (9)
which yields the estimate
2
\\c i\\ < Do\\ck\\ ,
k+

so that
2 1
llcfcll < r ' - Hcoll (* = 0,1,...). (10)
For any positive integer p we have

U*-H» _3,W|| < £ h k + . _ A < 2 ' - l ,| ||


r C0 / ( 1 _ ,2^

which impUes that tip"*} ia a Cauchy sequence . Taking k — 1 and letting


p —> o o in (11), we obtain (8). From (9) and (10) it follows that
|j(» ( t + I )
)|<«2llCfc|| -*0 2
ffe-OO),

which shows g(yo) — 0 by the continuity of g. This completes the proof.


Let
m
eo = G{yo,zn)gyly )f(yn,zo),z 0 = zo-
Then by the same reasoning we have
Proposition 2 Suppose that s — D\ ||eo|| < 1. Then the sequence )z'"'} de-
fined by the iteration
z [k + l) = z {k) + z W)g (y )f'y^ y a Z W) (fc = 0, 1, ...)

converges to zo,uihich satisfies the equation gy(yo)f(yo,z} — O and the estimate


2 2
Wzo-zWjKD! ||e || /(l- )
0 s

holds.
118

Corollary 1 Let

d = G{yo,z )g (yn)\f(y ,z )
Q 0 y 0 0 + fy(.yo, Z O ) C Q } > ZO = zo + dn.

Then
e ^d 0 n + 0{\\dof + \\cf), (12)
2 Z
||io-5o|| = 0(||(iol| + l|col| )- (13)
Proof. From

eo = {G(iiD,^H0(IMI)}{ftr(»^
3
- * + 0 ( 1 * 1 11*11+INI )
the estimate (12) follows. The inequality

W W
\\ZO-ZQ\\ < jzQ- z \\ + \\z

yields (13) and the proof is completed.


We assume that ||<Klfa)|| ^ d \\g 'ya)f(yo,zo}\\ are so small that we can find s

(So, Jo) in the neighborhood of iyo,zo). Let z = k(y) be the branch of the
implicit function denned by g {y)f(y,z) — 0 such that ZQ — k{yn) and put v

L
P(y) - f(y,k(y))- « t iyo(x),zo(x)) he the solution of (2) satisfying yo(xo) =
yo, zo(so) = h- Put

G = -(S ,A)" , R = Gg , K = Rf , L — /,G, A — (I — a f i T ) ' (a > 0 ) ,


S
1
s y
-

T = Pfv, P = ' -Q< Q = -Lg», F — f + f c y 0 +/ d , E= ; 0 9v! ,(F,F),


where all function values are evaluated at (yo,zo). Then
2 2
Bfifjee) = F + OIHcoll 4- ||do|| ), z (z ) = KF + G £ + 0( | | | | + ||do||), 0 0 Co

y '(x ) = TF + LE + O(||co|| + ||d ||),... .


0 0 D

We construct semi-explicit methods for approximating (yo(xi), ZQ(X,)) of the


form
m n
yi = vo + co + ^ ^ ( p i j * i . f + rymiji, (14)
<=U=i
m n m
I +r + s + t 15
81 =zo + do + E Z ^ ( P y ' J ' j " y ' 5 I ( ' P ' ''9')' ( >
1=1j=i 1=2
where
J 1
fc = (A-7) '- ,4/iF, li^iEferjjj 0 = 1.2,...,n),
u

l
k - {A - iy- Ah{fi
tj - F), kj = Kkij (i = 2,3,...,m;j = 1,2, ...,rt),
rm = {A- iy-'ALgi,
} rty = Km,, Pi = R(fi - F),
119

5i = Ggt/h, fi - f(ui,vi), gi — g{wi),


i-l n
Ui = yo + c + E ^2(djkkjk
a + dij*mjfc)|
y=i*=i
i-l n i-l
c
Vi = z 0 + d + J2 E( y**jfc + <*y*«i*) + £ ( e i j p j
0
+

j=ifc-J j=2
i-l n
Wi = jo + co + + u* j*)' d m

j=ifc=i
1 2 2 2
ft = S1 - So£*i) = Of**" + A M l + 0 ||doll + Hcoll + h ||d || ), 0 (16)
+1 2 2
r , - z i - z i ( x ) = O(A« +fc||^||+ft||d ||
< 1 0 + || || + l|d || ) ( p > > 0 ) . (17)
Ca o 7

We also construct interpolation formulas of the form


in n m
, i rrtmil
yt = yo + co + EEj 'i ^'+5] + rirh,
i=lj=l i=2
m n +
TIT
7 7til
2, = ZO + do +EEPy'''J 5Z' "" + iiPi s
+'"3i)
17=1 j=l i=2
and the formula
m n m

i=l j=l i=2


where
m n m
m - ALg(w),w = yo + c + E H + 0

,=1 j=l i=2


The equation (2) is equivalent to the system j / = F(y), z = &(!/), so that
the A-stability of the method is verified by the test equation j/ = \y . We have
Theorem 3 For m — 1 X-itaWe methods, interpolation formulas and the for-
mulas with {p,q) = (1,0) exist. For m — 2 (Aose (p,g) = (2,1) exist and
for m = 3 (ftose wiift (p, 5) = (3,1) exist. For m = 4 (ftose with (p, g) — (3,2)
eiis( methods with p = 4 do not eiis( ,
Examples of the formulas
Case m = 1 ^
= 1 = ( = 1
o= 2> Pn ' P"i ' Pn -
Case m — 1
1 1 1 , . 1 5
O= Cn -
2 - , Cai2 - -yjj, C2)i - 1, C212 = - , pn = 1, P12 - —j
120

33
p2l = 3, r = 1, s = 9, t = 2, pn, = i , p i = K - y £ + » ? , P211 = -6f+18( -9( ,
2 2 2 2 (
2 3

2 2
r , = i , s , = 9t , t , = 2(, pn = 0, pu = -5,
2 2 2 P21 = 6 , f = ^. 2

Case m = 3
1 3 3 21 99 69
a = j , O i l = 1, C 2 1 2 = - - , GJI, = 7 , C3i 2 = -jjg. <WS - - — , c 321 = — ,

9 9 3 , , , 1 . 7 . 1

0321 = y y «32 • Yg. ff3 = 2 &2U = C212 = C n = 1, C312 = - g , 3 C313 = y y =


3'

dMl = 1. C11 = = (f =3 pn = 1, p i = - - , 2 P13 = - , P21 = 0, P22 = - - ,

r
P31 - ^ y 21 = 1, T - = - 1 , r = s = 1, S3 m —, 1 — —, t = 3, 2 2 3 1 2 2 3

•> 16 , 16 , 8 o 19 16 T

2 3 3 2 J
Pill = t. PlM - 2t -t—-t , pu, = y t - - t + ( , P 2 „ - 0, p , - --t\ 22 put - —( ,

f i. = t , r
2
2
2 2 [ = -t , r 2
3 1 1 - ( , r, = 8 ( t - ( ) , p,i = 0, p , = - y P13 = y
3 2 3
2 P21 = 0,

P31 = f31 = 1.
Casern = 4
1 1 1 1 3 39
a = C2U C 2 1 2 0 2 1 3 = C 3 n = C312 =
4' " 3' = "5' "9' ?' "250'
337 37 207 33 3
cm = - £ g , * M = — , < * » = — ,C322 = j g . A u - 1.4M = J.
. 64 , 128 , 64 . 686 81
— 4 2 1= 31 = 632 = e 4 2 = 9
dill = 2 2 5 . ^ - Tl25 ST'^ 243' 25' '
50 128 1664 686 9 45 9
643 = y J32 = y-jy ?42 = — = JjJ.ejl, = ^Cm = 32,C3ii = , ?

99 . „, , . 41 . 7 . 200 ; 64
C312 = j y ^ i =o,c4ii - i , c i 2 = - y - 4 2 i = - j . « m = y f . ^ i = T_-, 4
c

, 64 - 64 - 343 8 1 257
*2l - ^.^422 - 2 ^ 3 , ^ 3 , = - — , P U = 1.P12 = - g , P , 3 = 3^,P.4 = g j ,
41 59 125 1 64 64
P2, - 0,P22 = "^,P23 = 77T.P31 = ^ P" = g, r , = g j . f ^ = - - . 2

349 50 2048 1372 3 J

r « = - ^ , r « = 1,82 = 9 = y , S = 1,(2 = . t 3 = , t 4 = " y , > S J 4 W W

3 z 3 2
PlU = *,Pia = ^ ( 2 2 5 f - 6 4 i + 36t-18),p22i = ^ ' ( 1 5 t + 64t - 72i + 18),

P3U = - i),pm = - 4),r , = g t V t t , = - ^ t ,


2I
2

a 2 2 2 2
r i i . = f | | ' ( 8 * - 9), r « , = - | t ( 7 t - 9), , = 9t , * = ^ ( , s , = ( , S 2 4
121

128 686 3 8
t 2 = t ( 1 3 ( + 3 ) t 3 = i(5f 3 > ( 4 ! t ( 7 ( 6 ) 2 = P u =
' 243 ' ' 243 ~ ' " "2 ~ '^ ~ 21'
2 . 784 .
P 4 1 r 3 1 = , 4l = L
-21' 729' ' -

3.2. Convergence

Let (yj, zj) (j = 1,2,...) be the approximations to (j/ofijl.zofxj)) obtained


by one of our methods and put

y = y + c , z = Zj + dj (j = 0,1,2,...),
j j j 3

where
Cj = H (yj), jg dj = Gjg {yj)(f(yj,Zj) y + f (yj,v z,)cj},
H = H{y ,z ),
i i j Gj = G{yj,Zj).

Then we have

Theorem 4 Suppose
r+1 1+l
c = O{k ),
0 d = O(h )
0 {r,a > 0), 0 < f t < f c , 0 nh<G.

Then for sufficiently small ho

fe-»0(*i) = O(A'), % - * o ( x ) = 0{ft') i (J = l,2,...), (18)

ft " = O(A'), z - z (x ) k 0 k = 0(A") (* = 2,3,...), (19)


m
ft - W N - O(fe'), n - a f n j = 0 ( / i ) , (20)

tuftere

( - min(p,g + 2, r + 3,2r + 2, s + 3), u = min(j), g + 1,r + 3,2r + 2, s + 3),

I = min(p + l,f- + 3,2r + 2,s + 3), m = rnin(g + l , r +2,s + 2).


Proof. Let £y be the solution of g — 0 obtained by the iteration (7) with the
starting value (j(j,«f) and let Zj = fc(jy). Let (tfj-(z),Zj(x)) be the solution of
(2) such that yj(xj) — yj, Zj(ij) — ij and put
z
Sj+1 = Vj+i - l o f o + i ) . Tj+i = i+\ ~ Zj(xj+i).

Then for some constants oo, &o, * (j = 1,2,3) and (fc = 1,2) we have by
(16) and (17)
1 2 2
| | S j i | | < aoh^
+ + UihHWcjW + \\dj\\) + 02 IICJH + ftaj ||d,|| , (21)

| | T j l | < 6oA
+1
,+1
+b h(\\c \\ + \\dj\\) + b2(\\cjf 4- \\djf).
1 j (22)
122

Let

Q} = -Myi>zj)Gi9(yj), Pj = i-Q,, Ri^GmfaffidViiZi)'

Aj« = life - yo(xj)\\, A«j = \\z, - ao(xj)|| 0' = 0.I. ••).


and let La be the constant such that ||F(u) — F(v)\\ < La ||u — «||. Since

i?
90J ~ » ( * ) = Wte) - » ( * / ) + / { ^ i - (fo(())}d( ( I > * ; ) ,

we have

b j d ) - j/ofaOII < Ay, + £ f 0 \\y it) - y (t)\\ dt,


3 0

so that by Gronwall's lemma


\\y (xj+i) - So(sj+i)|| <
3 e^Ayj.

Prom
2
Sfetzj+l)) - " S i) j+ = - ft,(tt+i)S i i+ + 0 { | | S | | ) - 0,
3+l

it follows that
a
c i =
i + fff+ij^+l) = -Qi+iS i j+ +0(|%iS ), (23)

and so
a
fc'+l - ft&fcl] - Fj+iSj + i + 0(||Sj+i|| ),
because
a
-fe+l=^ + J +0(||c i|| ). i+

Hence for some constants fa and fcj


2
< e"" A Vj + ft, HSf+,11 + k 2 \\S \\ .
j+1

CL
Setting d = kie ° (i = 1,2) , we have

2
Ay. < C , £ | % ] + C £ HSjll 2 ( = 1,2,...),
n (24)
J=I j=i

because Ay = 0 , and there exist constants C, (i = 3,4,5) such that


0

2
\\y^-yo(x )\\<Ay„n + C \\S \\ 3 n ( r i - 1,2,...), (25)
2
\\yn ~ Vo(x )ll < Ay„ + C ||S || + C ||S„||
B 4 n h (n - 2,3,...), (26)
Ilia •-•••»B.(*i)ll = l i f t ! . (27)
123

Let k be tbe constant such that ||fc(«) - k(v)\\ <fc \\u - v\\. Then
0 0

Az„ = \\k(y„) - k(yo(x ))\\ < k Ay n 0 n (n = 0,l,..),

so that for some constants kj (j = 3,4,5)


2 2
P „ - io(*»)ll <fcoAl/n+ fcsllknll + I K | | ) (n = 1,2,...), (28)
2
h i - tt(xt)l < koAy + fc IIrfjt|| + k (\\c \\
k 4 5 k + \\d f) k lit = 2,3,...), (29)
l k i - ^ ( n ) l l = llr,||. (30)
Since
2
Vj(Xj+i) = Vj+i ~ Fj+iSj+i +0(||S - i|| ), zjixj+i) = J + -Tj+i,

we have

Ss(w(^+i))/(y,(^+i),z,(xj i)) +

= Ss(W+i)(/ + V j + O - S j t / j P j + i S j + i + ATj+O - ^ ( P j + i S j + i , / )
2 2
+o(||s || + ||r || }J+1 i + 1

= 0,

which yields

(31)
where function values without arguments are evaluated at j/j+i or at ( i / j i , z i ) . + J+

Prom (21), (22) and the assumption it follows that for some constants A\
and Bi
m
liftII <Aih', \\T,\\ <Bih .
By (21), (22), (23) and (31) there exist constants a t (j - 1,2,3), h and b 2

such that
, 2
\\S 4< h^ + n {\\S \\
j+ ao ai j + \\T \\)+j a2 Wf-H&pyf,
,+1 2
| | T i | | < 6oft
i+ + M d l f t l l + 113)11) + W l l S i f + P J | ) (j = lj2,...).
We shall show that for sufficiently small ho inequalities
+ w+i
llftll < Ajh" \ \\Tj\\ < Bjh (32)

are vahd with bounded coefficients for j = 2,3,... , where

v = min(p,g + 2,r + 3, s + 3), w — min(g, T + 2, s + 2).


124

Put
1 +1 21 1 -
A = a^-"
2 + aiiAih? -' + B,h? -') + a A\h -"- 2 4- ^ ^ M * * ,

Then (32) holds for j = 2. Suppose that (32) holds for j = 2,3,.... k and put
+2 v +l 2 2 +2
A k+1 = h' -»
ao 0 + a i (A hl + B ^ ~ )
k k + aiAlh° 0 + a B h r -\
3 k (33)

+1 4 1 1
Bfc+i = 6ohT" + 6i(iltfcS -" + B^fto) + f ^ f t o " " + BgftJ?* ). (34)
Then (32) is valid for j = ft + 1. Setting

j l t = 0 + i f c , B* = c+feoB (fc = 3,4,...), t

2 2w+2 v
where a = ao^"" + MNf**~" + a b h - , 3 b = bohf", c = b + Wifto +
2 +l
6 6 /to . we have
2

3
= «4 AB^ai*. + «^Ao^*H- a T ^ * ^ ^ == 4>C^^ ^fc^

where Oj (t — 4,5,6,7) , bj (j — 3,4,5) , A and B are polynomials in fto 3 3

with nonnegative coefficients. From (33) and (34) it follows that A > 0 and k

B > 0 (ft = 3,4,...) , because -4 > 0 and B > 0.


fc 3 3

Put
(A ,B f k k = U , (<p(V)MV)f
k =T{U)
and choose ha small so that

|T(10|| < r < \ 0 for 11(7 - t/ || < 2 ||(/ - t/ ||. 3 4 3

If for j = 4,5,...,fc
\\Uj-V \\<2\\Ui-U \\,
3 3 (35)
then since
\\u -u \\k+l k <T \\u -v _A,
Q k k

we have
\\u -u \\
k+l 3 < Eiall^+i-^n
s ,
< ( l + r + ... + i £ - ) | | E J - 7 i | | < 2 | l ( / - f / | | ,
0 4 l 4 3

and (35) holds for j — k 4-1. Hence j i \ and B^ are bounded and there exist
constants A and B such that

Aj < A, Bj <B (j= 1,2,...).


125

By (24) we have
v+l 2 2 a +2
AiM < CiA{h! + (n-l)h } + C A {h ' + (n - l ) A * }
2
i 2 2 2 +1 !
< Ci>l(A + Cft'')+C2vl (/t ' + C A " ) < D / i ( n = l , 2 , . . . ) .

Estimates (18), (19) and (20) follow from (25), (28), (26), (29), (27) and (30)
respectively

4. R e f e r e n c e s

1. E. Hairer and G. Wanner, Solving Ordinary Differential Equations II,


(Springer-Verlag, Berlin, 1991), 453 - 454.
2. H. Shintani, Semiexplicit Methods for Differential-Algebraic Systems of In-
dex 1, Bull. Fac. Sch. Educ. Hiroshima Univ., Part II 1 7 (1995), 23
32.
127

COMPUTATIONAL C H A L L E N G E S IN T H E SOLUTION
OF NONLINEAR OSCILLATORY
MULTIBODY DYNAMICS SYSTEMS"

JENG Y E N
Army High Performance Computing Research Center, University of Minnesota
Minneapolis, MN 55415, USA
E-mail: yen@aJipcrc-umn.edu
and
LINDA PETZOLD
Department of Computer Science, University of Minnesota
Minneapolis, MN 55455, USA
E-mail: petzold@cs.umn.edu

ABSTRACT
One of the outstanding problems in the numerical simulation of mechanical sys-
tems is the development of efficient methods for dealing with highly oscillatory
systems. These types of systems arise for example in vehicle simulation in mod-
elling the suspension system or tires, in some models for contact and impact, in
flexible body simulation from vibrations in the structural model, and in molec-
ular dynamics. Simulations involving high frequency vibration can take a huge
number of time steps, often as a consequence of oscillations which are not phys-
ically important. On the other hand, the components causing the oscillations
cannot usually be eliminated from the model because in some situations they
are critical to the simulation. The equations of motion of a rnultibody mechani-
cal system are described by a system of differential-algebraic equations (DAEs).
In this paper, we will explore two types of methods. The first class of meth-
ods damps out the oscillation via highly stable implicit methods. Even in this
relatively simple approach, unforseen problems may arise for Newton iteration
convergence, due to the nonlinearities. The second class of methods involves
formulating thernultibodysystem in such a way that the oscillations are de-
termined by a linear subsystem, which can potentially be solved rapidly via a
number of different methods.

1. Introduction

Much recent work has been focused on the development of numerical methods and
underlying theory for the solution of rnultibody dynamic systems (MBS) consisting of
28 20
fast and slow subsystems ' , These types of systems occur frequently as initial value
problems in the computer-aided design and modeling of constrained mechanical sys-
1 , 4 2
tems, molecular dynamics, and in many other applications It is well-known that
the characteristics of fast or slow solution is determined not only by the modeling as-

"This work was partially supported by ARO contract numbers DAAL03-92-G-0247 and DAAH04-94-
G-0409 and by ARO contract number DAAL03-89-C-0038 with the University of Minnesota Army
High Performance Computing Center, and by the Minnesota Supercomputer Institute.
128

pects, e.g., the coefficients of stiffness and damping, but also via the initial conditions
and events that may excite stiff components in the system during the simulation.
As an example, the governing equations of motion of a mechanical system of stiff
or highly oscillatory force devices may be written as differential-algebraic equations
6
(DAE) :

-£/*(«.?. 0 + G ( « ) *
r
= 0 l a
< )
<?(<?) = 0 (lb)

T
where 5 = \qi,...,q„] is the generalized coordinate, q = ^ is the generalized velocity,
T
q = ^ is the acceleration and A = [Ai, . . . , A ] is the Lagrange multiplier. There are
m
A
rif stiff or oscillatory forces /;, Q includes all the field forces and the external forces
which are non-stiff compared to the stiff components, g is the kinematic constraints,
G^A = I * A represents the internal constraint reaction forces, and M is the mass-
inertia matrix. The stiff force components in (la) are usually expressed by

/, = T,( , )(KMw.)
qil qil + Ci^) (2)
K 1
where the i' force, i € 1,.... ny, is a bilateral force between the pair if and i f bodies,
75, which may be the orientational transformation matrix of a local coordinate system,
is a nonlinear transformation of configuration spaces, which may represent relative
distance or angles between adjacent bodies, is a function of the generalized coordinates
33 29 7
qi and q,,, and finally Ki, C, are the associated spring and damping factors > ^ For
some generalized coordinate sets, such as optimal relative coordinates of mechanical
systems, the function IJ may be linear or even identities, e.g., for instance TJ; = q^ for
some t, ij, the nonlinearity of / j with respect to the generalized coordinate q is due to
3 3
the spatial transformation T; When the components of the coefficient matrices Ki
and Ci become large, these force components may cause rapid decay or high frequency
oscillation in the solution of (1). The purpose of this article is to study these systems
and their numerical solution.
To demostrate the problem of oscillation and the recent developments in this area,
we present two examples: a stiff pendulum and a 2D bushing problem. The former is
a very simple example of a type of system often seen in modeling molecular dynamic
systems, and the latter is a general form of modebng force devices in rnultibody
mechanical systems.

Stiff pendulum

In Cartesian coordinates, a simple stiff pendulum model, with unit mass and gravity,
may be expressed as
0 = i-u (3a)
129

0 = y - V (3b)
0 = ii + xX (3c)
0 = ii + yX- 1.0 (3d)
2 y/x* + y 2
- 1.0
eA = (3e)

where the stiff spring of natural length 1.0 and stiffness j , is attached to the center of
3
mass of the pendulum. Preloading the spring by using e = \ / l 0 ~ , the initial condition
(0.9,0.1) of (x, y) and the zero initial velocity of (u, v), the results of the states
(x,y,u,v) in the 0 to 10 second simulation are shown in Fig. 1. The corresponding
eigenvalues of the uderlying ODE of (3), i.e., substituting (3e) into (3c, 3d), are
illustrated in Fig. 2, where the 3D figures contain all the eigenvalues on the complex
plane drawn along the time-axis. The dominant pair of eigenvalues in the example are
i-i, as shown in Fig. 2. As e —* 0, the pair of eigenvalues approaches ±oo along the
imaginary axis. The other pair of eigenvalues oscillates on the complex plane with the
-5
amplitude and frequecny approaching ±co. Decreasing e to V l O , the eigenvalues
of the uderlying ODE of (3) are shown in Fig. 3. Comparing to those in Fig. 2, two
pairs of eigenvalues in Fig. 3 are 10 times the magnitude of those in Fig. 2, and the
oscillating pair increases its frequency proportional to the size of e.

Stiff in • ••

I I 3 * 5 * T * S U

~ ^ I ? 3 4 n i 6 7 G Q 1 D

Figure 1: Stiff Pendulum in Cartesian Coordinates

2 0
Lubich shows that numerical solution of a class of Runge-Kutta methods for
stiff mechanical systems of a strong potential energy, e.g., stiff spring force such as
the stiff pendulum (3), converges to the slowly varying part of the solution, with the
2 8
stepsize independent of the parameter e in (3). Reich extends the principle of
1 0 , i ?
slow manifold to DAE of MBS with highly oscillatory force terms. Algebraic
130

"
20

·20

...
20

••
real-axis Tlm.

Figure 2: Eigenvalues of Stiff Pendulum in Cartesian Coordinates, epsilon = 10e-1.5

eigenvalues olllllrl pendulum n cartealan ooontinale (epsIIon-1D"-2.S)

<DO

200

!~ •
·2. .

-4. .
200 .
...
....
Figure 3: Eigenvalues of Stiff Pendulum in Cartesian Coordinate, epsilon = l Oe-2.5
131

constraints corresponding to the slow motion were introduced with a relaxation pa-
rameter to preserve the slow solution while adding flexibility to it in the slow manifold
approach.

It is not clear that a slow solution appears in the above example. In fact, we can
only identify the slow solution of (3) using a proper nonlinear coordinate transfor-
mation. In polar coordinates (r, 6), we obtain the equations of motion of (3):

0 = r-z (4a)
0 = 0-u (4b)

0 = z + r<J + ~(r- 1) - s i n 0 (4c)

0 = u --{2zw - cosd) (4d)


T

where (z,u>) is the velocity. In the 0 to 10 second simulation, using the same initial
conditions as the Cartesian coordinates, we obtain the solution in Fig. 4, where
the fast solution is (r, z) and the slow solution is (0,w), The eigenvalues along the
solution trajectory are presented in Fig. 5. Note the dominate eigenvalues are of the
same as those in the Cartesian coordinate formulation. This is because the coordinate
transformation, x — rcosS, y = rsinS, is linear with respect to the fast moving r. The
- 5
eigenvalues of (4) with e = V l O are presented in Fig. 6, Similar to the comparison
in the Cartesian formulation, we obtain the eigenvalues of 10 times magnification.
However, the eigenvalues corresponding to the slow motion have near zero imaginary
parts, therefore, the oscillations along the imaginary axis of eigenvalues 3,4 in Fig. 6
remain insignificant.

--' in Pom

D 1 I 3 4 6 B ? s i n

Figure 4: Stiff Pendulum in Polar Coordinates


132

Figure 6: Eigenvalues of Stiff Pendulum in Polar Coordinates, epsilon = 10e-2.5


133

Although there are ongoing developments to extend the results of Lubieh to multi-
3 1
stage multistep methods , and impressive application of the slow manifold technique
in some molecular dynamic models, it is not clear that these results may apply di-
2 0
rectly to all the types of oscillatory components in MBS. As indicated in , the
representation of stiff or oscillatory components in an appropriate coordinate system
of MBS is not always possible, i.e., the constraints associated with the stiff or oscil-
latory potential force can be difficult to obtain in general. Moreover, convergence of
Newton's method may be an obstacle in obtaining efficient numerical solution of an
2 0 , 2 8
oscillatory MBS in either of the above-mentioned approaches

Bushing force

We have been studying more general MBS of nonlinear oscillatory components such
as a bushing force, which is often used in modelling vehicle suspension systems. Dif-
ferent from the linear spring, this element is usually an anisotropic force, i.e., it has
different spring coefficients along the principle axes of the bushing local coordinate
frame. The bushing force between body-i and body-j may be defined using the rela-
tive displacement dy, its time derivative djj, and the relative angle and its time
derivative 9ij of two body-fixed local coordinate frames at the bushing location on two
bodies. Using the vectors s[ and representing the bushing location in the body-Vs
and body-j's centroid local coordinate systems, respectively, we have

Xj
d ij =
+ Ai4 - Ajs'j (5)

where the orientation transformation matrices Ai and Aj are

A, = A(f>,) = cosf, — sinf,


sin 8; cos &\-
COS0-; — sin 8j
A, = A{8j) =
sinf?j cosdj
and [xi, yt, 9<] and [xj, yj, 9j\ are Cartesian coordinates at body-fixed frames. The
bushing force / j can then be written as
0 e o Ajd\j
A = 'A =Ai k
n & Ajdij+Ai (6)
. Jb J 0 C*

and the applied torque is


n = k%j + c V 3 (7)
1 v a
where lOij = k , k , and k are spring coefficients associated with x, y, and 9
9
coordinates, and <f; C, and c are the correponding damping coefficients.

A simple example may be obtained from this model using unit mass-inertia and
gravity, grounding the first body, and setting the bushing location on the second body
134

as s' = [-|, 0]. A bushing element with no damping attached at the global position
of [ i , 0 ] yields

0 = »-*-(|-. 2»£j + (8a)

0 = if+ * » ( » - 2 ^ ) + 1 (8b)
f l
-i ,.„ sin , ,l T cost?, cosf sinf?

It is easy to see from (8) that the local eigenstructure of the system may change
rapidly, depending on the size of the stiffness coefficients.
I
Usingtheinitialvaluesof(i, ,,fl) = (l.l,0.1.0.0)with(fc ,fcv,fc'') = (lOMoUO*),
!

the solution of (8) exhibits high frequency oscillation for all coordinates, as shown in
Fig. 7. Solving the eigenvalue problem of (8) at each time step yields three pairs as
illustrated in Fig. 8,

•vWWVW A/WWW
BJ
S « o*

ts a< 04 M 04 Ql

Figure 7: Bushing Problem in Cartesian Coordinates

The bushing example represents a different type oscillatory forcing function than
the stiff pendulum. For the bushing element, the coupling of translational and ro-
tational coordinates yields varying local eigenvectors in Cartesian coordinate space,
and two or more fast pairs of eigenvalues, see Fig. 8. The bushing force cannot
be represented as a simple potential force as well as the constraint reaction force of
simple constraints, for which we may directly apply either the results of Lubich or
the slow manifold approach by Reich to the numerical solution.

Many methods for efficient solution of oscillatory dynamic systems are predicated
5
on a nearly linear form of the equation. For example, the method of averaging
requires the linear part of the oscillation equations of motion to be dominant, and
135

r . « * : o r o l ^ . l I J I..: J |^|ttM

Figure 8: Eigenvalues of Bushing Problem

3 5
the mode-acceferation method for structural dynamics, which eliminates higher
4 0
modes in the computation of the mode-dispmcemerrf solution , is based on the
time-invariant eigenvalues of the structual dynamic equations. Our aim is to treat
the class of general nonlinear stiff and oscillatory forces represented in the MBS of (1),
3 9
One approach is based on the study of a class of MBS DAE solvers and the energy
a3 11
dissipative method > , which may damp out the oscillation that is not important.
The other approach is to localize the oscillatory components, and then apply fast
numerical solution techniques to approximate the oscillation.

2. Damping the oscillation

Given the possibility of a rapidly changing local eigenvalue structure, perhaps the
simplest strategy is to consider damping the oscillation whenever it is not impor-
tant via highly stable implicit numerical methods. Since the amount of damping is
controlled by the time-step, and automatic stepsize selection increases the time-step
whenever the solution is slowly-varying (i.e. if the amplitude of the oscillation is
small in comparison with the local error tolerances), the stepsize is increased when
the oscillation is no longer important.
3 9
In recent work we have considered the solution of mechanical systems with high
frequency vibrations via this type of technique. In our initial experiments with the
bushing problem (8) solved directly by low-order BDF methods, we found that the
methods experienced severe problems with Newton convergence. To overcome these
problems, we proposed a coordinate-split (CS) formulation of the equations of mo-
tion, and a Newton-type iteration for solving the coordinate-split equations at each
time step. The coordinate-split formulation eliminates problems due to obtaining an
136

accurate predictor for the Lagrange multiplier variables because these variables are
no longer present in the computation. We found that the coordinate-split formula-
tion worked well foT several test problems involving mechanical systems with high
frequency oscillations. However, for problems with very high-frequency oscillations,
there are still difficulties with Newton convergence.

The Jacohian matrix for solving the nonlinear equations of the coord in ate-split
formulation at each time step involves several terms which are complicated to compute
and which are small at the solution of the nonlinear system. These are terms of
second-order which correspond to the derivative of the projection operator onto the
constraints- Away from the solution, these terms are highly oscillatory. By neglecting
these terms, we found that the resulting Newton-type method converged much faster
for oscillating test problems like the bushing problem. We called the resulting method
the modified coordinate-split, or CM method.

The modified coordinate-split (CM) method performed extremely well in numeri-


3 9
cal experiments described in and in other experiments we performed subsequently.
T
The constraints g(p) = [ p i , g s ] of a two-body pendulum may be written as

Si = 3-1 (9a)
92 Vi
= (9b)
S3 0,
= (9c)
g* = (x -x f t 2 + (y, (9d)
Ss = &2 (9e)

where X\ and tft, i = 1,2 are Cartesian coordinates of the center of mass of body i ,
and &i is the orientation coordinate of the body centroid reference coordinate system,
1 e
and the length of the pendulum is 1. Applying the bushing force (6) with [it , k", k ]
= [1000,1000,1000] and [c\cV,c°] = [10,10,10] to the pendulum, small oscillations of
the numerical solution appear. Using the initial values q [0,0,0,9.9989e-l,-1.4852e-2,0]
2 7
and v - [0,0,0,-6.75e-5,-4.5444e-3], numerical results from the BDF code DASSL
are contained in Table 1, in which error test failures (etf — s) and convergence test
failures {ctf - s) are listed. We denoted by CS the coordinate-splitting formulation,
l 3
LG the stabilized index-2 formulation proposed by Gear , CM the coordinate-
split form using a modified iteration matrix with the second-ordered derivative terms
omitted, and LM the modified LG using the new predictor of the multipliers by the
CS method. Using simplified Newton iterations and the corresponding modified local
error estimate, CS, CM, LG and LM obtain consistent results.

To see the effect of more severe oscillation, we increased the spring constant of
5
the bushing to 10 . Time steps of these methods selected by DASSL are shown in
137

Method TOL step S-s J ~ * et} - s ct] - s


CS 62 156 48 0 13
3
CM 10" 62 156 48 0 13
LM 10- 3
62 154 48 0 13
3
LG lO" 59 141 46 0 12
4
CS 10" 77 193 65 1 16
CM ID" 1
77 193 65 1 16
4
LG 10" 61 136 27 1 5
1
LM lO" 77 193 65 1 16
E
CS 10" 87 215 54 0 13
5
CM 10" 87 215 54 0 13
5
LG 10" 108 259 77 1 21
3
LM irr 87 215 54 0 13
6
CS 10" 138 343 97 1 25
6
CM 10" 138 343 97 1 25
6
LG 10" 131 308 65 0 16
6
LM 10" 138 343 97 1 7b
3
Table 1: Simple Pendulum with a Bushing Force, Spring Constant — 1Q '

Method ToL time step 3-8 etf -s ctf-s


cs,
n lO- 4
0-0-1 2252 /-*
4901 3361 1 1120
4
CM m lfj- 0-0-1 20 40 7 0 0
4
10" 0-0.1 5267 10536 7899 0 2633
LM,„ io-4
0-0.1 2251 4900 3360 1 1120
6
Table 2: Results of Bushing, Spring Constant = 10 , Damping = 10

Figure 9. Clearly, CM took much larger steps than the other methods. Moreover,
6
if the spring constant is increased to 10 , we found severe convergence problems for
LG, CS and LM; the results are contained in Table 2.

The amount of damping in the CM method is controlled via automatic timestep


selection. In real-time simulation, it is not always practical to vary the timestep.
Hence we have also been investigating the possibility of damping the oscillation via
changes to the method parameters. This idea has often been considered in structural
2 3
analysis, where the Newmark /?-method is commonly used to selectively damp
high frequency oscillations. The /?-method bears some similarity to the fl-method of
numerical ODEs, however it applies directly to second-order ODE systems such as the
equations of motion. The parameters in these methods are used to vary the damping
(stability) properties. Usually the methods are first order, however for the critically
damped parameters they are second-order. Hence when any significant amount of
damping is added into the method it becomes first-order. This deficiency has been
11
overcome via the a-modification of the ^-methods. By adding one additional
parameter, the a-methods are able to achieve second order and selectively damp high-
138

• - ID. T m I.- •

B.OS 0.1 DIG 0J 0 2 03 OK OA 0*$ 0.5

5
Figure 9; Time Steps Used in Solving the Bushing Problem, Spring Constant — 10

8
frequency components. Recently Cardona and Geradin have considered extending
the a-methods to second-order DAE systems from rnultibody dynamics, however the
new methods are plagued by oscillations which we believe are largely unphysical.

We can extend the a-methods to DAEs in a way which does not introduce addi-
tional oscillations. Given a second-order ODE,

y = f(t,y,y) (10)
the o-method for this system is given by

a , n+ = (1 + a)f(t„ ,d ,v )
+l n+l n+1 - af(t ,d ,v ) n n a (11a)
2
dn+l = d + hv„ + h [(^-8)a
n n + 0a \ B+l (lib)
v„+i = " + h[(l - 7 K
n +70n+i] (He)

2
where 7 = 1/2 — or, 8 = (1 — a) /r, for a e [-1/3,0], and d, v, and a are the position,
velocity and acceleration, respectively.

To extend this method to the DAEs which describe rnultibody systems,

M m = /(«,y,!/) + G (y)A T
(12a)
0 = 9(g) (12b)

7
where G = dg/dy, we consider a class of methods of the following type,

M(d i)a
n + n + 1 = (l+c.)/(l i,d n + n + 1 ,u„ )-a/(t ,d ,j) )
+ 1 n n n (13 )
a
139

1
d n+l = d + hv +
n n h' [(^-0)a +0a ] n n+l

+ ^ G r ^±i_rA
f K + i ( 1 3 b )

T dn+l d
v a+1 = v + h[(l - ) a „ +
a 7 7 <VH] + hG ( + ")A n + 1 (13c)
9(<Wi) = 0 (13d)
T
G (d )v nirl = 0
n+1 (13e)

The new method projects the solution at the internal stages onto the constraints
2
similarly to the Projected Implicit Runge-Kutta methods introduced in . We note
that a in (13) is not the acceleration. However, the acuta] accelerations can be
n+1

computed via a post-processing step if they are actually needed. Using the concept of
3
essential underlying ODE introduced in , we can show that for linear model problems,
this method corresponds to discretizing the essential underlying ODE directly by the
3
a-method, up to terms of order 0(h }, and is second order for the position variables.
Further analysis of the method, testing and extension of these ideas to higher order
methods remains to be done.

3. Localizing the Oscillation

Often in rnultibody systems the components exhibiting high frequency oscillation


are modeled using very stiff linear springs or nonlinear springs of the dominant first
order term. In the case of rnultibody mechanical systems, there are usually nonlin-
ear transformations applying to the spring forces to obtain equations of motion in
33,M 37
the generalized coordinate space > . The resulting system of equations contains
nonlinear high frequency oscillatory forces.

One approach to solving the high frequency oscillation problem is to carry out
modal analysis and then eliminate the higher modes, since lower modes may preserve
9
the slowly varying part of the solution For example, the extreme high modes of a
structure are often rejected in modelingflexibleeffects of mehanisms, since the details
of the oscillating solution are not so important as the long-term solution behavior. A
similar approach has been developed in recent work on molecular dynamics simulation
4 2 , 4 3
However, due to nonlinear oscillatory forces in the rnultibody formulation, the
modal analysis needs to be carried out at each time step to resolve the rapidly varying
local eigenvalue structure of the system, resulting in very costy computations.

Another approach is to resolve the oscillation efficiently via look-up tables. This
1 2 , 3 4
idea is frequently used in a real-time simulation environment One such example
140

is the modeling of contact compliance in rigid body simulation, where the localized
oscillation may be of interest. Applying linear constitutive laws to modeling the
contact compliance, e.g., the elastic half space theory, Boussinesq's influence func-
tions and Hertz' contact model leads to linear spring forces between contact bodies
38,H,2J <p spring coefficents may be very large since the contact deformations are
ne

small compared to the gross motion of the contacting bodies. The advantage of us-
ing table-look-up is efficiency, however it is not clear how the variable stepsize and
order numerical integration should interact with the tables to maintain efficiency and
accuracy.
In a numerical method such as multistep or Runge-Kutta, which are based on
approximating the solution locally, the stepsize must be chosen very small to resolve
the high-frquency oscillation in the system. Moreover, due to the nonlinear transfor-
mation that places oscillating components in the space of the generalized coordinates,
e.g., in the form of (1), the numerical method may become ineffective since the eigen-
values may change rapidly as shown in the previous examples. Our goal of treating
MBS with highly oscillatory components (1) is to develop numerical methods that
localize the oscillation and approximate the high frequency components properly.

Modal analysis in structural dynamics is well-developed and implemented in pro-


9 , 3 4 8
duction software As shown in , combining structural dynamic subsystems with
the DAE of MBS, high frequency oscillating solutions may occur. Based on the solu-
5
tion of this class of nonlinear oscillations , we propose a new approach to localizing
of oscillating components by utilizing theory and numerical solution of DAE.

It is possible to rewrite the equations of motion of the bushing force in a local coor-
dinate system so that the eigenvalue structure is nearly linear during simulation. We
propose to explore this idea in modeling and solving complex rnultibody dynamic sys-
tems. By introducing virtual coordinates into the system equations, we may localize
the nonlinear oscillation terms. Using these new variables, e.g. virtual coordinates,
we may approximate the oscillatory subsystems by linear differential equations of the
virtual coordinates.

Using the bushing problem as an example, we consider the local relative displace-
ment d = Ajdjj and the relative angle 0 = &ij that comprise another set of coordinates
T T T T y T
9 = [£>y.0] = [d,f?] , and denote the velocity by v = [Jl V,£)] , where V = [v',v ] .
The Newton-Euler equations corresponding to the bushing in the new coordinate
system become

0 mx + my& + ft+Q'(q,q) (14a)


0 m$-mx'(j+ft + Q>>{q, ) q (14b)
0 f& + n~ s\ft + ' fts 2 + - k») \4
s (14c)
141

where the bushing force and torque are

k1 ' k* 0 1 " X'


+ <* 0 ' r «• 1
A= 0 k
0 &
. h "J

Tj, = O + c"w.
The applied force excluding the bushing force can be expressed by
2
1
Q " 2myw - m i w — mgsinfi
•5(1,5) = —2ma:<D - myu + mgcos 1

T
and s' = [s'i,s' )
2 is the bushing attachment point in the body-fixed centriod frame
of body j , u is the angular velocity. It is easy to see that the bushing force / j and
the applied torque of (14c) in the new coordinates become linear functions of q and q.
In the cases of high stiffness or oscillation, these linear terms are the dominant part
of the equations of motion (14). Note that the new coordinate frame is parallel to
the body-fixed centroid coordinate frame. The transformation between the Cartesian
coordinates q and the new local coordinates q can be defined by

[si A(0fdis (15a)

-§ (15b)
(15c)

w = —w (15d)
z s r
where v = [v ,v ,ut\ is the velocity of q, and v =
1 S
[« ,U ,CJ] ' t
is the velcity of q.

Applying the above transformation to the previous 2D bushing example, we com-


T
bine the virtual coordinates q = [x, y, u)] into Eqs. (8) yielding

0 = x + coa&x — siaOy (16a)


0 = y + sin S i + cosdy + 1 (16b)
0 = ij + I J (16c)
0 = i + fc'i (16d)
0 = $+ (16e)
e
0 = ti + k 0 - —$

where the dynamics associated with the bushing force is approximated in the linear
differential equations (16d, 16e,16f), and corresponding constraints are (15).
142

In (1), the spring-damper forces have been described by measurement of real-


tive distance or angles between points and local coordinate frames, respectively. By
introducing additional states, the virtual coordinates q,

= 0 (17a)
l - ^ q = 0 (17b)

into the system, we may rewrite each stiff or oscillatory force component as =
Ti(Ci§j+Aigi), for i € {!,...,»»/}. Denoting the forces by the virtual coordinates q =
T
[ffi, — <9nj\ and its velocity q = j j , we introduce the notion of virtual acceleration,
q = jg for some q(t) = rhq(t) + b(t), such that (1) may be reformulated as
T A
M{q)q-rT(q)q-+G (q)\-Q (q,q,t) = 0 (18a)
q-+Cq-+Kq = 0 (18b)
g(q) = 0 (18c)
q-r,(q) = 0, (18d)

a combined system of DAEs. In general, C and K may be slowly varying functions of


o, due to plasticity of the force components. Note that (18b) may not be consistent
with the dynamics of the original system, e.g., differentiating (17b) one obtains the
consistent acceleration g , Nevertheless, substituting the virtual acceleration q into
(18a), we obtain the original differential equations. It is easy to see that the resulting
DAE has the same index and degrees of freedom as the original system (1), by writing
T
the combined system (18) of q = [q, q\ in matrix form:

T A
Miq)q + G (q)\-Q (q,q,t) = 0 (19a)
m = o (19b)

where
M = M{q) T(q)
0
A

Q = ' -Cq
A Q {q,q,*)
- Kq

9 = .' ? - if?)
m

Depending on the intial values, (19) and the new state variables, (g,o) may partitioned
into a stiff and a non-stiff part.
143

Our objective is to develop a method that takes large time-steps relative to the
high-frequency oscillations. To begin, notice that (18b) and (18d) define for each
q a highly oscillatory subsystem which can be solved exactly. However, if we were
to solve this subsystem exactly and substitute q into (18a), this is equivalent to a
rapidly vibrating force (a high-frequency forcing function) which would require small
time steps for its resolution. Instead, we can do a local eigenanalysis of (18b) to
identify the high-frequency modes. Letting these high frequencies tend to infinity, we
will replace q in (18a) by its average, yielding a smooth approximation for q.

For rigid body mechanisms, in most cases the stiff subsystems will be small.
For flexible body mechanical systems, the stiff subsystems arise as part of the finite
element method analysis, and will be much larger. These subsystems can be dealt
with using mode superposition via a reduced system described by the Ritz vectors or
the Lanczos vectors, as described in 36,18,41,32,24

1. S. S. Ashour and O. T. Hanna, Explicit exponential method for the integration


of stiff ordinary differential equations, J. Guidance 14 (1991), 1234-1239.
2. U. Ascher and L. Petzold, Projected implicit Runge-Kutta methods for
differential-algebraic equations, SIAM J. Numerical Analysis 28 (1991), 1097¬
1120.
3. U. Ascher and L. Petzold, Stability of computational methods for constrained
dynamics systems, SIAM J. SISC 14 (1993), 95-120.
4. C. Bischof, A. Carle, G. Corliss, A. Griewank and P. Hovland, ADIFOR
Generating derivative codes from Fortran programs, Scientific Programming 1
(1992), 11-29.
5. N. N. Bogoliubov and Y. A. Mitropolski, Asymptotic Methods in the Theory
of Nonlinear Oscillations, Hindustan Publishing Corp., Delhi, India, 1961.
6. K. E. Brenan, S. L. Campbell and L. R. Petzold, Numerical Solution of Initial-
Value Problems in Differential-Algebraic Equations, Elsevier Science Publish-
ers, 1989.
7. P. N. Brown, A. C. Hindmarsh and L. R. Petzold, Using Krylov methods in
the solution of large-scale differential-algebraic systems, to appear, SIAM J.
on Scientific Computing.
8. A. Cardona and M. Geradin, Time integration of the equations of motion in
mechanism analysis, Computers and Structures 33 (1989), 801-810.
9. R. R. Craig, Structural Dynamics, Wiley, 1981.
10. N . Fenichel, Geometric singular perturbation theory for ordinary differential
equations, J. Diff. Eq, (31), 53-98, 1979.
11. H. H. Hilber, T. J. R. Huges and R. L. Taylor, Improved numerical dissipation
for time integration algorithms in structural dynamics. Earthquake engineering
and structural dynamics 5 (1977), 283-292.
12. R. M . Howe and K. C. Lin, The use of function generation in the real-time
144

simulation of stiff systems, Proc. of AIAA Flight Simulation Technologies,


Dayton, OH, (1990), 217-224.
13. C.W. Gear, G.K. Gupta and B.J. Leimknhler, Automatic integration of the
Eider-Lagrange equations with constraints, J. Comp. Appl. Math., vol. 12 &
13, 1985, 77-90.
14. K. L. Johnson, Contact Mechanics, Cambridge University Press, 1985.
15. T. R. Kane and D. A. Levinson, Formulation of equations of motion for com-
plex spacecraft, J. Guidance and Control 3 (1980), 99-112.
16. D. Karnopp, The energetic structure ofrnultibodydynamic systems, J. Franklin
Insti. 306 (1978), 165-181.
17. N. Kopell, Invaraiant manifolds and initialization problem for some atmo-
spheric equations, Physica D., (14), 203-215, 1985.
18. P. Leger and E. L. Wilson, Generation of load dependent Ritz transformation
vectors in structural dynamics. Eng. Comput. 4 (1987), 309-318.
19. K.C. Lin and R. M. Howe, Speed and memory requirements for different meth-
ods of multivariate function generation in real-time simulation, reprint.
20. Ch. Lubich, Integration of stiff mechanical systems by Runge-Kutta methods,
ZAMP, Vol. 44, 1022-1053,1993.
21. A. I . Lure, Three-dimensional Problems of the Theory of Elasticity. Inter-
science, New York, English translation by J.R.M. Radok, 1964.
22. R. S. Maier, L. R. Petzold and W. Rath, Parallel solution of large-scale
differential-algebraic systems, Technical Report TR 94-10, University of Min-
nesota, Department of Computer Science, to appear, Concurrency: Practice
and Experience, 1994.
23. N. M. Newmark, A method of computation for structural dynamics, J. of En-
gineering Mechanics Division, Proc. of ASCE (1959), 67-94.
24. B. Nour-Omid, Applications of the Lanczos algorithm, Comp. Phy. Comm.,
53, 1989.
25. B. Nour-Omid and M. E. Regelbrugge, Lanczos method for dynamic analysis
of damped structural systems. Proceedings of the Sixth International Modal
Analysis Conference.
26. L. R. Petzold, An efficient numerical method for highly oscillatory ordinary
differential equations, SIAM J. Numer. Anal. 18 (1981), 455-479.
27. L.R. Petzold, A description of DASSL: a differential/algebraic system solver,
Proc. 10th IMACS World Congress, August 8-13 Montreal 1982.
28. S. Reich, Numerical integration of highly oscillatory hamiltonian systems using
slow manifolds, Beckman Institute, University of Dlinois, UIUC-BI-TB-94-06,
1994.
29. R. E. Roberson and R. Schwertassek, Dynamics of Multibody Systems,
Springer-Verlag, New York, NY, 1988.
30. Y. R Saad and M. H. Schultz, A generalized minimal residual algorithm for
145

solving nonsymmetric linear systems., SIAM J. Sci. Stat. Comp. 7(1986),


856-869.
31. S. Schneider, private communication, 1994.
32. H. C. Chen and R. L. Taylor, Using Lanczos vectors and Ritz vectors SOT
computing dynamic responses, Eng. Comput. 6 (1989), 151-157.
33. R. A. Wehage and M. J. Belczynski, Constrainedrnultibodydynamics, preprint,
1994.
34. R. A. Wehage, private communication, 1994.
35. D. Williams, Dynamic loads in aeroplanes under given implusive loads with
particular reference to loading and gust loads on a large flying boat. Great
Britain RAE Reports SME 3309-3319,1945.
36. E. L. Wilson, M. Yuan and J. M . Dickens, Dynamic analysis by direct super-
position of Ritz vectors, Earthquake Engineering and Structural Dynamics 10
(1982), 813-821.
37. J. Wittenburg, Dynamics of Systems of Rigid Bodies, B. G. Teubner, Stuttgart,
1977.
38. S. C. Wu, S. M. Yang, E. J. Haug, Dynamics of mechanical systems with
Coulomb friction, stiction, impact and constraint addition-deletion II. Plannar
systems. Mech. Mach. Theory 21, 407-416, 1986.
39. J. Yen and L. Petzold, On the numerical solution of constrained rnultibody
dynamic systems. University of Minnesota AHPCRC 94-038, 1994.
40. W. S. Yoo and E. J. Haug, Dynamics of articulated structures. Part I. Theory,
J. Struct. Mech., 14 (1986), 105-126.
41. M. Yuan, P. Chen, S. Xiong, Y. Li and E. Wilson, The WYD method in large
eigenvalue problems. Eng. Comput. 6 (1989), 49-57.
42. G. Zhang and T. SchJick, The Langevin/implicit Euler/normal-mode scheme
for molecular dynamics at large time steps, J. Chem. Phys. 101 (1994), 4995¬
5012.
43. G. Zhang and T. Schlick, LIN: A new algorithm to simulate the dynamics of
biomolecules by combining implicit-integration and normal mode techniques, J.
Comp. Chemistry 14 (1993), 1212-1233.
147

Existence and Uniqueness of Quasiperiodic Solutions to


Quasiperiodic Nonlinear Differential Equations

Yoshitane SHINOHARA,
Atsuhito K O H D A
and
Hitoshi I M A I
Department of Mathematics, Faculty of Engineering,
Tokuskima University, Tokushima 770, Japan

Abstract
This paper is concerned with the existence and the uniqueness
of quasiperiodic solutions to quasiperiodic nonlinear differential equa-
tions in the neighborhood of quasiperiodic solution to linear differen-
tial equation or in the neighborhood of Galerkin approximation to the
nonlinear differential equations. By denning the generalized exponen-
tial dichotomy, our Theorem 4 will be useful independently whether
the nonlinearity ia weak or not.
Some numerical results are shown. These results show that our
analysis is useful for mathematical investigation of quasiperiodic phe-
nomena such as the design of communication circuits.

1 Introduction
The most fundamental problem in nonlinear oscillations is to find the peri-
odic or quasiperiodic solutions to the nonlinear ordinary differential equations
such as

3
—j + a-7- + / 3 x + 7 1 = PcosW, (1)
148

and
<f?x dx 2
— - 2A(1 -x )— + x = aoQSi^t + fecOSi^i, (3)
dt^ at
where a, 0, f, P, v, A, a, f>, and are positive constants.
But, it is very difficult in general to find the exact solutions in analytical
form. Thus, we are obliged to study the solutions by numerical methods. As
for the periodic solutions to nonlinear periodic systems and also to nonlinear
1 8 M 1 5 2 1 9 I 0 1 3 , 2 2 2 3
autonomous systems we refer to the papers - ' - - - ' - -
From a practical viewpoint, a harmonic balance analysis of nonlinear
5
quasiperiodic microwave circuits has been given by Maas in view of qualita-
tive applications, but he is concerned with neither the existence analysis nor
the error analysis.
2
Chua and Ushida have presented two efficient algorithms for obtaining
steady-state solutions to nonlinear quasiperiodic circuits and systems driven
by two or more distinct frequency input signals. They have calculated some
approximate solutions to Duffing type equation w i t h two frequency inputs
and they have given error estimation, but they are not concerned with the
existence analysis of the exact solutions.
In the present paper, we will show that we can indeed verify the existence
of an exact solution and know the error bound of the approximate solution to
nonlinear quasiperiodic differential equation. By making use of the general-
ized exponential dichotomy, we will be able to strengthen the error estimation
of the approximate solutions.
Numerical examples concerned with the Duffing type equation are given.

2 Existence and uniqueness theorem


A function / f t ) 6 C ( R ; R ) , where R denotes the real line hereafter, is
d

said to be quasiperiodic with periods £«&,,.. , u i f f(t) is represented as


m

/(i) = / (t,...,()
0 for all teR, (4)

for some continuous periodic function / ( « i , . . . ,u ) with period w, in each


0 m

Ui. W i t h o u t any loss of generality, we may assume that U i , . . . ,ui are all m
149

positive and further that reciprocals of these periods are rationally linearly
17
independent (see Urabe ). A function f(t) is said to be almost periodic if
from every sequence {a }, one can extract a subsequence
n such that
{f(t + a' )} is uniformly convergent on R. We assume that all functions
n

considered i n the present paper are continuous on R. I t is known in the


3
paper that the limit value

exists for any almost periodic function / ( ( ) and any real a and that there is a
countable set E of real numbers such that a(f,o) = 0 i f a & S. The module
of / , M o d ( / ) , is defined to be the smallest additive group of real numbers
that contains the set S for which a(f,a) ^ 0 i f a € E.
3,12
According to the results of the p a p e r s we have
P r o p o s i t i o n 1 Let {f {t)} [n — 1,2,...) be a sequence of quasiperiodic
n

functions with periods u i , . . . ,u and let f(t) be the uniform limit of f (t)
m n

as n —* oo, then f(t) is also quasiperiodic with the same periods.


Consider a linear differential operator

Lz = -z - A(t)z, (5)

where A(t) is an almost periodic or quasiperiodic matrix. Let $ ( f ) be the


fundamental matrix of the linear homogeneous equation

Lz = 0 (6)
satisfying the initial condition <E>(0) = E (unit matrix).
The linear homogeneous Eq. (6) is called to satisfy a generalized expo-
nential dichotomy i f there exist a projection P, positive constants <7i, oi and
nonnegative functions Cj(t,s), (%(i,s) such that
1 , s
(i) l ^ f t ) ^ - ^ ) ! ! < C i ( t , s ) e - ' " t - > for t > s,
_ 1
(ii) | | $ ( * ) ( £ - P ) * ( s ) | | < C (i,s)e-" <'- >
2
s !
for t < s

(iii) the integral

is bounded on R by a positive number M .


150

Here we introduce the i ^ n o r m || -1| in Euclidean space and denote that | [ / | | =


sup H/(t)|| for any bounded function / — / ( f ) .
( e R

We have the following two propositions.


12
P r o p o s i t i o n 2 (Shinohara et a l . ) Let A(t) be an almost periodic matrix.
Suppose that the Eq. ( 6 ) satisfies the generalized exponential dichotomy and
that f{t) is an almost periodic function. Then there is a unique almost peri-
odic solution z(f) of the inhomogeneous equation

Lz = / ( f ) (7)

and the modules satisfy the relation

Mod(z) c Mod(A, / ) , (8)

where M o d f ^ , / ) is the smallest additive group of real numbers that contains


the countable set E for which a(f,o) - 0 and a(a^,jff) — 0 if a & E and

12
P r o p o s i t i o n 3 (Shinohara et a l . ) Let Ait) be a quasiperiodic square ma-
trix with periods tift,..., u . Suppose that the Eq. ( 6 ) satisfies the generalized
m

exponential dichotomy. Then for any quasiperiodic function f(t) with peri-
ods u\,. ..,u m the inhomogeneous Eq. ( 7 ) has a unique quasiperiodic solution
z{t) with the same periods given by

(9)

where
for t > s,
for t < s.
Moreover the solution z{t) satisfies the relation

4<M ii/n. (10)

Our numerical analysis for the quasiperiodic Duffing type equation is


based on the following existence theorem.
151

T h e o r e m 4 Given a nonlinear differential equation

11

where z and X{t, z) are vectors and X(t, z) is quasiperiodic in t with periods
u i , . . . , w and is continuously differentiable with respect to z belonging to a
m

region V of z-space.
Suppose that there is a quasiperiodic function z (t) with periods w i , . . . , w
0 m

such that
zo(t) e v ,
dz
^-X(t,z
0
(t))
0

for all t € R. Further suppose that there are a positive number 6, a nonneg-
ative number K < 1 and a quasiperiodic matrix A(t) with periods a>i,... , w m

such that

(i) the linear differential Eq. (6) satisfies a generalized exponential di-
chotomy,

Vi, = {z\ \\z - z (t)\\ < 6


0 for some t e R) C T>,
| | * ( « , z) - < | | whenever \\z - z (t)\\ < S,
B

00
J*L<6.
1 - K
Here * ( r , z) is the Jacobian matrix ofX(t, z) with respect to z and the quan-
tity M is given in Eq. (10).
Then the given Eq. (11) possesses a solution z — z(t) quasiperiodic in t
with periods w j , . . . , w such that
m

Mr
l*o(*)-*(*)ll<i—: (12)
1 — fx
for all t e R. Furthermore, to the Eq. (11) there is no other quasiperiodic
solution belonging to T>(, besides z = z(t).
152

3 Quasiperiodic Solution to the Duffing Type


Equation
We shall first consider the following linear differential equation

where ft, v are constants such that V > &, ft# 0, and /(<) is a quasiperiodic
dx
function with periods W i , . . . , w . Putting y = —,
m

the Eq. (13) can be written in the vector form as follows :

i = ^ + n « ) . (14)

Let £ be the differential operator given by

U = % - Az, (15)
at
then the fundamental matrix $ ( ( ) of the linear system Lz = 0 such that
$(0) - E is given by $(£} = e x p t A , which will be called matrizant of L . In
what follows, we denote by ||-]| the following ( norm of vectors and matrices:
x

\\v\\ = max|v,| for vector v with components Uj, (16)


||*|| — m a x E j \<pij\ for matrix 4> with components i>ij- (17)

The matrizant satisfies the inequality

| | * ( t ) | | < Koe-"". (18)


Here quantities K 0 and <To are specified in the following three cases :
(i) when \fi\ > v,
153

(ii) when — f,
K 0 = K (t)
0 = m a x { l + \pt\ + \t\ , 1 + H +
0q = fi,

(iii) and when \fi\ < u,

m=fi,
l l
where a = -fi - %Jfi — v , 0 = -ft + y/fi' — i>*.
From Eq. (18) and Proposition 2, the Eq. (6) satisfies the generalized
exponential dichotomy, because we can choose the matrix P such that

when p. < 0,
when fi> 0.

Consequently, we get the following theorem.

T h e o r e m 5 If ft j£ O, then the equation Lz — 0 satisfies the generalized


exponential dichotomy and the unique quasiperiodic solution

i
z = z (t) =
0 {x (t),y {t))
0 0 (19)

with periods w i , . . . , w m to the Eq. (14) is given by

(20)

where the Green function G(t, s) is specified in the following two cases:

(i) when fi>0,


for t>s
for t < s

(ii) when ft < 0,


for t>s,
for t < s.
154

Moreover, we have
\\G(t,s)\\<K e-^, 0 (2D

where
3| when \n\ > v,
a =
\fi\ when \n\ < V.
Next, consider the Duffing type equation w i t h quasiperiodic forcing term
such as
2 3
+ %j£ + v x — ex -H a cos v t + b cos v t, (22)
x 2

dV at
where p, i / , i/j and vi are all positive constants, e, a and 6 are parameters.
Further v\ — 2ir/wi, ^ = 27r/w and the ratio w / w i is irrational.
2 2

The Eq. (22) can be written in the vector form

^- = Az + m + »Jt4 (23)
dt

where

' x 0 \ / 0
2 = = V
U J' ^ ( ^ c o s ^ t + )>cos^t I ' = '

and
1
A
A
- ( ° 2
" ^ -K -2M
Let L be the differential operator defined by

dw , ,,
I t o = — - Aw, (24)

then Theorem 5 tells us that the equation Lw — 0 defined by Eq. (24) satisfies
the generalized exponential dichotomy for fi ^ 0 and that the linear operator
G defined by Gd> — w, which means

G(t,s)4>(s) ds = w{t),
CO
satisfies the inequality
IIGII < M, (25)
where
K0
= if n > v > 0,
ft- vV -v 2

M - [' Koi^e^-Ua if (1 = ?,

J—oo

^ if o<n<v,

and K Kn(t) are given in Eq. (18).


0l

The quasiperiodic solution of the linear equation


Lz = <p(t)

is given by z = zo(t) = ^ j, where

- ( y i_^2 + ^ { ( ^ - " l ) '

2 c o s + 2 2 s i n 1 / 2 ( 1
V - ^ + W ^ " # * ^

Since it is easy to see

2 2 2 2 A 2
[v - fi )cosi/it + 2/j,ViSmi/it = \J(v - v?) + n v}sin^f + a*)

for i = 1,2, we have

, . BSfnfi'if + cvi) 6sin(f2i + a ) 2


x
o{t) = - 7 = „,„ , „ „ +

where 2
f -

By differentiation, we have

ai/iCos(vi( + Qi) 6f2Cos(i/2r + a2)


2
j(v -v ) 2 2
+ ifi v 2 2
7(f -f|) +4 V '
2 2
M
2
156

So we have the following inequalities for any ( € R

and
bu
lu (t)l < l^il + I iI
2 2 / 2 2 2
^[>2 - z, ) + 4 V f M v (f - /|)
1 + 4//V ' 2

Therefore we have the estimate

\x (t)\,\y (t)\<K
0 0 (29)

for all t e R, where

K
f
{ ,
|a
•+ |6
'

\aui\ \h%\
2 2 2 2 2 2 2
y/{i>* - v ) + 4 M M ^ - i/ ) + ip, v J'

Using Eq. (29), we can estimate the residual function for z {t) as follows: a

3 3
_ - 0 ( 0 - a # f e $ ) l - | | - « , ( * > ( t ) ) | | - |c] | T ( t ) | < |e| J^T . 0

Accordingly, we can choose


3
T = \€\K . (30)

Let = {z\ \\z\\ < 2K}, V = U { z ; \\z - zo{t)\\ < K}. I t is clear that
( 6 R

f T
Mt) e T^k o any t e R and V c T> . k

Let us denote the Jacobian matrix of the right-hand side of Eq. (23) with
respect to z by * ( z ) . Then we have the inequality
2 2 2
\\\M{z) - A\\ = |3e| x = 3 |c| \x\ < 12 |e| K (31)

for all z e D ' .


In order to apply Theorem 4 to the present case, we have to check with
the inequalities in (ii) of Theorem 4. The question is the existence of a
non-negative number K < 1 satisfying both inequalities

L
K 3
Yl\t\K <— and \e\K M <(1-K)K.
157

2 2
From the inequalities 12 |e| K M < K < 1 and |e| K M < 1 - K, we have the
2 2 2
inequalities 12 |e| K M < K < 1 - |e| K M and 13 |e| K M < 1. Hence we
have
| £ | 5 ( 3 2 )
13^M
or

K £ ( 3 3
/ l 3 | 7 | M - »
Consequently, we have the following existence theorem of a quasiperiodic
solution to the quasiperiodic Duffing type equation.

T h e o r e m 6 / / the parameter e and the constant number K satisfying Eq.


(32) or Eq. (33), then the Eq. (22) possesses a quasiperiodic solution z — i{t)
with periods , w such that
2

\m-Mt)\\<K (34)

for all t € R.

If the inequality (32) or (33) does not hold, or the error estimation Eq. (34)
is too crude, then we should compute a more accurate approximation than
Zo(t). For this purpose, we have considered an approximate quasiperiodic
solution written in the form

x {t)
m = a(0,0) + ^ ^{<hcos(j> v)t t + 0 am{p,v)t},
p

r=l \p\=r
dXm(t)
dt

where (p, f ) = p\V\ + P2V2, \p\ = \pi \ + \p2\, ftnd we have determined the
unknown coefficients tv(0,0), ot , 0 by means of the Galerkin method.
p P

For the computed Galerkin approximation of m-th order as


m+1
x (t)
m = 5 ( 0 , 0 ) + E E {<* cos(p,u)t + P %sin(p,v)t],
r=l |p|=r

we consider the residual function

'MO , 0 dx \
m 3
r(t) = & ^ +2 + M
M % ^ > + J (t) Xm - ex (t)
m - acos^ - bsm^t
dt* dt
158

which can be expanded into the finite double Fourier series as

3(m+l)
r (i) = /(0,0)+ E E i / p ^ P ' ^ ' + ^sinfp,!/)!},

r=l |p|=r

where 3(m + 1) is considered as sufficiently large when m is large. Put

S(m+1)
35
r = |/(0,0)|+ E EilAI+lftU. < >
r=l |p|=r
then we have | r ( t ) | < r for all t € R. Define
m+1
36
n = |a(o,o)| + E E { W + A-}. <>
r=l |p|=r
and
m+1

tf=E EW + WKP.")!, (37)


r=l |p|= r

then we have the inequality i i > s u p | z ( t ) | and f i ' > s u p ,


( e R m eR |y (f)l-
m

t
For z which lies in the 8-neighborhood of z (t) — {x (t),y (t)) m m m t we have

2
||«(«) - A\\ <3|e| (n + i ) .

If there exist a non-negative number re < 1 and a positive number 6 satisfying


2
both inequalities 3 |e| (f! + 6) < ^ and y f ^ A f < S, then from Theorem 4
l
the exact quasiperiodic solution z(f) — (x(t), y(t)) w i t h periods wi and <J 2

exists and an error estimation of z (i) is given by


m

\ u t ) - m w < ^ - .

that is,
1 — K

Mr
\X {t)-x(t)\,
n
<
for all t € R.
159

4 Numerical Results
We shall consider the Duffing type Eq. (22) with v = \P1, V\ = 1, j / = \J%. 2

As for the case p = 1/8, € - 1/32, a - 1/8, b - 1/2, we have

/ 4 4 4 4\/5 '\
K = max — = + . ,— = + = 0.4876394 r
/
\8VT7 2 149 8VT7
V 2\/l497
and

= 2 5
/l3|e|M 1 3 x A x 19.389598 ™ -

Since the inequality (33) does not hold, we can not know whether the exact
quasiperiodic solution exists in the neighborhood of x (t). Thus we make D

use of the Galerkin method. After 3 iterations starting with x (t), we have 0

a Galerkin approximation of 8-th order as

x {t)
8 = 2{0.0589905cos v t + 0.0147061 sin i ^ t
t

-0.0804707cosf2/ + 0.0150072 sin v i 2

-0.0000016cos3i/it + 0.0000015sin 3 ^ *
+0.0000069cos(2^i + v )t - 0.0000026sin(2i' + v )t
2 1 2

-0.0000468 cos(2f, - v )t + 0.0000376sin(2^ -


2 v )i
2

-0.0000054cosfi/i + 2i/ )t - 0.0000004 sinfo + 2v )t


2 2

-0.0000116cos(fi - 1v )i + 0.0000076sin(i/! - 2v )t
2 2

+0.0000007cos3f i + 0.0000004 sin Zv t


2 2

+0.0000001 cos(4f! - v )t2

-0.0000003cos(3fi - 2i^)t - 0 . 0 0 0 0 0 0 5 s i n ^ - 2 ^ } * } - (38)

9
By Eq. (35) and Eq. (36), we take r = 2.0 x 10" and SI = 0.3385963. I f we
take 6 = K = 0.4876394, then we have

3 |e| ( f i + 6f < 0.0639999 < ^ ,

and
« > 0.0639999M = 0.1515---,
160

where M = K /(i 0 = 8(2 + y/2)/y/TTf = 2.4236- • •. Hence we can choose


K = 0.16, then we have

Mr 4 8 4 X 1 0 9 9 B
= ' < 5.761905 x 10" < 0.58 x 10" < 6.
1 - K 0.84

Thus, we can assure that the exact quasiperiodic solution x(t) exists in
the ^-neighborhood of the Galerkin approximation (38) and we have an error
estimation of xg(t) as

8
\xs{t)-£(t)\ < 0 . 5 8 x 10" . (39)

Remark that Galerkin approximation (38) is almost the same as the corre-
6
sponding result in the paper , but the inequality (39) is strengthened, because
of using the generalized exponential dichotomy in Theorem 5.
As for the case (j, = 1/8, e = 1/64, a = 1/8, b = 1/2, we have

= 0.5038878.
13|e|M \ 13 x j x 19.389598

Hence the inequality (33) holds. Accordingly, Theorem 6 is valid but the
error estimation
\\zo(t) - < K = 0.4876394 (40)
is very crude, where

x (t) = 2{0.05882362cosi/ir + 0.01470593 sin M


Q

-0.08053684cosi/ t + 0.01500716sini/ i}.


2 2 (41)

In order to find a more accurate approximation, we have used the Galerkin


method. After 2 iterations starting with x (t), we have a Galerkin approxi- 0

mation of 8-th order as

x {t)
s = 2{0.0589069cosi/ ( + 0.0147504sini/ f 1 1

-0.0805038COSJ/ E + 0.0149944sini^ 2

-0.0000008cos3i/it - 0.0000006sin3i'ii
+0.0000034 cos(2i/! + v )t + 0.0000008 s i n ( 2 ^ + v^i
2

-0.0000233cos(2^ - v )t - 0.0000175sin(2vi - 2
161

-0.0000027 cos(i'i + 2vi)t + 0.0000005 sin(i>i + 2v )t 2

-0.0000058cos(i/i - 2v )t - 0.0000049 s i n ^ i -
2 1v )t
2

+0.0000003cos3i/ * - 0.0000002sin3^
2

+0.0000001 cos(3^ - 2v )t - 0.0000001 sin(3fi -


2 2v )i).
2

- 7
By Eq. (35) and Eq. (36), we take r = 0.15 x 1 0 and f i = 0.3384327. I f we
take 6 = K = 0.4876394, then we have

3 |e| ( f i + 6f < 0.0319872 < ^ ,

and
K > 0.0319872M = 0.6202189,
where M = K /fi 0 — 19.389598. Hence we can choose K as 0.63, then we have
7
Mr 2.9084397 x 10" „ „„ „
a B n B 7
- 7
„ n „ . 6
= • —— < 7.8606478 x 1 0 < 0.8 x U P .
1 - K 0.37
From the above calculation, we have an error estimation
- 6
\\zg(t) - z{t)\\ < 0.8 x 1 0

which strengthens the inequality (40).

References
1. R. Bouc. Sur la methode de Galerkin-Urabe pour les systemes
differentiels periodiques. Intern. J. Non-Linear Meek., 7:175-188, 1972.

2. L . O. Chua and A . Ushida. Algorithms for computing almost-periodic


steady-state response of nonlinear systems to multiple input frequencies.
Memorandum No. U C B / E R L M80/55, Electronics Research Laboratory,
UC, Berkeley, 1980.

3. A . M. Fink. Almost Periodic Differential Equations, volume 377 of Lec-


ture Notes in Mathematics. Springer-Verlag, 1974.

4. A . Kohda and Y . Shinohara. Numerical analysis of the quasiperiodic solu-


tions t o Duffing type equations. Japan J. Indust. Appl. Math., 10(3):367-
378, 1993.
162

5. ' S. Maas. Nonlinear Microwave Circuits. Artech House Inc., 1988.

6. T . Mitsui. Investigation of numerical solutions of some nonlinear quasi-


periodic differential equations. Publ RIMS. Kyoto Univ., 13(3):793-820,
1977.

7. F . Nakajima. Existence of quasi-periodic solutions of quasiperiodic sys-


tems. Funkcial. Ekvac, 15:61-73, 1972.

8. Y . Shinohara. A geometric method of numerical solution of nonlinear


equations and its applications to nonlinear oscillations. Publ. RIMS. Ky-
oto Univ., 8:13-42, 1972.

9. Y . Shinohara. Numerical analysis of periodic solutions and their periods


to autonomous differential systems. J. Math. Tokushima Univ., 11:11-32,
1977.

10. Y . Shinohara. Galerkin method for autonomous differential equations.


/. Math. Tokushima Univ., 15:53-85, 1981.

11. Y . Shinohara, A . Kohda, and T. Mitsui. On quasiperiodic solutions to


Van der Pol equation. J. Math. Tokushima Univ., 18:1-9, 1984.

12. Y . Shinohara, M . Kurihara, and A . Kohda. Numerical analysis


of quasiperiodic solutions to nonlinear differential equations. Japan
J. Appl. Math., 3:315-330, 1986.

13. Y . Shinohara and N . Yamamoto. Galerkin approximation of periodic


solution and its period to Van der Pol equation. J. Math. Tokushima
Univ., 12:19-42, 1978.

14. M . Urabe. Galerkin's procedure for nonlinear periodic systems. Arch.


Rational Mech. Anal, pages 120-152, 1965.

15. M . Urabe. Numerical investigation of subharmonic solutions to Duff-


ing's equation. Trudy Pjator Mezdunarodnoi Konfrencii po Nelineinyn
Kolebanijara, pages 21-67, 1970.

16. M . Urabe. Green functions of Pseudoperiodic Differential Operators,


volume 243 of Lecture Notes in Mathematics. Springer-Verlag, 1971.
163

17. M . Urabe. Existence theorem of quasiperiodic solutions to nonlinear


differential systems. Funkcial. Ekvac, 15:75-100, 1972.

18. M . Urabe. On the existence of quasiperiodic solutions to nonlinear quasi-


periodic differential equations. I n Proc. 6th ICNO, pages 1-38, Warsaw,
1972. Polish Akad. Sci.

19. M . Urabe. On a modified Galerkin's procedure for nonlinear quasi-


periodic differential systems. I n Actes de la Conference Internationale ;
Equa-Diff 73, pages 223-258. Herman, 1973.

20. M . Urabe. On the existence of quasiperiodic solutions to nonlinear quasi-


periodic differential equations. In Nonlinear Vibration Problems, pages
85-93. Zagadnienia Drgan Nieliniowych, 1974.

21. M . Urabe and A . Reiter. Numerical computation of nonlinear forced


oscillations by Galerkin's procedure. J. Math. Anal. Appl, 14:107-140,
1966.

22. N . Yamamoto. A n error analysis of Galerkin approximations of peri-


odic solution and its period to autonomous differential system. J. Math.
Tokushima Univ., 13:53-77, 1979.

23. N . Yamamoto. Galerkin method for autonomous differential equations


w i t h unknown parameters. J. Math. Tokushima Univ., 16:55-93, 1982.

24. N . Yamamoto. A remark to Galerkin method for nonlinear periodic


systems with unknown parameters. J. Math. Tokushima Univ., 16:95¬
126,1982.
165

ABSOLUTELY STABLE DELAY D I F F E R E N T I A L EQUATIONS


AND NATURAL RUNGE-KUTTA METHODS

Toshiyuki Koto
Department of Computer Science and Information Mathematics
The University of Eiectro-Communtcations
1-5-1, Chofugaoka, Chofu, Tokyo 182, Japan

Abstract
A natural Runge-Kutta (RK) method is a RK method which has a special
continuous extesion. Any one-step collocation method is equivalent to one of
such methods. In this paper, we consider the application of natural RK methods
to delay differential equations (DDEs) which have a constant delay, and discuss
their numerical stability applied to several types of test equations whose zero
solution is stable for arbitrary value of the delay. As a result, we show that an
,4-stable method preserves the asymptotic property of the analytical solution
of a DDE coupled with a difference equation (i.e., a delay-differential-algebraic
equation).

A MS subject classifications : 65L05, 65L20

1. I n t r o d u c t i o n
2,3,6 ,0 4,16 20
Many authors > '">' .". have discussed stability properties of numerical
methods for delay differential equations (DDEs) based on the scalar test equation

u'(t) = au(t) + bu(t - r), (1)

where a, b are complex numbers which satisfy

\b\<-9a (2)
and T is a positive constant. Because of the condition (2), Eq. (1) has a special
asymptotic property; its zero solution is asymptotically stable for any value of r. By
this property, certain stability regions (P-stability regions) of numerical methods can
be defined for (1) in the same way as the standard stability regions are defined for
the Dahlquist test equation, i.e., Eq. (1) without the term bu(t — T). Moreover, an
analogy of ,4-stability can be considered regarding DDEs based on (1).
However, differently from the standard case, we can not reason that a stable
method for (1) is also useful to a system of DDEs even if it is linear and with constant
166

coefficients. It is rather exceptional that such a system is decomposed into equations


of the form (1). From such a point of view, it is important to study stability properties
of numerical methods when they are applied to more general DDEs. In this paper,
we consider initial value problems of the form

Ev!(t) = Lu{i) + Mu(t - T), t > 0 , (3)

u(t)= (i), V -r<t<0, (4)


where L , M are d x d constant matrices and

-(if !)•
We assume that some conditions are satisfied for the zero solution of (3) to be stable
for any value of T, and discuss stability properties of natural Runge-Kutta (RK)
30
methods , a special type of RK methods applied to (3). It should be noted that
(3) is a DDE coupled with a difference equation if d\ < d. Such equations are called
delay-differential-algebraic equation (delay DAE), and a systematic research on their
1
numerical treatment has been carried out by Ascher and Petzold
A solution of a DDE with constant delays is called absolutely stable if it is stable for
6
arbitrary values of the delays After this usage, we will say Eq. (3) to be absolutely
stable if its zero solution is so.

2. Preliminaries
In order to apply a RK method to DDEs, we need an approximation of their
19
retarded parts. We use a natural continuous extension (NCE) of the RK method
for this purpose. Moreover, we consider an aligned mesh, i.e., a mesh of the form

t„ = kn, h = r/k, n = 1,2,... ,

where k denotes a positive integer. Then, a RK method applied to (3) is (at least
formally) written in the form

EK jn = L + ayicjj + M (u . n k + h g ^K^jjj ,

i = 1 , 2 , ( 5 . a )

u , = u + h Y, kK ,„
n + n n (5.b)
i=l
where u denotes an approximate value to u(t„), a^, b C;(= £ J ay) are the param-
n it = 1

eters of the RK method and 6 (0)'s are polynomials which satisfy certain conditions.
;

We also write
167

A = (a.;) (1 < i,j < 4 , B = (6j(c,)) (1 < i j < a),

Even in the case dj < d, .the numerical solution of (3) is computed by (5) if A is
1
invertible and (3) satisfies some proper condition .
20
Concerning general RK methods, Zennaro clarified their stability for the scalar
3
test equation (1). He has developed a techiniqe to find the P-stability region of a
RK method, i.e., the set of the pairs of the complex numbers (a, 0), a = ah, 0 = bh,
such that the numerical solution of (1) vanishes as n —> oo.
For example, the interior of the P-stability region of the Euler method is

S
{{*,/)) e C : | l - r a | + | / 3 | < l } ,

and that of the (2-stage 2nd-order) Heun method is

{(a,0)eC :|l + a-r«72|<l,


2
\ 8 \< v ) a t

where
2+a 2+ a
2
1 +a -1 1 + Bf P - 1
1 5
when | I + « 1, w„ = —&a when | 1 + a |= 1. Figure l shows the P-stability
regions of well-known RK methods when a and 0 are real; the holizontal line is
denoted as o-axis and the vertical line /?-axis. A RK method may have several NCEs.
The Heun and the classical RK method have tow NCEs, but their P-stability regions
do not depend on the choice of the NCEs. Kutta's 3rd-order method has infinitely
many NCEs; its P-stability region in Figure 1 is that for an NCE furnished by a
19
theorem (p. 124, Theorem 7). These are obtained by Zennaro's technique, but the
derivation of the regions needs rather complicated computation although only quite
fundamental methods are considered.
! 0
If B = A, i.e., fej(cj) — agj the RK method is said to be natural . As a
fundamental result on natural RK methods, it is known that any one-step collocation
method for DDEs is equivalent to a natural RK method determined by

%= f t,id)d9, b (6)=
s fi0)m 1 h= tt0)0, (6)
Jo Jo Jo
where fj(0)'s are the basis polynomials of the Lagrange interpolation for the col-
location points C i , c , ••• , c,. In particular, the class of natural RK methods
3

includes important RK methods derived from classical quadrature formulae, such as


Gauss-Legendre, Radau IIA, Lobatto HIA methods'. Other examples of natural RK
4,21
methods are considered by Bellen, Jackiewicz and Zennaro .
168

T ' T - i '—n

Fig. 1. Real P-stability regions of the Euler, Heun, (3-stage 3rd-order) Kutta, classical Runge-Kutta
methods {in order of small-to-large).

As for natural RK methods, we can derive some stability properties for DDEs
from those for ordinary differential equations. For example, if a natural RK method
is ^-stable, then it is also P-stable, i.e., its P-stability region includes the domain
10
{(o-.d) £ C : | 8 \< - S o } . It was also proved by Zennaro . We now introduce two
symbols which will be used in the following sections. Let r(z) be the stability function
of a RK method, i.e.,

T
r(«) = 1 + zb (I, - zA)-^, e = { l 1 •-• if. (7)

In addition, to characterize the asymptotic behavior of the solution of (3), we define


a function of the two complex variables z, £ by

P(z,0 = det[zE-(L + CM)]. (8)

The characteristic equation of (3) is written as P(z, exp(—TZ)) = 0.

3. Stability for D D E Systems


We first consider the case d\ = d, i.e., systems of the form

«'(() = t u ( t ) + M u ( / - T ) . (9)

In this case, the asymptotic behavior of the solution is well known. The zero solution
of (9) is asymptotically stable if and only if

(A) P{z,exp(-Tz)) / 0 for any z with > 0.


169

We assume that this condition is satisfied for any r > 0, and consider the appli-
13
cation of a natural RK method to (9). Then, we can show the following theorem .
Theorem 1 Assume that the natural RK method is A-stable and that all eigenvalues
of the matrix A have nonnegative real parts. Then, the numerical solution of (9) tends
to zero a s n - t c o for any k and any initial function.
9
Theorem 1 was originally proved using a theorem by in 't Hout , but a more
flexible proof without the theorem is also possible. In the following, we describe
another proof, whose fundamental idea is also valid to the case of delay DAEs, or
13
neutral DDEs considered, e.g., by Kuang, Xiang and Tian

3.1. Stability of DDEs.


If (A) is satisfied for any r, then the following two conditions are satisfied:

(A,) P(iy,() # 0 for any J/ £ i?, y 4- 0 and any C with | f | = 1;

(A )
3 %tz < 0 for any z € o [L + M],

where c[A'] denotes the spectrum of the square matrix X. To the contrary, these two
conditons imply (A) for any r. In fact, it is also easy to see that the condition ( A i )
in the following proposition, together with (A ), implies (A).
2

Proposition 1 The conditions (Ay) and (Ai) imply

(Ai) P(z,C) + 0 far any z and ( such that 3?3 > 0, z ^ 0 and | fj | < 1.

Proof. We first prove that (Aj) implies

P(iy,0 # 0 for any y 6 R, y 4- 0 and any C with | ( | < 1 (10)

by contradiction. Assume that P{iya,(o) = 0 for some yo & R, yo / 0 and some Co


with | Co |< 1- Then, the function i(y) defined by

7 ( y ) = min{|C|:P(t!,,O = 0}

satisfies 7(^0) < 1- Moreover, i(y) is a continuous function, and it is easily shown
that 7(3/) > 1 if | y | is sufficiently large since the set {a[L + (M\ : f e C , | ( |< 1} is
bounded. Thus, 7(1/1) = 1 for some y% ^ 0. However, this implies P(*Jfi,Ci) — 0 f ° r

some £1 with | Ci | = 1, which contradicts (A!).


Similarly, we can show that (10) and (A2) imply (A[). Assume that P(z ,Ci) = 02

for some zi, Ci such that 9?z > 0 and ] C2 [< 1; let g(6), 0 < 8 < 1, be a continuous
2

function which satisfies


S(0) = 1, ?(1) = CJ,
I g(0) \< 1, P(O,J(*))/O, o<e<i. (11)
170

Then, the function x(f) defined by

x W = mw{S*:P(*,j(0)) = O}

is continuous on 0 < 0 < 1; y(0) < 0 by ( A ) , and x ( l ) > 0 by the assumption 2

above. Therefore, v(0.) = 0 for some 8, with 0 < 8. < 1. However, this implies
P(z.< g{$.)) = 0 for some z. with £ 2 , = 0, which contradicts (10) since | g(8.) | < 1
and z. / Oby (11). Q. E. D .

S.2. Proof of Theorem 1


We now describe a proof of Theorem 1. Since the RK method is natural, the
method applied to (9) leads to

K nj = L | u „ + hf^aijK^ +M |u„_i: + hJ^a^K^j

» = l,2,...,s, (12-a)

u« =u +l n + h-£kiK^ (12.b)

which is written in the form

I ^ + i - LV 0 n - & * U * « - J&0&,,-* = 0, (13)

where
T
t/„ = MfW.i,i ••• u ) ,
n

fl> ( h{A®M) 0\ & / 0 fc(e:®M) \


M l
= ( t 0J' M o =
^ o 0 j -
Thus, it suffices to show that the absolute values of all roots of the equation

fr+1 k
det [A Z.i - X L 0 - A Mi - M ] = 0 0 (14)

are less than 1. We prove it by contradiction.


If (14) has a root, say A, with | A |> 1, it must satisfy
k
det[\I ~r(Z,)]
d = 0, Z = k(L + x X- M),

i.e., A is an eigenvalue of the matrix r(Z\). It is shown by simple computation based


on the assumption that all eigenvalues of A have nonnegative real parts. However,
this leads to a contradiction, as below.
171

By the condition ( A i ) , the set ff[Z ] \ (0} is included in the Left half complex
x

plane. Thus, r(z) is holomorphic in a neighborhood of t r [ Z > ] , and

<r(r(2 )J = r(<r[2 ])
A i (15)
18
by the Spectral Mapping Theorem . Since the RK method is /Lstable, [ A | < 1 or
A = 1. However, if A = 1, then 0 e c [h (L + M)\ by (15); this is impossible by ( A ) . 2

q . E. D .

4. Stability for Delay DAEs

4-1. Main Results


T
Let us now consider the case J, < d. Let d = d—di and write u(t) = 2 [x{t),y(t)] ,
where x(t) and y[t) are d,- and <f -dimensional, respectively. We also write the initial
2
T
funciton in (4) in the same way: <p{t) — [iPi{t),tp {t)] . Eq. (3) is written in the form
2

Eu'{t) = Lu(t) + Mu(t - r ) , (16)

F - ( k °\ r - ( I<u L u \ _ {Mn Mih


M
[ 0 0 ) ' [ L L 2 1 2 2 ) ' \M 21 M 22

or
x'(t) = L x(t)
n + L y{t)
s2 + M x(t
n - r) + M y(t 12 - r),
(16)'
0 = L x(t)2i + L y(i)
22 + M x(i
n - r ) + M y{t - r ) ,
22

where Lij, M,j denote d, x dj matrix. We assume that Ln is invertible. Then, (16)
can be solved for any initial function which satisfies

0 = L <Pi{0) + W a ( 0 ) + J t f j i ^ ( - r ) + M <p (-r).


n 22 2 (17)

In fact, on each interval of the form \(m — 1)T, m r ] , m = 1,2,..., (16) is considered
as a DAE if u{t — r ) is known. We can solve (16) by the almost same method as the
step method for usual DDEs.
Also in this case, the condition ( A ) is necessary for the zero solution to be asymp-
totically stable; if ( A ) is not satisfied, there is a solution of the form exp(z()ii which 0

does not converge to zero. However, we can not expect that (A) is also a sufficient
condition; we obtain a neutral DDE by differentiating the second equation in (16)',
but (A) is not always a sufficient condition for asymptotic stability in neutral DDEs.
In this paper, we consider the following condition (B) as a sufficient condition for the
zero solution of (16) to be asymptotically stable. We will prove that (B) is indeed a
sufficient condition in Appendix.

(B) there is a 6 > 0 such that P(z,exp(-Tz)) ^ 0 for any Kz > -8.

We assume that (B) is satisfied for any r > 0. Then, we have the following
theorem.
172

Theorem 2 Assume that the natural RK method is A-stable and that ail eigenvalues
of the matrix A have nonnegative real parts. Then, the numerical solution of (16)
never diverges for any k and any initial function. Moreover, if | r(co) |< 1, the
T 1
solution tends to zero as n —* oo, where r(oo) = 1 — b A~ e.
In the following, we will describe the proof of Theorem 2 along the same line as
the proof of Theorem 1. The same results have been obtained for higher index delay
DAEs, e.g., index 2 equations of the form

Eu'(t) = Lu(t] + Mu(t - T ) , (18)

F - ( h 0\ _ ( L u L 1 2 \ _ ( M u Mt2

i.e.,
*'(') = W W + L.2V(t) + Mux{t - r) + M y(t - T% l2
(18)'
0 = L x{t)
21 +M x{i-r),
21

where L L \ is invertible. It is also proved along the same line although more com-
2 I 2

plicated computation is needed for the proof.

4.S. Proof of Theorem 2


For simplicity, we write £"j(C) = Lij + (Ma, 1 < i,j < 2. Since
d 1
P(z,ex (-rz))
? = ( - I ) M det[Ll (exp(-rz))] 2 + 0(z -- ) , > 7

for any 7, it is proved that (B) for some r > 0 implies

(B )
0 det[i; (C)] / 0 for any C with | £ |< 1
2

by Rouche's theorem. Moreover, (B) for any r > 0 implies

(B,) P(iy,Q f 0 for any y G R, y / 0 and any ( with | ( | = 1;

(B )
2 P(z, 1) ^ 0 for any z with »z > 0.

If (B ) is satisfied
0

{ 0 -^(o- 1
) { -mi) -L- (O
2

=
imf*m) £ ) • w =
^ ( o - i ; 2 « ) ^ ( o - i
^ 1 « ) .

Thus, (Bo), (B,), (B ) imply 2

(Bi) det [s(84 - Q{()\ / 0 for any j f f i , j , / 0 and any < with | ( | = 1;
173

(B',) » z < 0 f o r a n y 2 - G tr[Q(l)].

By the same argument as in the proof of Proposition 1, we obtain the following propo-
sition. It is also shown that ( B ) , ( B j ) and ( B ) imply ( B ) for any r . Consequently,
0 2

( B ) is satisfied for any T if and only if ( B ) , ( B i ) , (B ) are satisfied. 0 2

Proposition 2 The conditions (Bo), (B\) and (B ) imply 2

(Bj) det [zl di - Q(Q] / 0 for any z and C a.t. 9tz > 0, z / 0 and \C\< I .

in order to prove Theorem 2 we prepare a lemma. This lemma shows the solv-
ability of Eq. (5,a) as a special case.
Lemma 1 Let P be a permutation matrix such that
T
P(X 1 % X 2 Y • - • X. Y,) = ( X , X
2 2 • • • X, % Y • - • 2 Y.f,

where A', and Y< denote di- and d -dimensional vectors, respectively. If A is invertible
2

and all eigenvales of A have nonnegative real parts, then I ® E — hA ® (L + CM) is s

invertible for any C with j f | < 1, and

1 1
(I ®E-hA®{L
s + CM))" = P" (c3(C)-'i*(C)) P, (19)

wkt
w l
' \hA®L' {Q- L M) 22 2 hA®I J' d

L {
° - ( 0 -I,®LUQ-> j '
Proof Since (Bo) is satisfied, we obtain
1
L*(C)P (/. ® E - hA ® (L + CM)) P " = 0(C). (20)

Moreover, since

det [I , -hA® sd Q(Q] = JJ det [I , - d k Q(0],


ai

i=i

where a,'s are the eigenvalues of A, det [I, , — hA® Q{()] / f°r C h | C |5 1 d
0 w i t

by ( B i ) . Thus, the matrices Q{(), I, ® E - hA®{L + (M) are invertible, and (19)
follows from (20). Q. D . E.

Proof of Theorem 2. A natural RK method applied to (16) is written in the form

L,U n+1 - LU 0 n - - MU. 0 n k = 0, (21)


174

and its characteristic equation is given by


k+1
det [X L S - X"L - AM, - M ] = 0,
0 0 (22)

where
, _ ( h®E-h{A®L) 0
L l 7
~ { -b ®I d u
and the other symbols are the same that appear in the proof of Theorem 1.
If A f 0 and
h
d e t [ j , ®E-hA®(L + \- M)] / 0,
simple computation shows that
A i , - l a - ^ " W t - \-"M 0 (23)
k
X{\- ) 0 ) ( fc
X ( A - ) - ' [he®(L + k
X- M)\
T k
-b ®f d h ) \ 0 \h-R{\- )
X « ) = / , ® E - hA ® {L + (M),
T 1
R{() = h + (b ® I ) [I,®E-hA®(L d + (M)}- [he ®(L + (M)}.
Moreover, using (19), we obtain

R ( C ) _ ( r(hQ(0) 0 \ m
R{
<>-{ Y(0 r H / J ' ( 2 4 )

where Y(() denotes a dj x d, matrix.


If | r(oo) |< 1, then the absolute values of all roots of (22) are less than 1 and the
numerical solution tends to zero. It is shown by the same argument as in the proof
of Theorem 1 using (B,) and (B ). Let r(oo) = 1 or —1. Similarly, it is shown that
2

the absolute values of all roots of (22) except A = r(oo) are less than 1; r(co) is a
root whose absolute value is 1 and its multiplicity is <f by (23), (24) (and (B ) when 2 2

r(co) = 1). However, it is easy to see that there are d linearly independent vectors 2

which satisfy
+ 1
( A * £ , - A * I - A Mi - M ) U = 0 0 0

for A = r(oo). Thus, the solution of (21) does not include a diverging component for
any initial value. This completes the proof of Theorem 2. Q. E. D .

References

1. U. M. Ascher and L. R. Petzold, The numerical solution of delay-differential-


algebraic equations of retorted and neutral type, to appear in SIAM J, Numer.
Anal,
175

2. C. T. H. Baker and C. A. H. Paul, Computing stability regions: Runge-Kutta


methods for delay differential equations, IMA J. Numer. Anal. 14, 347-362
(1994).
3. V. K. Barwell, Special stability problems for functional differential equations,
BIT 15, 130-135 (1975).
4. A. Bellen, Z. Jackiewicz and M. Zennaro, Stability analysis of one-step methods
for neutral delay-differential equations, Numer. Math. 52, 605-619 (1988).
5. R. E. Bellman and K. L. Cooke, Differential difference equations, Academic
Press, 1963.
6. T. A. Bickart, P-stable and P[a, 3]-stable integration/interpolation methods
in the solution of retorted differential-difference equations, BIT 22, 464-476
(1982).
7. J. C. Butcher, The numerical analysis of ordinary differential equations, John
Wiley & Sons, 1987.
8. L. E. El'sgol'ts and S. B. Norkin, Introduction to the theory and application of
differential equations with deviating arguments, Academic Press, 1973.
9. K. J. in t Hout, The stability of 6-methods for systems of delay differential
equations, Annals of Numerical Mathematics, Vol.1, 323-334 (1994).
10. K. J. in 't Hout and M . N. Spijker, Stability analysis of numerical methods for
delay differentia! equations, Numer. Math. 59, 807-814 (1991).
11. Z. Jackiewicz, Asymptotic stability analysis of 6-methods for functional differ-
ential equations, Numer. Math. 43, 389-396 (1984).
12. T. Koto, A stability property of A-stable natural Runge-Kutta methods for
systems of delay differential equations, BIT 34, 262-267 (1994).
13. J. X, Kuang, J. X. Xiang and H. J. Tian, The asymptotic stability of one-
parameter methods for neutral differential equations, BIT 34, 400-408 (1994).
14. M. Z. Liu and M . N. Spijker, The stability of the 8-methods in the numerical
solution of delay differential equations, IMA J. Numer. Anal. 10, 31-48 (1990).
15. Y, Obara, P-stability regions for some classical Runge-Kutta methods, Gradu-
ation thesis, the University of Electro-Communications, 1995 (in Japanese).
16. K. Strehmel, R. Weiner and H. Claus, Stability analysis of linearly implicit
one-step interpolation methods for stiff retarded differential equations, SIAM
J. Numer. Anal. 26, 1158-1174 (1989).
17. D. S. Watanabe and M. G. Roth, The Stability of difference formulas for delay
differential equations, SIAM J. Numer. Anal. 22, 132-145 (1985).
18. K. Yoshida, Functional analysis, sixth edition, Springer-Verlag, 1980.
19. M , Zennaro, Natural continuous extensions of Runge-Kutta methods, Math.
Comput. 46, 119-133 (1986)
20. M . Zennaro, P-stability properties of Runge-Kutta methods for delay differen-
tial equations, Numer. Math. 49, 305-318 (1986).
21. M , Zennaro, Natural Runge-Kutta and projection methods, Numer, Math, 53,
176

423-438 (1988).

Appendix A
We here show that the condition (B) implies the asymptotic stability of the zero
solution of (16). By differentiating the the second equation in (16)' we obtain

~L u'(t)
0 + M u'(t -r)
0 = l u{t)
x + Mm{t - % T (25)

where
t _ ( Ii, 0 \ • _ ( o 0
i o
" [ Im 1« J' M
° ~ ( Mi, M 22

Since det[L ] ^ 0 by the condition det[£ ] / 0, we can represent the solution u(t)
0 22

5
using the Laplace transform in the form

1 n+" 1
«(*) = ^ exp(tz)H(z)- g(z)dz, (26)

where
/f(z) = ZLQ-U +exp(-Tz)(zM 0 - Mi),
$(z) = LovKO) + MOV>(-T) + exp(- z) y T exp(-tz)(M! - z M ) ^ ( f )dt, 0

and 7 is a sufficiently large real number.


On the other hand, since

fit ,= ( ~ +
®flP{****)*fii] —1-6*3 + e x p ( - T 3 ) M , ] 2
0 1
' \_ -[I +exp(-T )M ]
2 1 2 2 1 -[Iaa+exp(-T*)Af„)
we get
dettf(z) = ( - ^ P ( z , e x p ( - T ) ) . 3

l
The condition (B) implies that H(z)~ is holomorphic in a neighborhood of {z S
C : 3Jz > -(5} \ (0). Hence, shifting the contour in (26) and applying the residue
theorem, we obtain

U { t ) = i
Wi Lum ^M^)H(z)- g(z)dz + uo(t), (28)

«o(0 = Res (exp(rz)ff(z)-' (z), o) . s


177

Since the initial function ip(t) satisfies (17),

T
L V(Q)
O + M <P(-T)
0 = [VI(Q),Q] ,

and hence
H(z)-'(L p(<>)
ot +M 0 V ( - r ) ) = i/o^rV^Oj.Of
by (27). Further,

l
Consequently, exp(tz)H(z)~ g(z) is holomorphic near z — 0, and hence u (t) = 0. 0

Moreover, we can show that the first term on the right-hand side in (28) decreases
5
exponentially by the standard argument .
179

A n I n t e r v a l M e t h o d o f P r o v i n g E x i s t e n c e o f S o l u t i o n s for
Nonlinear Boundary Value Problems
Shin'ichi OISHI
Department of Information and Computer Sciences, Waseda University, Skinjuku-ku
Tokyo, 169, Japan
E-mail: oishi@oishi.info.waseda.ac.jp

ABSTRACT
A method of computer assisted existence proof is discussed for solutions of non-
linear boundary value problems. Ia 1966. (Jiabe has presented a convergence
theorem for a certain simplified Newton method. Urabe's theorem is essentially
based on Banach's contraction mapping theorem. In this paper, reformulation
of Urabe's theory using the interval analysis is presented, ft is shown that a
sharp error estimation can be obtained by this reformulation.

1. Numerical Existence Theorem for Nonlinear Boundary Value Problems

In this paper, we are concerned with the following nonlinear boundary value prob-
lem of a system of first order real differential equations:

J = /(*.*). i € / = [ - U l ,
<?{*) = 0, (1)

where x and f{x, t) are n-dimensional vector valued functions and g is an n-dimensional
vector valued functional. For example, let - 1 = to < f-i < ( < - • • < f-w-i < t^ = 1,
2

S, (i = 0,1,2, • • -, N) be an n x n matrix, and 6 be an n-dimensional constant vector.


Then we have ^
g(x) = ZSMU)-b,
1=0

which is a multi-point boundary value problem. In particular, if g(x) = x(—1) — b,


the problem becomes an initial value problem. If N = 1, the problem becomes a two
point boundary value problem. Moreover, if

g(x) = x(-l)-x(l),

then we have a periodic boundary value problem.


Usually, this type of nonlinear problem is hard to solve analytically. Thus it may
be solved by some numerical method to obtain an approximate solution c(f). However,
a numerical solution obtained there does not necessarily guarantee the existence of an
exact solution of the problem. Therefore, it is important to give a sufficient condition
under which the problem has an exact solution in a domain containing an approximate
solution c(t) and to find a sharp error bound for c(i).
180

2
This problem has been studied by many authors. Among them, in 1966, Urabe
has established an existence theorem of eq.(l) using the so-called "Urabe's theo-
1
rem" of the convergence theorem for a certain simplified Newton Method. This
result has been applied to estimate the error of a numerical solution of eq.(l) by
11 5 6 7
himself , Shinohara*, Shinohara and N.Yamamoto , Fujii , Shintani and Hayashi
8 9
and Hayashi Moreover, T.Yamamoto has developed his theory using the theory of
pseudometric space. As a result, the usefulness of Urabe's theory has been proved.
9
In this article, we shall present a further extension of T.Yamamoto's theory from
the modern interval analytic point of view. Namely, in this paper, it will be shown
that point-wise error estimate |c(f) — x'(t)\ is possible between a given approximate
solution c(() and a true solution x'(t). In our argument, we will show that an infinite-
dimensional extension of the Krawczyk operator can be defined associated with a
Newton-like operator defined by Urabe. Then, using Caprani-Mad sen-Rail's theory
10
of integration of interval function , we will show that range of that Newton-like
operator can be evaluated numerically.
Features of our method can be summarized as follows:
1. We assume only that c(t) is a continuous function oft. Thus our method can be
applied to approximate solutions obtained by a wide class of numerical methods.
For example, it is applicable to finite element solutions, approximate solutions
obtained by interpolating discrete approximate solutions generated through a
discrete variable method, and so on.

2. Our method calculates directly the image of Urabe's simplified Newton operator
applied to closed ball centered at the given approximate solution. Further it
does not use overestimated imbedding constants. Therefore it provides a sharp
error bound. Moreover, if desired it also provides a rough bound with less
computation.

3. Numerical verification of existence of solutions proceeds almost automatically


by only inputting the functions f(x,t), g{x) and c(() into verification software.

4. By choosing a suitable simplified Newton operator, a rigorous mathematical


existence proof can be obtained by our verification procedure.

5. A verification result may be obtained by a computational effort which is propo-


tional to obtain an approximate solution c(t).

6. Iterative refinement of solutions is available.

2. Theory
In the following, we assume that an approximate solution eft) is given for the
problem (1). We also assume that it is a continuous function but not necessarily
181

a smooth function. This assumption reflects the fact that approximate solutions
obtained by numerical methods are usually continuous functions but not necessarily
smooth functions. For example, discrete numerical solutions by means of interpolation
may not be smooth functions. Under this assumption, we will present a sufficient
condition under which the problem has an exact solution in a domain containing an
approximate solution c(t). We will show that our method also provides a method of
obtaining sharp error bound for c(().
Let X = C[-1,1;V] be the Banach space of real n-dimensional vector valued
functions x(t) = (xi(t),X2(t), • • • ,£„(()) continuous on the interval / = [-1,1] with
the scaled maximum norm
||x|| = max|x(t)| ,
u u (2)

where
m l = M i l . ( 3 )

Here, u = u , • • -,
2 is a constant n-dimensional vector with positive elements,
Uj > 0 for i = 1,2, • • •, n. Let Y = X x R" be a Banach space with the norm

\\y\\ = max(\\x\\ , ||e||) for y = (x,e) € ¥.


Y u

l
Let D = C [—1,1; V] be the Banach space of real continuously differentiable n-
dimensional vector valued functions x(t) = (a;i(t),Xa(t), • • - ,x (t)). In the following,
n

vectors and matrices mean n-dimensional vectors and n x rc-matrices, respectively.


We assume that the given approximate solution c(t) is an element of X. We now
define an operator F : D C X -t Y by

^ = (4)

Then we can rewrite the original problem as the following operator equation;

Fx = 0. (5)

In the following, we assume that / : X —> X and g : X -* R" are continuously


Frechet differentiable with respect to x. The Jacobi matrix of / with respect to x is
denoted by f (x,t),
x and the Frechet derivative of g is denoted by g'{x). Then it is
easy to see that for an element x of D, F : X -* Y is Frechet differentiable and the
Frechet derivative DF{x) : D — Y is given by

DF(x)h= ~ f (^t)k,g'(x)ky
I (6)

Here he D.
182

For a real matrix function A(t) continuous on J and for a vector valued continuous
linear functional I, which approximate f (c,t) and g"(c), respectively, we define the
z

following linear operator

Lh=(^-A(t)h,lhy (7)

Let * { t ) be a fundamental matrix of the linear homogeneous differential system

Tt = M t ) z ( 8 )

satisfying
*(-!) = /. (9)
l
Let *(t) € C \-1,1;M] be an approximation of *(() satisfying
•<-!)=/. (10)

Assuming that *(t) is invertible for all ( € 7, put

40 = *£W on
Then the following relations hold:

= A(tMt) (12)
dt
and
* ( - l ) = /. (13)
This means that $(f) is the exact fundamental matrix of the following linear systems:

%=A(t)z. (14)

We now define the following operator:

Lh = - A{t)h,lhJ for fte D. (15)

The following lemma has been proved by Urabe:


Lemma 1 Let $((} be a fundamental matrix of the linear homogeneous differential
system

§ = (W)

satisfying
* ( - ! ) = /. (17)
183

Let G = be the matrix whose column vectors are i i — 1,2, • • • ,n, where
<t>i(t) are the column vectors of the matrix $(t). Then the matrix G is nonsingular if
and only if the operator L defined by eq.(7) has the linear inverse L ~ ' and
1
L- (4>,uj = H4> + Su, (18)
n
where <f> e X, u e R , H is the linear operator from X into D C X such that

H<t-=${t) V ) * ( * M s - *(<)G f[*(f) f^- (s)<p(s)ds]


_1 l
(19)

n l
and S is the linear operator from R into D such that Sv = ${t)G~ v.
We assume now that G is invertible. We consider a Newton-like operator k : X —»
X
l
k(x) = L~ {L-F)x
= / ^ / ( M ) - ^ , ^ ) "!?(*))• (20)

It should be noted that the second line of this equation implies that k can be defined
on X. It will be seen in the next section that if x~ e X is a fixed point offc,then it
belongs to D and satisfies Fx' = 0.
In order to show a sufficient condition under which the operator k has a fixed
point in a domain containing an approximate solution c(t), and to find a sharp error
bound for c(f), we will introduce the infinite-dimensional Krawczyk operator. For this
purpose, we here review briefly the theory of interval functions. In ordinary interval
analysis, the term interval refers to closed intervals of real numbers,

X = [a,b] = {x | a < x < b}-

An interval function V ( i ) on the interval J is defined by

Yit) = [y(t),W)}- (21)

The real functions y[t) and y(t) are called the endpoint functions. In this paper, we
assume that the endpoint functions are elements of C[— 1,1; KJ and consider an inter-
val function to be the set of real functions y in C[— 1,1; V] such that y{t) < y(t) < y{t)
in the natural partial ordering of functions. The addition, subtraction, multiplication
and division between interval functions are defined point-wise. Moreover, a theory for
the interval integral of an interval function has been developed by Caprani, Madsen
10
and Rail It is defined by

£ Y(s)ds=
i fy$.»)d», f_y(s)ds (22)
184

where J denotes the lower Darboux integral and / denotes the upper Darboux integral.
Here, the lower Darboux integal is the supremum of integrals

where yi is any step function satisfying yi(t) < y{t). Similarly, the upper Darboux
integal is the infimum of integrals

j y*(s)ds,

where t/ is any step function satisfying


M > y(t). In the following, we adopt their
definition of the interval integration.
If Y(t) is a vector- or matrix-valued interval function, then \Y(t}\ is defined by a
vector- of matrix-valued function with elements or |Yy(<)|, respectively. Here,
for an interval [a,b], \\a,b\\ is defined by

|[a,6]|=max(|aj,|6|).

Moreover, the Mid function is defined by

Mid(r(0) = ^ ^ . (23)

Let T(t) be an interval function with Mid {T(t)) = c(t). We now introduce the
following infinite-dimensionalKrawczyk operator:

K{T) = k{c) + M ( T - c), (24)

where
M = L-\L - DF(T)) and c =Mid(T). (25)
More concretely, we have

M(T(t) - c)
= $(() f$-\s)R( )(T{s)-c{s))ds
s

l
-${t)G-H[${t) £ $- (s)R{s){T{s) - c{s))ds\
+*(()&-'(/ - ff(T(t))(T(t) - c(t)), (26)

where
R(t) = / ( T ( i ) , t ) - A(t).
x (27)
Then we have the following theorem:
185

Theorem 1 Let T(t) c X be a bounded interval function with Mid (Tff)) = c(t). / /

K(T(t)) C T(t) (28)


and if
||M|| <1,
U (29)
there is a fixed point x' of k uniquely in T(t) C A'. Moreover, x' belongs to D and
satisfies Fx' = 0. Furthermore, it is isolated, i.e., DF{x') is invertible.
Remark: The condition (28) is satisfied automatically if K{T{t)) is a proper subset
of T(t). Usually, the condition (28) is check by tansforming it into the following
equivalent form:
K(T(t))-c(t)CT{t)-c{t). (30)

We will prove this theorem in the next section.
In the rest of this section, we shall derive a concrete form of M(T — c). The
following lemma is a key of rewriting:
Lemma 2 Let A be anm x n interval matrix and B annxp matrix. If Mid(B) = 0,
the following relation holds:

AB = \A\B
= [-\A\\B\,\A\\B\]
= [-LIPIIBI.
We here note that Mid (T - c) = 0. Then, from this lemma, we have for any interval
function T{t)
M(T(t) - c)
= [-1, £ i s - ' W H R W I i r i i * ) - cis)\ds

l
-*(()G-'/[[-l, £ |*- ( )||fl( )||T( ) -
S S S c(s)\d ]
S

+*{t)G-\l - Dg(T))(T{t) - c[t)). (31)


Remark: It should be noted that this form of M has a similarity with the formula
for K, which is used in Yamamoto's paper as a fundamental quantity.
Moreover, it is also noted that if c(t) e X and if $(t) e C^-l.l.M], then M
is well-defined. Since numerical solutions such as discrete solution with interpolation
fall usually in this class, M is well-defined for a wide class of numerical schemes.
Let 0 / ( 0 be an approximation of *(*)"'• Then we have
|*-»f»||iii>)l
< (|*/(s)|+ ! * - ' ( * ) - * / ( * ) ! >
x(\f,(T( ),s)
S - A(s)\ + | i ( ) - A(s)\).
S (32)
186

This expression is often useful to reduce overestimation originated from interval cal-
culation if one chooses suitably A(s) and $i(t). •
We now show how to calculate the operator M. We assume that c(r) and $(r)
are piecewise smooth functions such that whose derivatives are piecewise continuous.
Then we can choose a subdivision of the interval [-1,1] as

[-1,1] = 5, U 5 U - - - U S
2 t (33)

such that c(() and $(t) are smooth on each subinterval Si. Here, SiC\Sj = tp if i ^ j .
In this case, for ( e [t/,fy+i], we have

$(f) / ' ^-Hs)Rls)(T(s) - c(s))ds

4- $-*(Sj)R($j){T - c)(Sj)\t -t \},


} (34)

where w{Sj) is the width of the interval Si.

3. Proof of Theorem

In this section, we will prove Theorem 1 presented in the previous section. Let
T(() be a bounded interval function with Mid (T(()) = c(f). We assume that the
following conditions are satisfied:

K(T(t))cT(t) (35)

and
||M||„ < 1. (36)
We first show that k : A" -* X is contractive on T, and k(T) C T.
In the first place, we show that k(T) C T. It is noted first that the set M ( T - c ) is
convex and closed in X. Moreover, we note that Frechet derivative Dk{x) : X -* X
ofkiX—tXis given by

Dk(x) = L-'{L-DF{x))
l
= L~ (Ux,t) - A(t)).

Then we have for x € T

k{x) = kic) + f Dk(sx + (1 - s)c)(x - c]ds


l

Jo
e fc(c) + M(T - c) e T.
187

Here, we have used the following property:

/ Dk(sx + (1 - s)c)(x - c)ds


Jo
G cd{Dk(sx + (1 - s)c)(x - c ) | 0 < s < 1}.

Here c~oS means a closure of the convex hull of a set S. This means that k(T) C T.
Clearly, if i € T,
Dk(x) € M.
Thus from the condition (36), we have
||Dfc(x)||„<lforalla:eT. (37)

Thus we have shown that k : X —> A" is contractive on T.


Therefore, from the contraction mapping theorem, it follows that there exists a
fixed point x' of the operator k in T(t) c X.
Since x' is a fixed point of k, we have

x-(t)

= $(0 J^- {s}(f(x-(s),s)-A(s)x-{s))ds


1

-*(<)Gr'i[*(«) f l
$~ (s)(f(x-(s), s) - m
A(s)x (s))ds]

+*(t)G-\l(x-)- (x'))-
9

Then we have
m
dx (t)
dt
,
= A(t)x(t) + $ ( i ) $ - ( t ) ( / ( : * ( t ) , i ) -
3 A(t}x-(t))
= fmt),t) (38)
and

i«)

= l\m J\- (s)(f{x-( ),s)-A(s)x-(s))ds)


l
S

-imtHG-Him / * r ^ i r j r V j ^ «J - A(s)x-{s))ds]
1
+im)]G- (l(x-) - g(x-))
= l(x')-g{x-). (39)

The equalities (38) and (39) say that x' is a solution of (1).
188

Now we shall prove the uniqueness of the solution in T. Let x be other solution
of (1). Then

*® = /<*).*)

= +{/(*(*),*)
Therefore i ( f ) can be expressed as follows:
x = L-\f(x{t),t)-A(t)x(t)J(x)- m)g

= *(*)-
Thus it is seen that i is a fixed point of k. Since k has a unique fixed point in T, it
follows that x'= i.
Lastly, let us prove that x' is isolated. Let $"(() be the fundamental matrix of
the linear homogeneous system

| = MV(tM)y. («)
such that *"(0) = / . Put
G- = /[*•{()]. (41)
Suppose that G" is singular. Then there is a non-trivial constant vector t; such that
G'v = 0. Put
y(t) = *-(()«, (42)
then y(t) satisfies

l[y] = 0.

Thus from Urabe's lemma, we have


1
y(t) = L- ((f (x'(t),t)-A(t))y,0).
x

Since clearly the right hand side of this equation belongs to M, we have

IMIc < KIMIC (43)


for some K < 1, This implies y = 0. Then, since $"(f) is nonsingular on J, v = 0
follows. This is a contradiction. Hence we see that x' is isolated. Q.B.D.

4. Error Estimation

In this section, we shall discuss how to choose T(f} provided that an approximate
solution c(f) is given. We assume that e(() is in C[—1,1; V].
Algorithm:
189

1. Choose an approximation A(t) of /i(c(f), ().

2. Calculate an approximate fundamental matrix 4>(i) of the linear homogeneous


equation

*-*>»
satisfying $(—1) = 0.

3. Calculate

4. Choose an approximation I of g'(c) and define the operator L by

5. Calculate i[$(t)J. If this matrix is singular, then failure. If nonsingular, calculate

l
6. Calculate an interval inclusion of L~ F{c) by
l
L~ F(c)
= _ ( / _ *(t)G()«(t) / ' ( / ( c t ^ . i l - ^ i W s ) ) ^
-*(*)G(.(i)-i,f»)-ctf)

or if c is continuously differentiable, by

= (j-*(t)6<)*(t) j j ^ l - f( {s),sm
c

Let 5(r) be a calculated interval function which is an inclusion of i - ' P f c ) .


7. Let /? be a constant greater than 1. Put
= _ma^ |S(t)|.
U i (44)

Then put
T(t) - c(t) = [-u,u].
P (45)
190

l
8. Put M = L- (f (T(t),
s t) - A(t)) and check the conditions

-L^F(c) + M(T-c) C T-c


\\M\\ < 1.
U

If these conditions hold, it follows that there is an exact solution of eq.(l) in


T(t).

Now we assume that a sequence of approximate solutions c„ can be calculated


such that c„ —• x' as n -> oo. Here, x' is assumed to be an exact and isolated
solution of eq.fl). We shall show if the Algorithm is applied to c„, in finite times this
Algorithm will generate an interval function T(£) satisfying the conditions
l
-L- F(c) + M(T-c) C T-c
\\M\U < 1.
l
Let S„ be inclusions of L~ F(c„) such that

max |5„(f) - L-'F^mi - 0 (46)

as n oo. Here, L„ is the operator L which is obtained by applying the Algorithm to


c„. Let M and T be M and T associated with S„. It is noted that Mid(T — c) = 0.
n n

Then, to show —L-'F(c„) +M„(T„ — c„) C T - e„, it is sufficient to show n

|S (t)| + \M (T (t)
n n n - c (t))| < pu .
n n (47)

We note here |£„(t)| < H„ and | | M | | = |[|iW ]ti || . Thus, the problem is reduced
n U o n n Un

to show
Un + \\M \\ „pu„ < pu , n u (48) n

or
\ + \\M \\ „p<p. n u (49)
From this it follows that p > 1 is necessary. In the following we assume p = 2. Thus
the problem is to show | | M | | , < 1 for sufficiently large n.
n u

Now we assume that \\L„ - DF(x')\\ -» 0 as TI -> co. Under this condi-
an

tion, we shall prove \\M \\^ n 0 as n - oo. For sufficient large n such that
WDFix-y^UJLn-DFix-}^ < 1, we have

l
IIX-MI < \\DF(xT L„
" " 1-PF(Z-)^|UJ|L.-OF(^)|U„
< 7,

where 7 is a constant.
191

On the other hand, we have

I P W I L
< ||T- -«.!«. +ll«»-*"lk

< n(F(c )+\\S -L^F(c„)\UJ


n n

+||c» - i * | L
-f 0.

From this, it follows that

\\DF(T ) n - - 0- (50)

From the above mentioned discussions, we have

= \\I-L^DF(T )\U n n

< -y\\L» - DF(T )\\ „ n u

< -r(\\L -
n DF(x-)\\ Un

+\\DF(x') - DF(T )\\ ) n Un

- 0.

This is the desired result.

4-1. Numerical Example

In this section, we shall consider the following van der Pol's equation as an exam-
ple:
cfx 1 dx 1 2

= ( 1 l ) I ( 5 1 )
^ 4 - dT-i6 -
As boundary conditions, we impose the following two point boundary conditions:

i ( - l ) = 0, = 2. (52)

This example is due to Urabe. We rewrite this equaiton as a simultanuous equation.


Deviding the interval [—1,1] into ten subintervals and interpolate an approximate
solution and a fundamental matrix using polynomials, we have verified the existence
of a solution. Using Eq.(34), we have derived interval inclusion of K. To evaluate
R[S) we use the third order Taylor expansion for the subinterval S of [—1,1]. Under
the above mentioned conditions, by applying the algorithm mentioned in the previous
section, we have

T(f) = c(t) + {x € X\ \xi{t)\ < 0.0447, \x {t)\ < 0.0167}. 2 (53)


192

Figure 1: An Approximate Solution c(t)

Fig. 1 shows that K(T) — c is a proper subset of T - c. Thus, it turns out that there
exists a solution of the problem in K(T) uniquely.

5. References

1. M.Urabe:"Galerkin's Procedure for Nonlinear Periodic Systems", Arch. Ra-


tional Mech. Anal., 20 (1965) pp.120-152.
2. M.Urabe:"An Existence Theorem for Multi-Point Boundary Value Problems",
Funkcialaj Ekvacioj, 9 (1966) pp.43-60.
3. M.Urabe: "The Newton Method and Its Application to Boundary Value Prob-
lems with Nonlinear Boundary Conditions ", Proc. US-Japan Seminar on
Differential and Functional Equations, Benjamin, New York (1967) pp.383¬
410.
4. Y.Shinohara: "Numerical Analysis of Periodic Solutions and Their Periods
to Autonomous Differential Systems", Journal Math. Tokushima Univ., 11
(1977) pp.U-32.
5. Y.Shinohara and N .Yamamoto: "Galerkin Approximations of Periodic Solution
and Its Periods to van der Pol Equation", Journal Math. Tokushima Univ.,
12 (1978) pp.19-42.
6. M.Fujii:"An aposteriori error estimation of the numerical solution by step-by
step methods for system of ordinary differential equations", Bull. Fukuoka
Univ. Ed., 2 3 (1973) pp.35-44.
7. H.Shintani and Y.Hayashi:"A posteriori error estimates and iterative meth-
193

ods in the numerical solution of systems of ordinary differential equations",


Hiroshima Math. J., 8 (1978) pp.101-121.
8. Y.Hayashi: "On a posteriori error estimation in the numerical solution of system
of ordinary differential equations", Hiroshima Math. J., 9 (1979) pp.201-243.
9. T. Yamamoto: "An Existence Theorem of Solution to Boundary Value Problems
and Its Application to Error Estimates", Math. Japonica, 27 (1982) pp.301¬
318.
10. O.Caprani, K.Madsen and L.B.Rail: "Integration of Interval Functions", SIAM
J. Math. Anal., 12 (1981) pp.321-341.
11. M.Urabe: "Numerical Solution of Multi-Point Boundary Value Problems in
Chebyshev Series Theory of the Method", Numerische Mathematik, 9 (1967)
pp.341-366.
194

-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8

(a) First Component of K(T) - c

•1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8

(b) Second Component of K{T) - c

F i g u r e 2: I n t e r v a l I n c l u s i o n o f K(T) - c
195

EXPERIMENTAL STUDIES ON GUARANTEED-ACCURACY


SOLUTIONS O F T H E INITIAL-VALUE PROBLEM O F
NONLINEAR ORDINARY DIFFERENTIAL EQUATIONS
Masao IRI
Department of Information and System Engineering,
Faculty of Science and Engineering, Chuo University.
1-13-27 Kasuga, Bunkya-ku, Tokyo 112, Japan
E-mail: iri@ise.chuo-u.ac.jp

and
Jiro AMEMTYA
Research Lab. II, Communication and Information Systems Research Laboratories,
Research and Development Center, Toshiba Co.
1 Komukai-toshiba-macki, Saiwai-ku, Kawasaki-city 210, Japan
E-mail: amemiya@isl.rdc.tOShiba.co.jp

ABSTRACT

The development of software technology of automatic differentiation and interval


calculus has enabled us to numerically construct an interval solution for the
initial-value problem of ordinary differential equations. In this paper, we will
report how we applied this method to several problems as a practical technique
and what observations we got.

1. I n t r o d u c t i o n

The possibility, as well as concrete methods, of numerically constructing an inter-


val solution for the initial-value problem of ordinary differential equations by means
of the interval version of the Taylor-series expansion has been known since 1958 (or
1 5
1956) (see also at latest, but it will be more or less recently that
many people in engineering fields have recognized that such approaches are not only
theoretically possible but also practically feasible and important. It is certainly "a"
progress in the world of numerical computation. Needless to say, the progress owes
much to the growth of hardware powers of computers in speed and in memory size,
but, as much — or even more — to the development of the software technology of
6 7 12
automatic differentiation ' ' .
If a theoretical method is to become a practical one, we must assess the method
from the engineering point of view, e.g., by investigating how to practically tune a
theoretically "arbitrary" parameter or parameters in view of the total cost/efficiency
of the method and how much the tuned method will cost in time and in memory
space in practical situations. With such a viewpoint in mind, we tried preliminary
computational experiments. This paper is based on part of the results obtained by
the second author during his research conducted for the master's degree under the
196

1
first author's supervision .

2. Problem and Solution Algorithms


d
The problem we treat is to construct the intervals (or boxes) [»/„] C R as a
function of (some appropriately chosen sequence of discrete values t„ of) the indepen-
d
dent real variable t € [t ,T] which rigorously includes the solution 3/(tJ e R of the
0

initial-value problem:

dt
d d
where the initial value y £ TV* and the right-hand side function / ; R -> R are
0

given and / is assumed to be well-behaved.


The Taylor series expansion of the exact solution of the above-described problem
1
is 8£ follows" :

2
y£W) = + £ + <'

where

^ . = T-^^ H*»+**»>•») +,
(3)
!
On + 1)
and

h =t
n n + 1 -t„ and ^€(0,1). (4)
The local truncation error 2 i cannot be known in general, but, if an interval Y„ (c
n +

H*) in which {y{t) | ( € [t„, t i ] } is included is known in some way or other, an inter-
n +
+
val [z i] in which 2 n lies can be determined by substituting the interval
n+ +1 y^ '\Y ), n
(p+1
or a little wider, practically computable interval fe^**]^) ( 3 !/ '("r'n)), for
2/U>+ii(( B h ) in (3). In practice, we set
n + n n

s := midpoint [ z j
n and [z ] •= \z ] - s ,
n n n (5)
and compute the sequence y„ (n = 0,1, • • •) by
k
h
r.
ww := u„ + E - r r ^ 1
+ ««+»• (6)

In the meantime, it should be noted that higher derivatives Jf'*'(ifn) or can be


automatically computed if a program for computing the right-hand side function / is
given. In fact, the differential equations (1) may be regarded as the equations that
relate the Taylor-series (in t) of y truncated up to the (p„ 4- l)st term on the left-hand
197

side to that truncated up to the p t h term on the right-hand side, so that we can
n

automatically produce from the given program another program which computes the
2
Taylor coefficients of higher and higher order successively. As is well known , every
computation represented in a program form of this kind (to discuss what kind would
5
require too much space ) can be intervalized.
Comparing (2) with (6), we see that the exact solution of the differential equations
(1) is within the interval obtained by executing the computational process (6) with
(

the "noise" represented by the interval [in+i] added at each "intermediate variable"
yn+i- This situation is quite the same as in the case of the noisy computational
5,8
process with rounding errors . (If we want to consider also the effect of rounding
error we may add to [z +i] the extra term for it as a noise.)
n

According to the theory on noisy computational process, we can write the inte-
gration process (6) as follows:

V Vo
» . + S u
Ift := l/o + l , -TT K
k=i -
+ + s
«* *- * £ t t * " ( 7)

where s is the sum of the reminder term and the noize.


m

Actually, using the automatic differentiation, we can obtain the higher derivatives
of y as a function of value of y(t). So we express the fcth-order derivative of y at the
(il
steppoint („ as y '(f(*n))- We rewrite the (2) and (6) as follows:

(k
y, i
+ = y + t-^y \yn)
n +s u n+ (8)
K
»=i -

v f W i ) = y(Q + E 4 h r % & D + (9) K


k=i -
Total discretization error in the numerical solution j / „ is the difference between y +l n+l

and the exact solution j / ( ( i ) , and is estimated as follows:


n+

* * i - y ( W > = y,-y(t ) n + f:-^(y (y )-y (y(m


ik,
n
w
+ s^ -z
l n + u

k=l
0
- y -y(tn)
n + E & % + n(y{t ) n - - »(*-))

+S„-H
198

= ± ( n ( - r + £ ^ ^ ) ) («--*-) oo)

where 7 is the identity matrix, dy^/dy is the partial derivative of the procedure to
calculate the ith-order derivative of j / . Then we define

* * * » = f i * w
( M < N + I )
' M
U (m = n + l ) .

By using notation (12) we rewrite (10) as follows:


n+l
y(t i)
n+ -fert-I = £ A.+i,m(zm - s )- m (13)
z — s is the noise generated at the steppoint t . A „ i is the matrix which
m m m + : m

represents the the effect of the noize z — s . However, z and 0 ( 0 < m < n + l )
m m m m

cannot be known in general. So we cannot obtain j 4 i (0 < m < n + 1) B + i m

Then we shall discuss the method to calculate the confidence interval [y i\ which n+

contains exact solution y(f„ i). At first we prepare to construct the method to
+

calculate [JM+I]. If [y ] is obtained, the following relation is formed:


R

fn + e„(il(tn)-Vn))€b„]. (14)

On referring to inclusion (14), we can calculate a interval matrix which contains

p k k
" h fliA*) h

+ e ( ! / ( £ n ) e / + ( W I ( i 5 )
An i,» = i + £
+ a % ' f r - " £ t 1 '
The right-hand side of (15) is the interval matrix obtained by using the automatic
differentiation. To calculate the interval matrix, we may first compute the p th-order n
199

derivative of y by using interval arithmetic with the interval argument [y \ and then n

calculate the partial derivative and right-hand side of (15). We shall denote this
interval matrix by [Ai+i,n].
To determine the effect of the noises on a y we need the interval partial derivatives
n

(interval matrices) of y„ with respect to all the previous y % which we shall denote m

by [ A i , ] , and they are also computable automatically if we have a program for the
m

process (6). ([^^.m] may be regarded as a kind of interval counterpart of fundamental


solutions to the first-order perturbation equations of the original system ( l ) , i.e.,
interval matrizants.) [A , ]'s are computed in practice by
n m

[AH-I,™] := [A.+i,n] • [A , ], n m [A _ ] = J,
n n (16)

The following relation is obtained on referring to (5):

! m - S m f [&»]• (17)

Substituting [A„ i, ] for - 4 i and [z^] for z — s on (13), we can obtain a interval
+ m n + i m m m
:
which contains y ( f ) — j/n+i
n + 1

y ( t „ ) - jfe+i € •£ K + i , p ] .
+1 m m (18)

We can calculate the numerical solution and the confidence interval with following
steps:
1) Determin the enclosure Y„ of y and the stepsize h„.
2) Determin the order p„ and the stepsize h which is used in the integration on n

this step . Then calculate [2„+i] which includes remainder term.


3) Calculate y^'Hly^]) and, if necessary, compute the initial enclosure Y„ by fol-
lowing equation:

K := M + £ ^S^S^ W + [*«]. (19)

where Y may be narrower than the enclosure obtained in 1). Then calculate the
n

interval matrix [J4„ I ]. + IB

4) Calculate the numerical solution y \ by using of (6), interval matrices [^n+i^j's


n+

which represents propagation of accumulated error, and the confidence interval


n+l
[y n + l ] := !/„ +1 + Y, [>U+i J [ * J (20)

for the integration of next step.


We have continued the discussion on the assumption that roundoff error can be
neglected. But in order to realize this method on a computer and obtain the con-
fidence interval, we must use a machine interval and a machine interval operation.
200

The machine interval is the interval whose both sides are floating-point numbers. The
machine interval operation is the operation which takes machine interval arguments
and produces the narrowest possible machine interval that contains the result of the
corresponding interval operation with the same arguments.
There is no problem in substituting the machine interval operation for the real
interval operation in this method. As a result of the computation left-hand side of
(6) by using of the machine interval operation, we can obtain a interval with positive
diameter. So the step 4) must be modified as follows:
4) Calculate

k + i ] := V* + E -£=V*>(fe) + [**4 (21)

select the floating-point number j / i contained in [y +u (ex. near the midpoint)


n + n

[2n+i] := [!Wi] - (22)

propagation of accumulated error, and the confidence interval


n+l
[fc+i] := y„ +l + E K+l.m][5m]. (23)
771=0

for the next step integration.


9
This method of computation is basically along the same line as Lohner's idea and
will be "the" standard method now. The discussions about the wrapping effect and
10,11,14
the like are to be looked upon from this viewpoint (cf. also ). However, there
are possible variants of the method. In particular, if we sticked to the above-stated
principle, we should have to always keep all during the integration steps,
which might be too much memory-consuming in some cases. So, we may keep only
those [ A , i ] ' s for which n-m<
+ ira N and substitute [ j 4 i ] [ 2 ] + [ z ] for \z\, ]
n+ >m m n+1 +1

for m = n — N when we compute the effect of the noises farther than JV steps before:

l^n-H,™] := [Ai+l.tJPVm],
m =n — N, n, [A^] = I, (24)
(in+i] := [ V H ] + \A„ . ]%_ ].
+Un N N

We can obtain the confidence interval as follows:

;
iVn+i] = Ih-H + E PWrnHliJ. (25)
m=n+l-/V

If we set N=0, then we have the interval version of a simple step-by-step integration
which uses the interval matrizant (and, consequently, conspicuous wrapping effect,
etc.).
201

Of course, there is another method which execute one step integration as follows:

W
bU-J == \3h] + E - J | H ( ( [ f a ] ) + k + i ] . (26)

This is very primitive and seems to be very fast, however, there is the fatal defect
that the diameter of the confidence interval on a step will become always larger than
that on the previous step.

3. I m p o r t a n t Technical Points t o Review

With respect to the method (or the family of methods) enunciated in the preceding
section, the following points seem to be technically most important.
(i) How to determine the interval Y„ and h such that Y D {y(t) | t e [t„,r- ]}
n n n+t

for estimating [z„+i] ?


(ii) How to determine the stepsize hn and the order p at the nth step of integra-
n

tion?
(iii) How to choose N, the parameter for keeping the "interval matrizants" rep-
resenting the noise propagation in the computational process? How is the above-
enunciated method compared with the method of constructing the confidence interval
10
by means of interval solutions of the first-order perturbation equations directly ?
(iv) How much should we pay for guaranteeing the accuracy of the solution over
the cheaper conventional integration methods such as the Runge-Kutta with stepsize
control ?
According to our computational-experimental observations (cf. next section), our
tentative answers to the above-raised questions are as follows.
(i) To literally follow the statement in the ordinary textbooks, using the norm in
the solution space, of the so-called "Cauchy-Peano" theorem on the existence and
uniqueness of solution is the worst strategy for determining Y„ and h„. We would
usually have too small h*. We had better employ the combined strategy of inflating
Y and/or reducing h„ so as to achieve the inclusion:
n

y« + [0Mf(Y«)cY n (27)
and of sharpening the interval by repeated application of

^ : = ^ +[0,M/W (28)
However, there are still a lot to investigate about how to automatically (more or
less heuristically) determine the initial guess for Y and ho, and about the way of
a

combining the interval inflation and sharpening and the stepsize reduction.
202

We must guess the stepsize h at the beginning of the first integration, and we
0

must change it when we determine the order in (ii). If we guess too large fto initially,
we may obtain too small ho in the following process. Not only at the beginning of
integration but also in the course of integration, we had better avoid drastic change
in the interval width, in the order of the Taylor-series expansion and in the stepsize,
but still there remain several possible ways of gradually changing them (this point is
closely connected with (ii)).
With respect to the strategy how to determine Y„ and h„, we calculated the initial
guess for Y„ as follows:

y„:= fcJ-+^-*»*J[/](l»J)- (29)

If Y and h„ achieve the inclusion , the solution lies in Y„. Otherwise, we inflate the
n

initial interval Y using e-inflation:


n

y„:= (l + e)y -eYB T (30)

where we set e = 0.1. We can obtain the new interval whose width is 1 + 2e times
as large as the old interval and check if the initial interval and stepsize satisfy the
condition (27) or not. If they do not achieve (27) after repeating e-infiation several
times (we repeated 5 times), we halve the stepsize h„ and resume (29).
(ii) There are two meaningful strategies: (I) to make the stepsize as large as
possible under the restriction that the width of [z„]/h„, the local truncation error per
5
unit time, should not exceed some given parameter e , and (II) to minimize p / / j ,
t n n

the total operation count per unit time, as small as possible under the same restriction
13
(see also ). Both works fairly well. However, in view of (iii), the former strategy will
be more meaningful in practice.
(iu) At first sight it seems that the greater the N the better. Indeed this is usually
the case, and that for small iV's especially. But computational experiments sometimes
point to the contrary. To choose too great an N sometimes deteriorates the quality
of the solution, i.e., produces wider intervals than those obtained with smaller jV,
probably because the intervals of the elements of matrices [ A , ] become too wide.
n m

This phenomenon suggests us that there might be an optimum value of N, neither


too small nor too large. Since setting N too large would require us too large memory
space, this may be practically a good news.
We may alternatively calculate the interval matrizant by means of interval solu-
tions of the first-order perturbation equations:

c (31)
dt ~ dy '

There are several methodss to calculate the interval matrizant. We adopted first-
order Taylor expanseion method to solve (31) numerically and tested following two
203

methods: one is to solve the equations on [£„,(„+h], the other is to solve the equations
on [r,,, i« + h/2] and [(„ + h/2, i„ 4- h] and obtain the interval matrizant as the product
of the two. These methods are faster than the standard method (see Table 1), but
give us wider confidence intervals (see Fig. 5).
(iv) The results of our computational experiments on the problems in §4 showed
that the present methods require two to five hundred as long computation time as
the conventional 8th-order Runge-Kutta (of 10 stages) with stepsize control where
the restriction e on the local truncation error per unit time (estimated by comparing
t

the numerical solution with the stepsizes doubled) was set in such a way that the
accumulated error may be nearly equal to the guaranteed interval width obtained
by the present method. (The popular 4th-order Runge-Kutta (of 4 stages) was less
efficient by the factor of five or six.) Considering that the Runge-Kutta does not
rigorously guarantee the accuracy of the computed solution and that there are still
lots to be improved in our implementation of the present method, this comparison of
speeds will be in favour of the rigorous approach to the solution of ordinary differential
equations even from the practical engineering standpoint.

4. Examples

We took up the following three problems for experiment, one being a small test
problem and the other two chosen from celestial mechanics,
(a) Logistic curve:

»(0) = 0. (32)
In Fig. 1 it is seen that to calculate the confidence interval with matrizant is
important. (It is observed that the most primitive interval computation will fail.)
(b) Swing-by of an artificial satellite by Jupiter. The equations of motion is as
follows:

=v,
(x — rcoswt)
2
-Gm- 2 2
{x + J / S
} 2 {(x — rcoswt) + {y - rsmwr) };
(x - cos(yt + d>)) (33)
-GM 2 2
{(x - cos(t 4- 4>)) + (y- sin(f 4- 4>}) }* '
dv (y — r sinwi)
-Gm-
dt'' O 2 2
+ y V* {{x — r costjt) + (y — rsinwf) } s
2 2

-GM- 2 2
{(x - cos(( 4- 4>)) + (y- sin(( 4- 4>)) }*
204

/
3.0

2.0

exact solution

1.0
JP^''-— using matrizant

0.0

-1.0
0.0 1.0 2.0 3.0 t 4.0

Fig. 1 Logistic curve


6
(The width of the interval is magnified by 10 .)

The initial conditions are as follows:


i(0) = 0.19004, j/(0) = 0.0,
(34)
u(0) = 1.95, y(0) = 2.28.
Here, (x, y) and (u, v) are the position and the velocity of the satellite; T is the distance
between the sun and Earth; (is the time; and w is the angular velocity of Earth. We
chose the inertial system whose origin is the sun. Jupiter and Earth move in the same
plane in circular orbits around the sun. We normalized the distance between Jupiter
and the sun as unity, the angular speed of Jupiter as unity, and the mass of the sun
as unity. Under those normalizations we have
e
Gm=3.0404 x 10" , i-=0.19,
GM=9.5479 x lO" , 4
u=12.0,
(35)

where G is the universal gravitation, m is the mass of Earth, M is the mass of Jupiter.
We set the initial relative position of Earth and Jupiter as 0=0.4835.
The problem is essentially a one-body problem, the gravity of the sun, Earth and
Jupiter being taken into account but their motion being approximated as circular and
planar and a priori given. General view of the motions is shown in Fig. 2 and Fig. 3.
205

orbit of E a r t h

Fig. 2 Relative motion of the sun, Earth, Jupiter and the satellite

...'< \ \ \. closest position

Fig. 3 Relative motion of Jupiter and the satellite


206

Figure 4 shows how the stepsize as well as the order is controlled according to
different rules.
In Fig. 5 and Fig. 6 the effect of N and that of the way of computing interval
matrizants are shown. The "true error'' is the error (estimated by comparing with
the result with stepsizes halved) of the midpoints y„.
In Fig. 7 the effect of N is shown. (It is seen that the widths obtained by using
the perturbation equations gave poorer results.)
Computation time by the Taylor-series methods with different stepsize-control
strategies (I) and (II) (see §3) are shown in Table 1.

Table 1 Computation time (s)


primitive a a' 0
method JV=0 iV=0 W = oo N = G W = co
(I) 309.9 372.5 419.3 881.3 963.8 1452
(II) 229.7 294.1 351.2 1977 664.7 2293
a: matrizants obtained by solving perturbation
equations on [f„, t„ 4- h],
o/: product of the matrizants on [f , („ + A/2] and
n

% + ft/2, t + A],
n

0: standard method.

Computation results by the 8th-order 10-stage Runge-Kutta method are shown


in Table 2.

Table 2 Computation results by the Sth-order


10-stage Runge-Kutta method
Computation time (s) Significant figures
lO"" 1.8 6
lO"' 2.3 7
B
10" 3.2 8
10-" 4.8
207

0.0

p with max h
n n
CD
&

I
-2.0 -
e

c
-4.0 -
2
p with min p / / i n
n n

- 5

-6.0

0.0 0.S 1.0 1.5 2.0

Fig. 4 Variation of the stepsize and the order


10
during the integration (e — 10~ ) ( m a x p „ = 2 0 )
t

1
-5.0

a)
•3 -10.0

1
primitive-
J
a, JV = 0 -
-15.0 1

bo a', /V = 0 -
o Q', JV = o o J

0, JV = 0 — J
-20.0 ft JV = oo 1
"true error"
0.0 0.5 l.O 1.5 2.0

Fig. 5 Growth of the width of the interval for different


values of JV and different ways of computing [A„].
-10
(et = 1 0 ; p and hn controlled so as to maximize h .)
n n
208

Fig. 6 Growth of the width of the interval for different


values of N and different ways of computing [An].
10
(e — 10~ ; p and h controlled so as to minimize p / h . )
t n n n
2
n

Fig. 7 Growth of the width of the interval for different values of N


10
(et = 10~ ; p and hn controlled so as to maximize hn-}
n
209

16
(c) The Pythagorian three-body problem . The equations of motion are as fol-
lows:
x -x 3:3-2:5
= -4 3 4

2
-5-
2
{ ( x - x,y + ( » -
3 to) }* { ( x - z ) + (y - y y}i '
3 5 3 5

_ 4 to -to 5 ya - to
2 2
{(x - xtf + (jft -
3 to) }* {(x - x ) + (y -
3 5ytf}V 3

x - x c ^4-2:5
= 3 3 4

{(*3 - n ) + (» -
s 2
to) }* {(x< - X ) + (j/4 " t o ) } =
S
2 2

(36)
= 3 S/3 - !/4 84 - 1ft
{(*s " X,y + (!/3 - t o ) } * 2
{ ( x - x ) + (w " t o ) } *
4 5
2 2

-x £3 1
X4 - x 5
= 3 T I
& A

2 2
{(*a - s ) + da - t o ) } *
s { ( x - 2 ) + (»* - t o ) } " 4
5
Z 2

to - J/S 1 4 to - Jft
= 3 2
{ ( * * - x ) + (to -
2
5 to) }* { ( X - x ) + (to
2
4 5

The initial conditions are as follows:


x 3 = 1, i = - 2 , i
4 5 = 1,
to = 3, 84 = - 1 , to = -1.
(37)
i 3 = 0, ± = 0, is
4 = 0,
to = 0. !/4 = 0, to = 0-

General view of the motions of three bodies is shown in Fig. 8. This problem is
notorious for near-singular (near-collision) points occurring from time to time, so
that a number of regularization techniques have been devised by many authors.
But we did not adopt such regularization techniques in order to see how the present
method will behave itself at near-singular points. In Fig. 9, we observed that N = 300
gave a better result than N = 00.
9 Growth of the w i d t h of the interval for different values of N
- 1 2
{e = 1 0 ; p„ and h„ controlled so as to maximize /in-)
t
211

References

1. J. AMEMIYA: On Numerical Methods with Guaranteed Accuracy for Solving


Ordinary Differential Equations Using Automatic Differentiation (in Japa-
nese). Master's Thesis, Department of Mathematical Engineering and Infor-
mation Physics, Faculty of Engineering, University of Tokyo, March 1991.
2. G . A L E F E L D and J. H E R Z B E R G E R : Introduction to Interval Computations.
Academic Press, New York, 1983.
3. A. GRIEWANK and G . F . CORLISS (eds.): Automatic Differentiation of
Algorithms—Theory, Implementation, and Application. SIAM, Philadelphia,
1991.
4. P. HENRICI: Discrete Variable Methods in Ordinary Differential Equations.
John Wiley & Sons., Inc., New York-London, 1962.
5. M. I R I : Simultaneous computation of functions, partial derivatives and esti-
mates of rounding errors — Complexity and practicality. Japan Journal of
Applied Mathematics, Vol. 1 (1984), pp. 223-252.
6. M. IRI: History of automatic differentiation and rounding error estimates. In
[3], pp. 3-16.
7. K . K U B O T A : PADRE2, A Fortran precompiler yielding error estimates and
second derivatives. In [3], pp. 251-562.
8. K . K U B O T A and M . IRi: Estimates of rounding errors with fast automatic dif-
ferentiation and interval analysis. Journal of Information Processing (an offi-
cial journal of Information Processing Society of Japan), Vol. 14, No. 4 (1991),
pp. 508-515.
9. R. J. LOHNER: Enclosing the solutions of ordinary initial and boundary value
problems. In E. K A U C H E R , U . K U L I S C H and Ch. U L L R I C H (eds.): Computer
Arithmetic, Scientific Computation and Programming Languages, B . G . Teub-
ner, Stuttgart, 1987, pp. 255-286.
10. R. E. M o ORE: Automatic local coordinate transformations to reduce the
growth of error bounds in interval computation of solutions of ordinary differ-
ential equations. In L. B . R A L L (ed.): Error in Digital Computation, Vol.2,
John Wiley & Sons, Inc., New York, 1965, pp. 103-140.
11. R . E. Moo RE: Methods and Applications of Interval Analysis. SIAM,
Philadelphia, 1979.
12. L. B. R A L L ; Automatic Differentiation — Techniques and Applications.
Lecture Notes in Computer Science 120, Springer-Verlag, 1981.
13. H. J. S T E T T E R : Validated solution of initial value problems for ODE, In Ch.
U L L R I C H (ed.): Computer Arithmetic and Self-Validating Numerical Meth-
ods, Notes and Reports in Mathematics in Science and Engineering, Vol. 7,
Academic Press, 1990, pp. 171-187.
14. N . F. STEWART: A heuristic to reduce the wrapping effect in the numerical
solution of x' — f(t,x). Tidskrift for Informationsbehandling (BIT), Vol.11
212

(1971), pp-328-337.
15. T. SliNAGA: Theory of an interval algebra and its application to numerical
analysis. RAAG Memoirs, Vol.2 (1958), Misc.II, pp.547-564. [Based on the
Master's Thesis in 1956]
16. V. SZEBEHELY and C. F . P E T E R S : Complete solution of a general problem of
three bodies. Astronomical Journal, Vol.72 (1967), pp.876-883.
213

N u m e r i c a l V a l i d a t i o n for O r d i n a r y D i f f e r e n t i a l E q u a t i o n s
using Power Series A r i t h m e t i c
Masahide Kashiwagi
Department of Information and Computer Science,
School of Science and Engineering, Waseda University,
Okubo 3-4-1, Shinjuku-ku, Tokyo 169, Japan
E-mail: kashi@oishi.info.waseda.ac.jp

ABSTRACT
In this paper a numerical validation method for normal form simultaneous first
order differential equations is discussed. Based on Lohner's method and interval
functoid, a new inclusion algorithm for initial value problem is given. For the
algorithm, two types of arithmetics of power series is defined.

1. Introduction

In this paper we will consider numerical validation of normal form simultaneous


first order differential equation:

^ = /(s(t),i>, t€[MJ, (1)

where x(t) is n-dimensional vector valued function.


To solve this type of equation numerically, we often discretize the equation to
finite dimensional equation with unknown var iables Xi which approximate x(tj), where
a = ti < t < - • < t = b. In this process, discretization error arises and rigorous
2 m

estimation of the error is very difficult. But, if we can describe exact relation between
Xi and Xi+t, then we can get finite dimensional equation having solution z, which
exactly equal to x(ti). Namely, if we can calculate d>(v,t„t ) which returns exact
e

x(t ) provided that x(t,) = w, then we can write down the finite dimensional equation
c

as

X2 = 4>{,Xuti,tt)
Ij = tp(X2,h,t ) 3

x m = 0(i _ ,t _ ,t ).
m 1 m 1 m (2)

This system of equations has n x m unknown variables and n x (m - 1) equations.


So if n boundary conditions are added, it is expected to be solvable.
Thus if we can calculate validated solution of finite dimensional equations by
3
for example Krawczyk's interval map'' , we can obtain validated solution of original
ordinary differential equation at discrete m points.
214

In this paper we will show how to calculate the exact relation <p[v,t„t ). It can c
3
be seen as an extension of Lohner's method It is noticed that interval arithmetic is
used if needed through this paper.

2. Power Series Arithmetic

As preparation for discussions in section , we will define power series arith-


metic(PSA).

2.1. Type-I Power Series Arithmetic

Type-I PSA treats order-n power series and truncate terms higher than n-th order,
[do, ai, 0.2, • • •, a ] represents power series
n

2 n + l
do + ait + a t + • • • + o „ r
2 (+0(t )). (3)
Addition, subtraction and multiplication are done as follows:

[a , - - ,a„] ± [b ,--- ,b„] = [a ± b , •• •, a„ ± &„],


0 0 0 0 (4)
[do, •• - ,a ] x [bo, - •• ,6„] = [c ,- - • ,c„],
n 0 (5)

Functions are applied as follows:

w
/([oo,• • -,o.D = /(oo) + E *,f (ao)[0,oi,• •• , < (6)

Addition and multiplication in above algorithm are done by (4) and (5).
Division is executed by combining inverse function and multiplication (x/y =
x x (I/?)).
Type-I PSA keeps first (n + 1) terms of no-truncated arithmetic. Several mathe-
matical softwares as Mathematica provide such an arithmetic.

2.2. Type-II Power Series Arithmetic

Type-II PSA also treats order-n power series. In Type-II PSA we must specify
its domain like as [0, d\. [do, • • -, a„\ also denotes
2
a + atf + a t + • • • + a f,
0 2 n (7)

but coefficient of the last term a„ is generally interval and [n , • • • ,a ] represents set
0 n

of continuous functions defined in [0, d] such that / ( ( ) e Un H i-a f" in all !. This n
4 5
is a kind of interval fuuctoid introduced by Kaucher and Miranker '
Addition and subtraction are same as Type-I PSA. Multiplication is executed by
the following steps:
215

(1) Multiply [Oo'v • • i «n] and [6 , •••,&„] without truncation. Result C = [e , • • •, c \


0 0 2n

is order-2n.

(2) Reduce the order of C to n.

Order Reduction is defined as follows:


Definition 1 (Order Reduction = Rounding) Let A = [a , •••,a ] be power se- 0 m

ries and n be n < tn. Then order reduction of A to n is defined as the following

bi = a, (i < 0 < n - 1) (8)


1
b
n = a+ n £ a, M p (9)
i-n+l

i n -n
It transforms o,f to a t ~ t" and replace t'~" by [0, rf]' It is adding higher order
l

remainder to the coefficient of t" Thus the result of multiplication C contains all
possible results derived by multiplication without truncation.
Functions are applied as follows:

/(too,--,«*]) = /(aoi + E ^ W M i , - - - , ^ ] ' (10)

)
+i/'" fi:^io,d]')[o,< ,---, ]". il ttn (uj
\f=0 /

Addition and multiplication in above algorithm are done by Type-II PSA. This algo-
rithm uses the Lagrange's remainder term.
Division is done like as Type-I PSA.

3. How to Calculate tp(v,t,,t )


c

d> : R" x R x R -+ R" can be calculated by obtaining verified solution of the


following initial value problem:

^ = mt),t) d2)
x(t,) = v (13)
t 6 [t„t \. t (14)

If x(t) is calculated exactly, <p{v,t„t ) is obtained by x[t ).


c c

We construct an algorithm to obtain enclosure of x(t) based on Picard's iterative


method. That is, we convert (12) to equivalent fixed point form:

x(t) = v + j f f{x(s),s)ds, (15)


216

obtain an approximate solution by iteration started from constant function v, and


prove existence of true solution by Schauder's fixed point theorem.
Now we show the algorithm. It is noticed that independent variable t is shifted
([t„t ] - [0,t - [,]) in the following.
e e

Algorithm 1 We assume that t > t,. Let A = t - t and domain for Type-II PSA
e t s

be [0,A] ,

Step-1 Initialize power series vector X as

( M
x = (16)

and set m = 0.

Step-2(generate approximate solution) Repeat Step-2(l)-Step-2(3) appropriate


times.

Step-2(1) Set power series T as (t + () truncated within order m:


a

(m = 0)
(m=i) (17)
[t ,l,0,--,0]
3 (m>2)

Step-2(2) Calculate X = f(X,T) by Type-I PSA. Calculate X = v + $Xdt.


By these operations order of X increases from m to m + 1.
Step-2(3) m = m + l

Step-3 SetT = [t , 1,0, • • •, 0] (orderm).


a

Step-4 CalculateY = f(X.T) by Type-II PSA, calculateY = v + fdYdt, and reduce


the order ofY from m + 1 to m by definition 1.

Step-5 Calculate
1
r= m a x l V - r ' - A - ' " ' ! , (18)
w k
where suffix means coefficient of t .
3 m)
Step-6 Let i f * = x\ + [-2r, 2r\ forl<i<n.

Step-7 CalculateY = f(X,T) by Type-II PSA, calculateY = v + f^Ydt, and reduce


the order ofY from m + 1 torn by definition I .
1
Step-8(existence test) IfY£ C X,-" ' for ali i,, then the existence of true solution
of (12) in Y is guaranteed by Schauder's fixed point theorem.
217

Step-9(refinement of solution) /repeat Step-9(l)-Step-9(2) untilra.d(Yf"'>) becomes


sufficiently small.

Step-9(1) Calculate X = f(Y,T) by Type-II PSA, calculate X = v + ftXdt,


and reduce the order of X from m + 1 torn by definition 1.
m) m) ml
Step-9(2) Let Y,^ = Y} n Xj for all i.

Step-10 $(v,t,,t ) e is obtained by

(19)

k) h)
At Step-8, Y^ = X$ for 0 < k < m - 1 always holds in this algorithm, therefore
only the last term is needed to test.

4. How to calculate <p„(v,t ,t )


e c

In order to solve (2) with guaranteed accuracy, we need not only exact <p(v, t„ t ) t

but also exact <t> (v,t„t ).


v e

<p (v,t,,t ) is obtained as follows: Consider simultaneous initial value problem:


v c

d l ( t )
= f{x(t),t) (20)
dt
dy(t)
= Ux(t),t)y(t) (21)
dt
x{U) = v (22)
£f(tj) = / ( n x it identity matrix) (23)
l e [t ,t ).t t (24)

Solve this simultaneously by algorithm in section , then <j> {v,t„t ) is obtained by


v e

6
Above mentioned method is closely related to the automatic differentiation . By
doing algorithm in section calculating 'derivative of all numbers in the algorithm
with respect to initial value v' simultaneously, we can obtain (i, j)-element of matrix
valued function y[t) as 'derivative of Xj(t) with respect to j - t h element of v'.

5. Conclusion

In future paper, we will present how to enclose the solutions of (2). Also we will
present how to construct a software for this algorithm and numerical examples.
218

1. Masahide Kashiwagi and Shin'ichi Oishi : "Krawczyk-Based Numerical Vali-


dation Using Rational Arithmetic", Proc. 1993 International Symposium on
Nonlinear theory and its Applications {NOLTA '93), pp.399-402 (Dec 1993).
2. R. E. Moore : "Methods and Applications of Interval Analysis", SIAM,
Philadelphia (1979).
3. R. J. Lohner : "Enclosing the Solutions of Ordinary Initial and Boundary
Value Problems", In E. Kaucher, U. Kulisch and Ch. Ullrich (eds.) : "Com-
puter Arithmetic, Scientific Computation and Programming Languages", B.
G. Teubner, Stuttgart, pp.255-286 (1987) .
4. E. W. Kaucher and W. L. Miranker : "Self-Validating numerics for function
space problems", Academic Press, New York (1984).
5. E. W. Kaucher and W. L. Miranker : "Validating computation in a function
space", Reliability in Computing (eds. R. E. Moore), Academic Press, San
Diego, pp.403-425 (1988).
6. L. B. Rall : "Automatic Differentiation : Techniques and Applications", Lec-
ture Notes in Computer Science No. 120, Springer, New York (1981).
219

Statistical E r r o r Analysis i n Numerical Simulation


for S t o c h a s t i c I n t e g r a l P r o c e s s e s

Yoshihiro SA1T0
Shotoku Gakuen Women's Junior College, 1-38 Nakauzura
Gifu-shi 500, Japan
E-maili g44110g@nucc.cc.nagoya-u.ae.jp
and
Taketomo MITSUI
Graduate School of Human Informatics,
Nagoya Univ., Nagoya 464-01, Japan

ABSTRACT
Simulation for some stochastic integral processes is required in numerical solu-
tions for stochastic differential equations (SDEs). The stochastic part of the
error in simulation is considered, especially for the Wiener process W{t) and the
Wiener integral process f sdW(s) as the basics. Several weak numerical schemes
are applied for a good approximation of statistical quantities of the solutions.
The results show that the error depends on the number of trajectories, not on
the stepsize. The way to realize basic integral processes is discussed.
1991 Mathematical Subject Classification: 65U05, 60H05, 60H10, 65L99

1. Introduction
Much literature has been discussing numerical schemes of stochastic differential
equations (SDEs) in both strong and weak senses. As a mathematical error analysis
of strong schemes, we proposed one, which separates global error into deterministic
10
and stochastic parts, however we discussed only the former This paper is to treat
the latter. To this end, some stochastic integrals are studied on their stochastic error
part along with the means of realization.
Stochastic integrals are stochastic processes appearing in the Ito-Taylor series ex-
pansion for the solution of Ito stochastic differential equation. To realize the stochastic
integral in the digital computer is significant in the simulation of SDEs. The sim-
plest example of stochastic integral processes is the standard Wiener process. We will
consider one-dimensional stochastic integrals for simplicity.
The standard Wiener process is the Gaussian process which satisfies three prop-
erties as follows:
(i) P(W(0) = 0) = 1,
(ii) E(W(r)) = 0, for all f £ [0, oo),
(hi) C (t, s) = E(W(t)W(s))
w = min(t, s).
Simulation of the Wiener process in the digital computer requires the following dis-
cretization. _ n i

W(nh) = Y&W,.
i-0
220

Here, the increments AW, are


AW; = W({i + l)h) - W(ih),

and h is the step-size. The increment AWi, which is the normal distribution with the
zero mean and the variance h, can be simulated by
1 2
AWi = c^fc ' ,
where is a normal random number of the zero mean and the variance 1. The set
of such normal random numbers is written by JV(0,1).
If we carry out the simulation of integral processes with a digital computer, we will
use pseudo-random numbers instead of normal random numbers. Therefore we have
to consider the error caused by pseudo-random numbers. The order of convergence
in the Monte Cairo method used here is known to be as low as £?(l/\/W) (N is the
number of samples). Nobody knows, however, in advance how many samples should
be chosen and how they affect the error of the solution of the SDE. In this paper we
will statistically study an estimate of the number of trajectories to achieve a certain
accuracy, and the required independency of random numbers generated in each time
step.
These results can be applied to the simulation for SDEs. For example, consider a
scalar linear autonomous SDE:

dX = Xdt + XdW{t), X(0) = 1, (1)


which has the theoretical solution:

X(t) = exp{±t + W(t)}. (2)

The solution (2) explicitly depends on the Wiener process W(t). In general SDEs do
not have an explicit form of solution like as (2). In such cases, stochastic multiple
integrals appear in the solution expanded in the Ito-Taylor formula at ( = t . 0

In the present paper we will carry out an error analysis of two stochastic pro-
cesses, namely the Wiener process W(t) and the integral process / ' sdW(s) which are
0

incorporated in the Taylor scheme with global order 3. We adopt the distribution
3
norm which KLAUDER and P E T E R S E N used to estimate weak schemes.

2. Stochastic integral processes

A simple stochastic integral has the form

I{f) = [ S(s)dW{s) (3)

with a continuous integrand / ( f ) . The stochastic integral in Eq.(3) should be inter-


preted in the Ito sense. That is, the integral (3) is defined as

[f(s)dW(s) = hm Vf(t )&W .


r k k
221

7
Here, let t„ = 0 < d < - - - < t < t k k+1 < • • - < t„ = i , and AW * and h stand for the
following increments;
AW k = W(t ) k+1 - W(t ), k h = max(t t+1 - t ), k

respectively. The convergence should be taken in the mean-square sense. For example,
when f(s) = 1, the stochastic integral / ( / ) expresses Wiener process W(t).
The following proposition is well known.
Proposition 1 Assume f is sufficiently smooth. Then the following identities hold.

2 2
E ( / ( / ) ) = 0, E(/(/)) = f'(f(s)) ds
Jo

3. Error estimates for stochastic integral processes

We will give a method estimating the error of the integral process from its re-
alization. Let Y and Y stand for a stochastic integral process and its reabzation,
respectively. We define the distribution error e by the following:

I
e = ei + e , 2

e x = |Er-Ef|, (4)
2 2
e 2 = |E(y-EY) -E(y-Ey) |.

For a fixed r, the error (4) means the sum of Ci, the difference between the mean of
random variables Y and Y, and e%, the difference between the variance of them. This
3
estimate (4) is to be used for weak schemes .
The method of error estimation could be considered in the following way, too.
The Wiener process is the solution of the following simple SDE.
dY{t) = dW{t), Y(0) = 0. (5)
That is, the distribution error of the Wiener process is interpreted as the difference
between the numerical and the exact solutions of the SDE (5). Similarly SDE for the
(0,11
stochastic integral process 7 (() = / ' sdW(s) is the following:
0

dY(t) = tdW{t), F{0) = 0. (6)


The distribution error is equivalent to that of a numerical scheme of local order greater
than 2 applied to SDE (6).
The simulation of the Wiener process W{t) and of the stochastic integral process
,0 l)
/'"•''(t) will be done in the following way. The increment of / ' ( t ) can be written
as follows.

j''*\s - U)dW{s) + £'*'t,dW{s)


222

Here the increment AZi_ stands for the stochastic integral - U)dW(s). The
increments AWj and AZ; are replaced by the following expressions:

where 6,1, 0,2 a r e


mutually independent normal random numbers in N(0,1). A
computer simulation employs, however, the pseudo-random numbers in place of them.
0 11
Thus the realized stochastic processes W and /J; - are n

z
i=D i=a vi

respectively. At the same time, the replacement

| 0 , 1 1
is possible for AZ,. Then the simulation of the stochastic process / is carried out
with
= E l&i + ItejA*
11
.=0
6
This simulation corresponds to the asymptotically efficient scheme proposed by Newton
4,5
for the SDE (6). Also the increment At7 approximating A ^ can be used for nu-
;

merical schemes with weak order 2,


Yet another simulation for A / } can be given. Corresponding to numerical
schemes of local order 1 or 2 for the SDE (6), we put AZ; = 0. That is, we carry out
the simulation with

1=0

We will estimate the 90% confidence interval of the distribution error of the dis-
cretized processes for W(t) and I^-'Ht). To obtain 100 samples for e, L trajectories
are generated for each sample. We call the set of trajectories for the sample as block,
and each block are simulated independently. Then Y f means the sample of j - t h t

block of i-th trajectory at iV-th step-point.


The implemented method of calculating the error at T = Nh for Y$ is as follows;

^ f=l ^ 1=1
2 2
S = \EY - M f l + | E K
} T T - ( E Y j - ) - ff + $Mff\.
A sample of the error e at T = Nh is Sj. Since Sj is known to approximately obey
the normal distribution for large number of blocks due to the Central Limit Theorem.
We can evaluate the mean of the error e by using;
1 100 , 100
223

According to the statistical theory, the 90 % confidence interval when assumed Stu-
dent's (-distribution is given by
{S - AS x 0.166; S + AS x 0.166).

4. Results of simulation

Realized stochastic process simulations are the following 4 types:


Wiener process : Y(t) = W(t) and Y„ = W„.
Integral process type 1: Y(t) = I<M(t) and Y„ = f&$.
u
Integral process type 2: Y{t) = 3 ^ ( t ) and Y = J™. n

Integral process type 3: Y(f) = /("'"(t) and Y = K^l n

4 5
We calculate the errors at t = 0.5, 1.0, 1.5 and 2.0 with the stepsizes h = 2~ , 2~
6
and 2~ . The number of trajectories L in each blocks is taken as 100, 1000 and 10000.
The computer used in the simulation is Macintosh SE/30, and the program RNORQ
2
by Kahaner et al. is applied as the pseudo-random number generator.
The simulation results for Wiener process are given in Figures 1 to 3 according
- 4 - 5 6
to the stepsizes 2 , 2 and 2~ , respectively, while those for the integral process
of type 1, 2 and 3 are shown in Figures 4 to 6, respectively. The latter cases are,
4
however, shown for only the stepsize 2~ , because the results with other stepsizes are
4
almost same as with the stepsize 2~ . In these Figures the marks + , o and o stand
for the value S at L= 100, 1000 and 10000, respectively, while the mark - for the
upper and lower bounds of the confidence interval.
From these simulation results we can conclude as follows.
(i) For the Wiener process the magnitude of the error depend on the number of
trajectories, not on the stepsize.
(Q,1,
(ii) For the stochastic integral process i" (r), like for the Wiener process the error
depends on the number of trajectories. The rate of growth of the error versus
f is however bigger than that of the Wiener process.
10 1 10,1 | 0 , 1 )
(iii) For Z '' the simulations Z ' and j cannot be distinguished each other.
i0
However the simulation K -V is considerably different from other two.
We will give a more detailed discussion on the above item (iii).
4
With the stepsize h = 2" and the same samples in &,|, the values of S of 7, J
and K for L = 100,1000 and 10000 are shown in Tables 1-3.

4
Table 1. S for integral processes (h = 2 )
L 100
t /|0,1) Jl0.ll
2 A
0.5 1.98 x lO" 1.99 x W~ 2.20 x 1Q-'
2 2 _!
1.0 8.01 x 10- 8.05 x 10" 8.41 x 1 0
1.5 1.87 x 10" 1
1.86 x 10" 1
2.01 x 10"'
2.0 4.30 x 10" 1
4.29 x 10 _1
4.21 x 10"'
224

4
Table 2. S for integral processes (h = 2 )
L 1000
t jClo.il
0.5 6.68 x 10' :j 6.72 x 10~ s
1.21 x 10" •2
1.0 2.72 x 10" -2 2.73 x 10" 2
4.38 x 10" •2
x 10" •2 7.01 x 10" 1.02 x 10" •1
2
1.5 7.01
x 10" -1 1.27 x 10" 1.85 x 10" •1
1
2.0 1.27
4
Table 3. S for in tegral processes (ft = 2 )
L 10000
t J(0.1> jm
J
0.5 2.28 x 10" •3 2.31 x 10" 9.12 x 10"
1.0 8.80 x 10" -3 8.79 x 10" 3
3.52 x 10" -2
x 10" •2 10" •2
-2
1.5 2.15 2.17 x 1 0 7.94 x
2.0 4.31 x 10" -2 4.30 x 10" 2
1.29 x 10- 1
1 0,1
Differences are not observed between tjp ' and Jj, '. The theory in statistics indi-
3 2 3 3
cates that the increase of number of trajectories
1 yields 0, nh /12 and |n A /2 — nft /6|
as the limit of the distribution error e of
- 4
/J"'',
and rV* '", respectively. The the- 0

oretical values of e with h = 2 are listed in Table 4.

Table 4. Theoretical values of error for J and i f


t fft7l2 4
£
\t h/2-th'/6\
3
0.5 1.63 x lO" 7.49 x 10"
4
1.0 3.26 x 10" 3.06 x 1 0 -2

4 _a
1.5 4.88 x 10" 6.93 x 1 0
2.0 6.51 x lO" 4
1.24 x 10" 1

Comparing Tables 1 to 3 with Table 4, it is quite reasonable that the difference


between / and J did not appear due to the shortage of the number of trajectories.
A
Contrary to it, K attains a sufficient error level for h = 2~ and L = 10000
1
with respect to the theoretical ones. This suggests that in the case of h = 2~ the
numerical schemes of local order 1 or 2 are generally not usable for SDEs even with
very large number of trajectories. On the other hand, fj&& can be replaced with
0 11 0 1
J* ' , an easier realization for Z' ' ', when SDEs are solved by the weak schemes with
number of trajectories around 10000.

5. Summary and future aspects

The stochastic processes investigated here are restricted to most basic ones, the
( 0 , ,
Wiener process W(t) and the Wiener integral process / ' = JgSdW(s). However,
Section 4 suggests the error in the Wiener process is the main part of the error
in numerical solution of SDE. Thus it is natural that we expect the other integral
225

processes hold the property like Wiener process with respect to number of trajectories.
Also we will try to analyse the stochastic integral process which has any functions
/(sl as integrand.
We show the error in the stochastic integral processes simulation depends on
the number of trajectories. Thus the error can be small by increasing the number
of trajectories. It implies, however, that we have to generate considerably many
trajectories to achieve a desired accuracy.
Furthermore we adopted only a single pair of the starting value and the seed
for the pseudo-random number generator. Actual simulations should be carried out
with, say 100, independent blocks of samples, which require multiple pairs for the
generator. Thus, together with the requirement of considerably many trajectories,
implementation on a parallel computer is recommendable. The effect of multiple
pairs of the starting value and the seed should be examined carefully.
Simulations for multi-dimensional integral processes will be treated like as 1-
dimensional cases. Extracting from the latter, the number of trajectories is predicted
to be far more.

6. References
1. Arnold, L., Stochastic Differential Equations, Wiley, New York, 1974.
2. Kahaner, D., Moler, C , and Nash, S., Numerical Methods and Software, Prentice
Hall Inc., Englewood Cliffs, 1989.
3. Klauder, J.R., and Petersen, W.P., Numerical integration of multiplicative-noise
stochastic differential equations, SIAM J. Numer. Anal., 22(1985), 1153-1166.
4. Kloeden, P.E. and Platen, E., The Numerical Solution of Stochastic Differential
Equations, Springer, Berlin, 1992.
5. Kloeden, P.E., and Platen, E., A survey of numerical methods for stochastic dif-
ferential equations, J. Stoch. Hydrol. Hydraulics, 3(1989), 155-178.
6. Newton, N. J., Asymptotically efficient Runge-Kutta methods for a class of Ito and
Stratonovich equations, SIAM J. Appl. Math., 51 (1991), 542-567.
7. Pardoux, E. and Talay, D., Discretization and simulation of stochastic differential
equations, Acta Appl.Math, 3(1985), 23-47.
8. Rumelin, W., Numerical treatment of stochastic differential equations, SIAM J.
Numer. Anal., 19(1982),604-613.
9. Saito, Y. and Mitsui, T., Discrete approximations for stochastic differential equa-
tions. Trans. Japan SIAM, 2(1992), 1-16 (in Japanese).
10. Saito, Y., and Mitsui, T., Simulation of stochastic differential equations, Ann.
Inst. Statis. Math., 45(1993), 419-432.
11. Saito, Y, and Mitsui, T., Stochastic Part of Error in Numerical Schemes for
Stochastic Differential Equations, Trans. Japan SIAM, 4(1994), 127-139(in
Japanese).
12. Talay, D., Simulation and numerical analysis of stochastic differential systems,
INRIA Report 1313, 1990.
0.3-
+ L-100

o L-1000
0.Z-

« L-l0000

0.1- 6

0.0 -1 1 1 • 1

1 2 3

t
4
Figure 1: Confidence intervals of the Wiener process (/i = 2 ~ ) .

0.4-,

0.3- + L=l00
*

o L=1000
0.Z-

• L=10000

0.1- o
o

0.0 -i • 1 • 1

1 2 3

t
s
Figure 2: Confidence intervals of the Wiener process (h = 2 - ) .
227

0.4-,

0.3-
+ L=100

o L-1000
0.2-

• L-10000

0.1-

0.0

6
Figure 3: Confidence intervals of the Wiener process (h = 2 ) .

0.5-i

0.4-
+ L=100

0.3-
o L-1000

0.2-
• L-10000

0.1-

a •
0.0-

4
Figure 4: Confidence intervals of the integral process f (k = 2 ).
228

0.5-.

0.4-

+ L=l00
0.3-

o L=1000

0.2-
• L=10000

0.1-

0.0-

4
Figure 5: Confidence intervals of the integral process J (h = 2 ) .

0.S-.

0.4-

+ L=l00
0.3-

o L=1000

0.2-
« L-10000

0.1-
+ *
S
0.0
1

4
Figure 6: Confidence intervals of the integral process K (k — 2 ) .

You might also like