Professional Documents
Culture Documents
(Kyoto Workshop On Numerical Analysis of Odes (199
(Kyoto Workshop On Numerical Analysis of Odes (199
(Kyoto Workshop On Numerical Analysis of Odes (199
ORDINARY DIFFERENTIAL
EQUATIONS AND ITS
APPLICATIONS
This page is intentionally left blank
NUMERICAL ANALYSIS OF
ORDINARY DIFFERENTIAL
EQUATIONS AND ITS
APPLICATIONS
Editors
T Mitsui
Nagoya University, Japan
Y Shinohara
Tokushima University, Japan
fe World Scientific
WT Singapore * New Jersey * London•Hong Kong
Published by
World Scientific Publishing Co. Pie. Ltd.
P O Box 128. Farrer Road, Singapore 9128
USA office: Suite IB. 1060 Main Street, River Edge, NJ 07661
UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE
For photocopying of material in (his volume, please pay a copying fee through the Copyright
Clearance Center, Inc., 222 Rosewood Drive, Danvers, Massachusetts 01923, USA.
ISBN 981-02-2229-7
Preface
Numerical solutions of ordinary differential equations (ODEs) are broadly recognized that
they are not only interesting in theoretical study but also useful in practical applications. It
is the reason why the numerical analysis of ODEs has been attracting many research works
in the scientific computation community. One might be aware that this year is the centennial
memorial one since the historical article of C. RUNGE "Uber die numerische Auflosung von
Differeutialgleichungen" appeared in Mathematiscke Annalen as the pioneering work of more
sophisticated and effective numerical solution of ODEs.
Hoping that this volume contributes to the progress of numerical analysis of ODEs, we are
publishing it as a collection of original research articles. The contributions in this volume are
mainly based on those which were submitted in 1994 Kyoto Workshop on Numerical Analysis
of ODEs held in November of 1994 at the Research Institute for Mathematical Scicences,
Kyoto University. The topics of the articles are widely spreading, although they are touching
more or less upon the numerical solutions of ODEs. They reflect the state-of-the-art of the
study in numerical analysis.
Actually topics treated in the volume are: discrete variable methods, Runge-Kutta meth-
ods, linear multistep methods, stability analysis, parallel implementation, self-validating nu-
merical methods, analysis of nonlinear oscillation by numerical means, differential-algeraic
and del ay-differential equations, stochastic initial value problems and so on. Readers will be
able to recognize the recent development of these topics.
Last, but not least, we express our sincere gratitude to the present authors of the volume
as well as to the contributors of the Workshop.
CONTENTS
Preface v
Analysis of the Milne Device for the Finite Correction Mode of the
Adams PC Methods I
M. Fuji 75
L I M I T I N G F O R M U L A S OF E I G H T - S T A G E E X P L I C I T
R U N G E - K U T T A M E T H O D OF O R D E R S E V E N
HARUMl0N0
Faculty of Engineering, Chiha University
1-SS Yayoicko, Inage-ka, Chiba, 263, Japan
E-mail: aB9600Stansei.cc.u-tokyo.ac. jp
ABSTRACT
It is well known that eight-stage explicit Runge-Kutta formulas are of order at
most six. However, by taking the limit as the first abscissa approaches zero, the
formulas can achieve seventh order. Such formulas are called limiting formulas,
which requre the evaluations of the second derivatives of the solution. In this paper,
eight-stage seventh order limiting formulas using the second derivatives are derived.
And based on these limiting formulas, new eight-stage numerically seventh order
methods without derivatives are proposed.
1. I n t r o d u c t i o n
2. L i m i t i n g formulas
^=Mv), y{t ) = yo
a
where / and y are vectors and / is assumed to be different iable sufficiently often for
the definition to be meaningful. The parameters of an s-stage explicit Runge-Kutta
2
2
method are represented in the following Butcher array :
0.21
"31 132
• " ,l-l
s
W h ••
And, yi is used to denote the y ordinate at the abscissa Cj, namely,
i—1
a
Kf = y n + ft^ 'j/j'
j=i
where
/ l =/(*«,¥»), fi = f(tn+Cih,yi) (i = 2,3, • • • , * ) .
Using them, the method can be written as
•
5 , 5
Many eight-stage sixth order formulas are known and their properties are precisely
5
reported .
An eight-stage limiting formula that uses the values of the second derivatives at
the point ( t „ , J / „ ) has the form
¥3 = V* + HaaJi + ha F ),
3 2
h = f(t + c h,y ),
n 3 3
Vi =
•i—3
j—j
/, = flU + Cih^) (i = 4 , 5 , - - , 8 ) ,
B
S<n+1 = Vn + Khfl + Y, if' + b
kfaFj), (1)
1=3
where D(f(t , y )) and v(f,) denote the Jacobian matrix of / at the point (t„, y ) and
n n n
r
the vector (1, f\, • • •, / " ) respectively (the superscripts denote the component
numbers). The parameters of this limiting formula can be written in the following
array analogous to Butcher array:
a 31 <*3
0.43
Cs a
S3 OB4 • • "87 ClB
Is <*< •
°S ft.
3
cg = l, 63 = 0
and the following simplifying assumptions hold:
«a = I (2)
X > , ^ + «, = f (i = 4 , 5 , - - , 8 ) , (3)
E ^ =f (i = 4,5,---,8). (4)
Comparing the Taylor series expansion of Eq.(l) with that of the true value
y(t + k) and matching the coefficients of each elementary differential, after tedious
n
computation, we get the following equations of condition for seventh order accuracy:
i-l
C3i-c ,
3 a„ + ^ O i j = a (t = 4,5, • - - ,8), (5)
j=3
X>,a* = 0, (7)
i=4
jZ ^ ^ = 0, (8)
1=5 2=*
E 6 t L « i E « * « * 3 = o,
J J (9)
1=6 ,=S Jr=4
X > E W j 3 = 0, ( )
10
fe + £ 6 j = l , (11)
1=4
X>? = i (13)
J
i=4
M i=4
s i-i j - I
E^E^E
E ^ / E ^ ^ , (is)
1 Z U
i=6 ,= 5 *=4
1 9
E^E"./E^E^? = i t )
I D U
,=7 ,=6 i=5 1=4 '
i-1 I - l
1
E * .y=i
.=S E a . , E «*»^ = 2io. ( 2 1 )
i=6 ;=S it =4
8 i-l J-I J:-l
E ^ E ^ E ^ E ^ N — , (22)
i=7 ,=6 t=5 1=4
m s=s *=4 l b M
2.2. Solutions
Hereafter, we assume that all abscissas are distinct and are not equal to 0. And
we use, for later convenience, the notations
E km = 6,(1-c,) = P i (j = 4,5,6,7),
i=j+l
E E f E ^#U<= E ( E ( E v W U , = % i 3
d
Using the notation p;, we can rewrite Eqs.(14), (15), (17) and (20) as
And we get
UcjC
lltjCI: -
- 7(cj
(LCj +
TU ) +T 4*
CJ;
k . C G I fuel
n - " 3 8S 4 — ,r }r
f c0&^: -—^T)T^T-: 0 ( t , j , * = 4,5,6). (25)
2^ =
360' S^' =
840
and _
( M = 4 , 5 )
- ( 2 6 )
By using p^ a r , Eqs.(6), (11), (12) and (13), the parameters of the method can
if ;
be expressed rationally in terms of c^'s, provided that all denominators of a./s do not
vanish;
bi and 0 are
3
k = - ^ - (i = 4,5,6,7), 6 = |-£fti&
8 h = \-hh> (27)
1 — c,- 3
i=4 i=4
2 8
& = ( )
' i=4
P7 (29)
Q S T =
b»'
a 76 = > ass = 7-1 (30)
Pi
1
o 65 = 1 «75 =
0-6
6
The last equation of condition given by Eq.(23) must be satisfied with the solutions
above obtained under the assumption given by Eq.{4). By trivial manipulations, we
get the following relation between c and C j : 4
The parameter a obtained from Eq.(23) and c can be rewritten using the relation
54 4
given by Eq.(32) as
c c
4l i ~ *l
• = - 3 (33)
And the other a 's are
it
5 6 7 8 35
«« = j | , 00 = t | ( f " S « « ^ J (' " ' ' ' > - < >
These a, 's are found to satisfy Eqs.(7), (8), (9) and (10) by a straightforward com-
3
and ^ ^
^ = f - £ > ^ (. = 4,5,--.,8). (37)
* ,=3
Now, we have obtained a set of parameters of the eight-stage seventh order limiting
formula with four free parameters, c , c , c§ and c . 3 t T
In the solutions obtained in the previous section, four parameters c , c«, Ce and cj 3
are free to be chosen in any way. In this section we will consider how to determine
these parameters.
The stability region depends on only one free parameter c,. It is desirable to
determine c, so as to maximize the stability region. At the same time it is preferable
that every parameter is the number requiring a small number of digits and small in
magnitude. We intend to derive the eight-stage formulas which achieve numerically
7
seventh order by replacing derivatives with numerical differentiation. The key point
to derive these formulas is that the error caused by numerical differentiation does not
dominate over the leading error term of the limiting formula.
Here, we will present two sets of free parameters. One of them gives the parameters
requiring comparatively small number of digits, and the other gives comparatively
large stability region.
3.1. Stability
The polynomial r which determines the stability of the eitht-stage seventh order
limiting formula given by Eq.(l) is
•••• I + ; + ^ - + --- + ^
z7+ T* , 8
where
1 c (3 - 7c,)
4
7 = gj - W t ^ ^ o , = 1 5 1 2 0 f l 4 c . _ 1 2 c i + 3 )
and z is the complex number. The stability region is the set of points for which
|r(«)| < 1. Let the simply connected interval ( — d , Q ) be the intersection of the
stability region with the negative part of real axis. This interval is called the stability
interval. The boundaries of the stability regions for several values of 7 are shown in
Fig. 1 with the values of 7 attached. The graph indicates that
7 e { — — , — — } « (0.143 x 10 ,0.147 x 10 )
7 1 K
-4 -4
K
(38)
70000 '68000' ' '
gives the maximum stability region. In the case where the range of c is restricted to s
3 1
0< c <- 4 or - < c < 1. 4
8
3T-
I
]
V
11
/ f a
f u•
<* J
SV V
\s
\
! 1
Fig. 3. Stability boundaries for c = 2/T and 11/28. 4
11/28. The value of 7 is about 0.1455 x 1Q~ . Another choice of the value of is
4
2/7. For this value of c 7 is about 0.265 x 10~ and is outside the interval given by
4l
4
Eq.(38). But we can determine all parameters which require smaller number of digits
than those for c = 11/28. The boundaries of the stability regions for c, = 2/7 and
4
22 , 11
* = - for c = - .
4 (40)
9
with some small value of f. So, it is desirable that the error caused by numerical
differentiation is as small as possible.
The magnitudes of the optimum e and the error E of hF in Eq.(l) are roughly
opt avt 2
estimated as
and
d
he written
i=*
(43)
i=4 j=3
where G\ and G2 are vectors which depend on the function / . We see from the
2 3
previous section 2.2 that the coefficients of h E , • G^ and h E • G are ap apC 2
i=4
- (70c C CeC
4 s 7 - 35(050607 + QCfiCy + C4C5C7 + C^Cfi)
+21(C4C S + C4C6 + C C 4 7 + C Cb5 + C C + CgC?)
5 7
and
35c c ce - 14(c c6 + C CB + c c ) + 7(c, + c + C s ) - 4
4 s 5 4 4 s 5
£ <E b a a
ai (45)
840c c 4 sCfi
i= 4 j=3
free parameters so that ft vanishes, not only the second term but also the third term
of the right-hand side of Eq.(43) vanishes. In this case, the leading term of the error
caused by numerical differentiation becomes
3 4 2
0(h E ) opl <x h -p-"'
get
(25ce - 18)c - 18ce + 14 , 2 T
8 =
w 2
for c = - 4
240csc 7 7
and
„ (65ce + 63)c + 63^ - 56 , 11 7
A =
HMOc^ ^ *~W
We want to find the values of ce and c so that ft vanishes, and that they give all 7
a n d
4 28 , 11
C 6 = C 7 = f 0 C
5 ' 575
S.3. Two sets of parameters for eight-stage seventh order limiting formulas
get
1 2
c =- tfor C 4 = - 3
a n d
7 # 11
C 3 = f r C
20 ° ' = 28'
Now, two sets of abscissas are obtained:
- , f\ 2 2 4 \ \
( ,c ,c ,c ,. ) = ( - - , _ - - j
C3 4 5 6 7 1 1 (46)
and
. , ( 7 11 22 4 28 \
(C3, C , C , 0 6 , 0 7 ) =
4 5 - — j (47)
Substituting Eq.(46) into Eqs.(27), (29), (30), (31), (33), (34), (35), (36) and (37),
we get formula 1. The parameters of this formula are shown in Table I . The stability
11
Table 1. Formula 1
region of this formula, is not so large, but its parameters are the numbers requiring
comparatively small number of digits. For Eq.(47), we get formula 2. This formula
is nearly the best formula from the viewpoint of stability region. The parameters of
formula 2 are shown in Table 2.
The eight-stage formulas which achieve numerically the same accuracy as the
seventh order limiting formulas are obtained by replacing derivatives with the simplest
numerical differentiations. Namely, in the formula given by Eq.(l) we compute as
follows:
/ i = /(**. if*),
m
where
t = V ? / I
= ( 33554432/, f o r 1 4 t o t h e b a s e 1 6
( w )
o r
A j 'giffi ^ 8 digits to the base 16.
Si
i s
X
o> os r-
5 " EEr § 3
O
_ o CO fcrt
!£J CO
HI II
H S =|SS S I S - i - sit
13
NnI™I
Fig. 4. Largest errors in numerical solution of example 1 at the last step.
To show that formula 1 and formula 2 achieve seventh order, we give the errors
in numerical solution of a system of equations
Example 1 Integrate
tM n
= -3/13/3, 3/2(0) = 1,
^ 3 / s ( 0 ) = l, = 0.51
over the range [0,60]. The largest errors of both formulas at the last step for var-
ious values of h are shown in Fig. 4. From Fig. 4 we see that both formulas are
exactly of order seven because the accumulated truncation errors of both formulas
1
are proportional to h .
Next, to illustrate that the formulas given in the section 4 achieve numerically
4
the same accuracy as the limiting formula, we present the results of an equation by
formula 1 and formula 1'.
Example 2 Integrate
dy = _ W I + 1)
J V W
dt 3j/ (te<-6) '
over the range [0, 1]. The errors in numerical solutions at ( = 1 for various values
of k are shown in Fig. 5. The computations were performed in double and quadru-
ple precision arithmetic, using c for double precision arithmetic given by Eq.(48).
Observations of Fig. 5 are as follows:
(i) Formula 1' achieves the same accuracy as the formula 1 for all values of k in
double precision arithmetic.
14
6. References
ABSTRACT
Collocation Runge-Kutta formulae, a dominant class of implicit methods, are
considered for numerical initial value problems of ODEs. Since they are fully
characterized by their abscissae, we propose a systematical way to generate
collocation Runge-Kutta formulae of the same number of stages by a gradual
change of abscissae. They increase their orders as the changing, finally to co-
incide with the Butch er-Kuntzmann formula of the specified number of stages.
Their ,4-stabib'ty is also investigated to represent the stability factor with the
abscissae.
1. I n t r o d u c t i o n
the class of Runge-Kutta formulae (RK formulae, in short) is most important among
many discrete variable methods. Especially the stiff problem of Eq. (1) requires
highly sophisticated methods, one of which is the implicit RK formula.
It has the form
Here, the interval of integration [a, b] is divided by the step-size ft so that the step-
points are given by
x = a + nh (n = 0 , l , . . . , J V ) .
n (3)
The real parameters specify the method. Formula (2), which is called an
s-stage RK, is usually assumed to satisfy
a
C = I > ; J (i = l , . . . , s ) (4)
J=l
16
so that the RK formula gives the same result for a non-autonomous O D E as well as
for its autonomous counterpart.
To specify the formula, i.e. to determine the parameters ay,&j and c;, the colloca-
tion method seems to be most simple and powerful. Although the exact definition will
be given in the next section, the sense by the word "collocation" could be explained
by considering the polynomial which interpolates the numerical solution Y in (2a) (
2. A Construction of Collocation U K M e t h o d
1
D{q) if i,b4- a j i = )b (l-c' )j j (j = l , . . . , ; f = l , 2 , . . . , g ) ,
S (9)
1
E(M if E ^ - ^ c * - = - — ( < = ! , . . . , p , A = i , . . . , , ) . (io)
17
2
Definition 2 ( B U R R A G E ) If an s-stage RK formula satisfies the conditions B(s)
and C(s), then it is called a collocation method.
The implication of the collocation RK method can be considered as follows. For the
sake of notational convenience, we introduce a scaling for the independent variable
by x = xq + th. Then we will consider the differential equation and its approximate
solution in the term of the variable (. Moreover, for the RK methods, it suffices
to consider on the interval t € [0,1]. Hereafter, within the present section, we will
restrict ourselves on this situation.
If the RK method (2) is regarded as a method of numerical quadrature, we may
apply the method to a differential equation depending only on t,
M
where the right-hand side is an approximation of the integral
l
y(i) =
ft f f(t)dt
Jo
The condition B(p) implies that Eq. (12) is exact if / is merely a polynomial of t of
degree at most p— 1. That is, the quadrature (12) is of order p.
From Eq. (11), we have
^-ftEaijM) (13)
3=1
Let P{t) be a polynomial interpolant of degree at most s. Then the derivative of P(t)
is integrated exactly at every internal point c*. Thus we obtain from Eq. (13)
f(c )
j = P'(c )j (j = l s)
holds, which means that the condition C(s) implies the interpolancy of the derivative
P'(t) at every internal point c,- (i = 1 , . . . , s).
18
In conclusion, the collocation RK gives the exact values at i = 1 for the solution,
at t = Ci for its derivative if it is a polynomial of degree at most s.
Furthermore, we have stronger statements which can be found in 3.2 of D E K K E R -
4
V E R W E R They leads the followings.
Theorem 1 The collocation RK method of s-stage is consistent of order at least s for
general initial value problem Eq. (1) if it has distinct abscissae and nonzero weights.
Theorem 2 Let s distinct abscissae a, ...,c, be given. Then the simplifying con-
ditions B(s) and C(s) uniquely define an RK method.
Hence, we can focus on the determination of the abscissae for the collocation RK
method.
Because of the interpolating property at the internal point t = c,-, we will consider
a polynomial on [0,1] of degree s which interpolates the solution y(t). Usually it can
be constructed on the fixed interpolating points &>£i, - • - Its Lagrange form is
given by
Lm=£y(®m) where i0$=J[t^k,
Theorem 3 For fixed distinct points fo, • • • , 6 on [0,1], the abscissae Ci, c, of
the s-stage collocation method satisfy the equation
0 w h e r e
= «*<*)=nc-&)=(*-&)(* -«!)•••<*-
Proof. The s-stage collocation method yields the condition C(s). The solution y(t)
is written as
It is obvious that Eq. (14) has s distinct roots on (0,1), situated as ( i - l ) / s < a < ijs
(i = 1 , . . . , s) if they are labelled as ci < < • • • < c,.
Since the Newton-Cotes type formula of s-stage is a collocation one, it is consistent
of order at least s. The following theorem gives its actual order.
Theorem 4 The s-stage Newton-Cotes type formula is consistent of order s + 1 for
odd s, of order s + 2 for even s.
Proof of Theorem requires a Lemma concerning with a kind of orthogonality of poly-
nomials.
Lemma 1 Let 4> _ (t) be a polynomial of degree (p + 2q — 1) defined by
p q
^ ( o = ^ - i r n3=1( ' - j )
with p,q € N. Then we have
= 0 for even p,
jJ% (t)dt{
M
/ 0 for odd p.
Furthermore, let d>' ^(t) be a polynomial of degree (p + 2q) given by 4£,,(f) = td> (t},
p fiq
then we have
j f *;,,(*)<« *o.
The Lemma can be shown through tedious but straightforward calculations on poly-
nomials. Hence we omit it.
Proof of Theorem 4. It in known that the simplification conditions B(p),C(q) and
D(r) imply the condition A(p) provided the inequality p < min(g + r + 1,2q + 2).
Hence it suffices to prove that the condition B(s + 1) and B(s + 2) are satisfied for
the odd and the even cases, respectively.
The condition B(s + 1) means that the Newton-Cotes formula is exact for a
polynomial / ( ( ) of degree at most s as the numerical quadrature rule for Eq. ( I I ) .
So, let f(t) be a polynomial of degree s. Then the division of / ( ( ) by tff (t) yields the t
quotient and remainder polynomials of degree 0 and of degree less than s, respectively:
f(t) = q<fi',(t)+r,. (t).
1 Thus
l
3,(1) = h f {q<p',(t) + r -i(t)}dt s = k f r _!(t)dt.
3
JO Jo
On the other hand, the identity
to
Next, let /(£) be a polynomial of degree s + 1, Then we have / ( f ) = q{tW,{t) +
r,_i(t), where q{t) = git + go and r,_! is a polynomial of degree less than s. Similar
the above, we can deduce as
To increase the order of consistency for the s-stage collocation method, we will
make a gradual change of the distribution of the interpolating points £ o , . . . , £ which a
1 for k = s — 1.
Moreover, define the polynomial <£>,,*(£) of degree s + 1 by
n t +
%M = J{? (t-i) V*(o} (* = o , i , . . . , 5 - i ) . (i6)
Note that the identity <p,fl{t) = tp,(t) holds for <p {t) defined in Eq. (14). Since ( = 0
t
+1 ,;+
and 1 are both (fc +l)-ple root o f ( * ( £ - l ) V * ( £ ) . *p,, (0) = <p,, (l) = 0 . Obviously
k k
method.
Theorem 5 The s-stage collocation formula determined by the equation
<*(i) = 0 (17)
is consistent of order s + k + 1 ifs-k is odd, and of order s + k + 2 ifs-k is even.
21
Proof is similar to the one for the previous Theorem. The only difference is that
repeated application of integration by parts enables us to attain the exactness of the
quadrature for polynomials of required degree. Thus we omit it.
Theorems 4 and 5 bring a table showing the increase of the order of consistency
in this series. Refer to Table 1. Since the stretching of variable by r = 2t - 1
<.-i(t) = 0
is nothing but the s-stage Gaufi-Legendre (Butcher-Kuntzmann) method.
On closing the present section, we have established away to generate a series of s-stage
collocation method starting from the Newton-Cotes type ((s+l)-st or(s+2)-nd order)
up to the Gaufi-Legendre formula (2s-th order), increasing the order of consistency
two by two.
We employ the Butcher array for the formula parameters of RK. Let A, b and e
T
be the matrix and the vectors given by A — (a,j) (1 < i, j < s), b = .. ,b } and
3
T
= ( 1 , 1 , . . . , 1) , respectively. The stability factor R(z) of the RK method by Eq. (2)
is known to be given by
T
det (l-zA + zeb )
R 18
^ = \et(I-zA) - < >
In the s-stage collocation method, A and b are uniquely determined by the set of
the abscissae . .,c provided that they are distinct (Th.2). Hence we can conjecture
s
22
that the stability factor Eq. (18) should be expressed only with the abscissae. Define
the polynomial
3=1 ;=o
which is the derivative of w,(t) in Th. 3 divided by its leading coefficient to make it
monic. The coefficients p ( j = 0 , . . . , s — 1) can be represented by the fundamental
;
4'
1 cs ••• eT 1
r 1
V = , C = diag[ci,cj,---,c.] and S = diag
1 ft eT 1
(20)
Then, the simplifying conditions B(s) and C(s) are equivalent to the matrix identities
4
given by the following ( D E K K E R - V E R W E R ) :
T T
B{s) : b V = e S\ and C(s) : AV = CVS. (21)
Let W be the diagonal matrix given by
lV = d i a g [ l , l , 2 ! 3 ! - . - , ( s - l ) ! ] .
1 1 (22)
A direct calculation leads to the identity
0 -Po/s\
1 0
l l 1 1
wv~ cvsw~ = WV-'AVW- =
0 -(s-2)!p _ / ! a 2 S
1 -(«-l)!j>,_i/»!
The right-hand side matrix is the companion matrix for the polynomial
3 1
A(z) = det(Z- zA) = z det(-I -WV^AVW- )
Lemma 2 The denominator polynomial A(z) of the stability factor R(z) is given by
AO) = z'd{\).
' 1 7i '
1 72 • • 75" 1
. 1 7. " • i f *
The shift of the variable t in 7r,(t) by one derives the polynomial p,(t) by
p (t) = ir,(t + l) = f +
1
T
'£ > < fi (27)
3=0
(28)
S< 8! S\
(52) \N(iy)\ < |A(%)| for every real y. (The imaginary unit is written by i.)
24
( — l J ' T T ^ - t ) . This identity yields a relationship between r^ and pj, which implies the
equation N(z) = A(—z). Therefore the condition (S2) follows immediately. •
By the Theorem it suffices to investigate the location of roots of only A(z) whether
the symmetric collocation method, which includes the class given in the previous
Section, is ^-stable. For instance, a sufficient condition can be given by the following
proposition.
Theorem 7 If all the roots of the truncated Taylor series expansion of exp(z) of
order m
+
are in C , then any symmetric collocation method of m-stage is A-stable.
Proof. We note that &(z), which can be written as
in the sense
6
Grace's theorem (e.g. P O L Y A - S Z E G O p60) states that every zero z of A(a} has the
form z = -6^ where 9 is a certain root of g(z) and £ is a suitably chosen pointa in
k
3=1
25
oo
oo
a 0
OO
9 X o
10 M o o Q 0 o 0 0 o o
11 XM 0 0 0 o o o o Q o o
12 X o 0 0 o 0 o o o 0 0 o
13 X 0 o o o o o o o o o 0
14 X 0 o o o o o 5 0 o o 0 o
15 X X o 0 0 0 0 0 0 o 0 o 0 0 0
16 X X 0 o o o 0 0 Q 0 o o 0 o o 0
17 XX X X o o o o 0 o 0 o o 0 o 0 0 o
IS X • 0 o o o 0 o Q Q o 0 o 0 0 0 0
19 X X X 0 o o o 0 Q 5 o o 0 0 0 o 0 5 0
20 X X X 0 0 0 o o o 0 o 0 o o 0 0 0 o 0 0
21 X X • • o o o o o Q o 0 o o o o o o o 0 0
+
which implies 0, = 1/c, {j = Hence every zero of A{z) lies in C under our
assumptions. •
The investigation of ^-stability should be, however, carried out for each cases.
For the Newton-Cotes type methods, the collocation points are given by Eq. (14).
The polynomial ir,{t) is then given by 7f,(t) = r V , ( 0 where <p (t) is defined in Eq.
t
s "H 1
(14). Taking the symmetricity of the abscissae of the Newton-Cotes type methods
into account, we can apply Theorem 6 and Lemma 2 to investigate the stability of
the methods. The question is whether all of the zeros of A { - z ) are in C~. The
symbolic and algebraic computation by computer gives the polynomials A ( — a n d ,
furthermore, algebraic computations of the Routh-Hurwitz criteria give the following.
Theorem 8 Alt the Newton-Cotes type collocation methods whose stage number is
less than 9, that is, the methods whose abscissae are determined by Eq. (14) for s < 8,
are A-stable.
Further computations of many principal minors for various pairs of indices (s, k) yield
Table 2 to discriminate the series of the collocation methods derived in Subsection 2.3
with respect to A-stability. Here the O mark means the formula of this pair (s,k) is
A-stable while the x mark means it fails. We remark that the Butcher- Kuntzmann
method has been known A-stable for any s.
4. Acknowledgement
The authors are indebted to W. HUNDSDORFER for his suggestion of the equiv-
alence of HIDM to implicit RK formula. They are also grateful to T . K O T O , C H .
26
5. References
Xj = a + jh 0 = 0,1 m)
and j/j be the approximate solution at x . Let us denote the fractional points %, TJJ,
3
• - • i.Tm {r)j = a + Sjh), and the approximate solution and derivative at ij,- for y(x) by
Yj, respectively. Assume that Yj and Y, are given by
Yi=£c v>,
fk i ; ( A . i )
Here the coefficients {Cjt} and {D } jk are chosen so that the relations
m + l m + 1
tin) - £c*vM = o(n ), ^(m) - i f ; D (x )
jkV k = o(h ) (Ai)
hold for any sufficiently smooth function y(x). The series {y } is determined by k
Once the coefficients {Cff,}, {Djk} and the points {m} are specified, then the
process can be expressed as follows. Let y, Y, Y be the m-dimensional vectors given
by
T T
y = 0/.,---,Z/ ) ,
m Y = {Y ,...,Y ) ,
l m Y = (Y ...,Y f.
u m
d =
c = , c =
Ci
m •
Substitution of Eq. (A.4) into Eq. (A.5) yields
l l
Y = (c - CD- d)y 0 + kCV~ Y. (A.6)
Let the symbol denote the j'-th component of m-dimensional column vector. Then
from Eq. (A.4) y is given by
m
l l
ft. = [-T>- d] y m 0 + h\V- Y\ . m (A.7)
-1 l
If we can establish the identities [ - £ > d ] = 1 and c - CV~ d m = e, Eqs. (A.3)
and (A.7) are written as
l
Vm = ya + h[V- Y] mt (A.8n)
l
Y = f [lt,m + hCV Y) , (A.86)
where / is the m-dimensional column vector whose j - t h component is equal to
_1
/ [vi>Vo + y " j , ) . Thus, substituting h = H/m and denoting x = x , x = 0 n m
S
Y, = y + HY,*i fU«
n k + -±H Y ),l k
K m
k=i '
1
where a,> is the (j, k)-th component of CD' /m and bj is the j - t h component of the
- 1
last row vector of P / m . This is nothing but an impficit RK formula. (2) of m-stage
with step-size H{= mh).
Next, considering the special case y(x) = 1 for the constraints (A.2), we have the
identities
m in
£ c , . = l and £ f l = 0 (j = l,...,«*}.
( i
These imply
c + Ce = e and d + Ve = o.
Hence we arrive at the identities
1 1
e = -V~ d and c-CD~ d= e,
exact for y(x ) and — (%), respectively, at the next step-point and collocation points
m
dx
of the step-size H = mh. Hence we have a collocation RK formula. From the
derivation of HIDM, it is obvious that the equidistant points = a + (jjm)H stand
for the interpolating points with the step-size H. This implies that the formula is
the Newton- Cotes type mentioned in Section 2, and that the formula parameter is
uniquely determined. It also assures the nonsingularity of the matrix V.
29
ABSTRACT
A certain type of P-stabJe block method is derived for solving second order
initial value problem y" — f(x,y). The block method considered here computes
the numerical solutions simultaneously at the two points of x, and is easily
paralletisable. Some technique to reduce the local truncation error of the method
is also developed.
1. Introduction
E%JWi=ft £:fe. a
( )
2
Xi = x„ + ih, /, = f(xi,yi).
where h is the step-size and y< is an approximation to the solution y(xi). Hereafter
we assume that the function f{x,y) satisfies the Lipschitz condition with respect to
y. The LM method of the type (2) is said to be consistent if
= AC*+&-iC*~'+ •*•+&•
The consistent LM method (2) is said to be of order p if the power series expansion
of the operator
p+2
satisfies L[y(x); h] = 0(h, ), for all sufficiently differentiable function y(x).
9
In the family (2) the Stormer-Cowell methods(see e.g. Hairer and Henrki") are
the most commonly used ones. It is, however, well known that the methods with step
number greater than 2 exhibit an orbital instability for the test equation
y" = - « V {«)
where u is real; the numerical solutions generated by such methods do not stay on
the circular orbit but spiral inwards. On the other hand, it is also known that the
Numerov method, 2-step Stormer-Cowell type method, is unstable for alarge step-size
h.
2 !
The interval of H = (w/i) is called the interval of periodicity, if the method
2
(2) with any H within this interval gives a periodical solution. The method having
interval of periodicity (0, oo), which is expected to be stable for any step-size h > 0, is
1J 7 13
said to be P-stable . For the P-stabie method Dahlquist and Lambert and Watson
independently established that the attainable order of the method is 2. However,
2 1
Cash and Chawla showed that higher order can be attained if one or more off-step
values are used. The multistep methods which use off-step values are often called
hybrid methods. For the certain type of the hybrid P-stable methods QI and Mitsui
1 5
gave the attainable order.
2
Many high order hybrid P-stable formulae have been derived (see e.g. Cash ,
5 10 12 17 18 10
Chawla , Hairer , Khiya , Simos and Thomas ). Among these methods Haiter's
one is the most simple one, and is given by
/. = /(*..*).
where the off-step value y is given by a
V* = » . - J^-A'C/m+l " 2 / + / n - l ) .
n (6)
7 ^ 0 , wk je s/n.
and is of order 4. In this article we shall develop a P-stable method which computes
y „ i using the information available at T„ and can be executed simultaneously with
t
2. Block method
The block method to be considered here consists of two LM methods, one of which
is Hairer's(5) one, and the other one is
J/-+2 - 2y + y -
n n 2 = / i { M / 2 + U - 2 ) + bi(U+i + U-i) + 2&o/ },
2
n + n (7)
fi = f{x„yi), t = n , n - l,n±2,
/ n + i = f(x„ ,y ),
+1 n+1
2
j/.+i - a(y*+2 + y*-2) + 2by - I / * - , + h { c ( / n n + 2 + /„_ ) + 2df }. 2 n (8)
In our block method a pair of methods (5) and (7) compute y and y simulta- n+1 n + 2
that the second method (7) dose not use the values y i, / i , which are the values n+ n +
to be computed by the first one(5), since the use of these values in the second one
makes it difficult to execute the two methods (5) and (7) simultaneously on parallel
computers.
First of all, we must determine the coefficients b , bj and b in the second method 0 2
(7) so that the method is being of order 4, in accordance with that of the first method
(5). To do this, we associate with the method (7) the following difference operator
L[y(x)M:
L[y{x);h] := y(x + 2h) - 2y[x) + y(x - 2k)
2
- h {b (y"(x + 2k) + y"(x-2h)) 2 (9)
+ b,(y"(x + h) + y"(x - h)) 4- 2b y"{x)}. 0
Assuming that y(x) is sufficiently often differentiable, we expand the operator L about
x as the Taylor series
3 i l 2 b 2 + 4 4
L[y(x)-M = 2(2 - b - fc, - 0 fe) + " ~l y"'(^
J l 5 6 2 4 0 f e 2 + 3 2 a
+ '" 1 8 0 ^ ( ^ + 0(ft ). (10)
32
2
The values b and b, for which the terms 0(h } and 0(h*) vanish are given by
0
Therefore, under the assumptions that no previous errors have been made(localizing
assumption), and that / „ approximates to y"(x i) with the error of order at least
+ ] n+
6
ft , we have for the local truncation error of the method(7)
1 5 + 1
T := y(x )
2 n+2 - y n + 2 = " ^ ^ + 0(k*). (12)
difference operator L:
Note that the definition of the operator is based on the practical consideration that
s
;/„ > always subject to the local truncation error when computing i /
+2 by (8), even n + 1
if the localizing assumption has been made. The Taylor series expansion of (13) is
given by
2
L[y(xy,h\ = 2(-a-b + \)y(x) + (-4a-2c-2d+l)yW{x)h
- 1 6 a - 4 8 c + l ,,,, . . W
+ " 2 y i ^
1 (14)
-360ab - 40a -480c + 12 ( 6 ) 6
yW(x)h + 0(h*).
360
The coefficients for which the terms up to order 4 in (14) vanish are given by
-48(i-l- 23 , 48d + 57 8d - 3
a = 4 = C = 1 5 )
— 8 0 — ' - ^ 0 — ' — - <
Substituting the coefficients into (14) we have
f 2 : = ^ + 1 )- = 3 4 8 d - 2 3 ) - 48.
M + 17 + ^ ^
33
Thus, we have a 4th order LM method which is suitable for the second method in our
block method.
In the next section we will investigate the conditions under which the method (7)
is P-stable .
3. P-stable analysis
If we solve the test equation(4) by the method (7) then the numerical solution t/i
satisfies the recurrence relation
2 2 3
R (H )y„
0 +2 - 2R (H )y
1 7l + MH )*^ = 0, (17)
2 2 2
where H = uh, and the coefficients Ro(H ) and Ri{H ) are the polynomials in H
and given by
2
3(3b - l)(16d - 1) +20 2 (3fca-l)(8d-3)
R (H )
0 = 1+ H H\ (18)
60 30
2, , , 3(36,-l)(16<f-l)-100
2 s 4(36 - l)d
2
R,{H ) = 1 + (19)
H .
Necessary and sufficient condition that the y„ defined by the relation (17) has a
9 2
solution of the form y = e*' [8 = real) for any H > 0, that is, the condition for
n
2
V H > 0, 2 <1, (20)
Ro(H )
or equivalent!y
2 2 2 2
V H > 0, {R (H )a + R,(H )) (Ro(H ) - Ri(H*)) > 0. (21)
d e (-3/32,1/16),
since
2 2 2
Ro(H ) - R^H ) = 2H +
10
2
MH ) + R H)
l{
2
=2 + 3 ( ^ - l ) (30
y-D-40 , t f _ (3 & 2 -l)(32
30 t f + 3) t f 4
2
Moreover, since R + Ri ~ 2 for small H , we are allowed to consider only the case
0
that RQ + RX > 0 and R -Ri > 0. When d e (-3/32,1/16), necessary and sufficient
0
and
2
m ; = |3(36 -I)(16d-1)-4Q|
2 + 4(36 -l)(32d
2 + 3) < Q ^
The discriminant D(d) has the distinct real zeros d, and d (di < d ) 2 2
962 - 43 + 20^/-3(36 - 1) 2
(24)
48{36s - 1) '
9 6 , - 43 - 2 0 ^ - 3 ( 3 ^ - 1 )
25
* = « f ^ T ) " < )
Here we must investigate the condition for the first interval defined above is being
nonempty. We have for any b 2
d i + 1 9
i=96(1^) { - ^ - ey=i(>r^iy} > o,
which is given by the identity
2 2 2
(19 - 96 ) - ( 8 ^ - 3 ( 3 6 2 - 1 ) ) = (96 + 13) .
2 2
d 2 2 < 0
l i - ^ 1 2 ( l ^ ( - - ^ ^ ) -
Thus, we have proved that the second method (7) is P-stable, for the parameters 6 2
2
y" = -<" y, y(o) = i , y'(o) = o, (28)
y{x) = casux, w = 10
to compare the errors of our P-stable block method with those of Hairer's P-stable
method. The errors at x = 5rr, IOJT, 15TT, and 20?r are shown in Table 1 and 2. In
this experiment we set bj = -0.112 and d = 0.062.
35
Vt = - | , yi(Q) = 1, Jri(O) = 0,
(29)
ID this example, we integrate the equation in the interval x f [0, IOOOTT], and in order
to compare the accuracies of the methods, compute the maximum orbital errors
max (30)
D<z„<100*
where jfe,* and y , are the numerical solutions corresponding to the exact solutions
2 n
yi{x-n) and i/2(x„), respectively. The results are shown in Table 3.1 and 3.2.
Table 3.2. Maximum orbital errors (30) of the P-stable block methods.
h = TT/10 h = TT/20 h = TT/40 h = TT/80 h = TT/160
4.317E-02 4.655E-04 6.613E-06 2.375E-07 1.348E-08
We can see from the tables that although the results of our P-stable block method
are slightly less accurate for both problems, our P-stable method integrates the equa-
36
We have seen in the previous section that our P-stable block method does not
necessary give an accurate result compared with that of Hairer's one. The reason for
this is that the second method (7) of the block method has a large error constant
(—15b — 1)/15; the value of the constant at b — —1/9 is greater by a factor of 64
3 2
compared with that of the first one for the test problem (4), In order to improve
the accuracy of the second method we shall develop some extrapolation technique
such as Milne's device, which is often used to enhance the order of convergence in
conventional LM methods for the first order equations.
In our extrapolation technique we need another approximation to y(x } of the n+2
same order. To get such an approximation we use the same method as (7) with
different set of parameters. We attache the superscript * to all symbols relating to
the second approximation. The method for the second approximation is
2
y: +2 - + sk-a = h {b' {f; 2 +2 + u $ + Kif: +l + 1 U ) + 2b / },0 n (31)
ffi s
£ M = "(fn+2 + fc-a) + 26*& - + ft {c'{/„ +2 + /„_ ) + 2d'f }.
2 n (32)
In the method above the free parameters b and d' should be chosen in the range 2
which guarantees P-stability, The local truncation error of the method is given by
15 + 1 e
T- := y{x )n+2 - y' n+2 = ~^ y&(x )h
n + 0(k"). (33)
Using two methods (7) and (31) we can easily estimate the local truncation error of
(7). From (12) and (33) we have
6|
*V (z») = ^ ( y „ + 2 " C.J + 0(*ft (34)
where
-lMa + 1 -15ft;+ 1
15 ' 15 '
and therefore we get
Q
y{x )
n+2 =y n+2 + , _ c c (JW.2 - y* ) + 0(h').
n+2 (35)
37
fcU = 2y - y _ + —{f^j
n n 1 + (10 - ) / „ + /„_, + / }
7 7 n
= 2y - y _ + h?{b {} +
n n 2 2 n 2 + /„_ ) + 6,(/ 2 m+1 + f„_ ) + 26 /n}
t 0
2 ( 3 6 )
li+a = fa - y,-2 + ft {6 (/; 2 +3 + A- ) + b;(/;
2 + ! + fn-i) + 2b- f„} a
where the off-step values j j ^ j / and y^ are given by (6), (8) and (32), respectively.
n + 1 +1
proceeding to the next block then we can improve the accuracy of our P-stable block
method. Note that the calculations of the three values y„+i, t/„ and y^ can be +2 +2
performed in parallel using three processors, and that the calculation of j / requires n + 2
no function evaluations.
Example 3 . Let us consider the same equation as that of Example I , The results by
the modified algorithm are shown in Table 4. In this experiment we take b = -0.2 2
and d' = 0.0622, and take the same values for 6 and d as those in Example 1. 2
Next we study the stability of the modified method (36). If we solve the test
38
equation (4) by the algorithm, then j i n + 2 and y^ +2 must satisfy the recurrence relations
2
y ~2S(H )y
n+2 a +y- n 2 = 0, (37)
2
!,; +s - 2S-(H )y n + y _ = 0, n 2 (38)
2
2 _ 2 _ R[(H )
S { H S { H )
> - M i w y - m m
where Rg and R, are the polynomials defined by(18) and (19), and flj and R' are
the polynomials defined by the same Eqs., but b and d are replaced by b and d' 2 2 :
respectively. From Eqs.(36), (37) and (38) we can find the recurrence relation to be
satisfied by y - The relation is
n+2
2
j}„ 2 -
+ 2S(H )y„ + y - n 2 = 0, (39)
2
where S{H ) is given by
2
The modified method is stable if and only if the function S{H ), which is often
called the stability function, is less than unity in modulus. A simple computation
shows that the modified method (36) is P-stable , if b —• —oo and b remains finite, 2 2
2 2 2 2
(if -oo and b remains finite), then S(H )
2 S'{H ) (S(H ) - S(H )). There
may be some cases that the method is not P-stable for finite b and b . However, in 2 2
2 2 2
these cases there exist the intervals (0, H ) and (H ,H ) (H < H\ < H ) in which 0 2
the method is stable. A further study on the stability of the modified method will be
needed.
2
The graphs of the stability functions S(H ) of the modified algorithm for some b 2
6. Concluding remark
We have derived a certain type of P-stable block method for solving the second
order IVP's on parallel computers, and developed the procedure to reduce the lo-
cal truncation error of the method. The modified algorithm using this procedure
41
produces an excellent Tesult, and seems to be P-stable for almost parameters. Fur-
ther consideration on implementation of the algorithm on parallel computers and its
performance evaluation will be necessary.
7. References
ABSTRACT
ft
/ P{x)dx = E Vii^ixi), fa = a and i = 6),
a
to be exact for any polynomial P with degree at most n — I , where HUj'i are
weight coefficients independent of P. In addition, a numerical method for solving
directly the initial value problem of r-th order ordinary differential equations
f J/ LR, (01
= /(i.J .---,!/'''- ), 11
*>*»,
(
1 » '>(*o) = tf\ i = o,i,..., -i, r
1. I n t r o d u c t i o n
Let 7 = [a, 6], (a < 6), X = {x^I \ a<0t<Xv<>• -<x <b), (rt > 1). Let n _ ! be
n n n
pi) E
i=l j = 0
E
This E is often called an incidence matrix of quadrature or interpolation. Let V„
denote all the set of n real numbers yi indexed by (£, j) such that eg = 1 for a given
B
fc X n incidence matrix E = [ % ) , i.e., y„ = {yj | By = 1}.
Given a it X n incidence matrix E and X„, we can define a quadrature formula in
44
the form
Jo 1
for any polynomial P e n _ i , where W^'B are the weight coefficients independent of
n
1 2 3
P. This is called ' ' the Hermite-Birkhoff quadrature formula (hereafter, merely say
HB-QF) for the incidence matrix E.
In this paper, first we consider the question of existence and construction of HB-
QF specified by E with k = 2 and n > I . That is, this problem is of generalization
of the Euler-Maclaurin quadrature. As an application of HB-QF to ordinary differ-
ential equations, next we develop a numerical integration method for the initial value
problem of the r-th order ordinary differential equations
j,M = f( , x &%.,.,
(i)
(1)
y (xo) = Vo\ i = 0.1 r - l ,
where r is any fixed positive integer greater than 1. Then the numerical integration
formula in the form
tfm-H
Ym+i — and F m + i = i = 0,1,
J*)
here for each s (0 < s < T) is an approximation of yW(x i), and at, /3|, are m+
there exists a unique polynomial p€ll„^i such that for any F„ = {yl | ey = 1} and £
i=l
45
and
< = E < . 0<p<n-l. (2)
i=o
Then the following Pdlya's result is well-known.
4
Theorem 1 (Polya's theorem ) A 2 x n incidence matrix E is unconditional-
poised if and only if M£ determined by E satisfies the inequality
B
Af > p+ 1, 0<p<Ji - 1.
p
E £ w = *•
i=l j=0
In this paper, the G is called a dual incidence matrix corresponding to E or, merely,
a dual matrix of E. For example, for the incidence matrix
1
F - (
E
° °
~ { 0 1 1
its dual matrix becomes
1 1 0
G =
0 0 1
The following theorem give a relationship between a 2 x n incidence matrix E and
its dual matrix G.
Theorem 2 A 2 x n incidence matrix E = (ey) i s unconditional-poised if and
only if its dual matrix G is unconditional-poised.
Proof To prove this theorem, we use the following relations obtained immediately
from the definition of the dual matrix G\
M? - 2(p+l)-Em*_ _ , i 1
i=o
= 2(p+l)-^- E^-i-ij-
= 2{p + 1) - » + (4)
46
Therefore we obtain
since
M? = 2(q+l)-n + MZ_ _ , q 2
J
/V(x)^ = E E ' i V F i +
J
( i )
(^). 0-6-a), (5)
" ,=o 1=1
where
| n , )
Wti = ^ P l ~'~ (i ). i (<i = 0 and t = 1). 2 (6)
Proof By using the Darboux formula it can be proved that, for any monic poly-
nomial q of degree n and any (n + l)-th continuously differentiable function u on /,
it follows that
where
= ( _ 1 ) + 1 } a + s h
* >)(Oji ^ ( ^ - (»)
Now define
1
then we have u«>(x) = P ^ ' f i ) for j = 1,2,• • • , n + 1, and substituting P ^ - ' for
tt$ in Eqs. (7) and (8), we can obtain
ft?
u(b) = u(a) + £ ( - i y J ^ (q^(Cj)pU-V(a) - q^(l)P^{b)) , (9)
and
w <n,
However since P ( x ) = 0, we obtain R^-Q. Therefore by substituting g (0) = n!,
u{a) = 0, and
u(b) = £ P{t)dt,
;=0 t = l
by substituting g for P in Eq. (5). However, in this case, it should be noted that the
remainder term is not necessary zero, that is, in general,
where n + 1
Proof Suppose the existence of HB-QF specified by E = [a,), then we have the
expression in the form
?p{x)dx= E mf W% m
Pen..,. (ii)
On the other hand, by Lemma 1 we also have the expression in the form
f'* P{x)dx
P(x)dx =
= E +l
fc' wyP (*i)+ W)
E iPhnfltHpfi,. (12)
_ _• r*
for any P e n _ , . Therefore, comparing Eq. (11) with Eq. (12) we get the relation
n
E » ^ % f % ) = o.
ti;=0
Since this relation must be satisfied for all P e n „ _ i , it is necessary that tetj = 0 if
ey = 0. This means by Lemma 1 that there must exist a monic polynomial q of
degree n such that
x
q^-i- \U) = 0 if = t,
because j)i,„-j-i = 1 — = 1.
Conversely, if there exists a monic polynomial g of degree TI such that ?'*'((;) = 0
if = 1, then it is trivial by Lemma 1 that the HB-QH prescribed by E exists. We
have completed the proof of Lemma 2.
By virtue of Lemma 2, the following theorem is proved.
Theorem 3 A 2 x n incidence matrix E is poised if and only if there exists a
HB-QF specified by E.
T
Proof Let G be the dual matrix of E and H = [hi, k ) with hi — 1 and h = 0, 2 2
This horizontal sum is often denoted by G = (G\H), and we recall that the horizontal
sum is poised if and only if both G and H are poised. Therefore if G is poised then
the horizontal sum is also poised since H is evidently poised. Consequently since G
is poised by Theorem 2 provided E is poised, there exists uniquely a polynomial q of
degree n such that
fiHti) = 0, if g - = 1,
i} ( 0 < j < n - 1, i = 1,2), (13)
and
<n)
q (t ) 1 = n\. (14)
49
where (, = 0 and t = 1.
2
Conversely, if E is not poised then G is also not poised. Therefore any polynomial
q of degree n which satisfies the interpolation conditions in Eqs. (13) and (14) does
not exist, and any HB-QF specified by E does also not exist by Lemma 2. We have
completed the proof of Theorem 3.
By Theorem 3 and Lemma 2, if a given incidence matrix E is poised then it is
guaranteed that there exists a monic polynomial q of degree n such that q^{U) = 0
if = 1 for the dual matrix G of E. Especially it should be noted that finding such
g is just equivalent to solve a homogeneous interpolation problem on G. Therefore if
the homogeneous interpolation problem can be solved, we can then obtain the HB-QF
specified by E from the solution, by computing the coefficient »);,• in the way shown
in Lemma 1,
4. Several Examples of H B - Q F
Let ni and n (n. < 112) be non-negative integers and n = n. + n > 0. Define
2 2
f 1, if /=o,M,---.«i-i.
e
*' I 0, if otherwise,
a i + k
q(x) = t y r - * $ * '
50
without the factor of multiplier. Then the (n-j- l)-th derivatives of q are given by
BI \(n + t - j - I ) ! K
-SS* 0 < j < Bl - 1
f P{x)dx = " £ +1
tf »» P«>(a) li + fc'+Vw^H*).
J o
j=0 ;=0
where the weight coefficient uVy can be computed from Eq. (6) as follows
23
n\h {
' U +«i-J-l+9; J + (A + g i ) !
J+ '
here = min{tii, j } and q = max{0, j — r i i } . This quadrature formula is said to be
3
Example 1: (Case of n = 6; n\ = n = 3) 2
. , / 1 1 1 0 0 0 \
BMIMJ-^ i I I O O O J
(b) Dual incidence matrix G corresponding to EGH{3,3) :
( 1 1 1 0 0 0
^ 1 1 1 0 0 0
( 2 2)
+ T^(P V ) + P' W).
51
f
incidence matrix such that
1. if j = 0,
= { i. if j = 2k-l and l<k<r,
[ 0, if otherwise,
where the weight coefficient ttty can be computed from Eq. (6). This formula is
well-known as the Euler-Maclaurin quadrature.
Example 2: (Case of n = 6; r = 3)
(a) Incidence matrix :
1 1 0 10 0
EEM -
1 1 0 10 0
I 1 1 0 1 0 0
G =
^ 1 1 0 1 0 0
6 s
g( )=X -3x +
I ^ - ^
52
m
[P(x)dx = \{P{a) + P(b)) + ^ ( ^ ' ( a ) - P (t>))
i 3 ) m
- ^ ( P ( « ) - r m
Theorem 5 Let r be a positive integer and n = 2r. Let E E M L = (eij)> 1 < i < 2
and Q<j<n —1,be the 2xn incidence matrix such that
if j = 2k and l<k<r - 1,
if otherwise.
where the weight coefficient ijjj,- can be computed as follows; Let an — 1/2, and
1
(2> + l)!2(j + l ) -E (2j - 2 * + l ) ! '
(1 < 3 < r - 1),
Example 3: (Case of J I = 6; r = 3)
(a) Incidence matrix :
0 10 1
0 10 1
10 10 10
10 10 10
S3
+ ( / > ( 4 ( Q ) + F ( 4 l ( 6 ) )
2^ ' "
Remarks-. In all the cases of examples illustrated above, it should be noted that each
dual incidence matrix coincides with the corresponding incidence matrix.
Suppose that the solution y to the problem is sufficiently smooth, then we can obtain
the following equation from the quadrature formula by substituting y' for g in Eq.
(10) and computing the integration of the left hand side .
y(b) - y(a) = £ V +
+ E +
JLml-iyJL- l
[ (x)i (a+xk)dx,
g
n+1)
h = b — a, (16)
n! Jo
where is the error term of the quadrature. In this equation, truncating the term
Rn, we have the numerical integration scheme
y m + 1 = y m =E A W P + t (™ = o, l , . . . , M ) , (17)
,=i ;=i
(Qj = Wij-i and 0j = w - i ) , for the initial value problem (15), where x
2 j m+i =
x + (m + i)h and
0
54
for i = 1 and 2. It follows from Eq. (16) that the order of local truncation error in
this scheme is greater than or equal to n.
In the scheme with n > 2, because some derivatives of higher order are used, we
are usually a need of analytic computations to obtain the derivatives. Unfortunately
it is, in general, not easy to do so, in particular, in the case where the differential
equation is of a system, so that the scheme is not practical. But, as an application
of the scheme, we can design a useful scheme for computing directly a numerical
solution for the initial value problem of higher order differential equations. After
some preliminaries, therefore we will give such a useful scheme in §5.2,
In the numerical integration scheme (17), especially we are interested in the nu-
merical integration scheme derived from the quadrature of general Hermite type with
rii < "2- Then the scheme can be written as follows
(18)
where
Cti =
n, Un + k-pjY.
ni+k-j + qj) {k + q,)\ '
S(z) =
1! P.
[k + n.-j + qj) (k + qj )l j
with z = Aft. For the stability function, we can easily shown that S{t) < 1 (t e
(0,+co)) for any n and n with iii < n . In addition, we can also show that the
s 2 s
scheme is A-stable if H i = nj. The latter property is proved in the following theorem.
55
(19)
3=1 j= l
is A-stable, where
1 fp\
m = -±h-» (
p
k p
V 2 p +
* - » !
(A)=s-H w)rrO' + *
we can show that
- $ ~ t - l M (J = l , 2 , . . . , p ) .
3=0
m =
j=0
Evaluating this stability function at each point z = to/, (-co < u < oo), on the
imaginary axis, we obtain
A+iB
S(iV) =
A — iS
where
W , / a
A= £ ( - l y ^ w * . and S = £ (-#" 0 V-J
j: even>o 3: odd>o
j=0 w /
5(1) = < 1.
E(:)(2 -y)! P
j=o
56
we get
r )
= yir +i + i
j=i >=i
for p = 1 , 2 , . . . , r, and by setting t — p = s, we can moreover obtain
('+1)
, and F m + I =
Arranging j^mti i " a column and setting j , ^ = / m + i , (j = 0,1), we have the desired
scheme
, l r
/" fca[ . . . k - a' _ h'a \
f 3/L°> \ r l T
0
2 2
: •• ka\ ha
0 ... 0 ha[ J \ fm }
Table 1: Absolute Errors in y ' and y ' with M = 5/ftM M
4
0.050 7.47 x 10" 9.56 x l O - 2
4.52 x 10" 4
1.62 x 10""'
3 1
0.100 2.58 x 10~ 2.17 x 10"' 1.81 x 10~ 3.36 x 10"
3
2 3 +0
0.500 8.66 x l f r 2.60 x 10+° 9.81 x 10" 1.57 x 10
Sm + 1
0
+ J'-l)
Sm+1
(20)
0 0 Aft 1
V /m+li }
1
for m = 0,1,..., M, where f = m+i • • •, J/m+, ') for i = 0 and 1.
Since this scheme is of an implicit form, we must solve Eq. (20) as a system of
nonlinear equations with respect to variables of T / ^ . , . . . . j / m + i \ provided / in Eq.
+
0 1
(1) is nonlinear with respect to j / ' , ; / ' , j ; ' ' y. However we can use this scheme
as a corrector in a predictor-correct or method, and then the predictor can be easily
obtained by replacing / i with / in the right hand side of Eq. (20) as a system of
m + m
¥2
,,(1)
Sm+1 7m" IA - A * fm+1
1) + 0 + 0 m+1
Sm+1
where / = f [ x
m , a n d / + i = / ( W i , ! / ! i , £ l i ) - Then the numerical
m m t
results is shown in Table 1, together with the results solved by the Euler method.
I M P R O V E D SOR-LDKE M E T H O D W I T H O R D E R I N G S
FOR N O N - S Y M M E T R I C L I N E A R EQUATIONS
D E R I V E D F R O M SINGULAR PERTURBATION PROBLEMS
EMIKO ISHIWATA
Department of Mathematics, Waseda University
Okkubo S-i-1 Sinjyuku-ku, Tokyo 169, JAPAN
E-mail; 63m502@cn.wBseda.ac.jp
and
YOSHIAKI MUROYA
Department of Mathematics, Waseda University
Ohkubo S-4-1 Sinjyuku-ku, Tokyo 169, JAPAN
ABSTRACT
We consider the linear system Ax = b derivedfromsingular perturbation prob-
lems, l b solve such non-symmetric linear problems, we propose a generalised
SOR method, which we have named the "improved SOR method with orderings",
We use three ideas, that is, orderings,rariablerelaxation parameters and a not
oecessarily strictly upper triangular splitting matrix U'vaA^D — L-U. The
basic theorem, the selection of the relaxation parameters, orderings and several
numerical experiments are also presented.
1. Introduction
For example, if P = I , then Eq.(l) is called the improved SOR method with
natural orderings. In particular, if Q, = w, i = 1, • • • ,n, then we call Eq.(l) as the
usual SOR method with natural ordering for U), If
0 1
P = i and Q{ = Ul, i = 1,• • • ,n,
1 ' 0
then we call Eq.(l) as the SOR method with inverse ordering for UJ.
Our method has the following three features compared with the usual SOR method.
1) We take orderings into account. This turns out to be very important for
non-symmetric matrices.
2) We change the relaxation parameters G>i, i = 1, • • •, n of $ usefully.
3) U need not be a 'strictly' upper triangular matrix.
8 6
Recently, H.Han et al. and H.C.Elman and M.P.Chernesky studied the effect
of the partitioning and ordering of the unknowns on the convergence of the Gauss-
Seidel iterations and gave a general procedure to automate the partitioning and or-
dering phase of the solution process, for not only one-dimensional problems but also
11
two-dimensional problems of the discrete convection-diffusion equation. K.R.James
expressed the iteration to vary all relaxation parameters as in Eq.(2) and derived
a range of values of the relaxation parameters of the Gauss-Seidel and Jacobi type,
together with the bounds of the spectral radii of the corresponding iteration ma-
3 13 14 5
trices. P.H.Brazier , D.B.Russel , J.C.Strikwerda and L.W.Bhrlich respectively
proposed special selections of the relaxation parameters for two-dimensional prob-
4
lems, but they have no analytic results. J.J.Buoni and R.S.Varga commented that
the splitting matrices L , U need not be triangular matrices.
We use all three ideas at the same time to solve non-symmetric linear equations.
As a result, we have obtained an effective SOR-like method which converges more
rapidly and with fewer iterations than the usual SOR method. In this paper, we
only prove the basic Theorem on the tridiagonal matrix case with constant coefficient
and apply this theorem to several numerical examples of blocked systems, using the
special relaxation parameter wt with proper orderings.
Further results on the improved SOR method with orderings, that is, general con-
vergence theorems, special selections of the relaxation parameter u),, i = 1, • - •, n and
orderings in practical use and relationships between our method and the direct meth-
ods such as Gaussian Elimination, etc. will be published elsewhere (see E.Ishiwata
9 10
and Y.Muroya ' ).
where e is a parameter in (0,1] and the functions a, b,f lie in C?[0,1] and are
independent of e. We first assume that there exist constants a, 8 such that
3 3 2
a{x) > o > 0, b(x) > /3, a + 4e/3 > 0. (4)
Under these hypotheses Eq.(3) has a unique solution u(x). This solution has, in
genera], a boundary layer at x — 0 for e near 0.
Now, as examples of non-symmetric difference equations, we show only two ex-
amples of difference equations derived from singular perturbation problems.
Let n be a positive integer and h = l / ( n + 1) be the uniform mesh width. The
nodes in (0,1] are Xi = ih, i = 0,1, • - • , n + 1.
i) Upwind difference scheme
— 2y + yf < + t y ,~yii+
£ 2
~ h a . — + 6 * = A, . = l , - , n { 5 }
Vo — 7o> JAi+1 = 7i
e £ + a,h
2 2
2e + a ft + bih '
( ' 2£ + a,h + b,h '
If bi = 0, then li + « i = 1 holds.
ii) The El-Mistikawy and Werle difference scheme
The piecewise constant approximation a(a:) of a(x) on [0,1] is defined by
a u x.£ [*(-l,as), i = 1,2,••-,«+ 1
o i, n + x= 1
where hi = (a(z,_]) + a(xj))/2 for i = 1,2, • • • , « + 1.
Piecewise constant approximations 6 and / of b(x) and f(x) respectively are de-
fined analogously. The test functions [iik}t=] 8*8 defined by
1
-£^;' + av5' + oV* - 0, on (Xj,x )
k ju j = 0,l,---,n
(6)
5 , j = 0, l , - - . , n + 1.
i>k(xj) = ktj
Under the assumption Eq.(4), the El-Mistikawy and Werle difference scheme is
f, 7
denoted by Aii — b where u = {^(xi),--
11
• ,u (a; )) '. This matrix A = [flyr] is an
h
n
In this section, we show the error estimates for the SOR method with the special
relaxation parameter w = Q . b
16
We use an n x n tridiagonal matrix expression as
0\
f h Cl
02 62 C2
A = [a ,b ,*\ =
i i
n+li ,1+1
? = 0,1,2,---. i / m = 9(71 + 1 ) , thenwehave ef = |A|«< >. e<°> and
63
) U II 1<p < n - A
u
mi
u t At !
4 = /A u
p = n - fe+ 1
u k A; JO)
I <4°' | , n - Je + 2 < p < n
ZA u
8
Proo/. Let the eigenvalues of the SOR matrix C be \ = (w — lje* ', j = 1,- -• ,n. u
17 2
Then , i can be expressed by %j and
3 = cosfij + isindj, where i = —1.
A,r+w - 1 1
- 1)5 , i. (w - 1)5 ^
H = ' .. = - ( 1 + e"-)e-i > = i ^-2cos ^
e
w(A )a 3 "j w 2
c o s
Since / i , - is already expressed by Eq.(7), we obtain cos^ = U\J-£ti ^i- Be-
cause of Cij = lit, we have u)^Jluj{us — 1) = 1 and fly = 2jw/{n+ 1), 1 < j < n.
The eigenvector u , corresponding to A can be defined by }
pjTf ,p5
e ' sinpfly
I — I ^ n + l " U
where tJj = Pj/2 = j j r / ( n + 1), j= l , - - , n . Then we can get the next relation
n
2
£ < f « , where c[ — Y" ei (sin2A:e +tcos2*fl,-)-
0,
i (8)
£ 4 % = £
—*—• £ {sin2p^ + i ( l - cos2pfl,)} (sin 2/fcff,- + t cos 2*9^)
3=1 3=0
0)
XA V'" e[ *
*=1 y=o
We use the relations £ J sin 2pt?y - cos 2kS = 0 and £ " cos 2k&j = 0 . If p = A
= 0 } = 0
holds, then 5Z™ cos 2(p —fc)fl -= n + 1. Otherwise, if p / k and 1 < p, k < n, then
=0 3
U
" F
j=l „ j=l \V /
Substitute the above expression of cf^ to this formula, then the error vector is rewrit-
ten as
-k P
v
u j=0 n + 1
where
*(2m >-2fc)Jy
e +J . a i n p 9 j . _ { c o s ( p + 2(m-fc))fl + £sin (p + 2(m - k))0j} • sinpfij
>
from which each value of p, m determines the error vector explicitly and we can finally
obtain the proof. •
For practical purposes, we denote two error estimates more explicitly, one for
8
\l\ > \u\ and the other for |f| < |uj (cf. H.C.Elman and M.P.Chernesky ).
Corollary 1 Assume \l\ > \u\ on Theorem I . Then for q — 0,1,2, • • •,
n+1))
if m = q(n + 1), then we get simply e^ = • e<°>, and
if m = q{n + 1) + k, 1 < k < n, then
\M «(«+!> p = n —fc+ 1
1 + 4
wfcere d, - ^ ' " and |d,| - ~J\Xl/u\.
If\l + u\ = 1, then |d,| = 1. Otherwise, if\l + u\ < 1, men |A| < 1^| < 1,
and if\l-rv\ > 1, then |A| < 1 < \di\. (9)
Corollary 1 implies that if |/ + ti| < 1, then the convergence ratio per one iteration
f e (n+1 +t>
° i ™ ^ J i ' ' ' ' l for 1 < fc < n is |A/di| which is greater than |A], but the number
65
H -
Iff ]
n+1
replaced di by
v
1 + Vl —4lu
Corollary 2 impfies that if |i + u\ < 1, then the convergence ratio per one iteration
tn+1>+ki
of max \e^ \ for 1 < k < n is |da| which is greater than Al, but in spite
l<p<n *
of \t\ < \u\, the number of iterations is independent of n. On the other hand, if
|( + u| > 1, then \di\ > 1 holds and the number of iterations should be theoretically
, +1 m
almost q{n + 1) where the smallest integer q satisfies | A | ' " ' max |e{, '| < 6. But in
practical computations, if n is large, then it may not be correct by the computational
errors (see Example 2).
Since (|A|/|di|)/|o2| — |A| < 1, the case of \l\ < |u| is not better than the case of
|/j > |u|. Example 1 and 2 imply those results. That is, for p = 1, • • •, n — k, k —
P o f e m ) i n m
1, • • • , n , the term d\ • efl or jA|" • 4 " • 4 ° '
k Corollary 2 remains and e< >
p
does not decrease until m becomes a multiple of (n + 1). On the other hand, we note
in both cases of Corollary 1 and 2 that
m) m
e p = ]X\ ef\ m = (n + l),
q 9 = 1,2,3,---
In this section, we mention how to determine the permutation matrices for good
8
orderings (cf. H.Han et al. ) and the relaxation parameters. We first define the
turning points of the matrix.
Definition 1 Let us consider an n x n tridiagonal system Ax = b, where A =
[—li, 1, — i = 1, • - •, n. / / there is an integer k such that 3 < k < n — 2 and
66
a n r f 0 e n t
- |«»-I|)<IM - H*HD < o (IM - l ) ( K I - 5) > * p°"
X | u • .-.IJ'/I a point
0 , 8
In particular, if\l \,\u \k< 5,
k < |tt*-i| and | W l | > Nw-lli * * " *»
a "stable" turning point and if \l \, |u*| > \, \l -\\ > \uk-i\ and ]l i\ < \u \, then
k k k+ k+l
only a stable turning point or an unstable turning point such that if p > 2, then
<M-|)(l^-i)<o,*=i,2,.-.,p-i.
For the n x n tridiagonal matrix A = [—Z,, 1, —m\ with turning points, we now
show how to choose the permutation matrix. We call the orderings good orderings if
the permutation
•••] oin))
cr(k - 1) < cr(k) < e(k + 1) and if \l \ < |«*|, then a(k - 1) > a{k) > cr(k + 1).
k
then a(k - l),£r(fc + 1) > a{k) and if > \u ^\ and \l \ < |u»+i|, thenk k+l
<r(jfc-l),ff(fc + l ) <a(k).
Then, by assumption, we can practically use good orderings.
We now show examples of the turning points and the permutation matrices. We
apply the n x n tridiagonal matrix A = \—lj, 1, —Vj\ such that t
lj = h (2 < 3
< r\) where ii+wi = l , ii,fii>0
= Hi (l<j<n-D
h =h (r, + l < j < n )
where l + u^=l, l ,Ui>0
Uj = u\ (n<j<n-l) 2 2
/ 0 ••• 0 1 0 ••• 0\ r 1
1 0
0 0
1 0
0 0
0
01
0 1 0
^0-010
67
Note that the turning points of matrices in Definition 1 are simiiar to the turning
points of the singular perturbation problem, Eq.(3).
Now let us consider a n n x n tridiagonal matrix A = (-i*, 1, — u,] with /, = 1 — Uj
and tin = 1—1„ and assume 4i\iii < 1, t = 1, - • • , n . Assume this matrix A has several
blocks as mentioned above. Then we select Q^i for the j - t h block [—h, 1, —u,] of A
2 1
as Qij j = —-—, Note that if L + u,- = 1, then wa ,• = ——.,, - ,. r
1 + ^ 1 - 4/ 3llj max(|( |,K|) 3
1 + i/l - AliiLi
_ 1
5. Numerical Experiments
number of iterations.
We first apply the SOR method for w = u\ to a simple tridiagonal system Ax = b,
where A — [—1,1, —u], I + u = 1.
Example 1. For A = [—i, 1, —v], I + u = 1, we apply u> = u\.
Table 1. The number of iterations for ui = ui b
If \l\ > \u\, then the number of iterations is small and independent of n. But if
|/| < |«|, then that depends on n and equals to a multiple of ( n + 1).
Example 2. For A = [ - / , 1 , -u], I + u ^ 1 and lu > 0, we apply w = w . 4
If |i + u\ < 1, then the numbers of iterations in both cases |/| > |u| and |/| < |u| are
independent of n. On the other hand, if |f + u| > 1 and |/| < \u\, then the number of
iterations must theoretically be a multiple of (n-r 1). But in these cases, if n becomes
sufficiently large, then we may need more iterations than a multiple of (n +1) because
of the computational errors. We represent such cases by the numbers of iterations
with superscript * in Table 2.
Hence in practical computations, we should transform the case of I + v ^ 1 into
the case of i + u — 1, which can be done by using such d : dl + u/d = 1 as defined in
Corollary 1 and 2.
Example 3. For A = [—1,1, —u], I + u = 1, we change u>, 0 < u < 2.
Table 3. The ease of 0 < w < 2 and n = 100
w i=0.1 (-0.9 m d1=0.25 i=0.75 m di=0.33 (=0.67 m d
Remark 1 For the n x rt tridiagonal matrix A = [—1,1, —It), let lu > 0 and
Wojrt <wi, < Q < 2, then for any eigenvalue A of the fact \X\ = w —11 (8 fcrtowrc.
Let m 6e trie num&er of iterations for w — Q. Let A = Qt — 1 and m 6e a constant
Q
suck that |A|* = 5. Traen anrfer l&e assumptions we can guess m& as m& = m + an,
where a parameter a is determined by ^|A/A| = |A|" and does not depend on n.
For example in Table 3, if I = 0.9, w = 1.3 and u b = 1.111111111111111, them
A = 0.3, a = 0.4124, rn = 15 and jrt, = 56.
Next we apply our method Eq.(l) to various tridiagonal matrices with the turning
points. But before doing that, we shall explain some words used in all the following
tables to describe the numbers of iterations for each condition. The parameter ujgp, in
all examples is the calculated value for the nxn tridiagonal matrices A — [—li, 1, —ttj].
For UJ — uJopi, let m^i, m „ and m ^ , be the numbers of iterations for the SOR
in
method with natural ordering, with inverse ordering and with good ordering for each
block. Let m& be the number of iterations for the improved SOR method with good
b
by Remark 1 that the number of iterations equals to that for the second block
Ai = [—0.25,1, —0.75] with good orderings because of Qhy. = maxuib,, =f Wopt.
For iii = let m ,\ be the number of iterations for the first block A\ =
b
60 1.320605854262298 71 34 34 18
120 1.329710155009154 130 53 55 18
150 1.330960720871857 161 64 64 18
210 1.332102985834606 221 84 83 18
240 1.332394148552820 251 94 95 18
300 1.332749378434252 312 115 115 18
360 1.332954574049759 371 135 132 18
l = 0.9,
3 u = 0.1, = C7J = 1.111111111111111, (1 < 3 < [ ] " 1)
C|1 3
We show another interesting example. The next table presents the results for a
tridiagonal matrix which has a stable and an unstable turning point.
71
We consider the numbers of iterations given by Remark 1. Now for w = Wopi = Qt,2,
let fin,!, m ,2 and m ^ be the numbers of iterations for the first block, the second
0 D
block and the third block of A. Then we guess these to be respectively mt,! =
mi + 8n/9, m ,2 =? >n& and 771^3 = rftj, + n/3. Similarly we guess m d = m n + n / 3 =
0 or
For this tridiagonal matrix, the differences of the ui^j of two adjoining blocks are
very large. In such a case, we usually get a greater number of iterations if n is not
so large. Because we guess the number of iterations under the assumption that n is
sufficiently large, we note that in Table 7 if n < 660, then m , ^ and are much
more than our guess and monotone increasing according to n. But if n > 690, then
we get the numbers of iterations which we guess by Remark I ,
The number of iterations m „ is monotone increasing according to n. But the
in
60 1.450762203852421 68 132 71 25
100 1.474511840524642 129 212 132 24
160 1.484800375838160 196 330 199 24
200 1.487445518273925 243 412 246 24
240 1.488946709089202 290 491 293 24
We guess that the number of iterations is usually coincident with the number
of iterations for the block with u = max LDJ,^, but in this case, rriQ is a little less than b
this.
Finally, we consider the case of changing uj , i = 1, • • - ,n for each entry. What
t
happens then? We consider the coefficient matrix which is derived from the upwind
difference scheme and the lower elements lj, j = l , - - , n are monotone decreasing
such that Ij + Uj = 1 and 0 < lj < j = 1,
Example 9. We apply the n x n tridiagonal matrix A = [—lj, 1, —u,) such that
E + Ojft 1 ., 1
=—rr>
L 2
T>i=—-—r. h
£= ai-th, u =—, i = l , - -,n.
2e + o /i'
j it + mri n +1 f
iij
Table 9. The case of changing all uii
n value of uJopi "lord mi mj m 4Ji
80 1.221880878114550 142 63 64 18 16 6
120 1.221880565772298 221 102 103 22 17 5
160 1.221880876205672 301 142 143 23 18 5
200 1.221880875020172 382 183 183 27 21 5
240 1.221880880390542 463 224 224 30 22 5
280 1.221880879444867 544 265 264 35 24 5
320 1.221880888127118 626 307 307 40 23 5
360 1.221880877534967 708 349 349 44 25 5
400 1.221880892204558 790 391 391 48 27 5
73
iterations to use each ui = H> corresponding to each block which is split by the points
bii
[n/2], [3n/4], [4n/5], [9n/10], [19n/30], where we choose u\j = & • min u<
and the non-diagonal entries of i-th block are l , Uj for r i < j < ri.
}
It is clear that changing Wj, i = 1, • • •, n for each block is efficient from comparing
with m i or m j . But the most efficient case is to change all w , i =!,••• ,n.
t
Since the non-symmetry of this matrix is very strong, our method performs very
efficiently for these types.
All the above examples imply that the improved SOR method with orderings is
more rapidly convergent than the usual SOR method.
6. References
A N A L Y S I S OF T H E M I L N E D E V I C E F O R T H E F I N I T E
C O R R E C T I O N M O D E OF
T H E A D A M S PC M E T H O D S I
Masatomo F U J J I
Department of Mathematics, Fuktioka University of Education
Miatakata, Pukuoka 811—it, Japan
E-mail: hijiini@lukuok&-edu ac.jp
ABSTRACT
The behavior of the difference between the values of the predictor and of the
corrector for the Adams predictor-corrector method in the P{EC)"~ mode is
analysed. This leads not only to an accurate estimation of local truncation er-
rors but also to that of global truncation enors.
1. Introduction
1
In the previous paper , the author discussed an accurate method for estimating
local truncation errors, and as its application, an accurate method for estimating
global truncation errors. In that paper, he mentioned two theorems on the behavior
of the difference between the values of the predictor and of the corrector for the
m m
Adams PC methods both in the P(EC) E mode and in the P(EC) mode. The
proofs, however, were not given there.
m
The purpose of this paper is to give the proof in the P(EC) mode. In Section 2,
some preliminaries are given. In Section 3, the order of the error in the j^th correction
(i = 0 , 1 , . . . , m) is investigated. In our discussion we need the asymptotic formula of
the error in the (m — l)-st correction. In Section 4, the existence of the formula is
shown. In Section 5, the behavior of the difference between the values of the predictor
and of the corrector is analysed.
2. Preliminaries
where we denote by y ( x ) the solution of this problem. The step points are given by
where N is the total number of the steps. Let p be the order of the Adams PC
methods. Put
v.—n+.p—X.
76
& fjt = 0 , l , . . . , p - l )
are p starting values and let
e„ = 0(tf) (o->p+l;p. = 0 , l , . . . , p - l ) .
Let
k) {k)
9<») = / , ( * , » ( * ) ) , 9l =9 M, ff*=g{3v).
m
The formulae of the Adams predictor-corrector method of order p in the P(EC)
mode are given as follows:
;='
and
J=0 3=0
]
where y$ is the i-th correction of yf , /£' = J{xk,y£) , V is the backward difference
operator,
and
,
7 =^ (s-l)s---(s+j-2)ds/j!.
J
3=1
and
r ( z , j,(x); ft) = y(x) - y(x - ft) -
p2 ft£6 /(x
w - jft, y(x - jh)).
3=0
77
For the formulae Eqs.(2) and (3), we define the local truncation errors at x by n
and
T 2n = T 2(xv,y(x );
P P a ft)
respectively.
For preparations of the succeeding discussion, we give three lemmas.
1
Lemma 1 (M.Fujii ) For the Adams-Bashforth-MovXton pair of order p in the P(EC)*'
mode, the identity
3=0
V - r P h = - v i W V w
holds.
The following lemma concerns the growth of solutions of the nonhomogeneous
linear difference equation
k
Zn+* - Zti+fc-i = h^Pj^+t-jZn+k-j + A n (n = 0 , l , . . . , N — k). (4)
3=0
3
Lemma 2 (P.Henrici ) Let B', 0 and A be the constants such that
satisfies
kL
\z \ < K'e" '
n ( = 0,l,... JV),
n 1
where
K' = I - ( J V A + 2fcz), t* = r - e * . r * = 1/(1 - ph).
1
Lemma 3 {M.Fujii ) For any polynomial Pi(x) of degree i, the equality
holds.
(ft = 0 , l , . . . , p - l ; i = 0 , l , . .
and put
1
eg = $ - y(x ) n (n = 0 , 1 , . . . , N; i = 0 , 1 , . . . , m ) .
Then we have the following theorem.
Tn
Theorem 1 In the P(EC) mode, under the assumptions in Section 2, for a suitably
chosen ft, there exists a positive constant K such that
Proof. Under the assumptions in Section 2, we may suppose that there exists the
solution y(x) on the interval a < x < b and that for a small 6 > 0 the function f (x,y) y
is continuous on
Vi = {{x,y)\ a<x<b, \y-y(x)\<6}.
Let us put
0 < ft <ft<>< 1, I = \a,b], A^I^T-^l,
ft
where [ ] is Gaussian symbol. We may also suppose that C, (i = 0,1,2,3) be the
constants such that
\f (x,y)\<C
y 0 for [x,y)eV , s
p + 1
|e | < C , n
M for fte (O.ftr,] (u = 0 , 1 , . . . ,p - 1),
T+]
\T (x,y(x);h)\<C k
pl 2 for x e I, 0 < h < hn,
,+1
\T (x y{x) h)\<C h'
pl 1 1 3 for x e / , 0 < h < ho .
Furthermore put
J
J-IWft, « = £ W t*=o,i„4.
3=1 i=0
Let us choose ft, so that K' exp{(b — a)L'} < 6 for h satisfying 0 < ft < fti (fci < ho).
Suppose that
(XjJ^eVs 0 = 0 , 1 , . . . , N; 1 = 0 , 1 , . . . , l i t ) . (6)
The validity of Eq.(6) will be shown later. Since
01
4 = e£f, + ftE^.; ' 1
- /(^->,y(x ))} - w r p l B (7)
3=1
and
1 1
$ = el^+ftVI/r -/^,!/!^))}
we obtain
01 m| m 1| +1
|ei | < | e i 1 | - r f t C E | o | | e i 7 | +C7 h''
0 P3 2 (9)
3=1
and
i]
\e®\<s v + h3\et \ (i= 1,2,....m), (10)
where
p+1
s„ = n&i + ''Co E M k l V l + ^ f t . (ii)
J=I
Put
Then we obtain
,+i
d <a {^
v m + hC f^k d -
n i v j + Ch> ) (v=p,...,N). (14)
In order to estimate the left-hand side of Eq.(14), let us consider the equation
m-l
Zv = 2 - i + ft|{<-„Cb*i + B E W K _ i
v
i=0
and let {z„} be a solution of this equation with the starting values Zj = |e | 3
d,<Zj U= 0,l,...,N).
5, < hL
K-e" '
< K'exp{(b-a)L'} <S (n = 0 , 1 , . . . , JV).
4. Asymptotic formula
-1
For the asymptotic formula of ejj™ ', we have the following theorem.
m
Theorem 2 In the P(EC) mode, under the assumptions in Section 2, the relation
11
eJr = h"e( Xn ) + 0(h" ) +l
(ft =0,1,...)
holds. Here e(x) is the magnified error function, which is the solution of the differ-
ential equation
e' = g(x)e-Cy^(x), e(x ) = 0,
o (15)
uihere C is the error constant.
7 1
Proof. First, we shall consider the case m > 2. For 0 < j < p — \, put y = Bp* " ', 3
U =/j m _ 1
', Wj = ur"~ and e, = ef'
n
and also put A
1
5 W = n 7 (t,i + l ) -
J 3
1
. < 1 for fi g (O.A,]. (17)
-11
Now we make the difference equation on ej[" . Here we consider h which satisfies
Eq.(17). From Eqs.(7) and (16), we have
J=0
Since
p- I p
>v * t \m—II \nt—11 L \—» fm—11 Im— 1
j=0 j=l
—
+ 7J,i,3-p-)-i T 2,z-p+i
f
m
-hV7z(0,m)(4 l-ei?)
1
+ Wto7*(nt - l,m)(eW - e j - ' ) (20)
and
e M _ [m-H
e = »J,( - m l)(ef - 4°1), (21)
substituting Eq.(19) into Eq.(20), we have
j=0
+ r i , - p i -TpLz-p+i}
P I +
+ hbpo^im - l , m ) ( H _ l — l ) e e { 2 2 )
32
3=0
p
- ftj^ flpjwi™J ei™7 + r i^-p+i - 7 p _ i } .
1| 11
p 2iI p+
3=1
Put
A= = (bpor-rAm - l,m)S«(m - 1)
and
1
tT, - (fi )"- 5,(m - 1)(1 - /i6po7-(0. m ) ) .
p0
Then we obtain
3=0
- hj2<hi*£f^ +?P -p i - W P - H }
V + (23)
3=1
j=0
3=1
j=0 j'=l
p-1
+ A E ^3 l+W L+W -
3=0
W E
7
P2,!+p-2
X
+ W ®4{\ - frA:,)}{T ^- v p+1 - T , _p }.
p2 I +1 (24)
3=0
E lm-11 i
J=l
, n
+ tr- u /(i-h' \ )
n n
3=0 3=1
p + 2
+ /t A _ n p Cn=j»,p+i ,,.,]¥-l), 1
|A _„|<A-
n £» = p , p 4 - l , . . , , J ¥ - l ) .
Since
1 ^ 1 £<W** U* = o,i p-i),
and since
M = to - i f t ^ f f l ^ W ( J - o, i , . . . , ) ,P
Put
B" = BCb + {2(5A2r-V(l - WWHA + B)C .0
we may take
Put
< /fexpHft-aJL*}^ - 4 1
(R = 0,1,...,JV).
Second, let us consider the case m = 1. Let e(x) be the solution of Eq.(15). Put
]
ef = tfeixi) + WP+V (t' = 0,l,...,A0. (28)
Since
e{x )=jk[
3 e'(x + 9jk}d6 0 (j = 0 , 1 , . . . , p ) ,
JO
there exists a constant A"o such that
Since
P
loi 101 . , n /L
loi
w e
ep+k+i = V * E^J P+*+W p-U+i-.i
+
p—J P
+ / l 1u 0 ft a 3
E V j.+*-j4 r*-> ~ £ p;Wp+*- ep _ J rJ: J
j=0 j=l
+ Tpum - - T, (A = 0,.. -, N - p - 1), I J b + a (29)
P-i P
i w <
+ ftE PJ i'+*-j™i>+*-j - * E - w * v + * - j % * * ^
J=0 j=l
+ A* (* = 0 , l , . . . j V - p - l )
1
and there exists a constant Kg such that
\*\<Ka (i = 0,h...,N)
and we obtain
] +
ef = tVe^) + 0(h* ').
Thus the proof is completed.
5. Behavior of y® - j , H
eW-efl = {l-hVr.(0,">)}
/ { l - (hb^^im - l,ra)S (m - 1)} ;
1 1
x {^V^f-'leL'"- -Tpj,,.^} (30)
/lo/ds.
Proo/. Prom Eqs.(2), (3), (16), (7), (8) and Lemma 1, it follows that
m| 01 p m ,| ,|
ei - e' = ft7p- V i - ^-
1 W e + Tfi,*w - W**,. (31)
We denote both sides of the sign o f equality in Eq.(31) by D. Then we have the
following equalitiy:
m| m_11
+ hbfnM - 1. ™)(4 - 4 )> (32)
e IH_e|o] = { l - f t 6 p 0 7 x ( 0 , m ) } D
{1 - - hm)S,(m - - e»)
= {i-hb ^(Q,m)}D.
p0
Therefore we have
/ { I - ( A V r 7 * ( ™ - l,m)5,(m - 1)}
m
For j j f — j / [ ' , we have the following theorem,
1
Theorem 3 In the P(EC)" mode, under the assumptions in Section 2, the
2 +1
$ - = - T„ + e ln ppv + 0(ft " )
holds,where p = p and
P( 1
™ ' \v>t\j>-l) + l; l<t<p,m>2.
First, we shall consider the case m = 1. For v > p, k > 0, it follows that
fc-ip-i
J
«+r-j
1=0 j=o
P
— Tpi^+fc.p+i
1
-rp^-p+j + o ^ ) .
87
1
When we regard e„-i as ej, !,, the relation mentioned above holds for v = p. Since
^ - f f ^ + o f l ^ W=o,i,...),
for v > p, it follows that
e = —
5+t *1j8,v-p-ri — Tpi.u-^+i
+2 Ip+1
= P,(fc) + 0(/i" ) + 0 ( f t ),
s
E E *wl* + « " J'Jftffv + 0(ft )]|P,(« - j) + 0 ( ^ ) 1
*=0 J=0
r=o 3=0
+2
= E k « o + « - kooAfc + « m ] +0(ft" )]
r=o *
and
+2
E " w*^
- i * - ^ * - , = 9«"o + {* - ^)(c hg' 0 v + aig,) + 0 ( . V )
3=1 E Tpi,v+t-p+i + T i,v+t- +i r P
+ *<i.v- P + 1 +0(O.
e s
lit' expressed as follows:
p + 3 i p + 1
«£U = ft(*> 4- 0 ( n ) + 0(/i ) (p > 2).
+ , 1
e™, = h'efx,) + 0 ( h ' ) = Fo(fc) + O ^ ) -
For v > p , by Lemma 3, it is seen that
+I
= 0(A* ).
p+ +l 2 1
vS" - J/!" = T , * . - 7 > , „ + 0 ( f t - ) + O f A ^ ) (p > i ) .
Second, we shall consider the case m > 2. For v > 0, > 0, we see that
1 + 1
- A"e(x„) + O f A ^ ) = P (k) + 0 ( A ' ) . 0
H
- 4 +AE E ^ « t £ i £ # - E w
(=lj=0 (=1
89
- 1
- (nVP - D O - hV7«+fc(0, ™)}
/ { l - (nVJ'Vt-fcfm - l,m)S (Tn - 1)} v+t
_ 1
- («*>"-W)" (i - ftv^+*)/{i - ( / i v n ^ n
,, 11
x {/t7 -iV '/ fcei,+t + Tpi.v+t-p+i - Tp2,„ - i}
r t+ +k p+
2
-HOt^HOt/i '*™).
FVom Theorem 2, for t> > p, we see that
p + 2 2p+1
+ 0(/i ) + 0(/t )
p + 2 2 p + l
= P,(fc) ; - 0 ( h )+0(A ) (p>l).
By induction on j , for v > j(p — 1) + 1, it follows that
f l p+ +1 1
e£* = fy*) + 0(fc > ) + © ( f t ^ ) (p > j).
In a similar way as in the case m = 1, for v > i(j> — 1) + 1, we have
6. Acknowledgements
The author would like to express his gratitude to Professor Hisayoshi Shintani for
his invaluable advice.
7. References
A N E W A L G O R I T H M FOR D I F F E R E N T I A L - A L G E B R A I C
EQUATIONS BASED ON H I D M
WATANABE Tsuguhiro
National Institute for Fusion Science
Ckikusaku, Nagoya, 464-01, Japan
E-mail: wata@lsimsun.nifs.ac.jp
Giovanni GNUDI
National Institute for Fusion Science
Chihisaku, Nagoya, 464-01, Japan
E-mail: gnudi@srhatori.nifs.ac.jp
ABSTRACT
A new algorithm is proposed to solve differential-algebraic equations. The al-
gorithm is an extension of the algorithm of general purpose HIDM (higher or-
der implicit difference method). A computer program named HDMTDV and
based on the new algorithm is constructed and its high performance is proved
numerically through several numerical computations, including index-2 problem
of differential-algebraic equations and connected rigid pendulum equations.
The new algorithm is also secular error free when applied to dissipationless dy-
namical systems. This nature is demonstrated numerically by computation of
the Kepler motion. The new code can solve the initial value problem
where L and ip are vectors of length N. The values offirstor second derivatives
of ip{x) are not always necessary in the equations.
1. I n t r o d u c t i o n
Computer analysis is playing more and more important roles for the development
of science and technology. High speed and large scale computers together with pow-
erful algorithms are extending the field of activity of numerical computations. Many
types of equations are waiting to be solved numerically in the course of research and
development.
There are many excellent algorithms to solve the initial value problems described
by non-stiff ordinary differential equations. We can usually get good solutions for such
problems by excellent ready-made computer programs. However, we encounter some-
times serious numerical difficulties if the problems are reduced to stiff ordinary dif-
ferential equations, or to differential-algebraic equations. Differential-algebraic equa-
tions frequently arise in many physical problems, such as optimal control problems,
dynamical systems with constrained conditions and so on. The present status of the
1 2 3
research on differential-algebraic equations is described in references ' - .
92
4
In a previous paper we have constructed a new computer program named H I D -
M D V (HIDM with second derivative) to solve stiff ordinary differential equations or
differential-algebraic equations, based on the algorithm H I D M (higher order implicit
5 6 7 8
difference method) ' ' ' . The program H I D M D V can solve the equation
0 = L(<p(x),<fi'(x),<p"(x),x), (1)
where L and <p are vectors of length jV. To solve Eq.(l), we have introduced the
difference scheme as shown in Fig.l.
0 Sift ft sh
3 2ft
• 4 — ^ — I — $ — t — ^ — + -
Vf'(0) <ff(2h)
The computer program H I D M D V has shown good performance and has been
4
extended to be able to solve boundary-value and eigenvalue problems However,
practical applications has revealed that the algorithm of H I D M D V should be im-
proved from the point of view of accuracy and easiness of use.
The algorithm of H I D M D V is proved to be A-Stable but not secular error free
for dissipationless dynamical systems. For long time tracing of dynamical systems,
9 11 10 12
symplectic integrators ' ' ' have attracted considerable attention because they are
13
free from secular errors. Recently, Watanabe and Gnudi has extended the algorithm
H I D M D V to satisfy the no secular error property by introducing the idea of time-
reversal integrator.
The computer program H I D M D V is designed to solve the second derivative
!p"(x) at the grid points (see Fig.l). Additional equations are needed if the Eq.(l)
contains no second derivatives <$'(x). This requirement makes the use of H I D M D V
occasionally complicated in applications.
93
2. Principle of H D M T D V
5 7
The principle of H I D M is shown in detail in references '^ . Here we summarize
the principle of H D M T D V , which can solve differential-algebraic equations without
the trouble accompanying non adaptive initial conditions. Furthermore, H D M T D V
has a linearly symplectic nature, and guarantees absence of secular errors for recursive
motions of dissipationless dynamical systems. There are 3 types of H D M T D V
difference scheme, depending on the highest derivatives of each variable. These are
discussed in the following subsections.
Here, we consider the difference scheme for variables which have second derivatives
in Eq.(l). In this case, we use the difference scheme shown in Fig.2.
S 0 Si S
2 S3 S4 S5 S
S
. ci) . . ti) . - fa •
T T
t/h
v(o) m
ff(-h)
follows
k E Q W w E W + l i Q&wm + E • (2)
ip'(sh)
The difference scheme for the second derivative <p", Eq.(2), has a total of 9 parameters
7
(P, (s), Qj(s), Rj(s)). Then the truncation error for Eq.(2) becomes 0 ( / i ) . To reduce
this truncation error we introduce a relation which determines the value of s as follows
2 4
1 - 9s + 12s = 0 , (5)
that is
Then the values of (Pj(si), Qj($i), Rj{si)), {j = - 1 , 0 , 1 , i = 0,- •• ,6) are determined
s
uniquely, and then truncation error of Eq.(2) becomes 0(h ).
The difference scheme for tp', Eq.(3) has a total of 9 parameters (D,(s), Ej(s),
Fj(s)). Then the truncation errors for Eq.(3) becomes 0(h}). This order is com-
patible with the one of Eq.(2). Then, the parameters (Dj(s), Ej(s), Fj(s)) are also
determined uniquely.
The difference scheme for tp, Eq.(4), has a total of 9 parameters (4,(s), Bj(s),
s
Cj(s)). Then the truncation errors for Eq.(4) can be reduced to 0[h ). This order
is one order higher than the one of Eq.(2) and Eq.(3). Then one parameters, for
example Ci(s), becomes free if we are satisfied with the same order of accuracy of the
1
discretization scheme for <p, tp , <p". When we impose the time reversal condition for
the discretization scheme Eqs.(2-4), we obtain the conditions
U
<m ~ =- \ ^ s < , (7)
1 1 1
Ci{8 )-C { )
1 1 gt =- ~ f*8 .
1 6 (8)
Two coefficients are still left undetermined for the parameters (-4j(s,), Bj(Si), Cfai)),
(j = —1,0,1, i = 0, • - • ,6), if we request the compatibility for the truncation errors
for representations Eqs.(2-3). We discuss about this points in some detail in the
section 4.
In the following, we determine the parameters in Eq.(4) in order to minimize the
9
truncation error for (f(s). In this case the truncation error of Eq.(4) becomes C?(ft ),
95
which is one order higher than the one of ip' and (p" Coefficients for the discretization
scheme of H D M T D V are reduced to the form
_ 16H-17A±(444-4A)s 283-21A
C ± , ( S ) = L C d W =
73728 ' ^ 0 7 2 ~ ' <">
( m 5 + 6 4 S l
^ ) = - ^ . D 0 { S ) = J J ^ , m
^ ^ . ^ - ^ W = - « , (14)
_ 117 + 3 7 A ± ( 9 9 + 51A)s 9 - 7A
where
f V33 ( for s = a, or s ) , s
\ -V53 ( for s = s or s ) . 2 4
cos(2hn) =
2 4 5 6
457228800 - 881118000fl + 239415750g - 20934585^ + 724410g - 9792g + 38g
2 3 4 5 6
457228800 + 33339600g + 1275750g + 33O750 + 540<> - 18<j + 2g
(24)
2
where g = (kw) .
1.570792...
is satisfied, the numerical solution given by the above discretization scheme becomes
periodic with the correct amplitude (= 1). The relation given by Eq.(24) is shown
in Fig.3 when u)h is real. This figure shows that the largest step size h which
max
1.57079212078280060208152-
-8
| ( 1 - 2.67763---xlO ). (26)
97
In other words, the largest step size which guarantees the periodic solution of Eq.(22)
is 1/4 of the period of oscillation, and the relative error for the period is 2.67763- • • x
I D - 6 The local error of the discretization scheme becomes
cos(2 a -cos(2
f t ) M = - ^ | g 2 0 + - (27)
This analysis leads to the following conclusion. When we adopt the time step h as
1/20 of 1 period ( h = O.IJT/W), the local error for function value is of the order of
-14 13
4.4 x 1 0 and the error for the period of the solution is of the order of 1.1 x 10~
Furthermore, this discretization scheme can be proved to be linearly symplectic.
In this case, the value <p'(0) in Fig.2 cannot be specified as initial values. Then addi-
tional equations become needed if we use the discretization scheme in the preceding
subsection. Sometimes, this process becomes a nuisance. We introduce therefore a
separate discretization scheme for this case as shown in Fig.4.
SO « 1 $2 S3 Si S5 Sg
v'(-h) m <P'W
are given by linear combinations of function values at grid points x = ±ft, ±ft«, 0 as
follows
The difference scheme for the first derivative iff, Eq.(29), has a total of 8 parame-
7
ters (d,(s), e,(s), f±{s)). Then the truncation errors for Eq.(29) become 0{h ). To
reduce this truncation error one more order, we choose the value of the embedded
points ftu appropriately. It is slightly surprising that the value u = l / i / 3 guaran-
tee the all truncation errors for <^(sjft) (t = 1,2,4,5) are reduced one order, and
8
the truncations errors of above expressions become 0(ft ) which is just the same or-
der for discretization scheme of functions with second derivative shown in previous
subsections. Coefficients for this discretization are reduced to the form
2 9 3 + 8 5 a 8 + 9 4 A ) s 6
M . ) - * g . " . W - ^ , (3D
C i i , S ) = L ( 3 6 )
3072 > *<«>- 384
where the value A is given by Eq.(18).
We study the stability of the discretization scheme given by Eq.(31 -36) using the
equation
<p'(x) = Xifiiz), (9(0) = 1 , (37)
where A is a some constant. The difference scheme Eq.(31-36) gives the following
solution
2 3 5 6
..., = 7560 + 7560Aft + 3465(Aft) + 945(Afe) + 165(Aft)< + I8(Aft) + (Aft)
V { 2 3 4 1
' 7560 - 7560Aft + 3465(Aft) - 945(Aft) + 165(Aft) - 18(Aft)* + (Aft)«
(38)
99
This solution (the amplification factor of the difference scheme) has the following
characteristics
\f{2h)\ < 1 when3tAft<0, (39)
|(^(2/i)| = 1 when Xh is pure imaginary. (40)
The relation (39) shows that the difference scheme is A-stable and Eq.(40) guaran-
tees purely periodic numerical solutions independent of the step size h when A is
pure imaginary. The linear symplectic nature is also verified for this discretization
scheme. The local error of the difference scheme is given by the difference between
the discretized solution Eq.(38) and the analytic solution exp(2Aft)
« - e ^ 2 A / 1 ) = - I J ^ + .... (41)
These results show the excellent nature of the difference scheme of Eq.(31-36).
Here, we consider the difference scheme for the variables with no derivatives, which
appear in the equation like
0 = lfax% ib(x), ^(4 &x% f ( s ) , ?(?), x) (42)
In this case the difference scheme is very simple, as shown in Fig.5, and and no
truncation errors are included.
-1 Si s2 0 s 4 s s i
@ 1 — @ © m ®
practical unknown quantities in the actual program coding. For example, the quan-
tities <p{jh) $[jh) are treated as unknown quantities introduced by the relations
(J =0,1) .
If it is possible to determine all the highest derivatives using the given system of
equation (1), we can get solutions directly by the above mentioned algorithm. There
are, however, problems in which we cannot determine the highest derivatives only by
the given system of equations. An example is given by
G(</,^>,tM)=0, (45)
/r(vMfr,j)=0. (46)
In this case, both variables J/>(I) have second derivatives, so the discretiza-
tion scheme given by Eq.(9 -17) is applied. This discretization scheme assumes that
y>(-h), if/(—h), il>{—h) and ^(—h) are given as initial conditions. In this example,
however, we have only two true initial conditions, for example ip{-h) and f'{-h).
The other quantities ip(—h) and ift{—h) are not initial conditions, and should be
determined consistently from Eq.(46). In this case, two additional equations are nec-
essary to determine the values i>{—h) and i)'(-k). An example of a set of additional
equations is
^ff(,M,x)=0, (47)
£pH( ib,x)=0.
Vl (48)
Since the program should be informed of these facts, a index for each variables is
prepared in the program. Example of the index is rank-2 array variable named JVR
as shown in the following.
iF--/fy,r,*), (49)
and high speed computations is requested, separate program should be prepared which
does not treat the second variables as unknown functions because this value is given
101
in Eq.(49) . In this case the computation speed can exceed the speed of standard
Runge-Kutta method program.
3. Numerical Examples of H D M T D V
Next, we have calculated the numerical error for the variables with first order
derivative (without second order derivative). The discretization scheme is given by
Eqs.(31-36). In this case, we give the 'exact values' of <p(nh) and ip'(nh) on each grid
points and tp((2n + 1 ±1/V3)h) on embedded points (tp(x) — sinfa:), n ^integer and
h = TT/32), and calculate ip((2n +1 + s,)ft) and <p'((2n + l + Si)h) by the discretization
102
2
d y y
= 2 2 2 ( 5 1 )
~dtJ ~^{x + y )^ '
2
(I = 7T /16 ,
103
where the constant u and the initial conditions axe chosen such that the analytic
solution has period T = 64 and a relatively large value for the eccentricity. In this
system energy and angular momentum are conserved and it is possible to check the
accuracy of the numerical computations.
For this system the index of the variables is shown in Table 1. No additional
equations are necessary.
variables x(t) y(t)
n 1 2
JVBfl.nl 2 2
JVR{2, n) 0 0
Fig.9 Plot of error for energy and angular momentum of the numer-
ical solution of Eq.(50-51) (Kepler Motion). Because the period of the
numerical solution is very close to the analytical value ( = 64), the recur-
sion time of the numerical solution is very long. This figure shows the
secular error free computation characteristics of H D M T D V .
m> = T l y i + T i m ( 5 3 )
7t^ ~ ' * ' ~ '
m
^ ( ^ - s ) = - T i - ( x - t X l ) , (54)
m a = T 5 5
^*r - '"<»-»»>• ( )
fi. + Vl = ti , (56)
1
^ ( a i - ^ + dfi-Ift) = 4 . (57)
105
where £, and l represent the length of mass-less rigid rods. T and T correspond
2 t 2
to to the tensions of the rigid rods, 1%, m , g are constants. The unknown variables
2
are the position of the tip of each rod ( i , , jft, x , y ) and the tension of each rod,
2 2
(T and T ). The former group of variables has second derivatives, but last group of
3 2
variables has no derivatives. In this case, the the index JVR is given in Table 2.
variables Xi Pi %2 V2 r i T 2
n 1 2 3 4 5 6
JVR(l,n) 2 2 2 2 0 0
JVR(2 n) l 0 2 0 2 0 0
— niigxi
= constant, (58)
This system is dissipationless, but numerical results show that numerical error
suddenly increase at special points. This will break the time reversal nature of the
motion. The reason of this phenomena will be discussed in the next section.
2 2
0= Lifolf'.M) =^- cos(t)z -8exp(-t)y ,
a (59)
0= t (v,()
2 = i-(i- m{t)
Qii + 3exp(-tj)-y, (60)
where a and 8 are constants. The analytical solution of this system is given by
rtt) t)B ( 6 1 )
^ l-a^W
In this case the variable y{t) contains the first order derivative, but it is determined
by the algebraic Eq.(60). Then, an additional equation is necessary to determine the
value j/(()- So, the index of the variables y{t) and z(t) becomes as shown in Table 3.
107
Time Time
Fig. 12 Plot of numerical solution and its relative error for differential-
algebraic equations of index-2, Eq.(59-60). Since the variables z(t) is
solved by the algebraic equation, Eq.(60), the error is only due to roundoff
15
error, i.e., order of 10~
We have developed a new integration method with high accuracy and high ap-
plicability. A new program named H D M T D V can solve dissipationless dynamical
108
t t
SiSH'teKS = 0 , (63)
The physical meaning of this process is the following. We treat j/i(/) as a rank-2
variable when |j/i(t)| > | Z ] ( f ) | . In this case, y\{—h) is treated as a initial condition
and ft) is determined by the equation of motions. The x\(-h) and x"(—h) are
determined by the additional equations Eqs.(63-64). When |j/i(t)| > |xi(t)|, %i(t)
is treated as a rank-2 variables. We show a numerical example in Fig.13, using this
choice of the value of the index JVR.
H = 4 C O S 0 , , jft =f infl
l S 1 , (65)
3% — x l = g COsS ,
2 2 V2-yi =f2Sinfl . 2 (66)
In this case, Eqs.(56-57) are automatically satisfied. Highly accurate numerical solu-
tion is obtained as shown in Fig.14.
In subsection 2.1, we found that two of the coefficients (Aj(si), B,{s ), C,(s,)), t
and Ci(s ), if we are satisfied with the compatibility for the truncation errors for
5
representations Eqs.(2-3). In this case, we can change the dispersion relation Eq.(24)
so as to satisfy
cos(2fcf2) < 1, for ft|w| £ y , (67)
by appropriate choice of the values of Ci(s ] and Ci(s ). In this case, the largest step
4 5
size hmax which guarantees the purely periodic solution of Eq.(22) is given by 3/4 of
the period of oscillation.
It will be easy to extend H D M T D V to solve boundary value and eigenvalue
problems. This will be published elsewhere.
The next big task is the construction of a general purpose computer program to
solve time evolution of multi-dimension boundary value problems described by partial
differential equations. When the space dimension is 1-D, we have already constructed
1 4 , 1 5
such a general purpose computer program based on H I D M . The present work
represents also an important contribution to accomplish this task.
5. Acknowledgements
G. Gnudi acknowledges the Japan Society for the Promotion of Science for the
financial support.
6. References
S e m i - e x p l i c i t M e t h o d s for D i f f e r e n t i a l - A l g e b r a i c
Systems of Index 1 and Index 2
Hisayoshi SHINTANI
Department of Mathematics, Facility of School Education
Hiroshima University, Higashi-Hiroshima 739,Japan
Abstract
A-stable semi-explicit methods are constructed for differential-algebraic sys-
tems of index 1 and for those of index 2 and their convergence is shown.
1. Introduction
where y and /, z and g are vectors of the same dimension respectively, / and
g are sufficiently smooth and for (1) gi{y,z) has a bounded inverse in the
convex closed domain D and so does g (y)f,ly,z) y for (2). The initial value
(yo>zo) is saied to be consistent if g{yo,zo) = 0 for (1) and if g(yo) — 0 and
9Ayo)flyo, zo) = 0 for (2). Let
x = x + 'ft (0 < h,
t 0 0 < t, th < C).
We are concerned with the case where the approximations (y„, z„) to (y(x„), z(x )) n
initial value. Rosenbrock methods for (1) with inconsistent initial values are
1
described in the literature .
The object of this paper is to construct A-stable semi-explicit methods
for (1) and (2) that correct the errors of the initial values and to show their
convergence. We also construct interpolation formulas for approximating y(xi)
and z(x ) (t ^ 0,1,...) and obtain the formulas useful for stepsize control.
t
114
Let do — — (ff7'ff)(!/o>zo)> and assume that ||doll is so small that there exists a
2
unique zrj such that g(ya, ZQ) — 0 and put ZQ - ZQ+CIQ. Thenifj = zo-t-0(|[do|| ).
Let (yo(x)«2o(x)) be the solution of (1) satisfying i/o(xo) = yo,zo(xo) = ZQ- Put
-1
G = (- )-\
Sl K = Gg , L - / , G , T = f + KU
y s 4 = (I- aftT") (a > 0),
F = f-rfzdn, where all function values are evaluated at (yo.^o)- Then we have
2
yr>o) = F + O(||do|| ), 2D ( x ) = JrF + 0(||doH), yo(xo) = TF + O(||do||),
0
where
ni—l i - l TI i—1
c v i + e m
^ = W + E E ' ' ^ * j ' ' ' = - M ~ m i + E E M > * E ' > j (* = 2,3,...,m),
;=1*=1 JJ=1fc=1 j=2
+1 2 2
yi - yo(xi) = 0(ftP + A ||doll + ft Noll ), (5)
, + l 2
si - s (xi) = 0 ( f t
0 + ft lldol + lldoll ) (P > 9 > 0). (6)
We also construct interpolation formulas of the form
m n
C
yi =!W + E E P O ' ' . J i + hvt{f(yuz\) + Lgim,zi)} (0 < ( < l ) ,
i=ij=i
+ h K p , n
*,=* + m i + E E « ^ ?' f(yi .*l)+E * *
i=l i=l uc2
and the formula
771 TI
e fc
= E E Py y
such that JI, - vo(xi) and z, - zo{xt) are of the order (5) and (6) respectively
p
and e — 0(ft ). The quantity e is used for stepsize control.
115
so that the A-stability of the method (3)-(4) can be verified by the test equation
1
y = >~y
We have
16 32 32 32 32
~
2
3 2 3 2 3
Put - |(12( +5t -36t+9), pi it m ^(24( -65t +54t+9), n = ^ t ( 4 - 3 t ) , P2
3 4 2 2
P3i, = | i ( 4 - 3 t ) , p, - t - i , ra - | j t ( 5 t - l ) , r 31 « i/. (5t-3), p n == y , P12 = | ,
20 5 125 _ 5
P i 3 - - T , P l 4 = g , P2i = - ^ . P3i-g.
Case m — 4
1 1 2 11 2 8 26
a = 2' 4 : 2 1 1 = C
3 ' 212 = - g . C213 = — , C 3U - - , C312 - - —, C i3 - 3 - — ,
116
4 3 2
r = I£
4 P U ( = -4r(378t - 1647t + 2255S - 1080t + 120),
15 120
3 2
p 121 = l->-76Uf + H685i - 6920t + 90( - 90),
yu
4 3 2
Pi3< = TT^I-27621t +52440f -23345( - 1080t + 540),
540
4 3 2
p l4[ = _ L { i o i 2 5 ( - 19140( +8105i + S72t-324},
2 2 3 2
PS1I = ^* (36t - 75f + 40), P22, = Y^( (16119t - 30810t + 14215),
P3i- - 4t3(-9(2 + 1 5 (
" 5 |
' p l
" • T^* '" * 3 5 4 2 + 1 0 S (
" 5 0 )
-
3 2 J 2
p, - jt (27t - 501 + 23), rj, = -^-( (65t + 93), r , = Jrt (85t - 39),
3
4 40 20
13 , 2 ... r 25 . 13 . 43 . 65 . 15
p ( ( 5 ( _ 3 ) P l 1 = m = P l 3 m -
* - 3o - i 2 ' y = is - = M' ™ = - y
16 . 3 . 13
P22 = - y , #M-jr;
2
We have shown the following
,+1
Theorem 2 Suppose that do - 0(ft ) (s > 0) ,0 < h < ho , nA < C. Then
for sufficiently small ho
\\G(y,z)\\ < Mo, \\H{y,z)\\ < M,, \\g {y, z)/2\\ < M , \\g iy)f Ay,z)f2\\ vs 2 s 1 < M inD
3
117
and put
m
D = M\M , £>i — M M ,
0 2 0 3 CQ = H(y , z )g(y ), a 0 a So = Vo + co, y = yo-
Then we have
1
Proposition 1 Suppose that r = A>IMI < L. Then the sequence { j / " } de-
fined by the iteration
. „C*) + W H{g i (fc = o, i,...) (7)
have
fc+1) W
<7(i/ ) =jf (1-*)frf + Oc )(Ck, c )d&,
k k (9)
which yields the estimate
2
\\c i\\ < Do\\ck\\ ,
k+
so that
2 1
llcfcll < r ' - Hcoll (* = 0,1,...). (10)
For any positive integer p we have
holds.
118
Corollary 1 Let
d = G{yo,z )g (yn)\f(y ,z )
Q 0 y 0 0 + fy(.yo, Z O ) C Q } > ZO = zo + dn.
Then
e ^d 0 n + 0{\\dof + \\cf), (12)
2 Z
||io-5o|| = 0(||(iol| + l|col| )- (13)
Proof. From
eo = {G(iiD,^H0(IMI)}{ftr(»^
3
- * + 0 ( 1 * 1 11*11+INI )
the estimate (12) follows. The inequality
W W
\\ZO-ZQ\\ < jzQ- z \\ + \\z
(So, Jo) in the neighborhood of iyo,zo). Let z = k(y) be the branch of the
implicit function denned by g {y)f(y,z) — 0 such that ZQ — k{yn) and put v
L
P(y) - f(y,k(y))- « t iyo(x),zo(x)) he the solution of (2) satisfying yo(xo) =
yo, zo(so) = h- Put
l
k - {A - iy- Ah{fi
tj - F), kj = Kkij (i = 2,3,...,m;j = 1,2, ...,rt),
rm = {A- iy-'ALgi,
} rty = Km,, Pi = R(fi - F),
119
j=ifc-J j=2
i-l n
Wi = jo + co + + u* j*)' d m
j=ifc=i
1 2 2 2
ft = S1 - So£*i) = Of**" + A M l + 0 ||doll + Hcoll + h ||d || ), 0 (16)
+1 2 2
r , - z i - z i ( x ) = O(A« +fc||^||+ft||d ||
< 1 0 + || || + l|d || ) ( p > > 0 ) . (17)
Ca o 7
33
p2l = 3, r = 1, s = 9, t = 2, pn, = i , p i = K - y £ + » ? , P211 = -6f+18( -9( ,
2 2 2 2 (
2 3
2 2
r , = i , s , = 9t , t , = 2(, pn = 0, pu = -5,
2 2 2 P21 = 6 , f = ^. 2
Case m = 3
1 3 3 21 99 69
a = j , O i l = 1, C 2 1 2 = - - , GJI, = 7 , C3i 2 = -jjg. <WS - - — , c 321 = — ,
9 9 3 , , , 1 . 7 . 1
r
P31 - ^ y 21 = 1, T - = - 1 , r = s = 1, S3 m —, 1 — —, t = 3, 2 2 3 1 2 2 3
•> 16 , 16 , 8 o 19 16 T
2 3 3 2 J
Pill = t. PlM - 2t -t—-t , pu, = y t - - t + ( , P 2 „ - 0, p , - --t\ 22 put - —( ,
f i. = t , r
2
2
2 2 [ = -t , r 2
3 1 1 - ( , r, = 8 ( t - ( ) , p,i = 0, p , = - y P13 = y
3 2 3
2 P21 = 0,
P31 = f31 = 1.
Casern = 4
1 1 1 1 3 39
a = C2U C 2 1 2 0 2 1 3 = C 3 n = C312 =
4' " 3' = "5' "9' ?' "250'
337 37 207 33 3
cm = - £ g , * M = — , < * » = — ,C322 = j g . A u - 1.4M = J.
. 64 , 128 , 64 . 686 81
— 4 2 1= 31 = 632 = e 4 2 = 9
dill = 2 2 5 . ^ - Tl25 ST'^ 243' 25' '
50 128 1664 686 9 45 9
643 = y J32 = y-jy ?42 = — = JjJ.ejl, = ^Cm = 32,C3ii = , ?
99 . „, , . 41 . 7 . 200 ; 64
C312 = j y ^ i =o,c4ii - i , c i 2 = - y - 4 2 i = - j . « m = y f . ^ i = T_-, 4
c
, 64 - 64 - 343 8 1 257
*2l - ^.^422 - 2 ^ 3 , ^ 3 , = - — , P U = 1.P12 = - g , P , 3 = 3^,P.4 = g j ,
41 59 125 1 64 64
P2, - 0,P22 = "^,P23 = 77T.P31 = ^ P" = g, r , = g j . f ^ = - - . 2
3 z 3 2
PlU = *,Pia = ^ ( 2 2 5 f - 6 4 i + 36t-18),p22i = ^ ' ( 1 5 t + 64t - 72i + 18),
a 2 2 2 2
r i i . = f | | ' ( 8 * - 9), r « , = - | t ( 7 t - 9), , = 9t , * = ^ ( , s , = ( , S 2 4
121
128 686 3 8
t 2 = t ( 1 3 ( + 3 ) t 3 = i(5f 3 > ( 4 ! t ( 7 ( 6 ) 2 = P u =
' 243 ' ' 243 ~ ' " "2 ~ '^ ~ 21'
2 . 784 .
P 4 1 r 3 1 = , 4l = L
-21' 729' ' -
3.2. Convergence
y = y + c , z = Zj + dj (j = 0,1,2,...),
j j j 3
where
Cj = H (yj), jg dj = Gjg {yj)(f(yj,Zj) y + f (yj,v z,)cj},
H = H{y ,z ),
i i j Gj = G{yj,Zj).
Then we have
Theorem 4 Suppose
r+1 1+l
c = O{k ),
0 d = O(h )
0 {r,a > 0), 0 < f t < f c , 0 nh<G.
tuftere
Then for some constants oo, &o, * (j = 1,2,3) and (fc = 1,2) we have by
(16) and (17)
1 2 2
| | S j i | | < aoh^
+ + UihHWcjW + \\dj\\) + 02 IICJH + ftaj ||d,|| , (21)
| | T j l | < 6oA
+1
,+1
+b h(\\c \\ + \\dj\\) + b2(\\cjf 4- \\djf).
1 j (22)
122
Let
i?
90J ~ » ( * ) = Wte) - » ( * / ) + / { ^ i - (fo(())}d( ( I > * ; ) ,
we have
Prom
2
Sfetzj+l)) - " S i) j+ = - ft,(tt+i)S i i+ + 0 { | | S | | ) - 0,
3+l
it follows that
a
c i =
i + fff+ij^+l) = -Qi+iS i j+ +0(|%iS ), (23)
and so
a
fc'+l - ft&fcl] - Fj+iSj + i + 0(||Sj+i|| ),
because
a
-fe+l=^ + J +0(||c i|| ). i+
CL
Setting d = kie ° (i = 1,2) , we have
2
Ay. < C , £ | % ] + C £ HSjll 2 ( = 1,2,...),
n (24)
J=I j=i
2
\\y^-yo(x )\\<Ay„n + C \\S \\ 3 n ( r i - 1,2,...), (25)
2
\\yn ~ Vo(x )ll < Ay„ + C ||S || + C ||S„||
B 4 n h (n - 2,3,...), (26)
Ilia •-•••»B.(*i)ll = l i f t ! . (27)
123
Let k be tbe constant such that ||fc(«) - k(v)\\ <fc \\u - v\\. Then
0 0
we have
Ss(w(^+i))/(y,(^+i),z,(xj i)) +
= Ss(W+i)(/ + V j + O - S j t / j P j + i S j + i + ATj+O - ^ ( P j + i S j + i , / )
2 2
+o(||s || + ||r || }J+1 i + 1
= 0,
which yields
(31)
where function values without arguments are evaluated at j/j+i or at ( i / j i , z i ) . + J+
Prom (21), (22) and the assumption it follows that for some constants A\
and Bi
m
liftII <Aih', \\T,\\ <Bih .
By (21), (22), (23) and (31) there exist constants a t (j - 1,2,3), h and b 2
such that
, 2
\\S 4< h^ + n {\\S \\
j+ ao ai j + \\T \\)+j a2 Wf-H&pyf,
,+1 2
| | T i | | < 6oft
i+ + M d l f t l l + 113)11) + W l l S i f + P J | ) (j = lj2,...).
We shall show that for sufficiently small ho inequalities
+ w+i
llftll < Ajh" \ \\Tj\\ < Bjh (32)
Put
1 +1 21 1 -
A = a^-"
2 + aiiAih? -' + B,h? -') + a A\h -"- 2 4- ^ ^ M * * ,
Then (32) holds for j = 2. Suppose that (32) holds for j = 2,3,.... k and put
+2 v +l 2 2 +2
A k+1 = h' -»
ao 0 + a i (A hl + B ^ ~ )
k k + aiAlh° 0 + a B h r -\
3 k (33)
+1 4 1 1
Bfc+i = 6ohT" + 6i(iltfcS -" + B^fto) + f ^ f t o " " + BgftJ?* ). (34)
Then (32) is valid for j = ft + 1. Setting
2 2w+2 v
where a = ao^"" + MNf**~" + a b h - , 3 b = bohf", c = b + Wifto +
2 +l
6 6 /to . we have
2
3
= «4 AB^ai*. + «^Ao^*H- a T ^ * ^ ^ == 4>C^^ ^fc^
with nonnegative coefficients. From (33) and (34) it follows that A > 0 and k
Put
(A ,B f k k = U , (<p(V)MV)f
k =T{U)
and choose ha small so that
If for j = 4,5,...,fc
\\Uj-V \\<2\\Ui-U \\,
3 3 (35)
then since
\\u -u \\k+l k <T \\u -v _A,
Q k k
we have
\\u -u \\
k+l 3 < Eiall^+i-^n
s ,
< ( l + r + ... + i £ - ) | | E J - 7 i | | < 2 | l ( / - f / | | ,
0 4 l 4 3
and (35) holds for j — k 4-1. Hence j i \ and B^ are bounded and there exist
constants A and B such that
By (24) we have
v+l 2 2 a +2
AiM < CiA{h! + (n-l)h } + C A {h ' + (n - l ) A * }
2
i 2 2 2 +1 !
< Ci>l(A + Cft'')+C2vl (/t ' + C A " ) < D / i ( n = l , 2 , . . . ) .
Estimates (18), (19) and (20) follow from (25), (28), (26), (29), (27) and (30)
respectively
4. R e f e r e n c e s
COMPUTATIONAL C H A L L E N G E S IN T H E SOLUTION
OF NONLINEAR OSCILLATORY
MULTIBODY DYNAMICS SYSTEMS"
JENG Y E N
Army High Performance Computing Research Center, University of Minnesota
Minneapolis, MN 55415, USA
E-mail: yen@aJipcrc-umn.edu
and
LINDA PETZOLD
Department of Computer Science, University of Minnesota
Minneapolis, MN 55455, USA
E-mail: petzold@cs.umn.edu
ABSTRACT
One of the outstanding problems in the numerical simulation of mechanical sys-
tems is the development of efficient methods for dealing with highly oscillatory
systems. These types of systems arise for example in vehicle simulation in mod-
elling the suspension system or tires, in some models for contact and impact, in
flexible body simulation from vibrations in the structural model, and in molec-
ular dynamics. Simulations involving high frequency vibration can take a huge
number of time steps, often as a consequence of oscillations which are not phys-
ically important. On the other hand, the components causing the oscillations
cannot usually be eliminated from the model because in some situations they
are critical to the simulation. The equations of motion of a rnultibody mechani-
cal system are described by a system of differential-algebraic equations (DAEs).
In this paper, we will explore two types of methods. The first class of meth-
ods damps out the oscillation via highly stable implicit methods. Even in this
relatively simple approach, unforseen problems may arise for Newton iteration
convergence, due to the nonlinearities. The second class of methods involves
formulating thernultibodysystem in such a way that the oscillations are de-
termined by a linear subsystem, which can potentially be solved rapidly via a
number of different methods.
1. Introduction
Much recent work has been focused on the development of numerical methods and
underlying theory for the solution of rnultibody dynamic systems (MBS) consisting of
28 20
fast and slow subsystems ' , These types of systems occur frequently as initial value
problems in the computer-aided design and modeling of constrained mechanical sys-
1 , 4 2
tems, molecular dynamics, and in many other applications It is well-known that
the characteristics of fast or slow solution is determined not only by the modeling as-
"This work was partially supported by ARO contract numbers DAAL03-92-G-0247 and DAAH04-94-
G-0409 and by ARO contract number DAAL03-89-C-0038 with the University of Minnesota Army
High Performance Computing Center, and by the Minnesota Supercomputer Institute.
128
pects, e.g., the coefficients of stiffness and damping, but also via the initial conditions
and events that may excite stiff components in the system during the simulation.
As an example, the governing equations of motion of a mechanical system of stiff
or highly oscillatory force devices may be written as differential-algebraic equations
6
(DAE) :
-£/*(«.?. 0 + G ( « ) *
r
= 0 l a
< )
<?(<?) = 0 (lb)
T
where 5 = \qi,...,q„] is the generalized coordinate, q = ^ is the generalized velocity,
T
q = ^ is the acceleration and A = [Ai, . . . , A ] is the Lagrange multiplier. There are
m
A
rif stiff or oscillatory forces /;, Q includes all the field forces and the external forces
which are non-stiff compared to the stiff components, g is the kinematic constraints,
G^A = I * A represents the internal constraint reaction forces, and M is the mass-
inertia matrix. The stiff force components in (la) are usually expressed by
/, = T,( , )(KMw.)
qil qil + Ci^) (2)
K 1
where the i' force, i € 1,.... ny, is a bilateral force between the pair if and i f bodies,
75, which may be the orientational transformation matrix of a local coordinate system,
is a nonlinear transformation of configuration spaces, which may represent relative
distance or angles between adjacent bodies, is a function of the generalized coordinates
33 29 7
qi and q,,, and finally Ki, C, are the associated spring and damping factors > ^ For
some generalized coordinate sets, such as optimal relative coordinates of mechanical
systems, the function IJ may be linear or even identities, e.g., for instance TJ; = q^ for
some t, ij, the nonlinearity of / j with respect to the generalized coordinate q is due to
3 3
the spatial transformation T; When the components of the coefficient matrices Ki
and Ci become large, these force components may cause rapid decay or high frequency
oscillation in the solution of (1). The purpose of this article is to study these systems
and their numerical solution.
To demostrate the problem of oscillation and the recent developments in this area,
we present two examples: a stiff pendulum and a 2D bushing problem. The former is
a very simple example of a type of system often seen in modeling molecular dynamic
systems, and the latter is a general form of modebng force devices in rnultibody
mechanical systems.
Stiff pendulum
In Cartesian coordinates, a simple stiff pendulum model, with unit mass and gravity,
may be expressed as
0 = i-u (3a)
129
0 = y - V (3b)
0 = ii + xX (3c)
0 = ii + yX- 1.0 (3d)
2 y/x* + y 2
- 1.0
eA = (3e)
where the stiff spring of natural length 1.0 and stiffness j , is attached to the center of
3
mass of the pendulum. Preloading the spring by using e = \ / l 0 ~ , the initial condition
(0.9,0.1) of (x, y) and the zero initial velocity of (u, v), the results of the states
(x,y,u,v) in the 0 to 10 second simulation are shown in Fig. 1. The corresponding
eigenvalues of the uderlying ODE of (3), i.e., substituting (3e) into (3c, 3d), are
illustrated in Fig. 2, where the 3D figures contain all the eigenvalues on the complex
plane drawn along the time-axis. The dominant pair of eigenvalues in the example are
i-i, as shown in Fig. 2. As e —* 0, the pair of eigenvalues approaches ±oo along the
imaginary axis. The other pair of eigenvalues oscillates on the complex plane with the
-5
amplitude and frequecny approaching ±co. Decreasing e to V l O , the eigenvalues
of the uderlying ODE of (3) are shown in Fig. 3. Comparing to those in Fig. 2, two
pairs of eigenvalues in Fig. 3 are 10 times the magnitude of those in Fig. 2, and the
oscillating pair increases its frequency proportional to the size of e.
Stiff in • ••
I I 3 * 5 * T * S U
~ ^ I ? 3 4 n i 6 7 G Q 1 D
2 0
Lubich shows that numerical solution of a class of Runge-Kutta methods for
stiff mechanical systems of a strong potential energy, e.g., stiff spring force such as
the stiff pendulum (3), converges to the slowly varying part of the solution, with the
2 8
stepsize independent of the parameter e in (3). Reich extends the principle of
1 0 , i ?
slow manifold to DAE of MBS with highly oscillatory force terms. Algebraic
130
"
20
·20
...
20
••
real-axis Tlm.
<DO
200
!~ •
·2. .
-4. .
200 .
...
....
Figure 3: Eigenvalues of Stiff Pendulum in Cartesian Coordinate, epsilon = l Oe-2.5
131
constraints corresponding to the slow motion were introduced with a relaxation pa-
rameter to preserve the slow solution while adding flexibility to it in the slow manifold
approach.
It is not clear that a slow solution appears in the above example. In fact, we can
only identify the slow solution of (3) using a proper nonlinear coordinate transfor-
mation. In polar coordinates (r, 6), we obtain the equations of motion of (3):
0 = r-z (4a)
0 = 0-u (4b)
where (z,u>) is the velocity. In the 0 to 10 second simulation, using the same initial
conditions as the Cartesian coordinates, we obtain the solution in Fig. 4, where
the fast solution is (r, z) and the slow solution is (0,w), The eigenvalues along the
solution trajectory are presented in Fig. 5. Note the dominate eigenvalues are of the
same as those in the Cartesian coordinate formulation. This is because the coordinate
transformation, x — rcosS, y = rsinS, is linear with respect to the fast moving r. The
- 5
eigenvalues of (4) with e = V l O are presented in Fig. 6, Similar to the comparison
in the Cartesian formulation, we obtain the eigenvalues of 10 times magnification.
However, the eigenvalues corresponding to the slow motion have near zero imaginary
parts, therefore, the oscillations along the imaginary axis of eigenvalues 3,4 in Fig. 6
remain insignificant.
--' in Pom
D 1 I 3 4 6 B ? s i n
Although there are ongoing developments to extend the results of Lubieh to multi-
3 1
stage multistep methods , and impressive application of the slow manifold technique
in some molecular dynamic models, it is not clear that these results may apply di-
2 0
rectly to all the types of oscillatory components in MBS. As indicated in , the
representation of stiff or oscillatory components in an appropriate coordinate system
of MBS is not always possible, i.e., the constraints associated with the stiff or oscil-
latory potential force can be difficult to obtain in general. Moreover, convergence of
Newton's method may be an obstacle in obtaining efficient numerical solution of an
2 0 , 2 8
oscillatory MBS in either of the above-mentioned approaches
Bushing force
We have been studying more general MBS of nonlinear oscillatory components such
as a bushing force, which is often used in modelling vehicle suspension systems. Dif-
ferent from the linear spring, this element is usually an anisotropic force, i.e., it has
different spring coefficients along the principle axes of the bushing local coordinate
frame. The bushing force between body-i and body-j may be defined using the rela-
tive displacement dy, its time derivative djj, and the relative angle and its time
derivative 9ij of two body-fixed local coordinate frames at the bushing location on two
bodies. Using the vectors s[ and representing the bushing location in the body-Vs
and body-j's centroid local coordinate systems, respectively, we have
Xj
d ij =
+ Ai4 - Ajs'j (5)
A simple example may be obtained from this model using unit mass-inertia and
gravity, grounding the first body, and setting the bushing location on the second body
134
as s' = [-|, 0]. A bushing element with no damping attached at the global position
of [ i , 0 ] yields
0 = if+ * » ( » - 2 ^ ) + 1 (8b)
f l
-i ,.„ sin , ,l T cost?, cosf sinf?
It is easy to see from (8) that the local eigenstructure of the system may change
rapidly, depending on the size of the stiffness coefficients.
I
Usingtheinitialvaluesof(i, ,,fl) = (l.l,0.1.0.0)with(fc ,fcv,fc'') = (lOMoUO*),
!
the solution of (8) exhibits high frequency oscillation for all coordinates, as shown in
Fig. 7. Solving the eigenvalue problem of (8) at each time step yields three pairs as
illustrated in Fig. 8,
•vWWVW A/WWW
BJ
S « o*
ts a< 04 M 04 Ql
The bushing example represents a different type oscillatory forcing function than
the stiff pendulum. For the bushing element, the coupling of translational and ro-
tational coordinates yields varying local eigenvectors in Cartesian coordinate space,
and two or more fast pairs of eigenvalues, see Fig. 8. The bushing force cannot
be represented as a simple potential force as well as the constraint reaction force of
simple constraints, for which we may directly apply either the results of Lubich or
the slow manifold approach by Reich to the numerical solution.
Many methods for efficient solution of oscillatory dynamic systems are predicated
5
on a nearly linear form of the equation. For example, the method of averaging
requires the linear part of the oscillation equations of motion to be dominant, and
135
r . « * : o r o l ^ . l I J I..: J |^|ttM
3 5
the mode-acceferation method for structural dynamics, which eliminates higher
4 0
modes in the computation of the mode-dispmcemerrf solution , is based on the
time-invariant eigenvalues of the structual dynamic equations. Our aim is to treat
the class of general nonlinear stiff and oscillatory forces represented in the MBS of (1),
3 9
One approach is based on the study of a class of MBS DAE solvers and the energy
a3 11
dissipative method > , which may damp out the oscillation that is not important.
The other approach is to localize the oscillatory components, and then apply fast
numerical solution techniques to approximate the oscillation.
Given the possibility of a rapidly changing local eigenvalue structure, perhaps the
simplest strategy is to consider damping the oscillation whenever it is not impor-
tant via highly stable implicit numerical methods. Since the amount of damping is
controlled by the time-step, and automatic stepsize selection increases the time-step
whenever the solution is slowly-varying (i.e. if the amplitude of the oscillation is
small in comparison with the local error tolerances), the stepsize is increased when
the oscillation is no longer important.
3 9
In recent work we have considered the solution of mechanical systems with high
frequency vibrations via this type of technique. In our initial experiments with the
bushing problem (8) solved directly by low-order BDF methods, we found that the
methods experienced severe problems with Newton convergence. To overcome these
problems, we proposed a coordinate-split (CS) formulation of the equations of mo-
tion, and a Newton-type iteration for solving the coordinate-split equations at each
time step. The coordinate-split formulation eliminates problems due to obtaining an
136
accurate predictor for the Lagrange multiplier variables because these variables are
no longer present in the computation. We found that the coordinate-split formula-
tion worked well foT several test problems involving mechanical systems with high
frequency oscillations. However, for problems with very high-frequency oscillations,
there are still difficulties with Newton convergence.
The Jacohian matrix for solving the nonlinear equations of the coord in ate-split
formulation at each time step involves several terms which are complicated to compute
and which are small at the solution of the nonlinear system. These are terms of
second-order which correspond to the derivative of the projection operator onto the
constraints- Away from the solution, these terms are highly oscillatory. By neglecting
these terms, we found that the resulting Newton-type method converged much faster
for oscillating test problems like the bushing problem. We called the resulting method
the modified coordinate-split, or CM method.
Si = 3-1 (9a)
92 Vi
= (9b)
S3 0,
= (9c)
g* = (x -x f t 2 + (y, (9d)
Ss = &2 (9e)
where X\ and tft, i = 1,2 are Cartesian coordinates of the center of mass of body i ,
and &i is the orientation coordinate of the body centroid reference coordinate system,
1 e
and the length of the pendulum is 1. Applying the bushing force (6) with [it , k", k ]
= [1000,1000,1000] and [c\cV,c°] = [10,10,10] to the pendulum, small oscillations of
the numerical solution appear. Using the initial values q [0,0,0,9.9989e-l,-1.4852e-2,0]
2 7
and v - [0,0,0,-6.75e-5,-4.5444e-3], numerical results from the BDF code DASSL
are contained in Table 1, in which error test failures (etf — s) and convergence test
failures {ctf - s) are listed. We denoted by CS the coordinate-splitting formulation,
l 3
LG the stabilized index-2 formulation proposed by Gear , CM the coordinate-
split form using a modified iteration matrix with the second-ordered derivative terms
omitted, and LM the modified LG using the new predictor of the multipliers by the
CS method. Using simplified Newton iterations and the corresponding modified local
error estimate, CS, CM, LG and LM obtain consistent results.
To see the effect of more severe oscillation, we increased the spring constant of
5
the bushing to 10 . Time steps of these methods selected by DASSL are shown in
137
Figure 9. Clearly, CM took much larger steps than the other methods. Moreover,
6
if the spring constant is increased to 10 , we found severe convergence problems for
LG, CS and LM; the results are contained in Table 2.
• - ID. T m I.- •
5
Figure 9; Time Steps Used in Solving the Bushing Problem, Spring Constant — 10
8
frequency components. Recently Cardona and Geradin have considered extending
the a-methods to second-order DAE systems from rnultibody dynamics, however the
new methods are plagued by oscillations which we believe are largely unphysical.
We can extend the a-methods to DAEs in a way which does not introduce addi-
tional oscillations. Given a second-order ODE,
y = f(t,y,y) (10)
the o-method for this system is given by
a , n+ = (1 + a)f(t„ ,d ,v )
+l n+l n+1 - af(t ,d ,v ) n n a (11a)
2
dn+l = d + hv„ + h [(^-8)a
n n + 0a \ B+l (lib)
v„+i = " + h[(l - 7 K
n +70n+i] (He)
2
where 7 = 1/2 — or, 8 = (1 — a) /r, for a e [-1/3,0], and d, v, and a are the position,
velocity and acceleration, respectively.
M m = /(«,y,!/) + G (y)A T
(12a)
0 = 9(g) (12b)
7
where G = dg/dy, we consider a class of methods of the following type,
M(d i)a
n + n + 1 = (l+c.)/(l i,d n + n + 1 ,u„ )-a/(t ,d ,j) )
+ 1 n n n (13 )
a
139
1
d n+l = d + hv +
n n h' [(^-0)a +0a ] n n+l
+ ^ G r ^±i_rA
f K + i ( 1 3 b )
T dn+l d
v a+1 = v + h[(l - ) a „ +
a 7 7 <VH] + hG ( + ")A n + 1 (13c)
9(<Wi) = 0 (13d)
T
G (d )v nirl = 0
n+1 (13e)
The new method projects the solution at the internal stages onto the constraints
2
similarly to the Projected Implicit Runge-Kutta methods introduced in . We note
that a in (13) is not the acceleration. However, the acuta] accelerations can be
n+1
computed via a post-processing step if they are actually needed. Using the concept of
3
essential underlying ODE introduced in , we can show that for linear model problems,
this method corresponds to discretizing the essential underlying ODE directly by the
3
a-method, up to terms of order 0(h }, and is second order for the position variables.
Further analysis of the method, testing and extension of these ideas to higher order
methods remains to be done.
One approach to solving the high frequency oscillation problem is to carry out
modal analysis and then eliminate the higher modes, since lower modes may preserve
9
the slowly varying part of the solution For example, the extreme high modes of a
structure are often rejected in modelingflexibleeffects of mehanisms, since the details
of the oscillating solution are not so important as the long-term solution behavior. A
similar approach has been developed in recent work on molecular dynamics simulation
4 2 , 4 3
However, due to nonlinear oscillatory forces in the rnultibody formulation, the
modal analysis needs to be carried out at each time step to resolve the rapidly varying
local eigenvalue structure of the system, resulting in very costy computations.
Another approach is to resolve the oscillation efficiently via look-up tables. This
1 2 , 3 4
idea is frequently used in a real-time simulation environment One such example
140
is the modeling of contact compliance in rigid body simulation, where the localized
oscillation may be of interest. Applying linear constitutive laws to modeling the
contact compliance, e.g., the elastic half space theory, Boussinesq's influence func-
tions and Hertz' contact model leads to linear spring forces between contact bodies
38,H,2J <p spring coefficents may be very large since the contact deformations are
ne
small compared to the gross motion of the contacting bodies. The advantage of us-
ing table-look-up is efficiency, however it is not clear how the variable stepsize and
order numerical integration should interact with the tables to maintain efficiency and
accuracy.
In a numerical method such as multistep or Runge-Kutta, which are based on
approximating the solution locally, the stepsize must be chosen very small to resolve
the high-frquency oscillation in the system. Moreover, due to the nonlinear transfor-
mation that places oscillating components in the space of the generalized coordinates,
e.g., in the form of (1), the numerical method may become ineffective since the eigen-
values may change rapidly as shown in the previous examples. Our goal of treating
MBS with highly oscillatory components (1) is to develop numerical methods that
localize the oscillation and approximate the high frequency components properly.
It is possible to rewrite the equations of motion of the bushing force in a local coor-
dinate system so that the eigenvalue structure is nearly linear during simulation. We
propose to explore this idea in modeling and solving complex rnultibody dynamic sys-
tems. By introducing virtual coordinates into the system equations, we may localize
the nonlinear oscillation terms. Using these new variables, e.g. virtual coordinates,
we may approximate the oscillatory subsystems by linear differential equations of the
virtual coordinates.
Using the bushing problem as an example, we consider the local relative displace-
ment d = Ajdjj and the relative angle 0 = &ij that comprise another set of coordinates
T T T T y T
9 = [£>y.0] = [d,f?] , and denote the velocity by v = [Jl V,£)] , where V = [v',v ] .
The Newton-Euler equations corresponding to the bushing in the new coordinate
system become
Tj, = O + c"w.
The applied force excluding the bushing force can be expressed by
2
1
Q " 2myw - m i w — mgsinfi
•5(1,5) = —2ma:<D - myu + mgcos 1
T
and s' = [s'i,s' )
2 is the bushing attachment point in the body-fixed centriod frame
of body j , u is the angular velocity. It is easy to see that the bushing force / j and
the applied torque of (14c) in the new coordinates become linear functions of q and q.
In the cases of high stiffness or oscillation, these linear terms are the dominant part
of the equations of motion (14). Note that the new coordinate frame is parallel to
the body-fixed centroid coordinate frame. The transformation between the Cartesian
coordinates q and the new local coordinates q can be defined by
-§ (15b)
(15c)
w = —w (15d)
z s r
where v = [v ,v ,ut\ is the velocity of q, and v =
1 S
[« ,U ,CJ] ' t
is the velcity of q.
where the dynamics associated with the bushing force is approximated in the linear
differential equations (16d, 16e,16f), and corresponding constraints are (15).
142
= 0 (17a)
l - ^ q = 0 (17b)
into the system, we may rewrite each stiff or oscillatory force component as =
Ti(Ci§j+Aigi), for i € {!,...,»»/}. Denoting the forces by the virtual coordinates q =
T
[ffi, — <9nj\ and its velocity q = j j , we introduce the notion of virtual acceleration,
q = jg for some q(t) = rhq(t) + b(t), such that (1) may be reformulated as
T A
M{q)q-rT(q)q-+G (q)\-Q (q,q,t) = 0 (18a)
q-+Cq-+Kq = 0 (18b)
g(q) = 0 (18c)
q-r,(q) = 0, (18d)
T A
Miq)q + G (q)\-Q (q,q,t) = 0 (19a)
m = o (19b)
where
M = M{q) T(q)
0
A
Q = ' -Cq
A Q {q,q,*)
- Kq
9 = .' ? - if?)
m
Depending on the intial values, (19) and the new state variables, (g,o) may partitioned
into a stiff and a non-stiff part.
143
Our objective is to develop a method that takes large time-steps relative to the
high-frequency oscillations. To begin, notice that (18b) and (18d) define for each
q a highly oscillatory subsystem which can be solved exactly. However, if we were
to solve this subsystem exactly and substitute q into (18a), this is equivalent to a
rapidly vibrating force (a high-frequency forcing function) which would require small
time steps for its resolution. Instead, we can do a local eigenanalysis of (18b) to
identify the high-frequency modes. Letting these high frequencies tend to infinity, we
will replace q in (18a) by its average, yielding a smooth approximation for q.
For rigid body mechanisms, in most cases the stiff subsystems will be small.
For flexible body mechanical systems, the stiff subsystems arise as part of the finite
element method analysis, and will be much larger. These subsystems can be dealt
with using mode superposition via a reduced system described by the Ritz vectors or
the Lanczos vectors, as described in 36,18,41,32,24
Yoshitane SHINOHARA,
Atsuhito K O H D A
and
Hitoshi I M A I
Department of Mathematics, Faculty of Engineering,
Tokuskima University, Tokushima 770, Japan
Abstract
This paper is concerned with the existence and the uniqueness
of quasiperiodic solutions to quasiperiodic nonlinear differential equa-
tions in the neighborhood of quasiperiodic solution to linear differen-
tial equation or in the neighborhood of Galerkin approximation to the
nonlinear differential equations. By denning the generalized exponen-
tial dichotomy, our Theorem 4 will be useful independently whether
the nonlinearity ia weak or not.
Some numerical results are shown. These results show that our
analysis is useful for mathematical investigation of quasiperiodic phe-
nomena such as the design of communication circuits.
1 Introduction
The most fundamental problem in nonlinear oscillations is to find the peri-
odic or quasiperiodic solutions to the nonlinear ordinary differential equations
such as
3
—j + a-7- + / 3 x + 7 1 = PcosW, (1)
148
and
<f?x dx 2
— - 2A(1 -x )— + x = aoQSi^t + fecOSi^i, (3)
dt^ at
where a, 0, f, P, v, A, a, f>, and are positive constants.
But, it is very difficult in general to find the exact solutions in analytical
form. Thus, we are obliged to study the solutions by numerical methods. As
for the periodic solutions to nonlinear periodic systems and also to nonlinear
1 8 M 1 5 2 1 9 I 0 1 3 , 2 2 2 3
autonomous systems we refer to the papers - ' - - - ' - -
From a practical viewpoint, a harmonic balance analysis of nonlinear
5
quasiperiodic microwave circuits has been given by Maas in view of qualita-
tive applications, but he is concerned with neither the existence analysis nor
the error analysis.
2
Chua and Ushida have presented two efficient algorithms for obtaining
steady-state solutions to nonlinear quasiperiodic circuits and systems driven
by two or more distinct frequency input signals. They have calculated some
approximate solutions to Duffing type equation w i t h two frequency inputs
and they have given error estimation, but they are not concerned with the
existence analysis of the exact solutions.
In the present paper, we will show that we can indeed verify the existence
of an exact solution and know the error bound of the approximate solution to
nonlinear quasiperiodic differential equation. By making use of the general-
ized exponential dichotomy, we will be able to strengthen the error estimation
of the approximate solutions.
Numerical examples concerned with the Duffing type equation are given.
/(i) = / (t,...,()
0 for all teR, (4)
Ui. W i t h o u t any loss of generality, we may assume that U i , . . . ,ui are all m
149
positive and further that reciprocals of these periods are rationally linearly
17
independent (see Urabe ). A function f(t) is said to be almost periodic if
from every sequence {a }, one can extract a subsequence
n such that
{f(t + a' )} is uniformly convergent on R. We assume that all functions
n
exists for any almost periodic function / ( ( ) and any real a and that there is a
countable set E of real numbers such that a(f,o) = 0 i f a & S. The module
of / , M o d ( / ) , is defined to be the smallest additive group of real numbers
that contains the set S for which a(f,a) ^ 0 i f a € E.
3,12
According to the results of the p a p e r s we have
P r o p o s i t i o n 1 Let {f {t)} [n — 1,2,...) be a sequence of quasiperiodic
n
functions with periods u i , . . . ,u and let f(t) be the uniform limit of f (t)
m n
Lz = -z - A(t)z, (5)
Lz = 0 (6)
satisfying the initial condition <E>(0) = E (unit matrix).
The linear homogeneous Eq. (6) is called to satisfy a generalized expo-
nential dichotomy i f there exist a projection P, positive constants <7i, oi and
nonnegative functions Cj(t,s), (%(i,s) such that
1 , s
(i) l ^ f t ) ^ - ^ ) ! ! < C i ( t , s ) e - ' " t - > for t > s,
_ 1
(ii) | | $ ( * ) ( £ - P ) * ( s ) | | < C (i,s)e-" <'- >
2
s !
for t < s
Lz = / ( f ) (7)
12
P r o p o s i t i o n 3 (Shinohara et a l . ) Let Ait) be a quasiperiodic square ma-
trix with periods tift,..., u . Suppose that the Eq. ( 6 ) satisfies the generalized
m
exponential dichotomy. Then for any quasiperiodic function f(t) with peri-
ods u\,. ..,u m the inhomogeneous Eq. ( 7 ) has a unique quasiperiodic solution
z{t) with the same periods given by
(9)
where
for t > s,
for t < s.
Moreover the solution z{t) satisfies the relation
11
where z and X{t, z) are vectors and X(t, z) is quasiperiodic in t with periods
u i , . . . , w and is continuously differentiable with respect to z belonging to a
m
region V of z-space.
Suppose that there is a quasiperiodic function z (t) with periods w i , . . . , w
0 m
such that
zo(t) e v ,
dz
^-X(t,z
0
(t))
0
for all t € R. Further suppose that there are a positive number 6, a nonneg-
ative number K < 1 and a quasiperiodic matrix A(t) with periods a>i,... , w m
such that
(i) the linear differential Eq. (6) satisfies a generalized exponential di-
chotomy,
00
J*L<6.
1 - K
Here * ( r , z) is the Jacobian matrix ofX(t, z) with respect to z and the quan-
tity M is given in Eq. (10).
Then the given Eq. (11) possesses a solution z — z(t) quasiperiodic in t
with periods w j , . . . , w such that
m
Mr
l*o(*)-*(*)ll<i—: (12)
1 — fx
for all t e R. Furthermore, to the Eq. (11) there is no other quasiperiodic
solution belonging to T>(, besides z = z(t).
152
where ft, v are constants such that V > &, ft# 0, and /(<) is a quasiperiodic
dx
function with periods W i , . . . , w . Putting y = —,
m
i = ^ + n « ) . (14)
U = % - Az, (15)
at
then the fundamental matrix $ ( ( ) of the linear system Lz = 0 such that
$(0) - E is given by $(£} = e x p t A , which will be called matrizant of L . In
what follows, we denote by ||-]| the following ( norm of vectors and matrices:
x
(ii) when — f,
K 0 = K (t)
0 = m a x { l + \pt\ + \t\ , 1 + H +
0q = fi,
m=fi,
l l
where a = -fi - %Jfi — v , 0 = -ft + y/fi' — i>*.
From Eq. (18) and Proposition 2, the Eq. (6) satisfies the generalized
exponential dichotomy, because we can choose the matrix P such that
when p. < 0,
when fi> 0.
i
z = z (t) =
0 {x (t),y {t))
0 0 (19)
(20)
where the Green function G(t, s) is specified in the following two cases:
Moreover, we have
\\G(t,s)\\<K e-^, 0 (2D
where
3| when \n\ > v,
a =
\fi\ when \n\ < V.
Next, consider the Duffing type equation w i t h quasiperiodic forcing term
such as
2 3
+ %j£ + v x — ex -H a cos v t + b cos v t, (22)
x 2
dV at
where p, i / , i/j and vi are all positive constants, e, a and 6 are parameters.
Further v\ — 2ir/wi, ^ = 27r/w and the ratio w / w i is irrational.
2 2
^- = Az + m + »Jt4 (23)
dt
where
' x 0 \ / 0
2 = = V
U J' ^ ( ^ c o s ^ t + )>cos^t I ' = '
and
1
A
A
- ( ° 2
" ^ -K -2M
Let L be the differential operator defined by
dw , ,,
I t o = — - Aw, (24)
then Theorem 5 tells us that the equation Lw — 0 defined by Eq. (24) satisfies
the generalized exponential dichotomy for fi ^ 0 and that the linear operator
G defined by Gd> — w, which means
G(t,s)4>(s) ds = w{t),
CO
satisfies the inequality
IIGII < M, (25)
where
K0
= if n > v > 0,
ft- vV -v 2
M - [' Koi^e^-Ua if (1 = ?,
J—oo
^ if o<n<v,
2 c o s + 2 2 s i n 1 / 2 ( 1
V - ^ + W ^ " # * ^
2 2 2 2 A 2
[v - fi )cosi/it + 2/j,ViSmi/it = \J(v - v?) + n v}sin^f + a*)
where 2
f -
By differentiation, we have
and
bu
lu (t)l < l^il + I iI
2 2 / 2 2 2
^[>2 - z, ) + 4 V f M v (f - /|)
1 + 4//V ' 2
\x (t)\,\y (t)\<K
0 0 (29)
K
f
{ ,
|a
•+ |6
'
\aui\ \h%\
2 2 2 2 2 2 2
y/{i>* - v ) + 4 M M ^ - i/ ) + ip, v J'
Using Eq. (29), we can estimate the residual function for z {t) as follows: a
3 3
_ - 0 ( 0 - a # f e $ ) l - | | - « , ( * > ( t ) ) | | - |c] | T ( t ) | < |e| J^T . 0
Let = {z\ \\z\\ < 2K}, V = U { z ; \\z - zo{t)\\ < K}. I t is clear that
( 6 R
f T
Mt) e T^k o any t e R and V c T> . k
Let us denote the Jacobian matrix of the right-hand side of Eq. (23) with
respect to z by * ( z ) . Then we have the inequality
2 2 2
\\\M{z) - A\\ = |3e| x = 3 |c| \x\ < 12 |e| K (31)
L
K 3
Yl\t\K <— and \e\K M <(1-K)K.
157
2 2
From the inequalities 12 |e| K M < K < 1 and |e| K M < 1 - K, we have the
2 2 2
inequalities 12 |e| K M < K < 1 - |e| K M and 13 |e| K M < 1. Hence we
have
| £ | 5 ( 3 2 )
13^M
or
K £ ( 3 3
/ l 3 | 7 | M - »
Consequently, we have the following existence theorem of a quasiperiodic
solution to the quasiperiodic Duffing type equation.
\m-Mt)\\<K (34)
for all t € R.
If the inequality (32) or (33) does not hold, or the error estimation Eq. (34)
is too crude, then we should compute a more accurate approximation than
Zo(t). For this purpose, we have considered an approximate quasiperiodic
solution written in the form
x {t)
m = a(0,0) + ^ ^{<hcos(j> v)t t + 0 am{p,v)t},
p
r=l \p\=r
dXm(t)
dt
where (p, f ) = p\V\ + P2V2, \p\ = \pi \ + \p2\, ftnd we have determined the
unknown coefficients tv(0,0), ot , 0 by means of the Galerkin method.
p P
'MO , 0 dx \
m 3
r(t) = & ^ +2 + M
M % ^ > + J (t) Xm - ex (t)
m - acos^ - bsm^t
dt* dt
158
3(m+l)
r (i) = /(0,0)+ E E i / p ^ P ' ^ ' + ^sinfp,!/)!},
r=l |p|=r
S(m+1)
35
r = |/(0,0)|+ E EilAI+lftU. < >
r=l |p|=r
then we have | r ( t ) | < r for all t € R. Define
m+1
36
n = |a(o,o)| + E E { W + A-}. <>
r=l |p|=r
and
m+1
t
For z which lies in the 8-neighborhood of z (t) — {x (t),y (t)) m m m t we have
2
||«(«) - A\\ <3|e| (n + i ) .
\ u t ) - m w < ^ - .
that is,
1 — K
Mr
\X {t)-x(t)\,
n
<
for all t € R.
159
4 Numerical Results
We shall consider the Duffing type Eq. (22) with v = \P1, V\ = 1, j / = \J%. 2
/ 4 4 4 4\/5 '\
K = max — = + . ,— = + = 0.4876394 r
/
\8VT7 2 149 8VT7
V 2\/l497
and
= 2 5
/l3|e|M 1 3 x A x 19.389598 ™ -
Since the inequality (33) does not hold, we can not know whether the exact
quasiperiodic solution exists in the neighborhood of x (t). Thus we make D
use of the Galerkin method. After 3 iterations starting with x (t), we have 0
x {t)
8 = 2{0.0589905cos v t + 0.0147061 sin i ^ t
t
-0.0000016cos3i/it + 0.0000015sin 3 ^ *
+0.0000069cos(2^i + v )t - 0.0000026sin(2i' + v )t
2 1 2
-0.0000116cos(fi - 1v )i + 0.0000076sin(i/! - 2v )t
2 2
9
By Eq. (35) and Eq. (36), we take r = 2.0 x 10" and SI = 0.3385963. I f we
take 6 = K = 0.4876394, then we have
and
« > 0.0639999M = 0.1515---,
160
Mr 4 8 4 X 1 0 9 9 B
= ' < 5.761905 x 10" < 0.58 x 10" < 6.
1 - K 0.84
Thus, we can assure that the exact quasiperiodic solution x(t) exists in
the ^-neighborhood of the Galerkin approximation (38) and we have an error
estimation of xg(t) as
8
\xs{t)-£(t)\ < 0 . 5 8 x 10" . (39)
Remark that Galerkin approximation (38) is almost the same as the corre-
6
sponding result in the paper , but the inequality (39) is strengthened, because
of using the generalized exponential dichotomy in Theorem 5.
As for the case (j, = 1/8, e = 1/64, a = 1/8, b = 1/2, we have
= 0.5038878.
13|e|M \ 13 x j x 19.389598
Hence the inequality (33) holds. Accordingly, Theorem 6 is valid but the
error estimation
\\zo(t) - < K = 0.4876394 (40)
is very crude, where
x {t)
s = 2{0.0589069cosi/ ( + 0.0147504sini/ f 1 1
-0.0805038COSJ/ E + 0.0149944sini^ 2
-0.0000008cos3i/it - 0.0000006sin3i'ii
+0.0000034 cos(2i/! + v )t + 0.0000008 s i n ( 2 ^ + v^i
2
-0.0000233cos(2^ - v )t - 0.0000175sin(2vi - 2
161
-0.0000058cos(i/i - 2v )t - 0.0000049 s i n ^ i -
2 1v )t
2
+0.0000003cos3i/ * - 0.0000002sin3^
2
- 7
By Eq. (35) and Eq. (36), we take r = 0.15 x 1 0 and f i = 0.3384327. I f we
take 6 = K = 0.4876394, then we have
and
K > 0.0319872M = 0.6202189,
where M = K /fi 0 — 19.389598. Hence we can choose K as 0.63, then we have
7
Mr 2.9084397 x 10" „ „„ „
a B n B 7
- 7
„ n „ . 6
= • —— < 7.8606478 x 1 0 < 0.8 x U P .
1 - K 0.37
From the above calculation, we have an error estimation
- 6
\\zg(t) - z{t)\\ < 0.8 x 1 0
References
1. R. Bouc. Sur la methode de Galerkin-Urabe pour les systemes
differentiels periodiques. Intern. J. Non-Linear Meek., 7:175-188, 1972.
Toshiyuki Koto
Department of Computer Science and Information Mathematics
The University of Eiectro-Communtcations
1-5-1, Chofugaoka, Chofu, Tokyo 182, Japan
Abstract
A natural Runge-Kutta (RK) method is a RK method which has a special
continuous extesion. Any one-step collocation method is equivalent to one of
such methods. In this paper, we consider the application of natural RK methods
to delay differential equations (DDEs) which have a constant delay, and discuss
their numerical stability applied to several types of test equations whose zero
solution is stable for arbitrary value of the delay. As a result, we show that an
,4-stable method preserves the asymptotic property of the analytical solution
of a DDE coupled with a difference equation (i.e., a delay-differential-algebraic
equation).
1. I n t r o d u c t i o n
2,3,6 ,0 4,16 20
Many authors > '">' .". have discussed stability properties of numerical
methods for delay differential equations (DDEs) based on the scalar test equation
\b\<-9a (2)
and T is a positive constant. Because of the condition (2), Eq. (1) has a special
asymptotic property; its zero solution is asymptotically stable for any value of r. By
this property, certain stability regions (P-stability regions) of numerical methods can
be defined for (1) in the same way as the standard stability regions are defined for
the Dahlquist test equation, i.e., Eq. (1) without the term bu(t — T). Moreover, an
analogy of ,4-stability can be considered regarding DDEs based on (1).
However, differently from the standard case, we can not reason that a stable
method for (1) is also useful to a system of DDEs even if it is linear and with constant
166
-(if !)•
We assume that some conditions are satisfied for the zero solution of (3) to be stable
for any value of T, and discuss stability properties of natural Runge-Kutta (RK)
30
methods , a special type of RK methods applied to (3). It should be noted that
(3) is a DDE coupled with a difference equation if d\ < d. Such equations are called
delay-differential-algebraic equation (delay DAE), and a systematic research on their
1
numerical treatment has been carried out by Ascher and Petzold
A solution of a DDE with constant delays is called absolutely stable if it is stable for
6
arbitrary values of the delays After this usage, we will say Eq. (3) to be absolutely
stable if its zero solution is so.
2. Preliminaries
In order to apply a RK method to DDEs, we need an approximation of their
19
retarded parts. We use a natural continuous extension (NCE) of the RK method
for this purpose. Moreover, we consider an aligned mesh, i.e., a mesh of the form
where k denotes a positive integer. Then, a RK method applied to (3) is (at least
formally) written in the form
EK jn = L + ayicjj + M (u . n k + h g ^K^jjj ,
i = 1 , 2 , ( 5 . a )
u , = u + h Y, kK ,„
n + n n (5.b)
i=l
where u denotes an approximate value to u(t„), a^, b C;(= £ J ay) are the param-
n it = 1
eters of the RK method and 6 (0)'s are polynomials which satisfy certain conditions.
;
We also write
167
Even in the case dj < d, .the numerical solution of (3) is computed by (5) if A is
1
invertible and (3) satisfies some proper condition .
20
Concerning general RK methods, Zennaro clarified their stability for the scalar
3
test equation (1). He has developed a techiniqe to find the P-stability region of a
RK method, i.e., the set of the pairs of the complex numbers (a, 0), a = ah, 0 = bh,
such that the numerical solution of (1) vanishes as n —> oo.
For example, the interior of the P-stability region of the Euler method is
S
{{*,/)) e C : | l - r a | + | / 3 | < l } ,
where
2+a 2+ a
2
1 +a -1 1 + Bf P - 1
1 5
when | I + « 1, w„ = —&a when | 1 + a |= 1. Figure l shows the P-stability
regions of well-known RK methods when a and 0 are real; the holizontal line is
denoted as o-axis and the vertical line /?-axis. A RK method may have several NCEs.
The Heun and the classical RK method have tow NCEs, but their P-stability regions
do not depend on the choice of the NCEs. Kutta's 3rd-order method has infinitely
many NCEs; its P-stability region in Figure 1 is that for an NCE furnished by a
19
theorem (p. 124, Theorem 7). These are obtained by Zennaro's technique, but the
derivation of the regions needs rather complicated computation although only quite
fundamental methods are considered.
! 0
If B = A, i.e., fej(cj) — agj the RK method is said to be natural . As a
fundamental result on natural RK methods, it is known that any one-step collocation
method for DDEs is equivalent to a natural RK method determined by
%= f t,id)d9, b (6)=
s fi0)m 1 h= tt0)0, (6)
Jo Jo Jo
where fj(0)'s are the basis polynomials of the Lagrange interpolation for the col-
location points C i , c , ••• , c,. In particular, the class of natural RK methods
3
T ' T - i '—n
Fig. 1. Real P-stability regions of the Euler, Heun, (3-stage 3rd-order) Kutta, classical Runge-Kutta
methods {in order of small-to-large).
As for natural RK methods, we can derive some stability properties for DDEs
from those for ordinary differential equations. For example, if a natural RK method
is ^-stable, then it is also P-stable, i.e., its P-stability region includes the domain
10
{(o-.d) £ C : | 8 \< - S o } . It was also proved by Zennaro . We now introduce two
symbols which will be used in the following sections. Let r(z) be the stability function
of a RK method, i.e.,
T
r(«) = 1 + zb (I, - zA)-^, e = { l 1 •-• if. (7)
«'(() = t u ( t ) + M u ( / - T ) . (9)
In this case, the asymptotic behavior of the solution is well known. The zero solution
of (9) is asymptotically stable if and only if
We assume that this condition is satisfied for any r > 0, and consider the appli-
13
cation of a natural RK method to (9). Then, we can show the following theorem .
Theorem 1 Assume that the natural RK method is A-stable and that all eigenvalues
of the matrix A have nonnegative real parts. Then, the numerical solution of (9) tends
to zero a s n - t c o for any k and any initial function.
9
Theorem 1 was originally proved using a theorem by in 't Hout , but a more
flexible proof without the theorem is also possible. In the following, we describe
another proof, whose fundamental idea is also valid to the case of delay DAEs, or
13
neutral DDEs considered, e.g., by Kuang, Xiang and Tian
(A )
3 %tz < 0 for any z € o [L + M],
where c[A'] denotes the spectrum of the square matrix X. To the contrary, these two
conditons imply (A) for any r. In fact, it is also easy to see that the condition ( A i )
in the following proposition, together with (A ), implies (A).
2
(Ai) P(z,C) + 0 far any z and ( such that 3?3 > 0, z ^ 0 and | fj | < 1.
7 ( y ) = min{|C|:P(t!,,O = 0}
satisfies 7(^0) < 1- Moreover, i(y) is a continuous function, and it is easily shown
that 7(3/) > 1 if | y | is sufficiently large since the set {a[L + (M\ : f e C , | ( |< 1} is
bounded. Thus, 7(1/1) = 1 for some y% ^ 0. However, this implies P(*Jfi,Ci) — 0 f ° r
for some zi, Ci such that 9?z > 0 and ] C2 [< 1; let g(6), 0 < 8 < 1, be a continuous
2
x W = mw{S*:P(*,j(0)) = O}
above. Therefore, v(0.) = 0 for some 8, with 0 < 8. < 1. However, this implies
P(z.< g{$.)) = 0 for some z. with £ 2 , = 0, which contradicts (10) since | g(8.) | < 1
and z. / Oby (11). Q. E. D .
» = l,2,...,s, (12-a)
u« =u +l n + h-£kiK^ (12.b)
where
T
t/„ = MfW.i,i ••• u ) ,
n
fr+1 k
det [A Z.i - X L 0 - A Mi - M ] = 0 0 (14)
By the condition ( A i ) , the set ff[Z ] \ (0} is included in the Left half complex
x
<r(r(2 )J = r(<r[2 ])
A i (15)
18
by the Spectral Mapping Theorem . Since the RK method is /Lstable, [ A | < 1 or
A = 1. However, if A = 1, then 0 e c [h (L + M)\ by (15); this is impossible by ( A ) . 2
q . E. D .
or
x'(t) = L x(t)
n + L y{t)
s2 + M x(t
n - r) + M y(t 12 - r),
(16)'
0 = L x(t)2i + L y(i)
22 + M x(i
n - r ) + M y{t - r ) ,
22
where Lij, M,j denote d, x dj matrix. We assume that Ln is invertible. Then, (16)
can be solved for any initial function which satisfies
In fact, on each interval of the form \(m — 1)T, m r ] , m = 1,2,..., (16) is considered
as a DAE if u{t — r ) is known. We can solve (16) by the almost same method as the
step method for usual DDEs.
Also in this case, the condition ( A ) is necessary for the zero solution to be asymp-
totically stable; if ( A ) is not satisfied, there is a solution of the form exp(z()ii which 0
does not converge to zero. However, we can not expect that (A) is also a sufficient
condition; we obtain a neutral DDE by differentiating the second equation in (16)',
but (A) is not always a sufficient condition for asymptotic stability in neutral DDEs.
In this paper, we consider the following condition (B) as a sufficient condition for the
zero solution of (16) to be asymptotically stable. We will prove that (B) is indeed a
sufficient condition in Appendix.
(B) there is a 6 > 0 such that P(z,exp(-Tz)) ^ 0 for any Kz > -8.
We assume that (B) is satisfied for any r > 0. Then, we have the following
theorem.
172
Theorem 2 Assume that the natural RK method is A-stable and that ail eigenvalues
of the matrix A have nonnegative real parts. Then, the numerical solution of (16)
never diverges for any k and any initial function. Moreover, if | r(co) |< 1, the
T 1
solution tends to zero as n —* oo, where r(oo) = 1 — b A~ e.
In the following, we will describe the proof of Theorem 2 along the same line as
the proof of Theorem 1. The same results have been obtained for higher index delay
DAEs, e.g., index 2 equations of the form
F - ( h 0\ _ ( L u L 1 2 \ _ ( M u Mt2
i.e.,
*'(') = W W + L.2V(t) + Mux{t - r) + M y(t - T% l2
(18)'
0 = L x{t)
21 +M x{i-r),
21
where L L \ is invertible. It is also proved along the same line although more com-
2 I 2
(B )
0 det[i; (C)] / 0 for any C with | £ |< 1
2
(B )
2 P(z, 1) ^ 0 for any z with »z > 0.
If (B ) is satisfied
0
{ 0 -^(o- 1
) { -mi) -L- (O
2
=
imf*m) £ ) • w =
^ ( o - i ; 2 « ) ^ ( o - i
^ 1 « ) .
(Bi) det [s(84 - Q{()\ / 0 for any j f f i , j , / 0 and any < with | ( | = 1;
173
By the same argument as in the proof of Proposition 1, we obtain the following propo-
sition. It is also shown that ( B ) , ( B j ) and ( B ) imply ( B ) for any r . Consequently,
0 2
(Bj) det [zl di - Q(Q] / 0 for any z and C a.t. 9tz > 0, z / 0 and \C\< I .
in order to prove Theorem 2 we prepare a lemma. This lemma shows the solv-
ability of Eq. (5,a) as a special case.
Lemma 1 Let P be a permutation matrix such that
T
P(X 1 % X 2 Y • - • X. Y,) = ( X , X
2 2 • • • X, % Y • - • 2 Y.f,
where A', and Y< denote di- and d -dimensional vectors, respectively. If A is invertible
2
1 1
(I ®E-hA®{L
s + CM))" = P" (c3(C)-'i*(C)) P, (19)
wkt
w l
' \hA®L' {Q- L M) 22 2 hA®I J' d
L {
° - ( 0 -I,®LUQ-> j '
Proof Since (Bo) is satisfied, we obtain
1
L*(C)P (/. ® E - hA ® (L + CM)) P " = 0(C). (20)
Moreover, since
i=i
where a,'s are the eigenvalues of A, det [I, , — hA® Q{()] / f°r C h | C |5 1 d
0 w i t
by ( B i ) . Thus, the matrices Q{(), I, ® E - hA®{L + (M) are invertible, and (19)
follows from (20). Q. D . E.
where
, _ ( h®E-h{A®L) 0
L l 7
~ { -b ®I d u
and the other symbols are the same that appear in the proof of Theorem 1.
If A f 0 and
h
d e t [ j , ®E-hA®(L + \- M)] / 0,
simple computation shows that
A i , - l a - ^ " W t - \-"M 0 (23)
k
X{\- ) 0 ) ( fc
X ( A - ) - ' [he®(L + k
X- M)\
T k
-b ®f d h ) \ 0 \h-R{\- )
X « ) = / , ® E - hA ® {L + (M),
T 1
R{() = h + (b ® I ) [I,®E-hA®(L d + (M)}- [he ®(L + (M)}.
Moreover, using (19), we obtain
R ( C ) _ ( r(hQ(0) 0 \ m
R{
<>-{ Y(0 r H / J ' ( 2 4 )
the absolute values of all roots of (22) except A = r(oo) are less than 1; r(co) is a
root whose absolute value is 1 and its multiplicity is <f by (23), (24) (and (B ) when 2 2
r(co) = 1). However, it is easy to see that there are d linearly independent vectors 2
which satisfy
+ 1
( A * £ , - A * I - A Mi - M ) U = 0 0 0
for A = r(oo). Thus, the solution of (21) does not include a diverging component for
any initial value. This completes the proof of Theorem 2. Q. E. D .
References
423-438 (1988).
Appendix A
We here show that the condition (B) implies the asymptotic stability of the zero
solution of (16). By differentiating the the second equation in (16)' we obtain
~L u'(t)
0 + M u'(t -r)
0 = l u{t)
x + Mm{t - % T (25)
where
t _ ( Ii, 0 \ • _ ( o 0
i o
" [ Im 1« J' M
° ~ ( Mi, M 22
Since det[L ] ^ 0 by the condition det[£ ] / 0, we can represent the solution u(t)
0 22
5
using the Laplace transform in the form
1 n+" 1
«(*) = ^ exp(tz)H(z)- g(z)dz, (26)
where
/f(z) = ZLQ-U +exp(-Tz)(zM 0 - Mi),
$(z) = LovKO) + MOV>(-T) + exp(- z) y T exp(-tz)(M! - z M ) ^ ( f )dt, 0
fit ,= ( ~ +
®flP{****)*fii] —1-6*3 + e x p ( - T 3 ) M , ] 2
0 1
' \_ -[I +exp(-T )M ]
2 1 2 2 1 -[Iaa+exp(-T*)Af„)
we get
dettf(z) = ( - ^ P ( z , e x p ( - T ) ) . 3
l
The condition (B) implies that H(z)~ is holomorphic in a neighborhood of {z S
C : 3Jz > -(5} \ (0). Hence, shifting the contour in (26) and applying the residue
theorem, we obtain
U { t ) = i
Wi Lum ^M^)H(z)- g(z)dz + uo(t), (28)
T
L V(Q)
O + M <P(-T)
0 = [VI(Q),Q] ,
and hence
H(z)-'(L p(<>)
ot +M 0 V ( - r ) ) = i/o^rV^Oj.Of
by (27). Further,
l
Consequently, exp(tz)H(z)~ g(z) is holomorphic near z — 0, and hence u (t) = 0. 0
Moreover, we can show that the first term on the right-hand side in (28) decreases
5
exponentially by the standard argument .
179
A n I n t e r v a l M e t h o d o f P r o v i n g E x i s t e n c e o f S o l u t i o n s for
Nonlinear Boundary Value Problems
Shin'ichi OISHI
Department of Information and Computer Sciences, Waseda University, Skinjuku-ku
Tokyo, 169, Japan
E-mail: oishi@oishi.info.waseda.ac.jp
ABSTRACT
A method of computer assisted existence proof is discussed for solutions of non-
linear boundary value problems. Ia 1966. (Jiabe has presented a convergence
theorem for a certain simplified Newton method. Urabe's theorem is essentially
based on Banach's contraction mapping theorem. In this paper, reformulation
of Urabe's theory using the interval analysis is presented, ft is shown that a
sharp error estimation can be obtained by this reformulation.
In this paper, we are concerned with the following nonlinear boundary value prob-
lem of a system of first order real differential equations:
J = /(*.*). i € / = [ - U l ,
<?{*) = 0, (1)
where x and f{x, t) are n-dimensional vector valued functions and g is an n-dimensional
vector valued functional. For example, let - 1 = to < f-i < ( < - • • < f-w-i < t^ = 1,
2
g(x) = x(-l)-x(l),
2
This problem has been studied by many authors. Among them, in 1966, Urabe
has established an existence theorem of eq.(l) using the so-called "Urabe's theo-
1
rem" of the convergence theorem for a certain simplified Newton Method. This
result has been applied to estimate the error of a numerical solution of eq.(l) by
11 5 6 7
himself , Shinohara*, Shinohara and N.Yamamoto , Fujii , Shintani and Hayashi
8 9
and Hayashi Moreover, T.Yamamoto has developed his theory using the theory of
pseudometric space. As a result, the usefulness of Urabe's theory has been proved.
9
In this article, we shall present a further extension of T.Yamamoto's theory from
the modern interval analytic point of view. Namely, in this paper, it will be shown
that point-wise error estimate |c(f) — x'(t)\ is possible between a given approximate
solution c(() and a true solution x'(t). In our argument, we will show that an infinite-
dimensional extension of the Krawczyk operator can be defined associated with a
Newton-like operator defined by Urabe. Then, using Caprani-Mad sen-Rail's theory
10
of integration of interval function , we will show that range of that Newton-like
operator can be evaluated numerically.
Features of our method can be summarized as follows:
1. We assume only that c(t) is a continuous function oft. Thus our method can be
applied to approximate solutions obtained by a wide class of numerical methods.
For example, it is applicable to finite element solutions, approximate solutions
obtained by interpolating discrete approximate solutions generated through a
discrete variable method, and so on.
2. Our method calculates directly the image of Urabe's simplified Newton operator
applied to closed ball centered at the given approximate solution. Further it
does not use overestimated imbedding constants. Therefore it provides a sharp
error bound. Moreover, if desired it also provides a rough bound with less
computation.
2. Theory
In the following, we assume that an approximate solution eft) is given for the
problem (1). We also assume that it is a continuous function but not necessarily
181
a smooth function. This assumption reflects the fact that approximate solutions
obtained by numerical methods are usually continuous functions but not necessarily
smooth functions. For example, discrete numerical solutions by means of interpolation
may not be smooth functions. Under this assumption, we will present a sufficient
condition under which the problem has an exact solution in a domain containing an
approximate solution c(t). We will show that our method also provides a method of
obtaining sharp error bound for c(().
Let X = C[-1,1;V] be the Banach space of real n-dimensional vector valued
functions x(t) = (xi(t),X2(t), • • • ,£„(()) continuous on the interval / = [-1,1] with
the scaled maximum norm
||x|| = max|x(t)| ,
u u (2)
where
m l = M i l . ( 3 )
Here, u = u , • • -,
2 is a constant n-dimensional vector with positive elements,
Uj > 0 for i = 1,2, • • •, n. Let Y = X x R" be a Banach space with the norm
l
Let D = C [—1,1; V] be the Banach space of real continuously differentiable n-
dimensional vector valued functions x(t) = (a;i(t),Xa(t), • • - ,x (t)). In the following,
n
^ = (4)
Then we can rewrite the original problem as the following operator equation;
Fx = 0. (5)
DF(x)h= ~ f (^t)k,g'(x)ky
I (6)
Here he D.
182
For a real matrix function A(t) continuous on J and for a vector valued continuous
linear functional I, which approximate f (c,t) and g"(c), respectively, we define the
z
Lh=(^-A(t)h,lhy (7)
Tt = M t ) z ( 8 )
satisfying
*(-!) = /. (9)
l
Let *(t) € C \-1,1;M] be an approximation of *(() satisfying
•<-!)=/. (10)
40 = *£W on
Then the following relations hold:
= A(tMt) (12)
dt
and
* ( - l ) = /. (13)
This means that $(f) is the exact fundamental matrix of the following linear systems:
%=A(t)z. (14)
§ = (W)
satisfying
* ( - ! ) = /. (17)
183
Let G = be the matrix whose column vectors are i i — 1,2, • • • ,n, where
<t>i(t) are the column vectors of the matrix $(t). Then the matrix G is nonsingular if
and only if the operator L defined by eq.(7) has the linear inverse L ~ ' and
1
L- (4>,uj = H4> + Su, (18)
n
where <f> e X, u e R , H is the linear operator from X into D C X such that
n l
and S is the linear operator from R into D such that Sv = ${t)G~ v.
We assume now that G is invertible. We consider a Newton-like operator k : X —»
X
l
k(x) = L~ {L-F)x
= / ^ / ( M ) - ^ , ^ ) "!?(*))• (20)
It should be noted that the second line of this equation implies that k can be defined
on X. It will be seen in the next section that if x~ e X is a fixed point offc,then it
belongs to D and satisfies Fx' = 0.
In order to show a sufficient condition under which the operator k has a fixed
point in a domain containing an approximate solution c(t), and to find a sharp error
bound for c(f), we will introduce the infinite-dimensional Krawczyk operator. For this
purpose, we here review briefly the theory of interval functions. In ordinary interval
analysis, the term interval refers to closed intervals of real numbers,
The real functions y[t) and y(t) are called the endpoint functions. In this paper, we
assume that the endpoint functions are elements of C[— 1,1; KJ and consider an inter-
val function to be the set of real functions y in C[— 1,1; V] such that y{t) < y(t) < y{t)
in the natural partial ordering of functions. The addition, subtraction, multiplication
and division between interval functions are defined point-wise. Moreover, a theory for
the interval integral of an interval function has been developed by Caprani, Madsen
10
and Rail It is defined by
£ Y(s)ds=
i fy$.»)d», f_y(s)ds (22)
184
where J denotes the lower Darboux integral and / denotes the upper Darboux integral.
Here, the lower Darboux integal is the supremum of integrals
where yi is any step function satisfying yi(t) < y{t). Similarly, the upper Darboux
integal is the infimum of integrals
j y*(s)ds,
|[a,6]|=max(|aj,|6|).
Mid(r(0) = ^ ^ . (23)
Let T(t) be an interval function with Mid {T(t)) = c(t). We now introduce the
following infinite-dimensionalKrawczyk operator:
where
M = L-\L - DF(T)) and c =Mid(T). (25)
More concretely, we have
M(T(t) - c)
= $(() f$-\s)R( )(T{s)-c{s))ds
s
l
-${t)G-H[${t) £ $- (s)R{s){T{s) - c{s))ds\
+*(()&-'(/ - ff(T(t))(T(t) - c(t)), (26)
where
R(t) = / ( T ( i ) , t ) - A(t).
x (27)
Then we have the following theorem:
185
Theorem 1 Let T(t) c X be a bounded interval function with Mid (Tff)) = c(t). / /
AB = \A\B
= [-\A\\B\,\A\\B\]
= [-LIPIIBI.
We here note that Mid (T - c) = 0. Then, from this lemma, we have for any interval
function T{t)
M(T(t) - c)
= [-1, £ i s - ' W H R W I i r i i * ) - cis)\ds
l
-*(()G-'/[[-l, £ |*- ( )||fl( )||T( ) -
S S S c(s)\d ]
S
This expression is often useful to reduce overestimation originated from interval cal-
culation if one chooses suitably A(s) and $i(t). •
We now show how to calculate the operator M. We assume that c(r) and $(r)
are piecewise smooth functions such that whose derivatives are piecewise continuous.
Then we can choose a subdivision of the interval [-1,1] as
[-1,1] = 5, U 5 U - - - U S
2 t (33)
such that c(() and $(t) are smooth on each subinterval Si. Here, SiC\Sj = tp if i ^ j .
In this case, for ( e [t/,fy+i], we have
3. Proof of Theorem
In this section, we will prove Theorem 1 presented in the previous section. Let
T(() be a bounded interval function with Mid (T(()) = c(f). We assume that the
following conditions are satisfied:
K(T(t))cT(t) (35)
and
||M||„ < 1. (36)
We first show that k : A" -* X is contractive on T, and k(T) C T.
In the first place, we show that k(T) C T. It is noted first that the set M ( T - c ) is
convex and closed in X. Moreover, we note that Frechet derivative Dk{x) : X -* X
ofkiX—tXis given by
Dk(x) = L-'{L-DF{x))
l
= L~ (Ux,t) - A(t)).
Jo
e fc(c) + M(T - c) e T.
187
Here c~oS means a closure of the convex hull of a set S. This means that k(T) C T.
Clearly, if i € T,
Dk(x) € M.
Thus from the condition (36), we have
||Dfc(x)||„<lforalla:eT. (37)
x-(t)
-*(<)Gr'i[*(«) f l
$~ (s)(f(x-(s), s) - m
A(s)x (s))ds]
+*(t)G-\l(x-)- (x'))-
9
Then we have
m
dx (t)
dt
,
= A(t)x(t) + $ ( i ) $ - ( t ) ( / ( : * ( t ) , i ) -
3 A(t}x-(t))
= fmt),t) (38)
and
i«)
-imtHG-Him / * r ^ i r j r V j ^ «J - A(s)x-{s))ds]
1
+im)]G- (l(x-) - g(x-))
= l(x')-g{x-). (39)
The equalities (38) and (39) say that x' is a solution of (1).
188
Now we shall prove the uniqueness of the solution in T. Let x be other solution
of (1). Then
*® = /<*).*)
= +{/(*(*),*)
Therefore i ( f ) can be expressed as follows:
x = L-\f(x{t),t)-A(t)x(t)J(x)- m)g
= *(*)-
Thus it is seen that i is a fixed point of k. Since k has a unique fixed point in T, it
follows that x'= i.
Lastly, let us prove that x' is isolated. Let $"(() be the fundamental matrix of
the linear homogeneous system
| = MV(tM)y. («)
such that *"(0) = / . Put
G- = /[*•{()]. (41)
Suppose that G" is singular. Then there is a non-trivial constant vector t; such that
G'v = 0. Put
y(t) = *-(()«, (42)
then y(t) satisfies
l[y] = 0.
Since clearly the right hand side of this equation belongs to M, we have
4. Error Estimation
In this section, we shall discuss how to choose T(f} provided that an approximate
solution c(f) is given. We assume that e(() is in C[—1,1; V].
Algorithm:
189
*-*>»
satisfying $(—1) = 0.
3. Calculate
l
6. Calculate an interval inclusion of L~ F{c) by
l
L~ F(c)
= _ ( / _ *(t)G()«(t) / ' ( / ( c t ^ . i l - ^ i W s ) ) ^
-*(*)G(.(i)-i,f»)-ctf)
or if c is continuously differentiable, by
= (j-*(t)6<)*(t) j j ^ l - f( {s),sm
c
Then put
T(t) - c(t) = [-u,u].
P (45)
190
l
8. Put M = L- (f (T(t),
s t) - A(t)) and check the conditions
|S (t)| + \M (T (t)
n n n - c (t))| < pu .
n n (47)
We note here |£„(t)| < H„ and | | M | | = |[|iW ]ti || . Thus, the problem is reduced
n U o n n Un
to show
Un + \\M \\ „pu„ < pu , n u (48) n
or
\ + \\M \\ „p<p. n u (49)
From this it follows that p > 1 is necessary. In the following we assume p = 2. Thus
the problem is to show | | M | | , < 1 for sufficiently large n.
n u
Now we assume that \\L„ - DF(x')\\ -» 0 as TI -> co. Under this condi-
an
tion, we shall prove \\M \\^ n 0 as n - oo. For sufficient large n such that
WDFix-y^UJLn-DFix-}^ < 1, we have
l
IIX-MI < \\DF(xT L„
" " 1-PF(Z-)^|UJ|L.-OF(^)|U„
< 7,
where 7 is a constant.
191
I P W I L
< ||T- -«.!«. +ll«»-*"lk
+||c» - i * | L
-f 0.
\\DF(T ) n - - 0- (50)
= \\I-L^DF(T )\U n n
< -r(\\L -
n DF(x-)\\ Un
- 0.
In this section, we shall consider the following van der Pol's equation as an exam-
ple:
cfx 1 dx 1 2
= ( 1 l ) I ( 5 1 )
^ 4 - dT-i6 -
As boundary conditions, we impose the following two point boundary conditions:
i ( - l ) = 0, = 2. (52)
Fig. 1 shows that K(T) — c is a proper subset of T - c. Thus, it turns out that there
exists a solution of the problem in K(T) uniquely.
5. References
F i g u r e 2: I n t e r v a l I n c l u s i o n o f K(T) - c
195
and
Jiro AMEMTYA
Research Lab. II, Communication and Information Systems Research Laboratories,
Research and Development Center, Toshiba Co.
1 Komukai-toshiba-macki, Saiwai-ku, Kawasaki-city 210, Japan
E-mail: amemiya@isl.rdc.tOShiba.co.jp
ABSTRACT
1. I n t r o d u c t i o n
1
first author's supervision .
initial-value problem:
dt
d d
where the initial value y £ TV* and the right-hand side function / ; R -> R are
0
2
y£W) = + £ + <'
where
^ . = T-^^ H*»+**»>•») +,
(3)
!
On + 1)
and
h =t
n n + 1 -t„ and ^€(0,1). (4)
The local truncation error 2 i cannot be known in general, but, if an interval Y„ (c
n +
H*) in which {y{t) | ( € [t„, t i ] } is included is known in some way or other, an inter-
n +
+
val [z i] in which 2 n lies can be determined by substituting the interval
n+ +1 y^ '\Y ), n
(p+1
or a little wider, practically computable interval fe^**]^) ( 3 !/ '("r'n)), for
2/U>+ii(( B h ) in (3). In practice, we set
n + n n
s := midpoint [ z j
n and [z ] •= \z ] - s ,
n n n (5)
and compute the sequence y„ (n = 0,1, • • •) by
k
h
r.
ww := u„ + E - r r ^ 1
+ ««+»• (6)
side to that truncated up to the p t h term on the right-hand side, so that we can
n
automatically produce from the given program another program which computes the
2
Taylor coefficients of higher and higher order successively. As is well known , every
computation represented in a program form of this kind (to discuss what kind would
5
require too much space ) can be intervalized.
Comparing (2) with (6), we see that the exact solution of the differential equations
(1) is within the interval obtained by executing the computational process (6) with
(
the "noise" represented by the interval [in+i] added at each "intermediate variable"
yn+i- This situation is quite the same as in the case of the noisy computational
5,8
process with rounding errors . (If we want to consider also the effect of rounding
error we may add to [z +i] the extra term for it as a noise.)
n
According to the theory on noisy computational process, we can write the inte-
gration process (6) as follows:
V Vo
» . + S u
Ift := l/o + l , -TT K
k=i -
+ + s
«* *- * £ t t * " ( 7)
Actually, using the automatic differentiation, we can obtain the higher derivatives
of y as a function of value of y(t). So we express the fcth-order derivative of y at the
(il
steppoint („ as y '(f(*n))- We rewrite the (2) and (6) as follows:
(k
y, i
+ = y + t-^y \yn)
n +s u n+ (8)
K
»=i -
k=l
0
- y -y(tn)
n + E & % + n(y{t ) n - - »(*-))
—
+S„-H
198
= ± ( n ( - r + £ ^ ^ ) ) («--*-) oo)
where 7 is the identity matrix, dy^/dy is the partial derivative of the procedure to
calculate the ith-order derivative of j / . Then we define
* * * » = f i * w
( M < N + I )
' M
U (m = n + l ) .
represents the the effect of the noize z — s . However, z and 0 ( 0 < m < n + l )
m m m m
Then we shall discuss the method to calculate the confidence interval [y i\ which n+
contains exact solution y(f„ i). At first we prepare to construct the method to
+
fn + e„(il(tn)-Vn))€b„]. (14)
p k k
" h fliA*) h
+ e ( ! / ( £ n ) e / + ( W I ( i 5 )
An i,» = i + £
+ a % ' f r - " £ t 1 '
The right-hand side of (15) is the interval matrix obtained by using the automatic
differentiation. To calculate the interval matrix, we may first compute the p th-order n
199
derivative of y by using interval arithmetic with the interval argument [y \ and then n
calculate the partial derivative and right-hand side of (15). We shall denote this
interval matrix by [Ai+i,n].
To determine the effect of the noises on a y we need the interval partial derivatives
n
(interval matrices) of y„ with respect to all the previous y % which we shall denote m
by [ A i , ] , and they are also computable automatically if we have a program for the
m
[AH-I,™] := [A.+i,n] • [A , ], n m [A _ ] = J,
n n (16)
! m - S m f [&»]• (17)
Substituting [A„ i, ] for - 4 i and [z^] for z — s on (13), we can obtain a interval
+ m n + i m m m
:
which contains y ( f ) — j/n+i
n + 1
y ( t „ ) - jfe+i € •£ K + i , p ] .
+1 m m (18)
We can calculate the numerical solution and the confidence interval with following
steps:
1) Determin the enclosure Y„ of y and the stepsize h„.
2) Determin the order p„ and the stepsize h which is used in the integration on n
where Y may be narrower than the enclosure obtained in 1). Then calculate the
n
The machine interval is the interval whose both sides are floating-point numbers. The
machine interval operation is the operation which takes machine interval arguments
and produces the narrowest possible machine interval that contains the result of the
corresponding interval operation with the same arguments.
There is no problem in substituting the machine interval operation for the real
interval operation in this method. As a result of the computation left-hand side of
(6) by using of the machine interval operation, we can obtain a interval with positive
diameter. So the step 4) must be modified as follows:
4) Calculate
for m = n — N when we compute the effect of the noises farther than JV steps before:
l^n-H,™] := [Ai+l.tJPVm],
m =n — N, n, [A^] = I, (24)
(in+i] := [ V H ] + \A„ . ]%_ ].
+Un N N
;
iVn+i] = Ih-H + E PWrnHliJ. (25)
m=n+l-/V
If we set N=0, then we have the interval version of a simple step-by-step integration
which uses the interval matrizant (and, consequently, conspicuous wrapping effect,
etc.).
201
Of course, there is another method which execute one step integration as follows:
W
bU-J == \3h] + E - J | H ( ( [ f a ] ) + k + i ] . (26)
This is very primitive and seems to be very fast, however, there is the fatal defect
that the diameter of the confidence interval on a step will become always larger than
that on the previous step.
With respect to the method (or the family of methods) enunciated in the preceding
section, the following points seem to be technically most important.
(i) How to determine the interval Y„ and h such that Y D {y(t) | t e [t„,r- ]}
n n n+t
tion?
(iii) How to choose N, the parameter for keeping the "interval matrizants" rep-
resenting the noise propagation in the computational process? How is the above-
enunciated method compared with the method of constructing the confidence interval
10
by means of interval solutions of the first-order perturbation equations directly ?
(iv) How much should we pay for guaranteeing the accuracy of the solution over
the cheaper conventional integration methods such as the Runge-Kutta with stepsize
control ?
According to our computational-experimental observations (cf. next section), our
tentative answers to the above-raised questions are as follows.
(i) To literally follow the statement in the ordinary textbooks, using the norm in
the solution space, of the so-called "Cauchy-Peano" theorem on the existence and
uniqueness of solution is the worst strategy for determining Y„ and h„. We would
usually have too small h*. We had better employ the combined strategy of inflating
Y and/or reducing h„ so as to achieve the inclusion:
n
y« + [0Mf(Y«)cY n (27)
and of sharpening the interval by repeated application of
^ : = ^ +[0,M/W (28)
However, there are still a lot to investigate about how to automatically (more or
less heuristically) determine the initial guess for Y and ho, and about the way of
a
combining the interval inflation and sharpening and the stepsize reduction.
202
We must guess the stepsize h at the beginning of the first integration, and we
0
must change it when we determine the order in (ii). If we guess too large fto initially,
we may obtain too small ho in the following process. Not only at the beginning of
integration but also in the course of integration, we had better avoid drastic change
in the interval width, in the order of the Taylor-series expansion and in the stepsize,
but still there remain several possible ways of gradually changing them (this point is
closely connected with (ii)).
With respect to the strategy how to determine Y„ and h„, we calculated the initial
guess for Y„ as follows:
If Y and h„ achieve the inclusion , the solution lies in Y„. Otherwise, we inflate the
n
where we set e = 0.1. We can obtain the new interval whose width is 1 + 2e times
as large as the old interval and check if the initial interval and stepsize satisfy the
condition (27) or not. If they do not achieve (27) after repeating e-infiation several
times (we repeated 5 times), we halve the stepsize h„ and resume (29).
(ii) There are two meaningful strategies: (I) to make the stepsize as large as
possible under the restriction that the width of [z„]/h„, the local truncation error per
5
unit time, should not exceed some given parameter e , and (II) to minimize p / / j ,
t n n
the total operation count per unit time, as small as possible under the same restriction
13
(see also ). Both works fairly well. However, in view of (iii), the former strategy will
be more meaningful in practice.
(iu) At first sight it seems that the greater the N the better. Indeed this is usually
the case, and that for small iV's especially. But computational experiments sometimes
point to the contrary. To choose too great an N sometimes deteriorates the quality
of the solution, i.e., produces wider intervals than those obtained with smaller jV,
probably because the intervals of the elements of matrices [ A , ] become too wide.
n m
c (31)
dt ~ dy '
There are several methodss to calculate the interval matrizant. We adopted first-
order Taylor expanseion method to solve (31) numerically and tested following two
203
methods: one is to solve the equations on [£„,(„+h], the other is to solve the equations
on [r,,, i« + h/2] and [(„ + h/2, i„ 4- h] and obtain the interval matrizant as the product
of the two. These methods are faster than the standard method (see Table 1), but
give us wider confidence intervals (see Fig. 5).
(iv) The results of our computational experiments on the problems in §4 showed
that the present methods require two to five hundred as long computation time as
the conventional 8th-order Runge-Kutta (of 10 stages) with stepsize control where
the restriction e on the local truncation error per unit time (estimated by comparing
t
the numerical solution with the stepsizes doubled) was set in such a way that the
accumulated error may be nearly equal to the guaranteed interval width obtained
by the present method. (The popular 4th-order Runge-Kutta (of 4 stages) was less
efficient by the factor of five or six.) Considering that the Runge-Kutta does not
rigorously guarantee the accuracy of the computed solution and that there are still
lots to be improved in our implementation of the present method, this comparison of
speeds will be in favour of the rigorous approach to the solution of ordinary differential
equations even from the practical engineering standpoint.
4. Examples
We took up the following three problems for experiment, one being a small test
problem and the other two chosen from celestial mechanics,
(a) Logistic curve:
»(0) = 0. (32)
In Fig. 1 it is seen that to calculate the confidence interval with matrizant is
important. (It is observed that the most primitive interval computation will fail.)
(b) Swing-by of an artificial satellite by Jupiter. The equations of motion is as
follows:
=v,
(x — rcoswt)
2
-Gm- 2 2
{x + J / S
} 2 {(x — rcoswt) + {y - rsmwr) };
(x - cos(yt + d>)) (33)
-GM 2 2
{(x - cos(t 4- 4>)) + (y- sin(f 4- 4>}) }* '
dv (y — r sinwi)
-Gm-
dt'' O 2 2
+ y V* {{x — r costjt) + (y — rsinwf) } s
2 2
-GM- 2 2
{(x - cos(( 4- 4>)) + (y- sin(( 4- 4>)) }*
204
/
3.0
2.0
exact solution
1.0
JP^''-— using matrizant
0.0
-1.0
0.0 1.0 2.0 3.0 t 4.0
where G is the universal gravitation, m is the mass of Earth, M is the mass of Jupiter.
We set the initial relative position of Earth and Jupiter as 0=0.4835.
The problem is essentially a one-body problem, the gravity of the sun, Earth and
Jupiter being taken into account but their motion being approximated as circular and
planar and a priori given. General view of the motions is shown in Fig. 2 and Fig. 3.
205
orbit of E a r t h
Fig. 2 Relative motion of the sun, Earth, Jupiter and the satellite
Figure 4 shows how the stepsize as well as the order is controlled according to
different rules.
In Fig. 5 and Fig. 6 the effect of N and that of the way of computing interval
matrizants are shown. The "true error'' is the error (estimated by comparing with
the result with stepsizes halved) of the midpoints y„.
In Fig. 7 the effect of N is shown. (It is seen that the widths obtained by using
the perturbation equations gave poorer results.)
Computation time by the Taylor-series methods with different stepsize-control
strategies (I) and (II) (see §3) are shown in Table 1.
% + ft/2, t + A],
n
0: standard method.
0.0
p with max h
n n
CD
&
I
-2.0 -
e
c
-4.0 -
2
p with min p / / i n
n n
- 5
-6.0
1
-5.0
a)
•3 -10.0
1
primitive-
J
a, JV = 0 -
-15.0 1
bo a', /V = 0 -
o Q', JV = o o J
0, JV = 0 — J
-20.0 ft JV = oo 1
"true error"
0.0 0.5 l.O 1.5 2.0
16
(c) The Pythagorian three-body problem . The equations of motion are as fol-
lows:
x -x 3:3-2:5
= -4 3 4
2
-5-
2
{ ( x - x,y + ( » -
3 to) }* { ( x - z ) + (y - y y}i '
3 5 3 5
_ 4 to -to 5 ya - to
2 2
{(x - xtf + (jft -
3 to) }* {(x - x ) + (y -
3 5ytf}V 3
x - x c ^4-2:5
= 3 3 4
{(*3 - n ) + (» -
s 2
to) }* {(x< - X ) + (j/4 " t o ) } =
S
2 2
(36)
= 3 S/3 - !/4 84 - 1ft
{(*s " X,y + (!/3 - t o ) } * 2
{ ( x - x ) + (w " t o ) } *
4 5
2 2
-x £3 1
X4 - x 5
= 3 T I
& A
2 2
{(*a - s ) + da - t o ) } *
s { ( x - 2 ) + (»* - t o ) } " 4
5
Z 2
to - J/S 1 4 to - Jft
= 3 2
{ ( * * - x ) + (to -
2
5 to) }* { ( X - x ) + (to
2
4 5
General view of the motions of three bodies is shown in Fig. 8. This problem is
notorious for near-singular (near-collision) points occurring from time to time, so
that a number of regularization techniques have been devised by many authors.
But we did not adopt such regularization techniques in order to see how the present
method will behave itself at near-singular points. In Fig. 9, we observed that N = 300
gave a better result than N = 00.
9 Growth of the w i d t h of the interval for different values of N
- 1 2
{e = 1 0 ; p„ and h„ controlled so as to maximize /in-)
t
211
References
(1971), pp-328-337.
15. T. SliNAGA: Theory of an interval algebra and its application to numerical
analysis. RAAG Memoirs, Vol.2 (1958), Misc.II, pp.547-564. [Based on the
Master's Thesis in 1956]
16. V. SZEBEHELY and C. F . P E T E R S : Complete solution of a general problem of
three bodies. Astronomical Journal, Vol.72 (1967), pp.876-883.
213
N u m e r i c a l V a l i d a t i o n for O r d i n a r y D i f f e r e n t i a l E q u a t i o n s
using Power Series A r i t h m e t i c
Masahide Kashiwagi
Department of Information and Computer Science,
School of Science and Engineering, Waseda University,
Okubo 3-4-1, Shinjuku-ku, Tokyo 169, Japan
E-mail: kashi@oishi.info.waseda.ac.jp
ABSTRACT
In this paper a numerical validation method for normal form simultaneous first
order differential equations is discussed. Based on Lohner's method and interval
functoid, a new inclusion algorithm for initial value problem is given. For the
algorithm, two types of arithmetics of power series is defined.
1. Introduction
estimation of the error is very difficult. But, if we can describe exact relation between
Xi and Xi+t, then we can get finite dimensional equation having solution z, which
exactly equal to x(ti). Namely, if we can calculate d>(v,t„t ) which returns exact
e
x(t ) provided that x(t,) = w, then we can write down the finite dimensional equation
c
as
X2 = 4>{,Xuti,tt)
Ij = tp(X2,h,t ) 3
x m = 0(i _ ,t _ ,t ).
m 1 m 1 m (2)
In this paper we will show how to calculate the exact relation <p[v,t„t ). It can c
3
be seen as an extension of Lohner's method It is noticed that interval arithmetic is
used if needed through this paper.
Type-I PSA treats order-n power series and truncate terms higher than n-th order,
[do, ai, 0.2, • • •, a ] represents power series
n
2 n + l
do + ait + a t + • • • + o „ r
2 (+0(t )). (3)
Addition, subtraction and multiplication are done as follows:
w
/([oo,• • -,o.D = /(oo) + E *,f (ao)[0,oi,• •• , < (6)
Addition and multiplication in above algorithm are done by (4) and (5).
Division is executed by combining inverse function and multiplication (x/y =
x x (I/?)).
Type-I PSA keeps first (n + 1) terms of no-truncated arithmetic. Several mathe-
matical softwares as Mathematica provide such an arithmetic.
Type-II PSA also treats order-n power series. In Type-II PSA we must specify
its domain like as [0, d\. [do, • • -, a„\ also denotes
2
a + atf + a t + • • • + a f,
0 2 n (7)
but coefficient of the last term a„ is generally interval and [n , • • • ,a ] represents set
0 n
of continuous functions defined in [0, d] such that / ( ( ) e Un H i-a f" in all !. This n
4 5
is a kind of interval fuuctoid introduced by Kaucher and Miranker '
Addition and subtraction are same as Type-I PSA. Multiplication is executed by
the following steps:
215
is order-2n.
ries and n be n < tn. Then order reduction of A to n is defined as the following
i n -n
It transforms o,f to a t ~ t" and replace t'~" by [0, rf]' It is adding higher order
l
remainder to the coefficient of t" Thus the result of multiplication C contains all
possible results derived by multiplication without truncation.
Functions are applied as follows:
)
+i/'" fi:^io,d]')[o,< ,---, ]". il ttn (uj
\f=0 /
Addition and multiplication in above algorithm are done by Type-II PSA. This algo-
rithm uses the Lagrange's remainder term.
Division is done like as Type-I PSA.
^ = mt),t) d2)
x(t,) = v (13)
t 6 [t„t \. t (14)
Algorithm 1 We assume that t > t,. Let A = t - t and domain for Type-II PSA
e t s
be [0,A] ,
( M
x = (16)
and set m = 0.
(m = 0)
(m=i) (17)
[t ,l,0,--,0]
3 (m>2)
Step-5 Calculate
1
r= m a x l V - r ' - A - ' " ' ! , (18)
w k
where suffix means coefficient of t .
3 m)
Step-6 Let i f * = x\ + [-2r, 2r\ forl<i<n.
(19)
k) h)
At Step-8, Y^ = X$ for 0 < k < m - 1 always holds in this algorithm, therefore
only the last term is needed to test.
In order to solve (2) with guaranteed accuracy, we need not only exact <p(v, t„ t ) t
d l ( t )
= f{x(t),t) (20)
dt
dy(t)
= Ux(t),t)y(t) (21)
dt
x{U) = v (22)
£f(tj) = / ( n x it identity matrix) (23)
l e [t ,t ).t t (24)
6
Above mentioned method is closely related to the automatic differentiation . By
doing algorithm in section calculating 'derivative of all numbers in the algorithm
with respect to initial value v' simultaneously, we can obtain (i, j)-element of matrix
valued function y[t) as 'derivative of Xj(t) with respect to j - t h element of v'.
5. Conclusion
In future paper, we will present how to enclose the solutions of (2). Also we will
present how to construct a software for this algorithm and numerical examples.
218
Yoshihiro SA1T0
Shotoku Gakuen Women's Junior College, 1-38 Nakauzura
Gifu-shi 500, Japan
E-maili g44110g@nucc.cc.nagoya-u.ae.jp
and
Taketomo MITSUI
Graduate School of Human Informatics,
Nagoya Univ., Nagoya 464-01, Japan
ABSTRACT
Simulation for some stochastic integral processes is required in numerical solu-
tions for stochastic differential equations (SDEs). The stochastic part of the
error in simulation is considered, especially for the Wiener process W{t) and the
Wiener integral process f sdW(s) as the basics. Several weak numerical schemes
are applied for a good approximation of statistical quantities of the solutions.
The results show that the error depends on the number of trajectories, not on
the stepsize. The way to realize basic integral processes is discussed.
1991 Mathematical Subject Classification: 65U05, 60H05, 60H10, 65L99
1. Introduction
Much literature has been discussing numerical schemes of stochastic differential
equations (SDEs) in both strong and weak senses. As a mathematical error analysis
of strong schemes, we proposed one, which separates global error into deterministic
10
and stochastic parts, however we discussed only the former This paper is to treat
the latter. To this end, some stochastic integrals are studied on their stochastic error
part along with the means of realization.
Stochastic integrals are stochastic processes appearing in the Ito-Taylor series ex-
pansion for the solution of Ito stochastic differential equation. To realize the stochastic
integral in the digital computer is significant in the simulation of SDEs. The sim-
plest example of stochastic integral processes is the standard Wiener process. We will
consider one-dimensional stochastic integrals for simplicity.
The standard Wiener process is the Gaussian process which satisfies three prop-
erties as follows:
(i) P(W(0) = 0) = 1,
(ii) E(W(r)) = 0, for all f £ [0, oo),
(hi) C (t, s) = E(W(t)W(s))
w = min(t, s).
Simulation of the Wiener process in the digital computer requires the following dis-
cretization. _ n i
W(nh) = Y&W,.
i-0
220
and h is the step-size. The increment AWi, which is the normal distribution with the
zero mean and the variance h, can be simulated by
1 2
AWi = c^fc ' ,
where is a normal random number of the zero mean and the variance 1. The set
of such normal random numbers is written by JV(0,1).
If we carry out the simulation of integral processes with a digital computer, we will
use pseudo-random numbers instead of normal random numbers. Therefore we have
to consider the error caused by pseudo-random numbers. The order of convergence
in the Monte Cairo method used here is known to be as low as £?(l/\/W) (N is the
number of samples). Nobody knows, however, in advance how many samples should
be chosen and how they affect the error of the solution of the SDE. In this paper we
will statistically study an estimate of the number of trajectories to achieve a certain
accuracy, and the required independency of random numbers generated in each time
step.
These results can be applied to the simulation for SDEs. For example, consider a
scalar linear autonomous SDE:
The solution (2) explicitly depends on the Wiener process W(t). In general SDEs do
not have an explicit form of solution like as (2). In such cases, stochastic multiple
integrals appear in the solution expanded in the Ito-Taylor formula at ( = t . 0
In the present paper we will carry out an error analysis of two stochastic pro-
cesses, namely the Wiener process W(t) and the integral process / ' sdW(s) which are
0
incorporated in the Taylor scheme with global order 3. We adopt the distribution
3
norm which KLAUDER and P E T E R S E N used to estimate weak schemes.
7
Here, let t„ = 0 < d < - - - < t < t k k+1 < • • - < t„ = i , and AW * and h stand for the
following increments;
AW k = W(t ) k+1 - W(t ), k h = max(t t+1 - t ), k
respectively. The convergence should be taken in the mean-square sense. For example,
when f(s) = 1, the stochastic integral / ( / ) expresses Wiener process W(t).
The following proposition is well known.
Proposition 1 Assume f is sufficiently smooth. Then the following identities hold.
2 2
E ( / ( / ) ) = 0, E(/(/)) = f'(f(s)) ds
Jo
We will give a method estimating the error of the integral process from its re-
alization. Let Y and Y stand for a stochastic integral process and its reabzation,
respectively. We define the distribution error e by the following:
I
e = ei + e , 2
e x = |Er-Ef|, (4)
2 2
e 2 = |E(y-EY) -E(y-Ey) |.
For a fixed r, the error (4) means the sum of Ci, the difference between the mean of
random variables Y and Y, and e%, the difference between the variance of them. This
3
estimate (4) is to be used for weak schemes .
The method of error estimation could be considered in the following way, too.
The Wiener process is the solution of the following simple SDE.
dY{t) = dW{t), Y(0) = 0. (5)
That is, the distribution error of the Wiener process is interpreted as the difference
between the numerical and the exact solutions of the SDE (5). Similarly SDE for the
(0,11
stochastic integral process 7 (() = / ' sdW(s) is the following:
0
Here the increment AZi_ stands for the stochastic integral - U)dW(s). The
increments AWj and AZ; are replaced by the following expressions:
z
i=D i=a vi
| 0 , 1 1
is possible for AZ,. Then the simulation of the stochastic process / is carried out
with
= E l&i + ItejA*
11
.=0
6
This simulation corresponds to the asymptotically efficient scheme proposed by Newton
4,5
for the SDE (6). Also the increment At7 approximating A ^ can be used for nu-
;
1=0
We will estimate the 90% confidence interval of the distribution error of the dis-
cretized processes for W(t) and I^-'Ht). To obtain 100 samples for e, L trajectories
are generated for each sample. We call the set of trajectories for the sample as block,
and each block are simulated independently. Then Y f means the sample of j - t h t
^ f=l ^ 1=1
2 2
S = \EY - M f l + | E K
} T T - ( E Y j - ) - ff + $Mff\.
A sample of the error e at T = Nh is Sj. Since Sj is known to approximately obey
the normal distribution for large number of blocks due to the Central Limit Theorem.
We can evaluate the mean of the error e by using;
1 100 , 100
223
According to the statistical theory, the 90 % confidence interval when assumed Stu-
dent's (-distribution is given by
{S - AS x 0.166; S + AS x 0.166).
4. Results of simulation
4 5
We calculate the errors at t = 0.5, 1.0, 1.5 and 2.0 with the stepsizes h = 2~ , 2~
6
and 2~ . The number of trajectories L in each blocks is taken as 100, 1000 and 10000.
The computer used in the simulation is Macintosh SE/30, and the program RNORQ
2
by Kahaner et al. is applied as the pseudo-random number generator.
The simulation results for Wiener process are given in Figures 1 to 3 according
- 4 - 5 6
to the stepsizes 2 , 2 and 2~ , respectively, while those for the integral process
of type 1, 2 and 3 are shown in Figures 4 to 6, respectively. The latter cases are,
4
however, shown for only the stepsize 2~ , because the results with other stepsizes are
4
almost same as with the stepsize 2~ . In these Figures the marks + , o and o stand
for the value S at L= 100, 1000 and 10000, respectively, while the mark - for the
upper and lower bounds of the confidence interval.
From these simulation results we can conclude as follows.
(i) For the Wiener process the magnitude of the error depend on the number of
trajectories, not on the stepsize.
(Q,1,
(ii) For the stochastic integral process i" (r), like for the Wiener process the error
depends on the number of trajectories. The rate of growth of the error versus
f is however bigger than that of the Wiener process.
10 1 10,1 | 0 , 1 )
(iii) For Z '' the simulations Z ' and j cannot be distinguished each other.
i0
However the simulation K -V is considerably different from other two.
We will give a more detailed discussion on the above item (iii).
4
With the stepsize h = 2" and the same samples in &,|, the values of S of 7, J
and K for L = 100,1000 and 10000 are shown in Tables 1-3.
4
Table 1. S for integral processes (h = 2 )
L 100
t /|0,1) Jl0.ll
2 A
0.5 1.98 x lO" 1.99 x W~ 2.20 x 1Q-'
2 2 _!
1.0 8.01 x 10- 8.05 x 10" 8.41 x 1 0
1.5 1.87 x 10" 1
1.86 x 10" 1
2.01 x 10"'
2.0 4.30 x 10" 1
4.29 x 10 _1
4.21 x 10"'
224
4
Table 2. S for integral processes (h = 2 )
L 1000
t jClo.il
0.5 6.68 x 10' :j 6.72 x 10~ s
1.21 x 10" •2
1.0 2.72 x 10" -2 2.73 x 10" 2
4.38 x 10" •2
x 10" •2 7.01 x 10" 1.02 x 10" •1
2
1.5 7.01
x 10" -1 1.27 x 10" 1.85 x 10" •1
1
2.0 1.27
4
Table 3. S for in tegral processes (ft = 2 )
L 10000
t J(0.1> jm
J
0.5 2.28 x 10" •3 2.31 x 10" 9.12 x 10"
1.0 8.80 x 10" -3 8.79 x 10" 3
3.52 x 10" -2
x 10" •2 10" •2
-2
1.5 2.15 2.17 x 1 0 7.94 x
2.0 4.31 x 10" -2 4.30 x 10" 2
1.29 x 10- 1
1 0,1
Differences are not observed between tjp ' and Jj, '. The theory in statistics indi-
3 2 3 3
cates that the increase of number of trajectories
1 yields 0, nh /12 and |n A /2 — nft /6|
as the limit of the distribution error e of
- 4
/J"'',
and rV* '", respectively. The the- 0
4 _a
1.5 4.88 x 10" 6.93 x 1 0
2.0 6.51 x lO" 4
1.24 x 10" 1
The stochastic processes investigated here are restricted to most basic ones, the
( 0 , ,
Wiener process W(t) and the Wiener integral process / ' = JgSdW(s). However,
Section 4 suggests the error in the Wiener process is the main part of the error
in numerical solution of SDE. Thus it is natural that we expect the other integral
225
processes hold the property like Wiener process with respect to number of trajectories.
Also we will try to analyse the stochastic integral process which has any functions
/(sl as integrand.
We show the error in the stochastic integral processes simulation depends on
the number of trajectories. Thus the error can be small by increasing the number
of trajectories. It implies, however, that we have to generate considerably many
trajectories to achieve a desired accuracy.
Furthermore we adopted only a single pair of the starting value and the seed
for the pseudo-random number generator. Actual simulations should be carried out
with, say 100, independent blocks of samples, which require multiple pairs for the
generator. Thus, together with the requirement of considerably many trajectories,
implementation on a parallel computer is recommendable. The effect of multiple
pairs of the starting value and the seed should be examined carefully.
Simulations for multi-dimensional integral processes will be treated like as 1-
dimensional cases. Extracting from the latter, the number of trajectories is predicted
to be far more.
6. References
1. Arnold, L., Stochastic Differential Equations, Wiley, New York, 1974.
2. Kahaner, D., Moler, C , and Nash, S., Numerical Methods and Software, Prentice
Hall Inc., Englewood Cliffs, 1989.
3. Klauder, J.R., and Petersen, W.P., Numerical integration of multiplicative-noise
stochastic differential equations, SIAM J. Numer. Anal., 22(1985), 1153-1166.
4. Kloeden, P.E. and Platen, E., The Numerical Solution of Stochastic Differential
Equations, Springer, Berlin, 1992.
5. Kloeden, P.E., and Platen, E., A survey of numerical methods for stochastic dif-
ferential equations, J. Stoch. Hydrol. Hydraulics, 3(1989), 155-178.
6. Newton, N. J., Asymptotically efficient Runge-Kutta methods for a class of Ito and
Stratonovich equations, SIAM J. Appl. Math., 51 (1991), 542-567.
7. Pardoux, E. and Talay, D., Discretization and simulation of stochastic differential
equations, Acta Appl.Math, 3(1985), 23-47.
8. Rumelin, W., Numerical treatment of stochastic differential equations, SIAM J.
Numer. Anal., 19(1982),604-613.
9. Saito, Y. and Mitsui, T., Discrete approximations for stochastic differential equa-
tions. Trans. Japan SIAM, 2(1992), 1-16 (in Japanese).
10. Saito, Y., and Mitsui, T., Simulation of stochastic differential equations, Ann.
Inst. Statis. Math., 45(1993), 419-432.
11. Saito, Y, and Mitsui, T., Stochastic Part of Error in Numerical Schemes for
Stochastic Differential Equations, Trans. Japan SIAM, 4(1994), 127-139(in
Japanese).
12. Talay, D., Simulation and numerical analysis of stochastic differential systems,
INRIA Report 1313, 1990.
0.3-
+ L-100
o L-1000
0.Z-
« L-l0000
0.1- 6
0.0 -1 1 1 • 1
1 2 3
t
4
Figure 1: Confidence intervals of the Wiener process (/i = 2 ~ ) .
0.4-,
0.3- + L=l00
*
o L=1000
0.Z-
• L=10000
0.1- o
o
0.0 -i • 1 • 1
1 2 3
t
s
Figure 2: Confidence intervals of the Wiener process (h = 2 - ) .
227
0.4-,
0.3-
+ L=100
o L-1000
0.2-
• L-10000
0.1-
0.0
6
Figure 3: Confidence intervals of the Wiener process (h = 2 ) .
0.5-i
0.4-
+ L=100
0.3-
o L-1000
0.2-
• L-10000
0.1-
a •
0.0-
4
Figure 4: Confidence intervals of the integral process f (k = 2 ).
228
0.5-.
0.4-
+ L=l00
0.3-
o L=1000
0.2-
• L=10000
0.1-
0.0-
4
Figure 5: Confidence intervals of the integral process J (h = 2 ) .
0.S-.
0.4-
+ L=l00
0.3-
o L=1000
0.2-
« L-10000
0.1-
+ *
S
0.0
1
4
Figure 6: Confidence intervals of the integral process K (k — 2 ) .