Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

18-1

Publ. Astron. Soc. Japan (2019) 71 (1), 18 (1–13)


doi: 10.1093/pasj/psy137
Advance Access Publication Date: 2018 December 12

Hermite integrator for high-order mesh-free


schemes

Downloaded from https://academic.oup.com/pasj/article/71/1/18/5250071 by Institute of Space Technology user on 26 June 2023


Satoko YAMAMOTO1,2,3,∗ and Junichiro MAKINO1,2,3
1
Department of Earth & Planetary Sciences, Tokyo Institute of Technology, 2-12-1 Ookayama, Meguro,
Tokyo 152-8550, Japan
2
RIKEN Advanced Institute for Computational Science, 2-2-3 Minatojima-minamimachi, Chuo-ku, Kobe,
Hyogo 650-0047, Japan
3
Department of Planetology, Kobe University, 1-1 Rokkodaicho, Nada-ku, Kobe, Hyogo 650-0013, Japan

E-mail: satoko.yamamoto@riken.jp
Received 2017 October 11; Accepted 2018 November 10

Abstract
In most mesh-free methods, the calculation of interactions between sample points or
“particles” is the most time-consuming. When we use mesh-free methods with high
spatial orders, the order of the time integration should also be high. If we use usual
Runge–Kutta schemes, we need to perform the interaction calculation multiple times per
time step. One way to reduce the number of interaction calculations is to use Hermite
schemes, which use the time derivatives of the right-hand side of differential equations,
since Hermite schemes require a smaller number of interaction calculations than Runge–
Kutta schemes do to achieve the same order. In this paper, we construct a Hermite scheme
for a mesh-free method with high spatial orders. We performed several numerical tests
with fourth-order Hermite schemes and Runge–Kutta schemes. We found that, for both
Hermite and Runge–Kutta schemes, the overall error is determined by the error of spatial
derivatives, for time steps smaller than the stability limit. The calculation cost at the time-
step size of the stability limit is smaller for Hermite schemes. Therefore, we conclude
that Hermite schemes are more efficient than Runge–Kutta schemes and thus useful for
high-order mesh-free methods for Lagrangian hydrodynamics.
Key words: hydrodynamics — galaxies: formation — methods: numerical — planets and satellites: formation

1 Introduction derivatives of the right-hand side of differential equations,


since Hermite schemes require a smaller number of interac-
Lagrangian mesh-free methods, in which particles move
tion calculations than Runge–Kutta schemes do to achieve
following the motion of fluid, have been widely used for
the same order.
astrophysical hydrodynamical simulations. In most mesh-
In the field of stellar dynamics, the fourth-order Her-
free methods, the calculation of interactions between par-
mite scheme (Makino 1991; Makino & Aarseth 1992) is
ticles is the most time-consuming part. Typically, one par-
widely used for high-order integration. The basic idea of
ticle interacts with ∼ 100 neighbor particles, and thus the
the Hermite scheme is to calculate the time derivative of
cost of interaction calculations dominates the total calcu-
gravitational acceleration directly, and use it to construct
lation cost. One way to reduce the number of interaction
a high-order interpolation polynomial. If we calculate up
calculations is to use Hermite schemes, which use the time
to a pth-order time-derivative directly, we can achieve the

C The Author(s) 2018. Published by Oxford University Press on behalf of the Astronomical Society of Japan. All rights reserved.

For permissions, please e-mail: journals.permissions@oup.com


18-2 Publications of the Astronomical Society of Japan (2019), Vol. 71, No. 1

order of s(p + 1) when we use the s-step linear multi-step dv


= a(x). (2)
method, and in the case of s = 2, we can achieve the order dt
of 2(p + 1). The two-step linear multi-step method can be
Here, x and v denote the position and velocity of one par-
formulated so that it requires only one force evaluation per
ticle. The fourth-order Hermite scheme is derived as fol-
time step. In the case of a grid-based scheme for hydrody-
lows. The predictor at time tn is given by
namics, Aoki (1997) described a method based purely on
the Taylor expansion, which achieves the order p + 1. an 2 j
x p = xn + v n t + t + n t 3 , (3)
In this paper, we combine the Hermite scheme with Con- 2 6
sistent Particle Hydrodynamics in Strong Form (CPHSF: jn 2
v p = v n + an t + t , (4)

Downloaded from https://academic.oup.com/pasj/article/71/1/18/5250071 by Institute of Space Technology user on 26 June 2023


Yamamoto & Makino 2017), which is one of the high- 2
order mesh-free methods. One disadvantage of the Hermite
where x p and v p are the predicted position and velocity at
scheme is that, even though it requires a smaller number of
the new time, tn + 1 = tn + t, xn and v n are the position
interaction calculations, the calculation cost of one inter-
and velocity at time tn , and an and j n are the acceleration
action is higher because we need to calculate high-order
and jerk (first time-derivative of acceleration) at time tn .
derivations. In the case of CPHSF or other moving least
Using x p and v p , we can now calculate the acceleration and
squares-based interpolation, high-order interpolation poly-
jerk, an+1 and j n+1 , at time tn + 1 . Using an , j n , an+1 , and
nomial gives spatial derivatives, and we only need to con-
j n+1 , we can construct the third-order Hermite interpola-
vert spatial derivatives to time derivations using the original
tion polynomial for a(t) as
differential equations. Thus, the increase of the calculation
sn cn
cost is small and independent of the number of neighbors. a(t) = an + j n (t − tn ) + (t − tn )2 + (t − tn )3 , (5)
2 6
We performed several numerical tests. Fourth-order
where sn and cn are given by
Hermite schemes and second- and fourth-order Runge–
Kutta schemes are used for the test with a periodic −6(an − an+1 ) − t(4 j n + 2 j n+1 )
sn = , (6)
boundary, and an implicit Hermite scheme, an implicit t 2
fourth-order Runge–Kutta scheme and the backward-Euler 12(an − an+1 ) + 6t( j n + j n+1 )
cn = . (7)
scheme are used for the test with boundary conditions. We t 3
found that, for both the Hermite and Runge–Kutta schemes,
the overall error is determined by the error of spatial deriva- We integrate equation (5) from tn to tn + 1 and obtain cor-
tives, for time steps smaller than the stability limit. The rectors given by
calculation cost at the time-step size of the stability limit sn 4 cn
xc = x p + t + t 5 , (8)
is smaller for Hermite schemes. Therefore, we conclude 24 120
that Hermite schemes are more efficient than Runge–Kutta sn cn 4
v c = v p + t 3 + t . (9)
schemes and thus useful for high-order mesh-free methods 6 24
for Lagrangian hydrodynamics.
In the rest of this paper, we first present the formulation If we set xn+1 = xc and v n+1 = v c at this point, that means
of the Hermite scheme for CPHSF in section 2, and report we use the PEC (predict–evaluate–correct) form of the linear
the results of numerical tests in section 3. We summarize multi-step method. We can also use PECE or P(EC)2 forms.
our study in section 4.

2.2 Derivation of high-order time-derivatives


2 Derivation of the high-order scheme for hydrodynamical equations

In this section, we present the derivation of the fourth-order In this section, we describe how we calculate high-
Hermite schemes for CPHSF. order time-derivatives for hydrodynamics equations in the
Lagrangian view. Our approach is essentially the same
as that of Aoki (1997), who derived higher-order time-
2.1 Hermite scheme derivatives for the Eulerian view. Aoki (1997) considered
In this section we present the formulation of the fourth- the following equation:
order Hermite schemes (Makino 1991; Makino & Aarseth ∂
f = ξx f, (10)
1992). Consider a second-order differential equation, ∂t
where ξ x is some linear operator. By taking time deriva-
dx tives of both sides of equation (10), they derived a series of
= v, (1) equations;
dt
Publications of the Astronomical Society of Japan (2019), Vol. 71, No. 1 18-3

 is given by
where γ is the ratio of specific heat, and P
∂2
f = ξx ξx f, (11)  = γ P.
P (27)
∂t 2
∂3 For the equation of state for weakly compressible fluid used
f = ξx ξx ξx f, (12)
∂t 3 in subsection 3.2,
P = c02 (ρ − ρair ) + Pair , (28)
and so on. In this paper, we consider the equation
where ρ air , Pair , g, H, and c0 are air density, air pressure,
d
f = ξx f, (13) gravity, height of fluid, and sound velocity, respectively. We
dt
set

Downloaded from https://academic.oup.com/pasj/article/71/1/18/5250071 by Institute of Space Technology user on 26 June 2023


where d/dt is the Lagrangian derivative, 
d ∂ c0 = g H. (29)
= + v · ∇. (14)
dt ∂t  is given by
The parameter P
The original set of partial differential equations of a  = c02 ρ.
P (30)
Lagrangian formulation of hydrodynamics is given by
In this paper, we apply artificial viscosity of the same form

= −ρ∇ · v, (15) as that in Yamamoto and Makino (2017). Note that we
dt
do not calculate the contribution of the artificial viscosity
dv ∇P
=− , (16) to the second time-derivatives since artificial viscosity is not
dt ρ
differentiable. Therefore, the artificial viscosity for PEC and
du P
= − ∇ · v, (17) P(EC)∞ forms of Hermite schemes are integrated with the
dt ρ
Heun’s scheme and the trapezoidal scheme, respectively.
P = P(ρ, u). (18) We calculate artificial viscosity as follows.
dv ∇q
Here, we rewrite (d/dt)(∇) as =− , (31)
dt ρ
d d
∇ = ∇ − . (19) du q
dt dt = − ∇ · v, (32)
dt ρ
The operator  is defined as   2
| λm|
α = (∇α vβ )(∇β ), (20) q =−  m ζ [αAV ρcs hAV
m |λm |

where α and β are indices of dimensions, and + βAV ρh2AV |λmmax | λmmax (−∇ · v), (33)

∇α = , (21) where α AV , β AV , and hAV are coefficients, and cs and ζ are the
∂ xα
sound velocity and a parameter which controls the overall
where α = 1, 2, and 3, and x = (x1 , x2 , x3 ) = (x, y, z). The strength of AV. In this paper, we set α AV = 1 and β AV =
index β is summed over. Second time-derivatives of ρ, v, 2. The parameters λm are the eigenvalues of the strain rate
and u are then expressed as tensor s defined as
 
d2 ρ ∇ρ · ∇ P 1 ∂vα ∂vβ
= ρ(∇ · v)2 + ρ  · v + P − , (22) sα,β = + . (34)
dt 2 ρ 2 ∂ xβ ∂ xα
The parameter λmmax is the negative eigenvalue with the
d2 v 1   P (∇ · v)(∇ P) maximum absolute value. If all eigenvalues are non-
= ∇ P(∇ · v) + − , (23)
dt 2 ρ ρ ρ negative, q = 0. In this paper, we use the time-independent
coefficient ζ . We set ζ = 1.
 
d2 u P−P
= (∇ · v)2
dt 2 ρ
2.3 Calculation cost for high-order
PP P(∇ P) · (∇ρ) P · v
+ − + , (24) time-derivatives
ρ2 ρ3 ρ
 is defined as For the fourth-order Hermite time-integrations, we must
where P
derive second spatial order derivatives of physical quantities
≡ P ∂P + ρ∂P.
P (25) to calculate jerk, snap, and crackle. However, if we use
ρ ∂u ∂ρ
spatial high-order mesh-free methods (e.g., CPHSF), the
For the equation of state for ideal gas used in
additional number of arithmetic operations of jerk, snap,
subsection 3.1,
and crackle is much smaller than the original number of
P = (γ − 1)ρu, (26) calculations for the spatial high-order mesh-free method.
18-4 Publications of the Astronomical Society of Japan (2019), Vol. 71, No. 1

In this section, we compare the original number of arith- of particle i, and Ndist is 22, 45, and 48 for one, two,
metic operations and the additional number of the oper- and three dimensions. Then, we evaluate elements of Bi
ations necessary for the Hermite scheme. First, we show given by equation (38), the polynomial equation given by
how to derive the spatial high-order derivatives of a phys- equation (36), and the kernel function Wij , to calculate
ical quantity f. Secondly, the original number of arithmetic equation (35). One interaction calculation between particle
operations of CPHSF is derived. We call this value Nop . i and particle j in [Bi ]αβ is given by {[pij ]α [pij ]β Wij }. The
Note that we assume that Nop comprises only the number number of combinations of [pij ]α [pij ]β is n(2np , D). The
of operations for the evaluation of the inverse matrix of Bi in parameter n(np , D) is the number of bases of a polynomial
equation (35) and interaction calculation between particles fitting in equation (35), where D is the number of dimen-

Downloaded from https://academic.oup.com/pasj/article/71/1/18/5250071 by Institute of Space Technology user on 26 June 2023


since these dominate the total calculation cost of CPHSF. sions, and the value of n(np , D) is given by
Thirdly, the additional number of arithmetic operations for
1
D−1

jerk, snap, and crackle is derived. We call this value Nadd . n(n p , D) = (n p + m). (41)
D! m=0
Finally, we compare Nop and Nadd . To obtain the number of
arithmetic operations, we calculate the number of floating- For example, if we consider the one-dimensional case,
point operations per particle of CPHSF. If a quantity has {[pij ]α [pij ]β Wij } is given by xiαj xiβj Wi j and thus [Bi ]α1 β1 is the
been derived, we assume that it will not be unnecessarily same as [Bi ]α2 β2 with (α 1 + β 1 ) = (α 2 + β 2 ). Therefore,
recalculated. We assume that the numbers of floating-point the number of the terms of the form of {[pij ]α [pij ]β Wij }
operations required to evaluate division and square root are is n(2np , D). Since we assume that a quantity which
both 20. has been derived will not be unnecessarily recalculated,
First, we show how to derive the spatial high-order the number of floating-point operations for the evalua-
derivatives of f. In CPHSF, the mth spatial order deriva- tion of {[pij ]α [pij ]β Wij } except for {[pij ]0 [pij ]0 Wij } is 1.
tives of f is given by the following equations: For example, if we consider the one-dimensional case,
  we can get ximj Wi j by multiplying xim−1 j Wi j by xij and thus
δm f = Bi−1 mα f j pα,i j Wi j , (35)
the number of floating-point operations is only 1 for the
α j
 T evaluation of ximj Wi j . In addition, the number of floating-
1 2 n p −1 np
δ = 1, ∇x , ∇ y , ∇z , ∇x , ∇x ∇ y , . . . , ∇ y ∇z , ∇z , (36) point operations for summing each term {[pij ]α [pij ]β Wij }
2

T with respect to j is 1. Therefore, the total number of
n −1 n
pi j = 1, xi j , yi j , zi j , xi2j , xi j yi j , . . . , yi j zi jp , zi jp , (37) floating-point operations for one interaction calculation
in Bi is 2n(2np , D) − 1. One interaction calculation
Bi = Wi j pi j ⊗ pi j , (38) between particle i and particle j in the calculation of
j
equation (35) is given by Wi j f j pi j . The number of the
where i and j are indices of particles, m and α are integers, terms of the form of Wij fj [pij ]α is n(np , D). We have den-
np and Wij are the spatial order of the scheme and a Kernel sity (pressure), energy, and velocity, and thus the number
function, and xij , yij , and zij are xj − xi , yj − yi , and zj − zi . of physical quantities is (D + 2). Therefore, the total
In CPHSF, the total number of floating point operations number of floating-point operations for one interaction cal-
per neighbor particle is given by culation in mth derivatives of density (pressure), energy,
Nop = Nint Nnb + Ninv , (39) and velocity given equation (35) is 2(D + 2)n(np , D).
Therefore, the number of floating-point operations for the
where Nnb is the number of neighbor particles, and Nint and
CPHSF fitting is given by
Ninv are the numbers of floating-point operations for the
interaction calculation between particles and the evaluation Nsf (n p , D) = 2n(2n p , D) + 2(D + 2)n(n p , D) − 1. (42)
of the inverse matrix of Bi in equation (35). The number The number of floating-point operations necessary for the
of floating-point operations for interaction calculation is evaluation of the kernel function, Nkernel , are 33, 35, and
given by 36 for one, two, and three dimensions, respectively. From
Nint = Ndist + Nkernel + Nsf , (40) the above, the total numbers of floating-point operations
for the calculation of equation (35) are [33 + Nsf (np , 1)],
where Ndist and Nkernel are the number of floating-point
[35 + Nsf (np , 2)], and [36 + Nsf (np , 3)] for one, two,
operations necessary to evaluate the relative distance and
and three dimensions.
the kernel function. The last term, Nsf , represents the
From the above, the total numbers of floating-point
number of floating-point operations for the CPHSF fitting.
operations for one interaction calculation of CPHSF, Nint ,
In CPHSF, first of all, we evaluate only |xi j |/ hi , where
are
xi j is the displacement of particles i and j and hi is the
Kernel length of particle i, to search neighbor particles Nint [55 + Nsf (n p , 1)], (43)
Publications of the Astronomical Society of Japan (2019), Vol. 71, No. 1 18-5

Table 1. Numbers of floating-point operations for one interaction calculation of CPHSF.

Process D=1 D=2 D=3

Ndist 22 45 48
Nkernel 33 35 36
Nsf Nsf (np , 1) Nsf (np , 2) Nsf (np , 3)

Total [55 + Nsf (np , 1)] [80 + Nsf (np , 2)] [84 + Nsf (np , 3)]

Downloaded from https://academic.oup.com/pasj/article/71/1/18/5250071 by Institute of Space Technology user on 26 June 2023


dimensions. We can see that Nadd is much smaller than Nop .
Therefore, we conclude that the additional number of the
calculations of jerk, snap, and crackle is much smaller than
the original number of the calculations of CPHSF.

3 Numerical experiments
Fig. 1. Values of Nadd and Nop plotted against np . The dashed and solid In this section, we present the result of the Sod shock tube
lines show these values for Nadd and Nop . From left to right, the values test in subsection 3.1 and that for the test of the surface
of D are 1, 2, and 3.
gravity wave in subsection 3.2. We compare the results
of fourth-order Hermite schemes and second- and fourth-
Nint [80 + Nsf (n p , 2)], (44) order Runge–Kutta schemes in the Sod shock tube test, and
the results of an fully implicit Hermite-scheme, the implicit
Nint [84 + Nsf (n p , 3)], (45) fourth-order Runge–Kutta scheme, and the backward-Euler
scheme in the surface gravity wave test.
for one, two, and three dimensions, respectively. Table 1
shows the summary of the numbers of floating-point oper-
ations for one interaction calculation of CPHSF. 3.1 Sod shock tube
The evaluation of the inverse matrix of Bi also dominates In this section, we present the result of the Sod shock
in CPHSF and the number of floating-point operations of it, tube test (Sod 1978). We assume that fluid is an
Ninv , is 2n(np , D)3 /3. Therefore, the numbers of floating- ideal gas with γ = 1.4. The computational domain is
point operations per particle of CPHSF are −0.5 ≤ x < 0.5 with a periodic boundary, and the initial
2 boundary of two fluids are at x = −0.5 and 0. In this test,
Nop Nnb [55 + Nsf (n p , 1)] + n(n p , 1)3 , (46)
3 we used equal-mass particles. The initial velocity is given
2 by v x = 0. The density is smoothed by a C5 polynomial,
Nop Nnb [80 + Nsf (n p , 2)] + n(n p , 2)3 , (47)
3 and is given by
2 ⎧
Nop Nnb [84 + Nsf (n p , 3)] + n(n p , 3)3 , (48)
3 ⎪
⎨ ρh −0.25 ≤ x < −x0 ,
ρh −ρl 5 ρh +ρl
for one, two and three dimensions, respectively. ρ(x) = m=0 bm x
2m+1
+ −x0 ≤ x < x0 ,

⎩ρ
2 2
In the following, we derive Nadd . To derive jerk, snap, l x0 ≤ x ≤ 0.25,
and crackle in the Hermite schemes, we need to calculate the
(49)
second spatial order derivatives of fi given by equation (35).

Here, the values of j fj pα, ij Wij and [Bi−1 ]mα have been cal- where (b0 , b1 , b2 , b3 , b4 , b5 ) = (−693/256, 1155/256,
culated in the derivation of the spatial first-order deriva- −693/128, 495/128, −385/256, 63/256), and ρ h and ρ l
tive. Therefore, we must calculate only the multiplication are the values of initial density in the high- and low-density

of [Bi−1 ]mα by j fj pα, ij Wij , and the additional number of cal- regions. We used ρ h = 1 and ρ l = 0.25. The parameter
culations for one physical quantity is given by D H2 n(np , D). x0 represents the width of the smoothing region, and we
We have density (pressure), energy, and velocity, and thus used two values of x0 . One is an initial condition with
the number of physical quantities in a numerical calcula- x0 = 0.006, and the other is a smooth initial condition with
tion is (D + 2). Therefore, the total additional number of x0 = 0.03. We set the initial condition for 0.25 ≤ x < 0.5
calculations is Nadd = (D + 2)d H2 n(np , D). to mirror that of 0 < x ≤ 0.25, and −0.5 ≤ x ≤ −0.25
Figure 1 shows Nop and Nadd with respect to np . We as mirroring −0.25 ≤ x ≤ 0. The positions of particles in
assume Nnb = 10, 75, and 600 for one, two, and three the smoothing region are determined so that position xi of
18-6 Publications of the Astronomical Society of Japan (2019), Vol. 71, No. 1

particle i satisfies
 xi
1
ρ(x)dx = , (50)
xi−1 2N h

where Nh is the number of particles in the high-density


region and the right-hand side of equation (50) is the mass
of a particle. The smoothed pressure is given by


⎨ Ph  −0.25 ≤ x < −x0 ,
P(x) = Ph 2−Pl 5m=0 bm x2m+1 + Ph 2+Pl −x0 ≤ x < x0 , (51)

⎩P

Downloaded from https://academic.oup.com/pasj/article/71/1/18/5250071 by Institute of Space Technology user on 26 June 2023


l x0 ≤ x ≤ 0.25,
Fig. 2. Results of the Sod shock tube tests with Nx = 1000. The density
where Ph and Pl are the values of initial pressure in the high- profiles at t = 0.1 are shown. The left- and right-hand panels show the
and low-density regions. We used Ph = 1 and Pl = 0.1795. results for x0 = 0.006 and x0 = 0.03.
We used equations (31) and (32) for the artificial viscosity
with hAV = 2.375 × 10−3 . We used a sixth-order inter-
polation with the value of interpolation polynomial at the
position of particle xi fixed to the actual value. Therefore, δ
given by equation (36) and pi j given by equation (36) are
 T
1 1 1 1
δ = 1, ∇x , ∇x2 , ∇x3 , ∇x4 , ∇x5 , (52)
2! 3! 4! 5!

 T
pi j = 1, xi j , xi2j , xi3j , xi4j , xi5j . (52)

The kernel function is the fourth-order Wendland function Fig. 3. Same as figure 2, but for Nx = 4000.
(Wendland 1995). The kernel length is given by
 1/D
m̃i We compare results with PEC, PECE, and P(EC)2 forms
hi = η , (54)
ρi of Hermite schemes, and Heun’s scheme (hereafter RK2)
and the classical fourth-order Runge–Kutta scheme (here-
m̃i = ρt=0,i Vt=0,i , (55) after RK4). The numbers of particles, Nx , are 1000, 2000,
and 4000. Calculation codes used in this study were devel-
where ρ t = 0, i and Vt = 0, i are the density and geometric oped using FDPS (Iwasawa et al. 2016).
volume, respectively, of a particle i at t = 0. We set η = 3.8. Figure 2 shows density profiles at t = 0.1 for the tests
We calculated the L1-norm error of density at t = 0.1 to with Nx = 1000 and dt dtmax /4, where dtmax is the max-
verify the spatial order of the schemes and to compare the imum time-step in the stability region, with the PEC form
accuracy of the schemes; of the Hermite scheme. Note that the results for all schemes
Nx
1 |ρn − ρnhres | are similar to that for the PEC form of the Hermite scheme.
ρ = , (56) We can see that the shock wave can be captured. How-
n=1
Nx ρnhres
ever, the post-shock oscillation is strong for x0 = 0.006.
where ρnhres is the result of a high-resolution test in which Figure 3 is the same as figure 2, but for Nx = 4000. Note that
the number of particles, Nx , is 8000 and dt = 10−6 . When the results are independent of the time-integration scheme
we derived equation (56), we calculated ρ n of particles rear- used and the results for Nx = 2000 are similar to those for
ranged at the same positions as those of the high-resolution Nx = 4000. We can see that the shock wave can be captured
test. The time integrator for high-resolution test is the Her- clearly even if the initial condition is not smooth. Therefore,
mite scheme of the P(EC)2 form. For the test of the time if the initial condition is not smooth, the resolution of time
order of the scheme for the test with Nx = N0 , ρn,t hres
is and space should be higher.
the result of a high-resolution test in which Nx is N0 and Now we check the spatial order of the scheme. We used
dt = 10−6 . The time integrator for the high-resolution test the sixth-order shape function and then the first and second
is the same as that for ρ n . In this case we define the error derivatives are fifth and fourth orders in space. Therefore, if
as the result converges to an exact solution following the order
Nx
1 |ρn − ρn,t
hres
| of the method, the order of the scheme should be larger than
ρ,t = . (57)
Nx ρn,t
hres or equal to 4, and thus  ρ should be given by ρ ∝ Nx−m
n=1
where m is larger than or equal to 4. Figure 4 shows that
Publications of the Astronomical Society of Japan (2019), Vol. 71, No. 1 18-7

 ρ for the P(EC)2 form of the Hermite scheme for runs with
dt = 10−6 plotted against Nx−1 . The results are independent
of the time-integration scheme used. The value of  ρ for runs
with x0 = 0.006 is proportional to Nx−4 . The value of  ρ in
the large Nx region for runs with x0 = 0.03 is proportional
to Nx−1 since, in this region, the round-off error dominates
the total error. In the other region,  ρ is proportional to Nx−4 .
From these results, we can conclude that the spatial order
of the scheme is consistent with theoretical expectation.

Downloaded from https://academic.oup.com/pasj/article/71/1/18/5250071 by Institute of Space Technology user on 26 June 2023


Let us look at the time orders of the schemes. Figures 5
and 6 show  ρ, t for the tests with x0 = 0.006 and
x0 = 0.03 plotted against dtic , where dtic is dt divided by
the number of interaction calculations per time step. We can
see that the errors of RK2 and RK4 are O(dt 2 ) and O(dt 4 ),
respectively, and that of the Hermite schemes is O(dt 2 ).
In the following we explain the reason why the order of
the Hermite scheme is O(dt 2 ) for fixed Nx−1 . In a particle-
Fig. 4.  ρ at t = 0.1 for the tests with x0 = 0.006 and x0 = 0.03 plotted based method, the calculated spatial derivatives contain
against Nx−1 . Filled and open circles show results for x0 = 0.006 and
discretization errors, and therefore the time derivative con-
0.03, and the solid curve shows the theoretical models for the error.
tains errors. In the case of RK schemes, this error causes

Fig. 5.  ρ, t at t = 0.1 for the tests with x0 = 0.006 plotted against dtic . From left to right, panels show the results for the Nx = 1000, 2000 and 4000.
Triangles, squares and crosses show the results for Hermite schemes in PEC, PECE, P(EC)2 forms, and open and filled circles show the results for
RK4 and RK2. Solid and dashed curves show the theoretical models for the error of second- and fourth-order schemes.

Fig. 6. Same as figure 5, but the results for x0 = 0.03.


18-8 Publications of the Astronomical Society of Japan (2019), Vol. 71, No. 1

the solution in the limit of dt → 0 to converge to a solu- a Hermite scheme is equal to two. From these results, we
tion that is different from the exact solution, but the rate can conclude that the time orders of the schemes are consis-
of the convergence is the order of the time-integration tent. The fact that the apparent error order of the Hermite
scheme, since we can regard the space-discretized differ- scheme is 2 does not imply it is a second-order scheme,
ential equations as the set of ordinal differential equations. because when we simultaneously shrink the interparticle
However, in the case of the Hermite scheme, we construct distance and time step, the error will be O(dt 4 ) as expected.
the second time-derivatives of physical quantities from the The second-order behaviour occurs only when the spatial
original equations and high-order spatial derivatives, and error dominates the total error.
these spatial derivatives contain discretization errors. Thus, Figure 7 shows errors for tests with x0 = 0.006 plotted

Downloaded from https://academic.oup.com/pasj/article/71/1/18/5250071 by Institute of Space Technology user on 26 June 2023


both the first and second time-derivatives contain the errors against dtic . The result shows that the accuracy of fourth-
due to space discretization errors, and therefore the second order Hermite schemes is similar to those of RK2 and RK4,
time-derivatives are not exactly the time derivatives of the since the errors of spatial differentiation approximation
first time-derivatives. For simplicity, let us illustrate this determines the overall error.
behaviour for the integration of velocity in one dimen- Figure 8 shows maximum dtic in the numerical stable
sion. Here, we rewrite the correctors by substituting equa- region for tests with x0 = 0.006 plotted against Nx−1 . We
tions (6) and (7) to equation (9). Note that we set dt = t can see that the regions of stability of fourth-order Her-
in equation (4): mite schemes are larger than or equal to those of RK2
1 1 and RK4. Hence, we can use larger time-steps with the
vc = vn + (an + an+1 )t + ( jn − jn+1 )t 2 . (58)
2 12 Hermite schemes. Therefore, we can conclude that Hermite
If we use sixth-order polynomial fitting for deriving spatial schemes, especially in PEC and PECE forms, are better than
derivatives, v c containing the spatial errors is given by Runge–Kutta schemes for simulations of fluid with shock
1 1 and contact discontinuity, even when the initial condition
vc = vn + (An + An+1 ) t + (J n − J n+1 ) t 2 , (59) has a sharp jump.
2 12
where An and An + 1 are accelerations at n and n + 1 steps Figure 9 shows errors for x0 = 0.03 plotted against dtic .
and Jn , and Jn + 1 are jerks at n and n + 1 steps, all given by As in the case of x0 = 0.006, the results show that the accu-
sixth-order polynomial fitting for deriving spatial deriva- racy of fourth-order Hermite schemes is similar to those of
tives. Therefore, J is not equal to the time derivative of A: RK2 and RK4, since the errors of the spatial differentiation
dA approximation determine the overall error.
J = + J , (60) Figure 10 shows maximum dtic in the numerical stable
dt
where  J is the error. Here, we integrate equation (59) from region for tests with x0 = 0.03 plotted against Nx−1 . As in
t = 0 to t = T, the case of x0 = 0.006, the results for the regions of stability
Nt   of fourth-order Hermite schemes are larger than or equal
1
vc,(t=T) = v(t=0) + (An + An+1 ) t to those of RK2 and RK4. Therefore, we can conclude that
n=0
2 Hermite schemes, especially in PEC and PECE forms, are
Nt   better than Runge–Kutta schemes for simulations of fluid
1
+ (J n − J n+1 ) t 2 with shock and contact discontinuity. We can conclude that
n=0
12
Hermite schemes are more computationally efficient than
Nt 
 Runge–Kutta schemes for calculation shocks.
1
= v(t=0) + (An + An+1 ) t
n=0
2

Nt    
1 d An d An+1
+ +  J ,n − −  J ,n+1 t 2 , 3.2 Surface gravity wave test
n=0
12 dt dt
The surface gravity wave test is useful for the investiga-
(61) tion of the capability of numerical schemes to handle two-
where Nt is given by Nt = T/t. Here, we can assume that dimensional fluid dynamics with high accuracy and small
dv dissipation. The initial condition is the same as those in
= lim A. (62)
dt t=0 Antuono et al. (2011) and Yamamoto and Makino (2017),
Therefore, equation (61) becomes but sound velocity given by equation (29) is 10 times smaller
1   than that of Yamamoto and Makino (2017). We assume
v(t=T) = v A(T) + O(t)4 +  J ,(t=0) −  J ,(t=T) t 2 , (63) that fluid is weakly compressible with an equation of state
12
where v A (T) is the analytical solution for the velocity which given by equation (28) with ρ air = 103 and Pair = 105 and
satisfies equation (62). We can see that the time order of sound velocity given by equation (29) with g = −10 and
Publications of the Astronomical Society of Japan (2019), Vol. 71, No. 1 18-9

Downloaded from https://academic.oup.com/pasj/article/71/1/18/5250071 by Institute of Space Technology user on 26 June 2023


Fig. 7.  ρ at t = 0.1 for the tests with x0 = 0.006 plotted against dtic . From left to right in the upper panels, the results for the PEC-, PECE- and P(EC)2
forms of the Hermite schemes are shown. The lower left-hand and middle panels show the results for the second and fourth Runge–Kutta schemes.
Crosses and open and filled circles show results for Nx = 1000, 2000, and 4000.

the height of fluid H = 1. The computational domain is


0 ≤ x < 1, 0 ≤ y ≤ 1. We applied a periodic boundary at
x = 0, v y = 0 at y = 0 and P = Pair for particles initially at
y = 1 as boundary conditions. Initial density is

2
ρ(y) = ρair e g(H−y)/c0 . (64)

Initial velocity is

|g|k cosh(ky)
vx = A sin(kx), (65)
ω cosh(kH)

|g|k sinh(ky)
v y = −A cos(kx), (66)
ω cosh(kH)

where A, k, and ω are the amplitude, the wavenumber,


Fig. 8. Maximum dtic in the numerical stable region for tests with and its frequency. We set A = 0.01, k = 2π, and
x0 = 0.006 plotted against Nx−1 . Triangles, squares, and crosses show

ω = |g|k tanh(kH). In this test, we do not use artificial
the results for Hermite schemes in PEC, PECE, and P(EC)2 forms, and
open and filled circles show the results for RK4 and RK2. viscosity to clarify the origin of the error. We used a fifth-
order interpolation with the value of the interpolate polyno-
mial at the position of particle xi fixed to the actual value.
18-10 Publications of the Astronomical Society of Japan (2019), Vol. 71, No. 1

Downloaded from https://academic.oup.com/pasj/article/71/1/18/5250071 by Institute of Space Technology user on 26 June 2023


Fig. 9. Same as figure 7, but for x0 = 0.03.

 T
pi j = 1, xi j , yi j , xi2j , xi j yi j , yi2j , . . . , xi2j yi3j , xi j yi4j , yi5j . (68)

The kernel function is the fourth-order Wendland func-


tion (Wendland 1995). We used equation (54) as the kernel
length and set η = 3.8.
We calculate the absolute error of v x at (x, y) =
(0.4, 1) and t = 0.2T where T is the period given by 2π/ω
for checking the spatial order of the schemes and comparing
the accuracy of the schemes.

vx = |vx − vxhres |, (69)

where vxhres is the result of the high-resolution test in which


the number of particles, N, is 128 × 129 and dt = T/1024.
The time integrator for high-resolution test is the implicit
Hermite scheme. For checking the time order of the scheme
for the test with N = N0 , vx,t hres
is the result of a high-
Fig. 10. Same as figure 8, but for x0 = 0.03.
resolution test in which N is N0 and dt = T/512. The
time integrator for a high-resolution test is same as v x .
Therefore, δ given by equation (36) and pi j given by
We calculated v x and vx,t hres
of the particles initially at
equation (36) are
 (x, y) = (0.3125, 1). In this case we define the error as
1 1 1
δ = 1, ∇x , ∇x2 , ∇x ∇ y , ∇ y2 , . . . , ∇ 2∇ 3, vx ,t = |vx − vx,t
hres
|. (70)
2! 2! 2!3! x y
T
1 1 We compare results of runs with the implicit Hermite
∇x ∇ y4 , ∇ y5 , (67)
4! 5! scheme, the backward-Euler scheme (hereafter IRK1) and
Publications of the Astronomical Society of Japan (2019), Vol. 71, No. 1 18-11

Fig. 12. Time-evolution of the y-coordinate of the particle initially at


(x, y) = (0, 1) in the surface gravity wave test with N = 16 × 17.

Downloaded from https://academic.oup.com/pasj/article/71/1/18/5250071 by Institute of Space Technology user on 26 June 2023


Fig. 13. vx at t = 0.2T plotted against Nx−1 . Filled circles show numerical
results and solid curves show the theoretical models for the error.

Fig. 11. Results of the surface gravity wave tests with N = 16 × 17; from 17 and dt dtmax /4. Note that the results are independent
top to bottom, the snapshots at t = 0, 0.25T, 0.5T, and 0.75T are shown. of the time-integration scheme used and N.
Now we check the spatial order of the scheme. We used
the fifth-order shape function and then the first and second
the Gauss–Legendre scheme (hereafter IRK4). The numbers derivatives are fourth and third orders in space. Therefore,
of particles, N, are 16 × 17, 32 × 33, and 64 × 65. if the result converges to an exact solution following the
Figure 11 shows the time evolution up to t = 0.75T order of the method, the order of the scheme should be
with the implicit Hermite scheme, N = 16 × 17 and dt larger than or equal to 3, and thus vx should be given by
dtmax /4. Figure 12 shows y of the particle initially at vx ∝ Nx−m where m is larger than or equal to 3 and Nx is the
(x, y) = (0, 1) with the implicit Hermite scheme, N = 16 × number of particles in the x-direction. Figure 13 shows vx

Fig. 14. vx ,t plotted against dtic . From left to right, panels show the results for the Nx = 16, 32 and 64. Crosses and open and filled circles show
the results of the implicit Hermite scheme, IRK4, and IRK1. Dashed, solid and treble-dot–dashed curves show the theoretical models for the error of
second-, first-, and fourth-order schemes.
18-12 Publications of the Astronomical Society of Japan (2019), Vol. 71, No. 1

Downloaded from https://academic.oup.com/pasj/article/71/1/18/5250071 by Institute of Space Technology user on 26 June 2023


Fig. 15. vx plotted against dtic . Left-hand, middle and right-hand panels show the results for the implicit Hermite scheme, IRK4, and IRK1. Crosses
and open and filled circles show results for Nx = 16, 32, and 64.

for the implicit Hermite scheme with dt = T/512 plotted


against Nx−1 . We can see that the error vx is proportional
to Nx−4 . Therefore, the error in acceleration determines the
overall error. The results are independent of the time inte-
gration used. From the result, the spatial order of the scheme
is consistent.
Let us now look at the time order of the scheme.
Figure 14 shows vx ,t plotted against dtic . We can see that
the errors of the implicit Hermite scheme, IRK4, and IRK1
are O(dt 2 ), O(dt 4 ), and O(dt), respectively. As described
in subsection 3.1, the time order of the Hermite scheme is
equal to 2. From these results, we can conclude that the
time orders of the schemes are consistent.
Figure 15 shows errors plotted against dtic . The result
shows that the accuracy of the implicit Hermite scheme is
similar to that of IRK4 and smaller than that of IRK1 with
large N. Fig. 16. dtic in the numerical stable region for tests with x0 = 0.006
Figure 16 shows the maximum dtic in the numerical plotted against Nx−1 . Crosses and open and filled circles show the results

stable region plotted against Nx−1 . We can see that the region of the implicit Hermite scheme, IRK4, and IRK1.

of stability of the implicit Hermite scheme is wider than


those of IRK1 and IRK4. Hence, we can use larger time-
of Runge–Kutta schemes. Therefore, we can use a large
steps with the implicit Hermite scheme. Therefore, we can
time-step with the Hermite scheme compare to that for the
conclude that the Hermite scheme is better than Runge–
Runge–Kutta scheme for the same accuracy. We conclude
Kutta schemes for simulations of fluid with the surface and
that Hermite schemes are more computationally efficient
gravity wave.
than commonly used Runge–Kutta schemes for a high-order
mesh-free method.

4 Summary
If we use multi-stage integration schemes, such as Runge– Acknowledgments
Kutta schemes, with mesh-free methods we need to perform We would like to thank the referee for his or her insightful comments
the interaction calculation, which is the most expensive and suggestions. We also thank the editor for his or her assistance.
part of the calculation, multiple times per time step. We We thank Masaki Iwasawa, Keigo Nitadori and Daisuke Namekata
for discussions about Hermite schemes and Runge–Kutta schemes.
constructed a Hermite scheme for a high-order mesh-free
This research was supported by RIKEN Junior Research Associate
method. The accuracy of fourth-order Hermite schemes is Program and MEXT as “Exploratory Challenge on Post-K com-
at least similar to those of Runge–Kutta schemes and the puter” (Elucidation of the Birth of Exoplanets [Second Earth] and
region of stability of Hermite schemes is better than those the Environmental Variations of Planets in the Solar System).
Publications of the Astronomical Society of Japan (2019), Vol. 71, No. 1 18-13

References Makino, J. 1991, ApJ, 369, 200


Makino, J., & Aarseth, S. J. 1992, PASJ, 44, 141
Antuono, M., Colagrossi, A., Marrone, S., & Lugni, C. 2011,
Sod, G. A. 1978, J. Comput. Phys., 27, 1
Comput. Phys. Commun., 182, 866
Wendland, H. 1995, Adv. Comput. Math., 4, 389
Aoki, T. 1997, Comput. Phys. Commun., 102, 132
Yamamoto, S., & Makino, J. 2017, PASJ, 69, 35
Iwasawa, M., Tanikawa, A., Hosono, N., Nitadori, K., Muranushi,
T., & Makino, J. 2016, PASJ, 68, 54

Downloaded from https://academic.oup.com/pasj/article/71/1/18/5250071 by Institute of Space Technology user on 26 June 2023

You might also like