A Dispersion Minimizing Finite Difference Scheme and Precond - Chen Et Al - 2012

Journal of Computational Physics 231 (2012) 81528175
Contents lists available at SciVerse ScienceDirect
Journal of Computational Physics

journal homepage: www.elsevier.com/locate/jcp
A dispersion minimizing nite difference scheme and preconditioned

solver for the 3D Helmholtz equation q
Zhongying Chen a,1, Dongsheng Cheng a,, Tingting Wu b
a
b
Guangdong Province Key Laboratory of Computational Science, Sun Yat-sen University, Guangzhou 510275, PR China
School of Mathematical Sciences, Shandong Normal University, Jinan 250014, PR China
a r t i c l e
i n f o
Article history:
Received 29 November 2011
Received in revised form 27 July 2012
Accepted 30 July 2012
Available online 17 August 2012
Keywords:
Helmholtz equation
Perfectly matched layer
Finite difference method
Preconditioner
Bi-CGSTAB
Shifted-Laplacian
Multigrid
Prolongation operator
a b s t r a c t
In this paper, a new 27-point nite difference method is presented for solving the 3D Helmholtz equation with perfectly matched layer (PML), which is a second order scheme and
pointwise consistent with the equation. An error analysis is made between the numerical
wavenumber and the exact wavenumber, and a rened choice strategy based on minimizing the numerical dispersion is proposed for choosing weight parameters. A full-coarsening
multigrid-based preconditioned Bi-CGSTAB method is developed for solving the linear system stemming from the Helmholtz equation with PML by the nite difference scheme. The
shifted-Laplacian is extended to precondition the 3D Helmholtz equation, and a spectral
analysis is given. The discrete preconditioned system is solved by the Bi-CGSTAB method,
with a multigrid method used to invert the preconditioner approximately. Full-coarsening
multigrid is employed, and a new matrix-based prolongation operator is constructed
accordingly. Numerical results are presented to demonstrate the efciency of both the
new 27-point nite difference scheme with rened parameters, and the preconditioned
Bi-CGSTAB method with the 3D full-coarsening multigrid.
2012 Elsevier Inc. All rights reserved.
1. Introduction
The wave equation has numerous important applications in sciences and engineering, for instance, in geophysics, aeronautics, marine technology. Applying the Fourier transform with respect to time to the wave equation, we obtain the frequency domain wave equation, which is the well-known Helmholtz equation. The Helmholtz equation is so important
that its numerical simulation has stimulated signicant research. To solve the Helmholtz equation numerically, articial
boundary conditions are often employed so that we can truncate the innite computing domain into a nite one. The perfectly matched layer (PML, cf. [8,30,40]) proposed by Brenger is a popular articial absorbing boundary condition, which is
used to gradually damp the outgoing waves and eliminate boundary reections. For convenience, we call the Helmholtz
equation with PML the Helmholtz-PML equation, which is considered in this paper.
To discretize the Helmholtz equation, we mainly have nite difference methods (cf. [11,19,22,30,31,33,34,43]) and nite
element methods (cf. [2,3,10,13,16,20]). Finite difference methods are commonly used in engineering eld such as geophysics. In scientic computing, solving the Helmholtz equation numerically with high wavenumbers still remains as one of the
most difcult tasks. Due to the pollution effect of high wavenumbers, the wavenumber of the numerical solution is different
q
This research is partially supported by the Guangdong Provincial Government of China through the Computational Science Innovative Research Team
program.
Corresponding author.
E-mail addresses: lnsczy@mail.sysu.edu.cn (Z. Chen), chdsh@mail.sysu.edu.cn (D. Cheng), wttxrm@126.com (T. Wu).
1
Supported in part by the Natural Science Foundation of China under Grants 10771224 and 11071264.
0021-9991/$ - see front matter 2012 Elsevier Inc. All rights reserved.
http://dx.doi.org/10.1016/j.jcp.2012.07.048
Z. Chen et al. / Journal of Computational Physics 231 (2012) 81528175
8153
from the wavenumber of the exact solution, which is known as numerical dispersion (cf. [20,21]). The conventional 2D
5-point and 3D 7-point nite difference schemes lead to serious numerical dispersion, polluting the numerical accuracy.
To reduce the numerical dispersion, some weighted transformed-based nite difference schemes have been constructed
(cf. [19,22,24,25,28,32]), which need less grids per wavelength, while maintaining a comparable accuracy. For the 3D
Helmholtz equation, the weighted transformed-based nite difference schemes are very complicated. For instance, in
[25], seven rotated coordinate systems were employed to construct a 3D 27-point difference scheme.
In this paper, we shall propose an alternative nite difference scheme for the 3D Helmholtz equation, which is an extension of our previous work for the 2D case in [11]. This new scheme remains weighted, but is rotation-free. We call it a dispersion minimizing nite difference scheme, since the weight parameters are obtained by minimizing the numerical
dispersion. The dispersion minimizing nite difference scheme is of second order and pointwise consistent. Its construction
is much simple without rotating the coordinate system in 3D space, compared with the staggered-grid 27-point formulation,
which was originally proposed for the wave equation in [24], and was further developed to the 3D Helmholtz equation in
[25]. Interestingly, we shall present that our scheme is equivalent to the scheme in [25] under certain conditions. Moreover,
weight parameters of our scheme are chosen by rening parameter intervals, which is called as the rened choice strategy.
We shall give an error analysis between the numerical wavenumber and exact wavenumber, and numerical experiments
show that the new scheme with the rened strategy outperforms the staggered-grid scheme in reducing the numerical
dispersion.
For the Helmholtz equation, high-order nite difference schemes (cf. [4,31,34]) are also constructed to improve the
numerical accuracy. For instance, in [34], compact nite difference schemes of sixth order are proposed for the 3D Helmholtz
equation. These sixth order schemes perform pretty well for small constant wavenumbers. Theoretically, sixth order
schemes are more competitive, so long as the step size is small enough. However, grids per wavelength can not be too much
in practical, that is, the step size can not be too small. Also, the pollution analysis of error shows that the accuracy not only
depends on the convergence order, but also the wavenumber. Then, though the sixth order scheme has a higher convergence
order, it does not always means a higher accuracy. For certain step size and large wavenumbers, the new second order
scheme may compete with the sixth order scheme, since it minimizes the numerical dispersion. In this paper, we shall compare the new second order scheme with the 3D sixth order compact scheme in [34], and numerical examples show that the
new second order scheme performs better for certain step sizes and large wavenumber. Moreover, we specially point out
that sixth order schemes are more demanding, since they require the solution and source term be continuously differentiable
of sixth and fourth order, respectively. They also require the wavenumber be constant and the step sizes be equal in three
directions. However, these requirements may not be met in practice. For example, in geophysical applications, we have to
deal with the Helmholtz equation with varying wavenumbers in heterogeneous medium, and the step size in the third direction may differ from others. In addition, high order schemes may have difculties in dealing with boundary conditions, and a
high convergence order may not be obtained if the boundary condition is not dealt properly.
After discretization of the Helmholtz equation, the preconditioned Bi-CGSTAB method is employed to solve the large
indenite linear system, and the shifted-Laplacian (cf. [14,15,23]) is considered as the preconditioner. The shifted-Laplacian
preconditioner is an extension of the Laplacian preconditioner, which was originally proposed in [5,6] for the 2D case. In this
paper, for the 3D Helmholtz-PML equation, the corresponding preconditioner we employ is the 3D complex shifted-Laplacian-PML. We specially analyze the spectral distribution of the linear system from the perspective of linear fractal mapping
in complex variable functions. We propose a new prolongation operator for the 3D full-coarsening multigrid, which is used
to invert the preconditioner approximately. With the same number of iterations, it is expected that the full-coarsening multigrid shall consume less CPU time than the semi-coarsening case, which decreases more gradually in grid size. Numerical
results are presented to illustrate that the 3D full-coarsening multigrid with the new prolongation operator gives a better
performance, reducing both the number of iterations and the total CPU time needed for convergence. In the experiment,
wavenumbers range from constant in homogeneous medium to greatly varying ones in heterogeneous medium. For the case
of constant wavenumber, the dimensionless wavenumber (cf. [21]) we compute in our experiment is as large as 220. We
specially point out that the number of iterations scales roughly linearly with the wavenumber, which seems to be a classical
problem for iterative solutions of the Helmholtz equation. We have not solved this problem.
In this paper, we aims at solving the 3D Helmholtz-PML equation related with geophysical applications, and focuses on
both the discretization of the operator equation and iterative method of the discrete linear system. The remainder of this
paper is organized as follows. In Section 2, a new 27-point nite difference scheme is developed and analyzed. In Section 3,
an error analysis is presented between the numerical wavenumber and exact wavenumber. In Section 4, a rened choice
strategy is given to choose weighted parameters of the new scheme. In Section 5, we discuss the 3D complex shiftedLaplacian preconditioning, and make some spectral analysis. In Section 6, we propose a new prolongation operator for the
full-coarsening multigrid. In Section 7, some numerical experiments are presented. Finally, in Section 8, some conclusions
are drawn.
2. A consistent 27-point nite difference scheme for the 3D Helmholtz-PML equation

In this section, we formulate a new 27-point nite difference scheme for the 3D Helmholtz-PML equation, based on the
idea of weighted average (cf. [30]). This scheme is pointwise consistent with the equation and of second order. Compared
8154
with methods in [24,25], the construction of this scheme is much simpler and can be easily extended to nonuniform grids,
since it neither needs the rst-order hyperbolic system nor the rotated Cartesian coordinate system.
We consider the 3D Helmholtz equation for wave problems
2
in R3 ;
Au : Du 1 aik u g
2
2:1
@
@
@
where D : @x
2 @y2 @z2 is the 3D Laplacian, unknown u usually represents a pressure eld in the frequency domain,
k : 2pf =v is the wavenumber with f indicating the frequency in Hertz and v indicating the wave velocity, a is the real nump
ber indicating the fraction of damping in the medium, i 1 is the imaginary unit, and g represents the source term. The
wavenumber k, which depends explicitly on the spatial velocity v, is a constant for the homogeneous medium and a variable
for the heterogeneous medium. The medium is considered to be barely attenuative when 0 6 a 1, and a can be set up to
5% (i:e:; a 0:05 in geophysical applications. When a square domain of size H is normalized to a unit domain, we obtain the
dimensionless wavenumber which equals to 2pfH=v (cf. [21]). In the remainder, the wavenumber refers to the dimensionless wavenumber.
Applying the PML technique to truncate the innite domain in (2.1) into a bounded domain, we have the 3D HelmholtzPML equation
Au :

@
@u
@
@u
@
@u

1 aiDu g;
A
B
C
@x
@x
@y
@y
@z
@z
2:2
with
ny ynz z
;
nx x
nx xny y
;
C :
nz z
A :
B :
nx xnz z
;
ny y
2
D : nx xny ynz zk x; y; z:
Here, nx x; ny y and nz z are 1D damping functions that satisfy nx x 1; ny y 1, nz z 1 in the interior area.
We next introduce the construction of the new 27-point nite difference scheme. Firstly, we present the 27-point nite
difference stencil with numbering in Fig. 1, where 0; 0; 0 represents the center point in the stencil, and l; m; n with
l; m; n 2 f1; 0; 1g denote the points surrounding 0; 0; 0. For convenience, 0; 0; 0 and l; m; n are identied with
x0 ; y0 ; z0 and x0 lh; y0 mh; z0 nh respectively, where h is the discretization step. The discretization of a function u
@u @ @u
@u
@
@
at point l; m; n is denoted by ul;m;n : ux0 lh; y0 mh; z0 nh. Then, to approximate @x
A @x , @y B @y and @z
C @z , we
utilize
i
c h
Lh;x u : c1 Lh;x uj0;0;0 2 Lh;x uj0;1;0 Lh;x uj0;1;0 Lh;x uj0;0;1 Lh;x uj0;0;1
4
i
c3 h
Lh;x uj0;1;1 Lh;x uj0;1;1 Lh;x uj0;1;1 Lh;x uj0;1;1 ;

4
x
y
z
(0,0,1)
(0,1,0)
(1,0,0)
(0,0,0)
(1,0,0)
(0,1,0)
(0,0,1)
Fig. 1. The 27-point nite difference stencil with numbering.
2:3
8155
Lh;y u : c1 Lh;y uj0;0;0
c3 h
4
Lh;y uj0;0;1 Lh;y uj0;0;1 Lh;y uj1;0;0 Lh;y uj1;0;0
i
Lh;y uj1;0;1 Lh;y uj1;0;1 Lh;y uj1;0;1 Lh;y uj1;0;1 ;
Lh;z u : c1 Lh;z uj0;0;0
c3 h
c2 h
c2 h
4
2:4
Lh;z uj1;0;0 Lh;z uj1;0;0 Lh;z uj0;1;0 Lh;z uj0;1;0
i
Lh;z uj1;1;0 Lh;z uj1;1;0 Lh;z uj1;1;0 Lh;z uj1;1;0 ;
where Lh;x uj0;m;n ; Lh;y ujl;0;n and Lh;z ujl;m;0
2:5
@u @ @u
@
(l; m; n 2 f1; 0; 1g) are approximations of @x
A @x ; @y B @y and
@
@z

C @u
respec@z
tively, based on the second order centered difference, and parameters satisfy c1 c2 c3 1. The differential operator
@ @ @ @ @
@
A @x @y B @y @z C @z is then approximated by Lh ,
L : @x
Lh u : Lh;x u Lh;y u Lh;z u:
2:6
Finally, the zeroth order term Du is approximated by a weighted average
I h Du : w1 D0;0;0 u0;0;0 w2 I h;1 Du w3 I h;2 Du w4 I h;3 Du;

with parameters wj j 1; 2; 3; 4 satisfying
dened by
P4
j1 wj
2:7
1. Here, I h is an average operator, and operators I h;1 ; I h;2 and I h;3 are
I h;1 Du :
1
D1;0;0 u1;0;0 D0;1;0 u0;1;0 D0;0;1 u0;0;1 D1;0;0 u1;0;0 D0;1;0 u0;1;0 D0;0;1 u0;0;1 ;
6
I h;2 Du :
1
D1;1;0 u1;1;0 D0;1;1 u0;1;1 D1;0;1 u1;0;1 D1;1;0 u1;1;0 D0;1;1 u0;1;1 D1;0;1 u1;0;1
12
D1;1;0 u1;1;0 D0;1;1 u0;1;1 D1;0;1 u1;0;1 D1;1;0 u1;1;0 D0;1;1 u0;1;1 D1;0;1 u1;0;1 ;
and
I h;3 Du :
1
D1;1;1 u1;1;1 D1;1;1 u1;1;1 D1;1;1 u1;1;1 D1;1;1 u1;1;1 D1;1;1 u1;1;1 D1;1;1 u1;1;1
8
D1;1;1 u1;1;1 D1;1;1 u1;1;1 :
With Lh and I h , we obtain a new 27-point nite difference scheme for the 3D Helmholtz-PML Eq. (2.2):
Lh u 1 aiI h Du g 0;0;0 :
2:8
To analyze the consistency of the new 27-point scheme, we recall the notion for pointwise consistency [38].
Denition 2.1. Suppose that the partial differential equation under consideration is T u g and the corresponding nite
difference approximation is T l;m;n U l;m;n Gl;m;n where Gl;m;n denotes whatever approximation which has been made of the
source term g. Let xl ; ym ; zn : x0 lDx; y0 mDy; z0 nDz. The nite difference scheme T l;m;n U l;m;n Gl;m;n is pointwise
consistent with the partial differential equation T u g at x; y; z if for any smooth function / /x; y; z,
T / gjxxl ;yym ;zzn T l;m;n /xl ; ym ; zn Gl;m;n ! 0
2:9
as Dx; Dy; Dz ! 0.
For the difference approximation (2.8) of the 3D Helmholtz-PML Eq. (2.2), we have the following proposition.
Proposition 2.2. The 27-point nite difference scheme (2.8) is pointwise consistent with the 3D Helmholtz-PML Eq. (2.2).
Proof. Assume that xl 6 x < xl1 ; ym 6 y < ym1 and zn 6 z < zn1 . We recall that
from (2.3)(2.5) and the Taylor theorem that
P3
c 1 and
j1 j
P4
j1 wj
1. Then, it follows
Lh;x u

@
@u
2
3
A
l1 h Oh ;
@x
@x
2:10
Lh;y u

@
@u
2
3
B
l2 h Oh ;
@y
@y
2:11
Lh;z u

@
@u
2
3
l3 h Oh ;
C
@z
@z
2:12
8156
in which
"
"
!#

#
1 @3
@u
@
@3u
c2 @ 3
@u
@3
@u
A
A
A
A
24 @x3
@x
@x
@x3
@x
@x
4 @y2 @x
@z2 @x
"
#
2

2

c
@
@
@
@u
@
@
@
@u
3

A
;
@y @z @x
@x
@y @z @x
@x
4
"
!#
"

#
1 @3
@u
@
@3u
c2 @ 3
@u
@3
@u
B 3
2
:
B
B
B
24 @y3
@y
@y
@y
@y
@y
4 @x2 @y
@z @y
"
#
2

2

c
@
@
@
@u
@
@
@
@u

B
3
;
@x @z @y
@y
@x @z @y
@y
4
l1 :
l2
and
l3
"
!#
"

#
1 @3
@u
@
@3u
c2 @ 3
@u
@3
@u
C 3
2
:
C
C
C
24 @z3
@z
@z
@z
@z
@z
4 @x2 @z
@y @z
"
#
2

2

c
@
@
@
@u
@
@
@
@u

C
:
3
@x @y @z
@z
@x @y @z
@z
4
Similarly, following (2.7) and the Taylor theorem, we obtain that

2
I h Du Du l4 h Oh ;
2:13
where
l4
!
w
w3 @ 2
@2
@2
2
Du
:
6
3
@x2 @y2 @z2
"
2
2
2
2 #
w4
@
@
@
@
@
@
@
@
@
@
@
@
Du:

@x @y @z
@x @y @z
@x @y @z
@x @y @z
8
It follows from Eqs. (2.10)(2.13) that the left hand side of the 27-point nite difference approximation (2.8) is equivalent to

@
@u
@
@u
@
@u
2
3
A

B

C
1 aiDu hh2 h Oh ;
@x
@x
@y
@y
@z
@z
2:14
P
where hh2 : 4j1 lj . Because of R3j1 cj 1; R4j1 xj 1, and Denition 2.1, we come to our conclusion. The scheme is just
second-order accurate. h
The next proposition indicates the relationship between the new 27-point nite difference scheme (2.8) and the staggered-grid 27-point nite difference scheme proposed in [24,25].
Proposition 2.3. For the case of the wavenumber being a constant, the staggered-grid 27-point nite difference scheme and
the 27-point nite difference scheme (2.8) are equivalent if wj wmj j 1; 2; 3; 4, and
2
3
1
2
1
3
1
2
c1 cs1 cs2 cs3 ; c2 cs2 ; c3 cs3 ;

where wmj j 1; 2; 3; 4 and csj j 1; 2; 3 are parameters in the staggered-grid 27-point nite difference scheme.
Proof. For convenience, we rstly introduce the notations
R0 : u0;0;0 ;
R1 : u1;0;0 u0;1;0 u0;0;1 u1;0;0 u0;1;0 u0;0;1 ;
R2 : u1;1;0 u0;1;1 u1;0;1 u1;1;0 u0;1;1 u1;0;1 u1;1;0 u0;1;1 u1;0;1 u1;1;0 u0;1;1 u1;0;1 ;
and
R3 : u1;1;1 u1;1;1 u1;1;1 u1;1;1 u1;1;1 u1;1;1 u1;1;1 u1;1;1 :

When the wavenumber is a constant, the staggered-grid 27-point scheme in [25] reduces to
8157
cs1
2
R1 6 R0
cs2 1
2
R2 R1 12R0
cs3 3
2
R3 R2 2R1 12R0
3h 2
4h 2

wm2
w
w
2
1 aik wm1 R0
R1 m3 R2 m4 R3 g 0;0;0 ;
6
12
8
h
P
P
where 4j1 wmj 1 and 3j1 csj 1. For the details of the scheme, we refer readers to the derivation of the Eq. (6) in [25]. For
our proposed 27-point nite difference scheme, the formula (2.8) reduces to

c2 1
c3 3
1
w2
w
w
2

1 aik w1 R0
R

6
R

R

R
R

R
R1 3 R2 4 R3 g 0;0;0 :
1
0
2
1
3
2
2
2 2
2 4
2
6
12
8
h
h
h
c1
For Eqs. (2.15) and (2.16), comparing parameters of R0 , R1 ; R2 and R3 yields the conclusion of this proposition.
2:16
h
3. Error analysis between the numerical wavenumber and the exact wavenumber
In this section, a classical dispersion analysis is made to assess the accuracy of the new scheme (2.8), and a theoretical
result is given on the approximation of numerical wavenumber to the exact wavenumber when kh is small enough.
To do a dispersion analysis for the new scheme (2.8), we consider an innite homogeneous model with constant velocity
v, and we also assume that the medium has no damping effect on the waves, that is, a 0 for the Eq. (2.1). Thus, we obtain
the Eq. (2.16) with a 0. Let k be the wavelength, and G : hk be the number of gridpoints per wavelength. Since k vf and the
wavenumber k : 2vpf , we have kh 2Gp. Following the classical harmonic approach, we rstly insert the discrete expression of
a plane wave ulmn eihkl cos / cos hm cos / sin hn sin / in (2.16), where p2 / is the propagation angle from the z-axis, and h is the
propagation angle from the x-axis. Let
a : kh cos / cos h
2p
cos / cos h;
G
b : kh cos / sin h
2p
2p
cos / sin h; c : kh sin /
sin /:
G
G
By a simple computation, we have the dispersion equation

2 2
k h L M;
3:1
in which
w2
w3
H
F w4 E;
3
3
M : 2c1 3 H c2 H F c3 F 3E;
L : w1
with E : cos a cos b cos c; F : cos a cos b cos a cos c cos b cos c, and H : cos a cos b cos c:. Then, replacing k in the left
N
side of the Eq. (3.1) with the numerical wavenumber k yields
N
1
h
r
M
:
L
3:2
N
The next proposition presents the error between the numerical wavenumber k and the exact wavenumber k.
Proposition 3.1. For the 27-point nite difference scheme (2.8) with a 0, there holds
N
h
i 1
h
i
1
c1 c2 8c3 b1 4 b2 4 b3 4 c3 c2 b1 b2 2 b1 b3 2 b2 b3 2
24
4

3
1
1
1
3 2
4 3
c3
w2 w3 w4 k h Ok h ; kh ! 0;
8
12
6
4
k k
3:3
where b1 : cos / cos h; b2 : cos / sin h, and b3 : sin /.

Proof. Let s : kh. Then, a b1 s; b b2 s; c b3 s; Es cos a cos b cos c; Fs cos a cos b cos a cos c cos b cos c, and
Hs cos a cos b cos c. Moreover, both M and L depend on the variable s, denoted by Ms and Ls respectively. With
the Taylor expansion, we have
Ms s2
h
i
h
i
o
1 n
c1 c2 8c3 b1 4 b2 4 b3 4 6c2 c3 b1 b2 2 b1 b3 2 b2 b3 2 9c3 s4 Os5 ;
12
3:4
and
1
1
1 w2 2w3 6w4 s2 Os3 :
Ls
6
3:5
8158
In addition, from the Eq. (3.2), we have

2 Ms
N
k h
:
Ls
Together with Eqs. (3.4) and (3.5), we have

2
h
i 1
h
i
1
N
2
k
k
c1 c2 8c3 b1 4 b2 4 b3 4 c3 c2 b1 b2 2 b1 b3 2 b2 b3 2
12
2

3
1
1
1
4 2
5 3
c3 w2 w3 w4 k h Ok h ; kh ! 0:
4
6
3
2
Based on the above equation, applying the Taylor expansion of the function
this proposition. h
p
1 s at the point s 0 yields the conclusion of
We remark that the above proposition indicates that k approximates k in a second order. Moreover, the term associated
3 2
with k h presents the pollution effect, which depends on the wavenumber k, the parameters of the nite difference formula
(2.8) and the waves propagation angle dened by / and h.
4. A rened parameter choice strategy for the new 27-point nite difference scheme
In this section, a rened strategy is presented to choose optimal parameters for the new scheme (2.8), based on minimizing the numerical dispersion. An optimization problem is solved with a ner setting to estimate the weight coefcients
c1 ; c2 ; c3 , and w1 ; w2 ; w3 ; w4 .
As is known, normalized numerical phase velocity and group velocity are two important tools in measuring the numerical
dispersion, and the former is usually preferred in practice (see [22,25,29,39]). For the Eq. (2.16) with a 0, its normalized
numerical phase and group velocity are
V Nph
2p
r
M
;
L
4:1
and
V Ngr
G v
4p V Nph
1
@M
h @k
LM
1
@L
h @k
L2
4:2
respectively. With h 2Gkp, we can easily conclude that

N
V ph
k
G
k
v 2p
r
M
;
L
4:3
N
which means minimizing the error between the numerical wavenumber k and the wavenumber k is equivalent to minimizing the normalized phase velocity.
Now, we choose optimal parameters cj j 1; 2; 3 and wj j 1; 2; 3; 4 by minimizing the numerical dispersion. To this
end, we set
Jc1 ; . . . ; c3 ; w1 ; . . . ; w4 ; G; /; h :
G
2p
r
M
1;
L
4:4
P3
P4
with
j1 cj 1;
j1 wj 1 and G; /; h 2 I G I / I h . Here, IG ; I/ and Ih are three intervals. In general, one can choose
I/ : 0; p2 ; Ih : 0; p4 and IG : Gmin ; Gmax 4; 400 with Gmin P 2 by the Nyquist sampling limit [29]. It is observed from
N
(4.3) that minimizing the error between the numerical wavenumber k and the exact wavenumber k is equivalent to minimizing the norm kJc1 ; . . . ; c3 ; w1 ; . . . ; w4 ; ; ; k1;IG I/ Ih . For this purpose, letting Jc1 ; . . . ; c3 ; w1 ; . . . ; w4 ; G; /; h 0 yields
1
3
c1 2G2 3 3E F H c2 2G2 3E 2F H w1 4p2 E 1 w2 4p2 E H

4p2 E 2G2 3E F :

1
1
Gmax
1
1
1
1
1
G
0
0
:
l 1 min
;
for l 1; 2; . . . ; l;
2
G Gl0
Gmax
Gmax Gmin
l1
m0 1p
2 I/
2m 1

1
w3 4p2 E F
3
4:5
We choose
/ /0m :

for m0 1; 2; . . . ; m;
8159
and
h hn0 :
n0 1p
2 Ih
4n 1
for n0 1; 2; . . . ; n:
Let al0 ;m0 ;n0 : 2Gp0 cos /m0 cos hn0 ; bl0 ;m0 ;n0 : 2Gp0 cos /m0 sin hn0 , and cl0 ;m0 : 2Gp0 sin /m0 . Then, from (4.5), we obtain a overdetermined
l
l
l
linear system
S11;1;1
6
6 ..
6.
6
6 S1
6 1;1;n
6
6 S1
6 1;2;1
6
6.
6 ..
6
6 1
6 Sl0 ;m0 ;n0
6
6.
6.
4.
S1l;m;n
S21;1;1
S31;1;1
S41;1;1
..
.
..
.
..
.
S21;1;n
S31;1;n
S41;1;n
S21;2;1
S31;2;1
S41;2;1
..
.
..
.
..
.
S2l0 ;m0 ;n0
S3l0 ;m0 ;n0
S4l0 ;m0 ;n0
..
.
..
.
..
.
S2l;m;n
S3l;m;n
S4l;m;n
S51;1;1
S61;1;1
7
7
6
7
7
6 ..
7
7
6.
72
7
3 6
7
6 S6
S51;1;n 7
7 c1
6 1;1;n 7
76
7
7 6
7
6c 7 6 6
S51;2;1 7
76 2 7 6 S1;2;1 7
76 w1 7 6
7;
76
7
7 6.
..
.
7
7
7
6
6
.
74 w2 5 6 .
7
7
7
6
5
6
6 Sl0 ;m0 ;n0 7
Sl0 ;m0 ;n0 7 w3
7
7
6
7
7
6.
..
7
7
6.
5
5
4.
.
..
.
S5l;m;n
4:6
S6l;m;n
where

S1l0 ;m0 ;n0 : 2G2l0 3 3El0 ;m0 ;n0 F l0 ;m0 ;n0 Hl0 ;m0 ;n0 ;

S2l0 ;m0 ;n0 : 2G2l0 3El0 ;m0 ;n0 2F l0 ;m0 ;n0 Hl0 ;m0 ;n0 ;
S3l0 m0 ;n0 : 4p2 El0 ;m0 ;n0 1;

1
S4l0 ;m0 ;n0 : 4p2 El0 ;m0 ;n0 Hl0 ;m0 ;n0 ;
3

1
5
2
Sl0 ;m0 ;n0 : 4p El0 ;m0 ;n0 F l0 ;m0 ;n0 ;
3

S6l0 ;m0 ;n0 : 4p2 El0 ;m0 ;n0 2G2l0 3El0 ;m0 ;n0 F l0 ;m0 ;n0 ;
El0 ;m0 ;n0 : cos al0 ;m0 ;n0 cos bl0 ;m0 ;n0 cos cl0 ;m0 ;
F l0 ;m0 ;n0 : cos al0 ;m0 ;n0 cos bl0 ;m0 ;n0 cos al0 ;m0 ;n0 cos cl0 ;m0 cos bl0 ;m0 ;n0 cos cl0 ;m0
Hl0 ;m0 ;n0 : cos al0 ;m0 ;n0 cos bl0 ;m0 ;n0 cos cl0 ;m0 :
The above coefcient matrix has l m n rows and 5 columns, and can be solved by the least-squares method. In [25],
optimal parameters were chosen globally, and were used in the computation for different frequencies, velocity and step
sizes. This may yield much numerical dispersion for large wavenumbers and varying kx; y; z. To reduce the numerical dispersion and improve the accuracy of the difference scheme, we propose the following rule with a ner setting.
Rule 4.1 (Rened choice strategy). Step 1. Estimate the interval IG : Gmin ; Gmax .
Step 2. Choose ci i 2 N3 and wj j 2 N4 such that
(
c1 ; . . . ; c3 ; w1 ; . . . ; w4 argmin kJc1 ; . . . ; c3 ; w1 ; . . . ; w4 ; ; k1;IG I/ Ih :
)
4
X
cj 1;
wj 1 :
3
X
j1
4:7
j1
Dispersion reducing schemes have been discussed since the 1990s, and they can be found in the work of Tam and his collaborators (cf. [3537]). Now, we shall present some relation between Tams work and ours. Both of the work are based on
the dispersion relation of the waves, that is, a functional relation between the angular frequency of the waves and the wavenumbers of the spatial variables. This relation is usually obtained by taking the space and time Fourier transforms of the governing equations, and it determines all the dispersiveness, damping rate, isotropy or anisotropy, group and phase velocities
of waves. Therefore, both the work of Tam and ours are based on the fact that a good nite difference scheme should have the
same or almost the same dispersion relation as the original equations. However, there are also some difference. Firstly, we
concentrate on solving the acoustic wave equation in the frequency domain, that is the Helmholtz equation, which possesses
some different properties compared with the acoustic wave equation in the time domain. Tam and his collaborators focus on
the acoustic problems in the time domain, especially the linearized Euler equations. Secondly, we take a different dispersion
minimizing strategy. For our nite difference scheme, we respectively compute the numerical dispersion Eq. (3.1) and the
normalized numerical phase velocity (4.1), which have seven weight parameters to be determined. As is known, the normalized numerical phase velocity would be 1, if there is no numerical dispersion. Then, we could obtain the weight parameters
8160
Table 1
Rened optimal parameters.
IG
[2, 2.5]
[3.5, 4]
[5, 6]
[7, 8]
[9, 10]
[10, 400]
c1
c2
c3
0.5035127
0.0720630
0.4244243
0.4058413
0.1966284
0.5979158
0.2003855
0.7617528
0.0148152
0.2530624
0.7602512
0.4883334
1.1153920
0.3873097
0.8159342
0.0340791
0.2181449
1.1330134
1.5191327
2.1033335
0.7172142
0.8354262
-0.0394517
0.2040255
1.7177071
3.2400262
3.8084643
1.2861453
0.8432810
0.0414069
0.1981258
2.4693294
5.4811311
6.0429826
2.0311809
0.8269996
4.097e07
0.1729999
2.9473150
6.8805122
7.4116566
2.4784594
w1
w2
w3
w4
by minimizing the error between the normalized numerical phase velocity and 1. For the method of Tam and his collaborators, they try to formulate a nite difference scheme nearly having the same Fourier transform in space or time as the original partial derivatives. The nite difference approximation of the spatial derivatives and the treatment for the time
derivative were discussed separately. For details, we refer the reader to [37].
We now turn our attention to computing the weight parameters with Rule 4.7. In general, we can estimate IG by using a
priori information before computing parameters. For constant wavenumbers, the number of grids per wavelength G usually
locates in 2; 400, which can be partitioned into some small intervals for practical computing, such as
IG : 2; 2:5; 2:5; 3; . . . 10; 400. For varying wavenumbers, the velocity v locates in v min ; v max , where v min ; v max denote
the minimum and maximum velocity respectively. For each frequency f, we can obtain the interval IG Gmin ; Gmax with
min
Gmin vfh
and Gmax v max
. Then, for each interval IG , together with I/ : 0; p2 and Ih : 0; p4 , we can obtain a group of paramfh
eters. In computation, we need to solve a overdetermined linear system (4.6), whose size l m n 5 is related with the
partition of IG ; I/ , and Ih . Generally, we would obtain better parameters with a ner partition, which means a more expensive
cost. Fortunately, we do not need a very ne partition, since the benet increases little with the increasing ner partition. In
practice, we choose l 20; m 20; n 10, and the resulting system can be solved not expensively.
Table 1 gives a group of rened parameters. In Fig. 2(a)(c), we present the normalized phase velocity curves for the staggered-grid scheme with global choice strategy (the staggered-grid 27p), the new scheme with global choice strategy (the
global 27p), and the new scheme with rened parameters (the rened 27p) in Table 1, respectively. As can be seen, both
the global 27p and the rened 27p scheme have an improvement over the staggered-grid 27p scheme, and the rened
27p scheme has the least numerical dispersion. Therefore, we would expect that the rened 27p scheme can be used to control the numerical dispersion and to suppress nonphysical oscillations to a certain degree.
Numerically, the dispersion is not sensitive to the small change of the weight parameters. That is, the dispersion would
not have a big change if the parameters are disturbed a little. To see this, we present two dispersion pictures in Fig. 2(d) and
(e), in which the parameters are taken from (c) but with perturbations of 0.1 and 0.01, respectively. As can be observed, for a
small change of parameters, the normalized phase velocity curve in (d) is almost the same as that of the original. However, if
parameters are disturbed too much, the normalized phase velocity curve would change signicantly. We also point out that a
small perturbation of the parameters would not have great inuences on the performance of the difference scheme. We shall
present this by examples in Section 7.1.
5. A 3D complex shifted-Laplacian-PML preconditioner for the 3D Helmholtz-PML equation
In this section, the complex shifted-Laplacian-PML preconditioner is developed for the 3D Helmholtz-PML equation, and a
spectral analysis is given to the discrete preconditioned system from the perspective of linear fractal mapping on complex
plane.
After discretization of the 3D Helmholtz-PML equation, we would like to solve the resulting linear system with iterative
solvers, since direct methods are limited by the storage. For the iterative method, Krylov subspace methods, such as the BiCGSTAB [41] and GMRES [27], are usually preferred. However, due to the indeniteness and bad condition number of the
resulting system, the Krylov subspace method is not competitive, and preconditioning is required. That is, a good preconditioner should be constructed to make the preconditioned system have a favorable spectral distribution, which contributes to
a lower condition number and hence a fast convergence of the iterative method (cf. [7]).
Many preconditioned iterative solvers for the Helmholtz equation have been explored in the past few decades. In Bayliss
et al. [5], established a benchmark for the iterative solution of the Helmholtz equation. They proposed an iterative algorithm
by using the preconditioner of Laplacian, combined with the conjugate gradient (CG) iteration. Since the Helmholtz problem
was indenite and non-self-adjoint, the preconditioned CG iteration was applied to the normal equation which is symmetric
and positive denite. The preconditioner was inverted approximately with one SSOR sweep. In [6,18], an even greater
improvement was obtained when a multigrid sweep, plus a redblack ordering was employed to invert approximately
the preconditioner. Later, a family of shifted-Laplacian preconditioners [14,15,23] were developed, which performed efciently, especially the complex shifted-Laplacian preconditioner (cf. [14])
8161

1.1
1.05
ph
VN /v
0.95
0.9
0.85
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
1/G
1.05
1.05
1
N
Vph/v
1.1
Vph/v
1.1
0.95
0.95
0.9
0.9
0.85
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.85
0.4
0.05
0.1
0.15
1/G
0.2
0.25
0.3
0.35
0.4
0.25
0.3
0.35
0.4
1/G
1.05
1.05
1
N
Vph/v
1.1
Vph/v
1.1
0.95
0.95
0.9
0.9
0.85
0.05
0.1
0.15
0.2
1/G
0.25
0.3
0.35
0.4
0.85
0.05
0.1
0.15
0.2
1/G
Fig. 2. Normalized phase velocity curves for: (a) the staggered-grid 27p scheme, (b) the global 27p scheme, (c) the rened 27p scheme, (d) the rened 27p
scheme with parameters in (c) disturbed by 0.01. (e) the rened 27p scheme with parameters in (c) disturbed by 0.1.
M : D b1 b2 ik ;
5:1
where b1 and b2 are positive parameters. In [14,15,23], the multigrid sweep was also employed to invert the preconditioner
approximately. As can be seen, the shifted-Laplacian preconditioner is a generalization of the original Laplacian preconditioner proposed in [5]. It can be obtained by adding a zeroth order term to the Laplacian. For the preconditioning system
of shifted-Laplacian preconditioner, the preconditioned Krylov subspace method such as Bi-CGSTAB is employed, and is directly applied to the discrete Helmholtz equation. However, for that of Laplacian preconditioner [5,6,18], they used the preconditioned CG iteration, which was applied to the normal equation instead.
8162
In this paper, for the 3D Helmholtz-PML Eq. (2.2), we base our preconditioner on the operator
M :

@
@u
@
@u
@
@u
A

B

C
b1 b2 iD:
@x
@x
@y
@y
@z
@z
5:2
If b1 ; b2 > 0, we call the operator (5.2) as the 3D complex shifted-Laplacian-PML preconditioner. When A B C 1 and
2
D k , (5.2) is equivalent to the operator (5.1). The 3D complex shifted-Laplacian-PML preconditioner gives a nice clustered
spectrum, which will be presented later. We employ (5.2) to precondition the 3D Helmholtz-PML Eq. (2.2), obtaining the preconditioned equation in the operator form
AM1 v g;
v Mu:
5:3
We now pay attention to the discretization of (5.3), and the spectral analysis of the discrete preconditioned system. Applying
the new 27-point nite difference scheme (2.8) to discretize (5.3), we obtain the discrete preconditioned system
AM1 v g;
v Mu;
NN
5:4
where A; M 2 C
; u; g 2 C ; C is the complex number set and N is the number of unknowns. Matrices A and M are the discrete Helmholtz-PML operator and preconditioner respectively, and they have the forms
A : L z1 D;
z1 : 1 ai;
M : L z2 D;
z2 : b1 b2 i;
5:5
and
5:6
where L; D are discretizations of the 3D Laplacian-PML
@
@
@x
A @x
@
@
B @y
@y
@
@
C @z
,
@z
and the operator corresponding to the
zeroth order term, respectively. The coefcient matrix A is sparse and diagonal-distributional. Moreover, due to the PML,
A is nearly symmetric with complex entries. As is observed, preconditioner M can be constructed easily, just by discretizing
M with the same difference scheme for A. The inverse of M can be approximated by the multigrid method. The convergence
of the multigrid method is related to parameters b1 and b2 (especially b2 ), which also have an important inuence on the
spectral distribution of the preconditioned system (5.4). It follows from the spectral analysis later that the choice of b2 is
to strike a balance between the convergence rate of the multigrid method and a favorable spectrum of the preconditioned
system (5.4).
We now study the spectrum of the discrete preconditioned 3D Helmholtz-PML Eq. (5.4). For the 2D Helmholtz equation,
an analysis was given to the discrete preconditioned system in [17], with the Neumann, Dirichlet, and Sommerfeld boundary
condition. For the 2D Helmholtz-PML equation preconditioned with a 2D complex shifted-Laplacian-PML, we presented a
spectral analysis from the perspective of the linear fractional mapping on extended complex plane (see [12]). The spectral
behavior can be understood clearly in this manner. Hence, we continue our method proposed in [12] for the spectral analysis
of (5.4). To do this, we denote the eigenvalues of A by rA for any matrix A 2 CNN . We rst recall a lemma in [12].
Lemma 5.1. Let L; D 2 CNN , A : L z1 D and M : L z2 D with z1 ; z2 2 C. If D and M are nonsingular,
with l z2 then k 2 rAM1 if and only if l 2 rD1 L, and M1 A; D1 L share the same eigenvectors.
With the help of above lemma, we can dene a linear fractional mapping by
k kl :
l z1
:
l z2
1
l 2 C and k : llz
z2
5:7
Denote the real and imaginary part of l by Rel and Iml respectively. Then, relevant to (5.7), we have the following lemma, which is a generalization of Lemma 2.2 in [12].
ci with c 2 R is mapped to
Lemma 5.2. Let the linear fractional mapping k : C ! C be dened by (5.7). Then the line l l

z1 z2
z1 z2 ic

< ci are mapped inside and
the circle Oc; R with c : z2 z2 ic and R : z2 z2 ic, and the half-planes l l > ci and l l
outside Oc; R, respectively.
Proof. In the complex plane C fl : l x yi; x; y 2 Rg, a circle can be represented by
ax2 y2 bx cy d 0;
5:8
2
b
c
with parameters a; b; c; d 2 R satisfying a P 0 and b c2 > 4ad. The center of this circle is 2a
; 2a
and the radius is
p
2
2
4ad
. Noting that
R : b c
2a
l l
2
l l
2i
;
and x2 y2 ll
8163
(5.8) can be rewritten as
fl d 0;
fl
all
5:9
p
2
with f : 12 b ci; jfj2 > ad. The center and the radius can be represented by c af and R
respectively. When
a
a 0, (5.9) is reduced to a line which can be considered as a circle with R 1 on the complex plane. Now, we consider
the line
jfj ad
l l ci;
5:10
which is parallel to the real axis. From (5.7), we have
z2 k z1
;
k1
z k z
1
:
l 2
k1
and
5:11
Substitution of (5.11) into (5.10) yields

0
a0 kk f0 k f0 k d 0;
5:12
0
where a : z2 z2 i c; f : z2 z1 i c and d : z1 z1 i c. Then, (5.12) represents a circle, denoted by Oc; R with
p

0

jf0 j2 a0 d0
1 ic
2
> ci is mapped inthe center c : af 0 zz22 z
and the radius R :
z2z1 zz
. It can be easily obtained that l l
z2 ic
a0
i
c
2
0
side Oc; R, and
l l < ci is mapped outside Oc; R. h
The following proposition immediately follows the linear fractional mapping (5.7) and the Lemma 5.2.
ci is mapped
Proposition 5.3. Let z1 : 1 ai, z2 : 1 bi (i.e., b1 1; b2 b in (5.6)) with b > a P 0. Then the line l l
bc
bac
to the circle Oc; R with c : a2b
c and R : 2bc , the half-plane
l l > ci is mapped inside Oc; R, and the half-plane

l l < ci is mapped outside Oc; R. Moreover, for the discrete preconditioned system (5.4) and a constant r 2 0; 1, there
exists a positive constant b0 such that when a < b 6 b0 the spectrum for D1 L is mapped inside the circle O1; r, that is, for
any l 2 rD1 L; jkl 1j 6 r.
Proof. The rst result of this proposition follows from the Lemma 5.2 with z1 : 1 ai, z2 : 1 bi, namely, b1 1; b2 b in
(5.6). The second result follows from lim k 1. h
b!a
We next give an example to illustrate the above proposition explicitly. Fig. 3 presents the linear fractional mapping (5.7)
0 (plotted in red), l1 : l l
0:2i (green)
with z1 : 1a 0 and z2 : 1 0:5i b 0:5. As can be seen, lines l0 : l l
0
9
5
0:2i (blue) are mapped into circles l00 : O12 ; 12 (red), l01 : O14
; 14
(green) and l2 : O38 ; 58 (blue), respectively.
and l2 : l l
0
The line l0 is the real axis and its image l0 is tangent to the imaginary axis. The upper half-plane (the shadow region) and
0
the lower half-plane (white region) are mapped inside and outside of the circle l0 respectively. The points l1 1,
l2 0:5; l3 1; l4 1:5 in Fig. 3(a) are mapped to k1 1; k2 0:5 0:5i; k3 0; k4 0:5 0:5i in Fig. 3(b), respectively.
Moreover, all the three circles are tangent at k1 1.
From the above results, we can see that rAM1 would be enclosed by a certain circle in the right-half plane if and only if
rD1 L has a nonnegative imaginary part. In [12], for 2D case, a series of numerical experiments were made to prove numer-
Fig. 3. The linear fractional mapping.
8164
Fig. 4. The spectral distribution for matrices (a) A, (b) D1 L, (c) AM1 .
ically that there holds the sufcient condition, namely, the spectrum of D1 L has a nonnegative imaginary part. However, for
the 3D case, we can not obtain that all of the eigenvalues for D1 L have a nonnegative imaginary part. Hence, not all of the
eigenvalues for AM1 would be enclosed by the red circle. Fig. 4 shows the spectrum for A; D1 L and AM1 with
1
. Fig. 4(a) presents the original spectrum for A, which is scattered over both the left
k 15; z1 1; z2 1 0:5i and h 24
and right half-planes. As can be seen, the real part of the spectrum includes a part of the negative real axis with large values,
which means A is seriously indenite. After preconditioning, for AM1 , a nice clustered spectrum is observed in Fig. 4(c),
which is far away from the origin with all of the eigenvalues in the right half-plane. It is seen that most of the spectrum
for AM1 is enclosed by the circle O12 ; 12 (plotted in red), with few eigenvalues lying outside of the circle. We can expect this
from the spectrum of D1 L in Fig. 4 (b), which shows that most of the eigenvalues are located on the upper half-plane. Fig. 5
shows the corresponding spectrum with h; k; z2 the same as Fig. 4, but z1 1 0:05i, namely, with 5% damping in (2.2). The
spectrum in Fig. 5(b) is more clustered due to a nonzero damping a.
As is known, the Krylov subspace method is particularly efcient for the system with an Hermitian positive denite matrix, or more generally, for system with all eigenvalues of the coefcient matrix in the right half of the complex plane. In
Fig. 4, though some eigenvalues are not enclosed by the circle O12 ; 12, the spectrum of AM1 is still favorable enough for
the convergence of Krylov subspace methods. Moreover, we can properly choose the parameter b to make the spectrum
of AM1 closer to one. In fact, for certain A; M with z1 : 1 ai and z2 : 1 bi, according to Proposition 5.3, there exists
some b0 > 0, such that when a < b < b0 ; jk 1j is smaller than a certain positive number. Hence, the spectrum of AM1 in
Fig. 4 contributes to a fast convergence of the Krylov space method, which is validated by numerical experiments in Section 7.
Nevertheless, it should be pointed out that b should be chosen properly in order to achieve a faster convergence of the Krylov
subspace method. When b is too small, though a good spectrum of AM1 can be obtained, it is difcult to invert approximately the preconditioner M with the multigrid method, due to its poor property. On the other hand, since lim k 0, when
b!1
b is too large, we would get an unfavorable spectrum near the origin, which leads to a poor performance of preconditioning.
Fig. 6 shows the spectrum of AM1 for b 2; b 0:8 and b 0:2, respectively. Both the cases in Fig. 6 would weaken the
efciency of the Krylov subspace method for solving the preconditioned system (5.4). Hence, the choice of b is a compromise
between the convergence rate of the multigrid method and a favorable spectrum of the preconditioned system (5.4). In this
text, we choose b 0:5 in (5.2), which is a proper choice in practice.
Fig. 5. The spectral distribution for A and AM1 with h 1=24; k 15; z1 1 0:05i and z2 1 0:5i: (a) A; (b) AM1 .
8165
Fig. 6. The spectral distribution of AM1 with (a) b 2; (b) b 0:8; (c) b 0:2.
6. A 3D full-coarsening multigrid method in the preconditioned Bi-CGSTAB

In this section, we develop a 3D full-coarsening multigrid method to invert the preconditioner M approximately, while
using the preconditioned Bi-CGSTAB to solve the discrete preconditioned system. For the full-coarsening strategy, we propose a new matrix-based prolongation operator.
For solving the Helmholtz problem, the Bi-CGSTAB method is usually preferred (cf. [14]). As can be seen in Section 5, the
spectrum of the discrete preconditioned system (5.4) is favorable for the convergence of Bi-CGSTAB method. In the preconditioned Bi-CGSTAB algorithm, we can not afford to invert exactly the preconditioner M in the computation of u : M1 v ,
where v is the given vector with u to be determined. Hence, we choose to solve the additional linear system Mu v . In order
to obtain a better approximate solution at a lower cost, the multigrid method is employed to solve the additional linear system. Thus, the preconditioned Bi-CGSTAB algorithm combines Bi-CGSTAB and the multigrid method, which are considered as
the outer iteration and inner iteration respectively. Now, we consider solving the additional linear system with the multigrid
method. For the multigrid method, the traditional linear prolongation operator serves well for solving Mu v with constant
wavenumbers. However, for varying wavenumbers, it is not robust enough, and leads to a divergence of the multigrid-based
preconditioned Bi-CGSTAB method. Here, we consider a matrix-based prolongation operator, which is constructed based on
the nite difference stencil, according to the algebraic information of the coefcient matrix. In [44], a matrix-based prolongation operator was developed to handle the 2D convectiondiffusion problem. In [14], the multigrid based on Zeeuws prolongation operator was used to invert approximately the preconditioner for the 2D Helmholtz problem. Later, the
prolongation operator was extended to the 3D case with some modication in [26,42], where the 3D multigrid was based
on the semi-coarsening strategy, performing coarsening only in two directions while keeping the third direction unchanged.
In this paper, we intend to employ the multigrid method based on a 3D full-coarsening strategy, plus a pointwise smoother which is easy to implement. The use of a pointwise smoother is to avoid the alternating plane relaxation which is very
expensive, because a sub-system has to be solved in every plane. For the full-coarsening prolongation operator, we have
Fig. 7. One coarse and eight ne grid cells with capital letters indicating coarse grid points and small letters indicating ne grid points.
8166
to take into account the prolongation in the third direction, which need not be considered in the semi-coarsening case. Motivated by Zeeuws 2D prolongation operator, we propose here a new 3D full-coarsening matrix-based prolongation operator,
which gives a good performance.
To describe the construction of the new prolongation operator, we begin with Fig. 7, which shows one coarse and eight
ne grid cells with capital letters indicating coarse grid points and small letters indicating ne grid points. The set of ne
grids denoted by Xh is split into eight disjunct subsets, that is,
Xh fXh;0;0;0 ; Xh;1;0;0 ; Xh;0;1;0 ; Xh;1;1;0 ; Xh;0;0;1 ; Xh;1;0;1 ; Xh;0;1;1 ; Xh;1;1;1 g:

Here, Xh;0;0;0 consists of ne grid points which are also coarse grid points. Xh;1;0;0 ; Xh;0;1;0 and Xh;0;0;1 consist of ne grid
points which are located between two coarse grid points along the x-, y- and z-axis, respectively. Xh;1;1;0 ; Xh;1;0;1 and
Xh;0;1;1 consist of ne grid points which are located respectively in x-y; x-z and y-z plane but not next to any coarse grid
points. Xh;1;1;1 consists of ne grid points which are not next to any coarse grid points. For example, in Fig. 7,
A; 2 Xh;0;0;0 , p 2 Xh;0;1;0 ; q 2 Xh;1;0;0 ; r 2 Xh;1;1;0 ; a0 2 Xh;0;0;1 ; p0 2 Xh;0;1;1 ; q0 2 Xh;1;0;1 and r0 2 Xh;1;1;1 . Symbols eh and eH
represent the grid functions for the ne and coarse grid respectively. Assuming eH is already known on coarse grids, the entries of eh on ne grids can be obtained by prolongation among the nearest coarse grid neighbors. For example, the prolongation weights at p are denoted by W A p; W C p, and the component of eh on p is obtained by prolongation from A and C
according to W A p; W C p. Then, the 3D full-coarsening prolongation operator can be expressed by
8
eh A : e2h A;
>
>
>
>
>
eh p : W A pe2h A W C pe2h C;
>
>
>
>
>
eh q : W A qe2h A W B qe2h B;
>
>
>
>
>
>
< eh r : W A re2h A W B re2h B W C re2h C W D re2h D;
eh a0 : W A a0 e2h A W A0 a0 e2h A0 ;
>
>
>
eh p0 : W A p0 e2h A W C p0 e2h C W A0 p0 e2h A0 W C 0 p0 e2h C 0 ;
>
>
>
>
>
>
eh q0 : W A q0 e2h A W B q0 e2h B W A0 q0 e2h A0 W B0 q0 e2h B0 ;
>
>
>
>
>
eh r0 : W A r0 e2h A W B r0 e2h B W C r 0 e2h C W D r0 e2h D W A0 r 0 e2h A0
>
>
:
W B0 r0 e2h B0 W C0 r 0 e2h C 0 W D0 r 0 e2h D0 ;
6:1
where
A 2 Xh;0;0;0 ; p 2 Xh;0;1;0 ; q 2 Xh;1;0;0 ,
r 2 Xh;1;1;0 ; a0 2 Xh;0;0;1 ; p0 2 Xh;0;1;1 ; q0 2 Xh;1;0;1 ; r0 2 Xh;1;1;1 ,
and
W A ; W B ; W C ; W D ; W A0 ; W B0 ; W C 0 ; W D0 denote the weights to be determined on each ne grid. We can also rewrite (6.1) in the form of matrixvector multiplication, that is eh PeH , where P is the prolongation matrix (prolongation
operator) to be determined.
Now, we describe our new 3D full-coarsening matrix-based prolongation operator, which is constructed based on the 27point stencil, that is, the algebraic information of the coefcient matrix. In one coarse and eight ne grid cells, prolongation
S
S
weights should be derived from three kinds of ne gridpoints, denoted by K1 Xh;0;1;0 Xh;1;0;0 Xh;0;0;1 ; K2
S
S
Xh;1;1;0 Xh;1;0;1 Xh;0;1;1 and K3 Xh;1;1;1 , respectively. Denote the 27-point difference stencil for preconditioner M by
M 1;1;1
6
M , 4 M 1;0;1
M 1;1;1
2
M 1;1;0
6
4 M 1;0;0
M 1;1;0
2
M 1;1;1
6
4 M 1;0;1
M 1;1;1
M0;1;1
M0;0;1
M0;1;1
M 1;1;1
7
M 1;0;1 5
M 1;1;1 z1
M 1;1;0
M 0;1;0
7
M 1;0;0 5
M 1;1;0 z
M 0;1;1
M 1;1;1
M 0;1;0
M 0;0;0
M 0;0;1
M 0;1;1
7
M 1;0;1 5 :
M 1;1;1 z1
6:2
We rst describe the determination of the prolongation weights on ne gridpoints in K1 . Taking the point a0 in Fig. 7 for
example, we have
eh a0 : W A a0 e2h A W A0 a0 e2h A0 ;
with W A a0 ; W A0 a0 to be determined. Assuming M to be the difference stencil at a0 , then M 0;0;0 is the entry of M on the central point a0 . If M l;m;1 or M l;m;1 (l; m 2 f1; 0; 1g; l; m 0; 0) is not zero, it means a coupling between the value of ux; y; z
at a0 and the value at the gridpoint corresponding to M l;m;1 or M l;m;1 . These couplings should be incorporated in the construction of prolongation weights W A a0 and W A0 a0 . Experiments show that it would weaken the efciency of the multigrid
method when neglecting either M l;m;1 or M l;m;1 . Without splitting M into symmetric and anti-symmetric parts, we give a simple formula
8167

X

M l;m;1 ; d1 : maxfjM l;m;1 jg; l; m 2 f1; 0; 1g:

l;m

X

: M l;m;1 ; d2 : maxfjM l;m;1 jg; l; m 2 f1; 0; 1g:

l;m
r1 :
r2
j : maxfr1 ; d1 g; i : maxfr2 ; d2 g;
and
W A a0
j
i
; W A0 a0
:
ji
ji
6:3
We can similarly compute the prolongation weights on other points in K1 . The difference stencil M here is asymmetric
due to the PML and varying wavenumber k in heterogenous medium, and we do not split M (or M) into symmetric and
anti-symmetric parts, as is done in the construction of prolongation operator of Zeeuw-type [14,26,42]. The prolongation
operator of Zeeuw-type was originally developed in [44] to handle the convectiondiffusion problem. The coefcient matrix
was split into symmetric and anti-symmetric parts which were considered to originate from the diffusion and convection
terms respectively. In this text, the multigrid method is used to invert approximately the preconditioner M for the Helmholtz
problem, and it is not absolutely necessary to split M into symmetric and anti-symmetric part as done for handling the convectiondiffusion problem.
After obtaining weights on gridpoints in K1 , we now describe the determination of prolongation weights on ne gridpoints in K2 and K3 . Using the multigrid to solve Mu v , after several relaxation sweeps, an approximate solution denoted
~ is obtained, and the residual is ~r : v Mu
~ . After the coarse grid correction, we get a correction solution u
^ : u
~ eh
by u
~ PeH , and its residual error is
u
^ ~r MPeH :
^r : v Mu
6:4
As in the 2D case of diffusion problems [1], in order to prevent huge jumps in the norm of the residual after prolongation, we
require
MPeH jK2 0:
6:5
Then, for the ne gridpoint in K2 such as r, we have

1 X
1 X
1
X
M l;m;n reh;l;m;n r 0:
6:6
l1m1n1
Here, if l; m; n 0; 0; 0; eh;l;m;n r denotes the component of eh restricted on r, otherwise it denotes the component restricted on the neighbor of r. We rewrite (6.6) as
1 X
1
X
M l;m;0 reh;l;m;0 r
l1m1
1 X
1
X
X
Ml;m;1 reh;l;m;n r 0:
6:7
l1m1n1;1
Assume that we have obtained the prolongation
8
eh;1;0;0 r : W B te2h B W D te2h D;
>
>
>
<
eh;0;1;0 r : W C se2h C W D se2h D;
>
eh;1;0;0 r : W A pe2h A W C pe2h C;
>
>
:
eh;0;1;0 r : W A qe2h A W B qe2h B:
6:8
Substituting (6.8) and eh;0;0;0 r into the rst part of (6.7) and neglecting the second part, then we can determine weights
W A r; W B r; W C r and W D r in
eh;0;0;0 r : W A re2h A W B re2h B W C re2h C W D re2h D:

However, the second part of (6.7) should not be neglected in order to get a good prolongation at r. To this end, we let
eh;l;m;1 r eh;l;m;n r eh;l;m;0 r, and substitute them into the second part of (6.7), yielding
1 X
1 X
1
X
M l;m;n eh;l;m;0 r 0;
6:9
l1m1n1
which is an approximation of (6.7). From (6.9), we can compute weights W A r; W B r; W C r and W D r, which would improve the performance of the multigrid. In the meantime, (6.9) also contributes to a 2D lumped stencil along the z-axis.
Denoting
8168
Mzl;m :
1
X
M l;m;n ;
l; m 2 f1; 0; 1g;
n1
we have the 2D lumped stencil along the z-axis as
Mz1;1
6
z
Mz , 6
4 M1;0
Mz1;1
M z0;1
Mz1;1
7
Mz1;0 7
5;
z
M1;1
M n0;0
M n0;1
6:10
and the 2D lumped stencil along n-axis (n x; y) as
Mn1;1
6
n
Mn , 6
4 M1;0
Mn1;1
M n0;1
M n1;1
7
M n1;0 7
5;
n
M 1;1
M n0;0
M n0;1
6:11
P
P
where M xm;n : 1l1 M l;m;n , M yl;n : 1m1 M l;m;n ; l; m; n 2 f1; 0; 1g. Fig. 8 shows the 2D lumped stencil along the x-axis. In
the semi-coarsening case of [26], the lumped stencil in one direction is similarly used. Thus, we complete the construction
S
of prolongation weights for K1 K2 . Substituting all the weights already obtained into (6.5) yields prolongation weights on
0
r 2 K3 , without splitting (6.6) into two parts and approximating it. Up to now, we have nished the construction of the prolongation operator.
We remark that the construction of the 3D full-coarsening prolongation operator is based on the difference stencil M and
2D lumped stencils along the x; y; z-axis respectively. That is, we use the algebraic information of the coefcient matrix M. As
can be seen, prolongation weights are derived similarly in three directions, and they are used to handle the variation of
wavenumber kx; y; z along x; y; z-axis respectively. In order to prevent huge jumps of the norm of the residual after prolongation, 2D lumped stencils in three directions are used, which shall contribute to a more robust and efcient prolongation
operator in practice. For the 3D Laplacian, with classical 7-point difference stencil on ne grids without PML, 2D lumped
stencils reduce to the classical 5-point stencil, and prolongation weights for p; q; a0 ; p0 ; q0 ; r and r 0 reduce to 12 ; 12 ; 12 ; 14 ; 14 ; 14,
and 18, respectively, which is just the trilinear interpolation. It is well known that the trilinear prolongation operator leads
to an efcient and robust multigrid in this classical setting.
7. Numerical experiments
In this section, the numerical convergence of the new difference scheme is tested, and some comparisons are made with
the compact sixth order difference scheme proposed in [34]. Then, numerical experiments related to geophysical applications are presented to show the efciency and robustness of the multigrid-based preconditioned Bi-CGSTAB method for
the discrete preconditioned system (5.4). The multigrid is based on the 3D full-coarsening strategy with the new prolongation operator. We employ the full multigrid V-cycle (FMG), which possesses both the robustness of the W-cycle and the efciency of the V-cycle [9]. One FMG iteration with two relaxation sweeps is enough. For multigrid components, the pointwise
Jacobi relaxation with underrelaxation (x-JAC) is used as a smoother, which is easy to parallel. We adopt the full weighting
y
z
mx
0,1,1
x
m
0,0,1
mx
0,1,1
m0,1,0
mx
0,0,0
x
m0,1,0
mx
0,1,1
mx
0,0,1
x
m0,1,1
Fig. 8. The construction of the lumped stencil along x-axis.
8169
restriction operator, the newly proposed prolongation operator, and the coarse grid operator obtained by the Galerkin principle. For the restriction operator, instead of using the transposition of the prolongation operator, we choose the weighting
operator suggested in [14], which gives a better performance here. The experiments range from constant wavenumber to
varying wavenumber in heterogenous medium. In the computation, a zero initial guess has been used, and the preconditioned Bi-CGSTAB iteration terminates when the Euclidean norm of the relative residual error is reduced to the order of
106 . All the experiments here are performed serially on an Intel Xeon (8-core) with 3.33 GHz and 96 Gb RAM. Moreover,
Matlab 7v is also used as the testing platform.
7.1. Numerical convergence and comparison of the new scheme
In this section, we shall present the numerical convergence of the the new scheme, whose effect is also shown with a
perturbation of the weight parameter. Then, we compare the scheme with the sixth order compact in [34], which was obtained by compact reformulation of the Helmholtz equation. For convenience, we consider the following 3D Helmholtz equation with zero Dirichlet boundary condition, which was similarly used in [34],
2
Du k u f ;
in 0; 13 ;
7:1
where f : 3n2 1 sinnkx sinnky sinnkz, with k mp; m; n 1; 2; 3; . . .. Then, the exact solution is
2
ux; y; z sinnkx sinnky sinnkz=k . Tables 2 and 3 give numerical convergence of the new scheme with its variation
caused by the perturbation of parameters for k 5p; m 1 and k 12p; m 4 respectively. C.O. denotes the numerical
convergence order with error in the norm j j1 . OP, PPS and PPL represent the new scheme using the original parameters,
and parameters obtained by a perturbation of the original by 0:01 and 0:1, respectively. As can be seen, the new scheme
is of second order convergence, and a small perturbation of the parameter has a small inuence on the scheme, while a large
perturbation signicantly inuences its convergence.
In Table 4, the new scheme is compared with the sixth order compact scheme in [34]. We can observe that when the step
h 1=20 is xed, for small wavenumbers, the sixth scheme outperforms the new second order scheme, but for large wavenumbers, the new second order scheme gives a better performance. In fact, the sixth scheme would denitely outperform
the new second order scheme, so long as the step is small enough. However, in practice, the step h is related with the wavenumber k, especially for large wavenumbers. Due to the storage limit, the number of gridpoints per wavelength G can not be
too large, that is, step h can not be too small. In addition, the pollution analysis of error shows that the accuracy not only
depends on the convergence order, but also the wavenumber. In this setting, the new second order scheme may perform
better, since it can reduce the numerical dispersion. In Table 5, for k 30p and n 10, we present the numerical error corresponding to both schemes with different step h. We can see that the new second order scheme performs better for a large
wavenumber, if h remains not too small. We point out that both the sixth and the new second order scheme use 27 gridpoints, but they are developed based on different motivation. After second order approximation, there are still lots of degrees
of freedom left. With these degrees, the sixth scheme aims to construct a formula of high accuracy, assuming the wavenumber is constant, the steps in all direction are equal, and the solution and source term are smooth enough. However, the new
second order scheme combines these degrees with the relation of dispersion to minimize the numerical dispersion. So, the
new second order scheme is optimal in the sense of suppressing the dispersion, while the sixth scheme is optimal in the
sense of convergence order.
We now turn our attention to the preconditioned BI-CGSTAB method for solving the linear system Au g, which is obtained by discretization of the Helmholtz Eq. (7.1). The preconditioner M is based on the 3D complex shifted-Laplacian. We
shall examine the effect of three different discretizations of the Helmholtz and preconditioning operator. The rst scheme,
denoted by SC62, uses the 6th difference scheme in [34] to discretize the Helmholtz operator, and uses the common 7-point
difference scheme to discretize the preconditioning operator. Two other schemes, denoted by SN62 and NN22, respectively
use the 6th scheme in [34] and the new 2nd scheme in this paper to discretize the Helmholtz operator, while both using the
new 2nd scheme to discretize the preconditioning operator. We do not employ the 6th scheme to discretize the preconditioning operator, because the obtaining of the 6th scheme involves a right-hand term. Fig. 9 shows spectra of AM1 , based on
SC62, SN62 and NN22, respectively, with k 4p; n 5p and h 1=16. As can be seen, SC62 gives a poor spectrum of AM1 ,
since some of the eigenvalues locates in the left half-plane, which is unfavorable for the convergence of the BI-CGSTAB method. Both SN62 and NN22 contribute to a good spectral distribution of AM1 , which are clustered in the right half-plane. In
Table 2
The numerical convergence of the new scheme with its variation caused by perturbation of the parameters, for k 5p; n 1.
OP
PPS
PPL
1/4
1/8
1/16
1/32
1/64
Error
C.O.
Error
C.O.
Error
C.O.
1.50e2
3.70e3
2.02
3.80e3
1.99
1.46e2
1.19
8.59e4
2.10
1.00e3
1.93
6.10e3
1.25
2.22e4
1.96
3.89e4
1.92
2.80e3
1.14
5.58e5
1.99
9.92e5
1.97
1.44e3
0.96
1.51e2
3.34e2
8170
Table 3
The numerical convergence of the new scheme with its variation caused by perturbation of the parameters, for k 12p; n 4.
OP
PPS
PPL
1/8
1/16
1/32
1/65
1/129
Error
C.O.
Error
C.O.
Error
C.O.
1.49e1

1.35e1

7.39e2
3.61e2
2.02
3.75e2
1.85
4.10e2
0.85
8.60e3
2.10
9.70e3
1.95
1.14e2
1.84
2.20e3
1.96
2.50e3
1.98
8.00e3
0.51
5.46e4
1.99
6.38e4
1.97
4.32e3
0.88
Table 4
Comparison of the new scheme with the sixth order scheme, for different k with n 5 and h 1=20.
k
2p
kh
G
Error for 6th
Error for 2nd
2p
3p
4p
5p
6p
7p
8p
22.0
14.7
11.0
8.8
7.3
6.3
5.5
0.0014
0.0116
0.0097
0.0126
0.0389
0.0206
0.1258
0.0250
0.3468
0.0286
1.315
0.0559
20.69
0.4969
Table 5
Comparison of the error for the new and sixth schemes when k 30p and n 10.
h
1=20
1=40
1=80
1=160
6th
Error
C.O.
16808
1081
3.96
21.76
5.63
0.4128
5. 72
2nd
Error
C.O.
0.0710
0.0442
0.6838
0.0115
1.94
0.0029
1.98
0.4
0.2
0
0.2
0.4
0.6
0.8
1
0.5
0.5
1.5
0.5
0.5
0.4
0.4
0.3
0.3
0.2
0.2
0.1
0.1
0.1
0.1
0.2
0.2
0.3
0.3
0.4
0.4
0.5
0
0.2
0.4
0.6
0.8
0.5
0
0.2
0.4
0.6
0.8
Fig. 9. The spectra of AM1 based on (a) SC62, (b) SN62, (c) NN22.
Table 6
Number of Bi-CGSTAB iterations and CPU time in minutes (in parentheses) for different wavenumbers k with 2% and without damping.
Grid
k = 60
3
k = 100
3
k = 140
3
k = 180
3
k = 220
Ia 0:00
a 0:02
97
43 (1.73)
32 (1.38)
161
60 (7.38)
42 (5.27)
224
83 (23.30)
55 (15.65)
288
108 (58.68)
70 (38.36)
3523
152 (140.19)
89 (83.74)
IIa 0:00
a 0:02
45 (1.92)
34 (1.57)
64 (8.68)
46 (6.13)
90 (27.84)
61 (18.98)
119 (70.36)
78 (46.27)
168 (169.06)
99 (100.53)
computation, since the discretization of the preconditioning operator for SN62 is based on the 7-point scheme, it indeed
saves multiplication operations. However, the effect of preconditioning is not good, that is, it costs too many preconditioned
BI-CGSTAB iterations, which coincides with the spectrum in Fig. 9(a). For SN62 and NN22, since both discretizations of the
Helmholtz and preconditioning operator are based on 27-point scheme, they cost the same multiplication operations for
8171

180
Method (I), =0.00

Method (I), =0.02
Method (II), =0.00
Method (II), =0.02
Number of iterations
160
140
120
100
80
60
40
20
60
80
100
120
140
160
180
200
220
Wavenumber k
Fig. 10. The number of iterations for preconditioned Bi-CGSTAB versus wavenumber k.
each preconditioned BI-CGSTAB iteration, and the total iterations make little difference. In Section 7.2 to Section 7.4, we
adopt the scheme NN22, that is, both the Helmholtz operator A and the preconditioning operator M are discretized by
the new second order scheme.
7.2. The 3D Helmholtz-PML equation with constant wavenumber
We consider a domain X 0; 13 , with point and line sources placed at different locations of the domain. The number of
gridpoints per wavelength is chosen to be G = 10, and an accuracy requirement for second order discretizations is that
kh 6 2Gp p5 . The thickness of the PML is set to be 20, that is, the PML possesses 20 gridpoints in each direction. Table 6 presents the number of preconditioned Bi-CGSTAB iterations and the CPU time needed (in parentheses) in minutes, with and
without damping for different wavenumbers respectively. The largest wavenumber for test is k 220, and the number of
unknowns (without PML) is 3523 43; 614; 208 (about 43.6 millions).
In Tables 6, (I) represents the preconditioned BI-CGSTAB method based on the 3D full-coarsening multigrid with the newly proposed prolongation operator, plus a point-wise x-JAC smoother. (II) represents the preconditioned BI-CGSTAB methods based on the 3D semi-coarsening multigrid with the Zeeuws prolongation operator, plus a line-wise x-JAC smoother.
For k 220 without damping, it needs 152 iterations for method (I), and the CPU time needed is about 140 min. As can be
seen from the table, the method (I) gains a faster convergence than (II). Moreover, the CPU time needed for each iteration of
(I) is less than that of (II). This is justied by the fact that the multigrid with semi-coarsening method has larger coarse grid
operators, compared with the counterparts in full-coarsening method. It can also be observed that the convergence speed
with some damping in the Helmholtz problem is faster than that without damping, and we can expect this from the spectral
distribution in Fig. 5 (b), which is more clustered. Fig. 10 presents the number of iterations for the preconditioned Bi-CGSTAB
method versus wavenumber k. We keep kh p5 , which means 10 gridpoints per wavelength is used. In Fig. 10, a nearly linear
increase is observed in the number of iterations with the increase of wavenumber k, especially for the damping case. The
numerical solution corresponding to k 60 without damping (a 0) is presented in Fig. 11, with point and line sources
placed at different locations of the domain.
7.3. The three layers model
The three layer model is used to evaluate the preconditioned Bi-CGSTAB for a simple heterogeneous medium, in which
case the wavenumber k is slightly varying due to the varying velocity v in different medium. We consider a physical domain
which is dened to be a cube of dimension 760 m 760 m 760 m. A point source is located at the upper surface with
(60 m, 60 m, 0 m), where the upper surface is assigned to be z 0. Fig. 12 presents the domain with three layers, and the
variation of velocity in the medium. In Fig. 12(a), the rst, second, and third layer represent 1600 m/s, 2400 m/s, 3200 m/s
respectively. The real part of the numerical solution at f 30 Hz without damping is plotted in Fig. 12(b). Table 7 presents
the convergence of the preconditioned Bi-CGSTAB method for different frequencies (varying from 20 to 60 Hz) with 2% and
without damping. The number of Bi-CGSTAB iterations and CPU time in minutes are presented. As can be seen, method (I)
performs robustly and still achieves a faster convergence than method (II).
7.4. The 3D salt dome model
We evaluate a more complicated heterogeneous medium, namely, the 3D salt dome model, which mimics the subsurface
geology under the sea. The physical domain considered here is dened to be a cube of dimension 960 m 960 m 600 m. A
point source is located at point (481 m, 481 m, 257 m) under the upper surface, which is assigned to be z 0. Fig. 13(a) presents the 3D salt dome model, and the variation of velocity in different medium. Fig. 13(b) and (c) present the cross-section
of the 3D salt dome model at x 481 m and y 549 m respectively. The velocity of sound is irregularly structured
8172
Fig. 11. The real part of numerical solution for k 60 with point and line sources placed at different locations. (a) point source at the center of the domain;
(b) 2D cross-section of (a) along the x-axis; (c) point source on the edge (z-axis); (d) point source on the corner; (e), (f), (g) point sources on the upper face
and right face; (h), (i) line sources in the directions of x-axis and y-axis.
x
z
1.6km/s
2.4km/s
3.2km/s
Fig. 12. (a) The three layers model with speed prole indicated. (b) The real part of numerical solution at f 30 Hz.
throughout the domain. The lowest velocity is 1524 m/s and the highest velocity is 4480 m/s, which is in the salt dome (the
red part in Fig. 13). The snapshots of the real part of the numerical solution for f 30 Hz without damping at x 481 m and
y 549 m are displayed in Fig. 14. We can observe the wave led propagating from the source (481 m, 257 m) through the
model. Table 8 presents the preconditioned Bi-CGSTAB convergence for the case with and without damping for different frequencies, which vary from 20 Hz to 40 Hz. The numbers of Bi-CGSTAB iterations and CPU time in minutes (in parentheses)
8173

Table 7
The number of Bi-CGSTAB iterations and CPU time in minutes (in parentheses) for the three layers model with 2% and without damping.
f = 20
f = 30
f = 40
f = 50
f = 60
893
1213
1853
2323
2963
(I)
a 0:00
a 0:02
48 (1.58)
43 (1.50)
61 (3.78)
52 (3.34)
79 (13.66)
65 (11.47)
105 (32.63)
82 (25.68)
141 (81.96)
110 (65.25)
(II)
a 0:00
a 0:02
50 (1.79)
45 (1.68)
66 (4.61)
56 (3.88)
90 (16.66)
74 (13.59)
118 (39.64)
91 (31.56)
159 (100.67)
122 (78.32)
120
120
240
240
z(m)
z (m)
Grid
360
360
480
480
600
120
240
360
480
600
720
840
600
960
120
240
360
x (m)
480
600
720
840
960
y (m)
Fig. 13. (a) The 3D salt dome model with velocity prole indicated; (b) and (c) the cross-section of 3D salt dome model at x 481 m and y 549 m,
respectively.
240
240
z (m)
120
z (m)
120
360
360
480
480
600
120
240
360
480
x (m)
600
720
840
960
600
120
240
360
480
600
720
840
960
y (m)
Fig. 14. Monofrequency waveeld (real part) for f 30 Hz without damping at (a) x 481 m ; (b) y 549 m, respectively.
8174
Table 8
The number of Bi-CGSTAB iterations and CPU time in minutes (in parentheses) for the 3D salt dome model with 2% and without damping.
Grid
f = 20
f = 25
f = 30
f = 35
f = 40
1292 81
1612 101
1932 121
2252 141
2572 161
a 0:00
a 0:02
57 (3.01)
55 (2.95)
72 (6.17)
67 (5.84)
89 (11.95)
81 (10.90)
120 (23.64)
106 (21.03)
158 (42.51)
139 (37.92)
(II)
a 0:00
a 0:02
61 (3.52)
58 (3.41)
79 (6.68)
73 (6.17)
102 (14.95)
91 (13.42)
137 (29.27)
118 (25.83)
180 (53.46)
157 (47.89)
(I)
are shown. For f 40 Hz, the interior domain contains 257 257 161 10; 633; 889 gridpoints (about 10.6 million). With
the PML, the unknowns amount to 297 297 201 17; 730; 009 (about 17.7 million) in total. The numbers of iterations
for method (I) and (II) without damping are 158 and 180, with the time-consuming being about 42 and 53 min, respectively.
For the complicated heterogeneous medium, method (I) still achieves a robust performance, and outperforms method (II).
Even though both method (I) and (II) have the same iterations, method (I) has an advantage over method (II). Because
the semi-coarsening strategy in method (II) leads to a gradual decrease of the coarse grid operators, and more time would
cost in each iteration of the preconditioned Bi-CGSTAB method.
8. Conclusion
In this paper, we study the numerical solver for the 3D Helmholtz equation with a boundary of PML. Both the nite difference scheme and preconditioned iterative solver are concerned.
For the discretization of 3D Helmholtz-PML equation, we have developed a new 27-point nite difference scheme, which
is second order. The new scheme has been proved to be consistent with the 3D Helmholtz-PML equation, and equivalent to
staggered-grid 27-point scheme under certain conditions. The classical dispersion analysis has been made to obtain the
approximation of numerical wavenumber to the exact wavenumber. Based on minimizing the numerical dispersion, a rened choice strategy is taken to optimize parameters of the new difference scheme. Comparisons of phase velocity show
the improvement of the new scheme with rened optimal parameters.
After discretization of the 3D Helmholtz-PML equation with the new difference scheme, we obtain a sparse indenite linear system. To solve the linear system, the preconditioned Krylov subspace method Bi-CGSTAB has been employed, and the
preconditioner is based on the 3D complex shifted-Laplacian-PML preconditioner, which is generalized from the 2D shiftedLaplacian preconditioner. A spectral analysis is given from the perspective of fractional linear mapping in complex variable
function. In order to invert approximately the preconditioner, the multigrid method has been used, which is based on the 3D
full-coarsening strategy, plus a pointwise Jacobi smoother. A new matrix-based prolongation operator has also been
constructed for the 3D full-coarsening multigrid. Numerical experiments have been presented, ranging from constant
wavenumbers to highly varying wavenumbers in heterogeneous medium. Numerical results illustrate the efciency of
the full-coarsening multigrid-based preconditioned Bi-CGSTAB method.
References
[1] R. Alcouffe, A. Brandt, J. Dendy, J. Painter, The multigrid method for the diffusion equation with strongly discontinuous coeffcients, SIAM J. Sci. Statist.
Comput. 2 (1981) 430454.
[2] I. Babuka, F. Ihlenburg, E. Paik, S. Sauter, A genralized nite element method for solving the Helmholtz equation in two dimensions with minimal
pollution, Comput. Methods Appl. Mech. Engrg. 128 (1995) 325359.
[3] I. Babuka, S. Sauter, Is the pollution effect of the FEM avoidable for the Helmholtz equation considering high wave numbers, SIAM Rev. 42 (2000) 451
484.
[4] G. Baruch, G. Fibich, S. Tsynkov, E. Turkel, Fourth order schemes for time-harmonic wave equations with discontinuous coefcients, Commun. Comput.
Phys. 5 (2008) 442455.
[5] A. Bayliss, C. Goldstein, E. Turkel, An iterative method for Helmholtz equation, J. Comput. Phys. 49 (1983) 631644.
[6] A. Bayliss, C. Goldstein, E. Turkel, The numerical solution of the Helmholtz equation for wave propagation problems in underwater acoustics, Comput.
Math. Appl. 11 (1985) 655665.
[7] M. Benzi, Preconditioning techniques for large linear systems: a survey, J. Comput. Phys. 182 (2002) 418477.
[8] J. Brenger, A perfectly matched layer for the absorption of electromagnetic waves, J. Comput. Phys. 114 (1994) 185200.
[9] W. Briggs, V. Henson, S. McCormick, A Multigrid Tutorial, Second ed., SIAM, California, 2000.
[10] Z. Chen, H. Wu, An adaptive nite element method with perfectly matched absorbing layers for the wave scattering by periodic structures, SIAM J.
Numer. Anal. 41 (2003) 799826.
[11] Z. Chen, T. Wu, H. Yang, An optimal 25-point nite difference scheme for the Helmholtz equation with PML, J. Comput. Appl. Math. 236 (2011) 1240
1258.
[12] Z. Chen, D. Cheng, W. Feng, T. Wu, H. Yang, A multigrid-based preconditioned Krylov subspace method for the Helmholtz equation with PML, J. Math.
Anal. Appl. 383 (2011) 522540.
[13] J. Douglas Jr., D. Sheen, J. Santos, Approximation of scalar waves in the space-frequency domain, Math. Mod. Methods Appl. Sci. 4 (1994) 509531.
[14] Y. Erlangga, C. Oosterlee, C. Vuik, A novel multigrid based preconditioner for heterogeneous Helmholtz problems, SIAM J. Sci. Comput. 27 (2006) 1471
1492.
[15] Y. Erlangga, C. Vuik, C. Oosterlee, On a class of preconditioners for the Helmholtz equation, Appl. Numer. Math. 50 (2004) 629651.
8175
[16] X. Feng, H. Wu, Discontinuous Galerkin methods for the Helmholtz equation with large wave number, SIAM J. Numer. Anal. 47 (2009) 28722896.
[17] M. Van Gijzen, Y. Erlangga, C. Vuik, Spectral analysis of the discrete Helmholtz operator with a shifted Laplacian, SIAM J. Sci. Comput. 29 (2007) 1942
1958.
[18] J. Gozani, A. Nachshon, E. Turkel, Conjuate gradient coupled with multigrid for an indenite problems, in: R. Vichnevestsky, R.S. Tepelman (Eds.),
Advances in computer methods for partial differential equation, vol. V, IMACS, New Brunswick, NJ, 1984, pp. 425-427.
[19] B. Hustedt, S. Operto, J. Virieux, Mixed-grid and staggered-grid nite difference methods for frequency-domain acoustic wave modelling, Geophys. J.
Int. 157 (2004) 12691296.
[20] F. Ihlenburg, I. Babuka, Finite element solution of the Helmholtz equation with high wave number, Part I: The h-version of the FEM, Comput. Math.
Appl. 30 (1995) 937.
[21] F. Ihlenburg, I. Babuka, Dispersion analysis and error estimation of Galerkin nite element methods for the Helmholtz equation, Int. J. Numer.
Methods Eng. 38 (1995) 42074235.
[22] C. Jo, C. Shin, J. Suh, An optimal 9-point, nite difference, frequency-space, 2-D scalar wave extrapolator, Geophysics 61 (1996) 529537.
[23] A. Laird, M. Giles, Preconditioned iterative solution of the 2D Helmholtz equation, Report 02/12, Oxford Computer Laboratory, Oxford, UK, 2002.
[24] Y. Luo, G. Schuster, Parsimonious staggered grid nite differencing of the wave equation, Geophys. Res. Lett. 17 (1990) 155158.
[25] S. Operto, J. Virieux, P. Amestoy, J. LExcellent, L. Giraud, H. Ali, 3D nite difference frequency-domain modeling of visco-acoustic wave propagation
using a massively parallel direct solver: a feasibility study, Geophysics 72 (2007) 195211.
[26] C. Riyanti, A. Kononov, Y. Erlangga, C. Vuik, C. Oosterlee, R. Plessix, W. Mulder, A parallel multigrid-based preconditioner for the 3D heterogeneous
high-frequency Helmholtz equation, J. Comput. Phys. 224 (2007) 431448.
[27] Y. Saad, W. Schultz, GMRES: a generalized minimal residual algorithm for solving nonsymmetric linear system, SIAM J. Sci. Statist. Comput. 7 (1986) 1
176.
[28] E. Saenger, N. Gold, A. Shaoiro, Modeling the propogation of elastic wave using a modied nite difference grid, Wave Mot. 31 (2000) 7792.
[29] C. Shin, H. Sohn, A frequency-space 2-D scalar wave extrapolator using extended 25-point nite difference operator, Geophysics 63 (1998) 289296.
[30] I. Singer, E. Turkel, A perfectly matched layer for the Helmholtz equation in a semi-innite strip, J. Comput. Phys. 201 (2004) 439465.
[31] I. Singer, E. Turkel, High-order nite difference methods for the Helmholtz equation, Comput. Methods Appl. Mech. Eng. 163 (1998) 343358.
[32] I. tekl, C. Pain, 3D frequency domain visco-acoustic modeling using rotated nite difference operators, in: 64th Annual Conference and Exhibition,
EAGE, Expanded Abstracts, 2007, p. C27.
[33] I. tekl, R. Pratt, Accurate viscoelastic modeling by frequency-domain nite difference using rotated operators, Geophysics 63 (1998) 17791794.
[34] G. Sutmann, Compact nite difference schemes of sixth order for the Helmholtz equation, J. Comput. Appl. Math. 203 (2007) 1531.
[35] C.K.W. Tam, L. Auriault, F. Cambuli, Perfectly matched layer as an absorbing boundary condition for the linearized Euler equations in open and ducted
domains, J. Comput. Phys. 144 (1998) 213234.
[36] C.K.W. Tam, H. Ju, Finite difference computation of acoustic scattering by small surface inhomogeneities and discontinuities, J. Comput. Phys. 228
(2009) 59175932.
[37] C.K.W. Tam, J.C. Webb, Dispersion-relation-preserving nite difference schemes for computational acoustics, J. Comput. Phys. 107 (1993) 262281.
[38] J. Thomas, Numerical Partial Differential Equations, Finite Difference Methods, Springer, New York, 1995.
[39] L. Trefethen, Group velocity in nite difference schemes, SIAM Review 24 (1982) 113136.
[40] S. Tsynkov, E. Turkel, A Cartesian perfectly matched layer for the Helmholtz equation, Absorbing boundaries and layers, domain decomposition
methods, Nova Sci. Publ., Huntington, NY, 2001. pp. 279309.
[41] H. Van Der Vorst, Bi-CGSTAB: a fast and smoothly converging variant of Bi-CG for the solution of nonsymmetric linear systems, SIAM J. Sci. Statist.
Comput. 13 (1992) 631644.
[42] T. Washio, C. Oosterlee, Flexible multiple semicoarsening for three-dimensional singularly perturbed problems, SIAM J. Sci. Comput. 19 (1998) 1646
1666.
[43] Y. Wong, G. Li, Exact nite difference schemes for solving Helmholtz equation at any wavenumber, Int. J. Numer. Anal. Mod. 2 (2011) 91108.
[44] P. Zeeuw, Matrix-dependent prolongations and restrictions in a blackbox multigrid solver, J. Comput. Appl. Math. 33 (1990) 127.

A Dispersion Minimizing Finite Difference Scheme and Precond - Chen Et Al - 2012

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

A Dispersion Minimizing Finite Difference Scheme and Precond - Chen Et Al - 2012

Uploaded by

Copyright:

Available Formats

Journal of Computational Physics 231 (2012) 81528175

Contents lists available at SciVerse ScienceDirect

Journal of Computational Physics

A dispersion minimizing nite difference scheme and preconditioned

Z. Chen et al. / Journal of Computational Physics 231 (2012) 81528175

2. A consistent 27-point nite difference scheme for the 3D Helmholtz-PML equation

Z. Chen et al. / Journal of Computational Physics 231 (2012) 81528175

Lh;x uj0;1;1 Lh;x uj0;1;1 Lh;x uj0;1;1 Lh;x uj0;1;1 ;

Fig. 1. The 27-point nite difference stencil with numbering.

Z. Chen et al. / Journal of Computational Physics 231 (2012) 81528175

Lh;y u : c1 Lh;y uj0;0;0

Lh;y uj0;0;1 Lh;y uj0;0;1 Lh;y uj1;0;0 Lh;y uj1;0;0

Lh;z u : c1 Lh;z uj0;0;0

Lh;z uj1;0;0 Lh;z uj1;0;0 Lh;z uj0;1;0 Lh;z uj0;1;0

where Lh;x uj0;m;n ; Lh;y ujl;0;n and Lh;z ujl;m;0

Lh u : Lh;x u Lh;y u Lh;z u:

Finally, the zeroth order term Du is approximated by a weighted average

I h Du : w1 D0;0;0 u0;0;0 w2 I h;1 Du w3 I h;2 Du w4 I h;3 Du;

Lh u 1 aiI h Du g 0;0;0 :

T / gjxxl ;yym ;zzn T l;m;n /xl ; ym ; zn Gl;m;n ! 0

Z. Chen et al. / Journal of Computational Physics 231 (2012) 81528175

Similarly, following (2.7) and the Taylor theorem, we obtain that

c1 cs1 cs2 cs3 ; c2 cs2 ; c3 cs3 ;

R3 : u1;1;1 u1;1;1 u1;1;1 u1;1;1 u1;1;1 u1;1;1 u1;1;1 u1;1;1 :

Z. Chen et al. / Journal of Computational Physics 231 (2012) 81528175

By a simple computation, we have the dispersion equation

where b1 : cos / cos h; b2 : cos / sin h, and b3 : sin /.

Z. Chen et al. / Journal of Computational Physics 231 (2012) 81528175

In addition, from the Eq. (3.2), we have

respectively. With h 2Gkp, we can easily conclude that

c1 2G2 3 3E F H c2 2G2 3E 2F H w1 4p2 E 1 w2 4p2 E H

Z. Chen et al. / Journal of Computational Physics 231 (2012) 81528175

S2l0 ;m0 ;n0

S3l0 ;m0 ;n0

S4l0 ;m0 ;n0

Z. Chen et al. / Journal of Computational Physics 231 (2012) 81528175

Z. Chen et al. / Journal of Computational Physics 231 (2012) 81528175

Z. Chen et al. / Journal of Computational Physics 231 (2012) 81528175

where L; D are discretizations of the 3D Laplacian-PML

and the operator corresponding to the

Z. Chen et al. / Journal of Computational Physics 231 (2012) 81528175

(5.8) can be rewritten as

which is parallel to the real axis. From (5.7), we have

Substitution of (5.11) into (5.10) yields

side Oc; R, and

l l < ci is mapped outside Oc; R. h

l l > ci is mapped inside Oc; R, and the half-plane

Fig. 3. The linear fractional mapping.

Z. Chen et al. / Journal of Computational Physics 231 (2012) 81528175

Z. Chen et al. / Journal of Computational Physics 231 (2012) 81528175

6. A 3D full-coarsening multigrid method in the preconditioned Bi-CGSTAB

Z. Chen et al. / Journal of Computational Physics 231 (2012) 81528175

Xh fXh;0;0;0 ; Xh;1;0;0 ; Xh;0;1;0 ; Xh;1;1;0 ; Xh;0;0;1 ; Xh;1;0;1 ; Xh;0;1;1 ; Xh;1;1;1 g:

Z. Chen et al. / Journal of Computational Physics 231 (2012) 81528175

Then, for the ne gridpoint in K2 such as r, we have

Assume that we have obtained the prolongation

eh;0;0;0 r : W A re2h A W B re2h B W C re2h C W D re2h D:

Z. Chen et al. / Journal of Computational Physics 231 (2012) 81528175

we have the 2D lumped stencil along the z-axis as

and the 2D lumped stencil along n-axis (n x; y) as

Fig. 8. The construction of the lumped stencil along x-axis.

Z. Chen et al. / Journal of Computational Physics 231 (2012) 81528175

Z. Chen et al. / Journal of Computational Physics 231 (2012) 81528175

Z. Chen et al. / Journal of Computational Physics 231 (2012) 81528175