Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

GEOPHYSICS, VOL. 71, NO. 4 共JULY-AUGUST 2006兲; P. R59–R67, 10 FIGS.

10.1190/1.2209547
Downloaded 06/21/16 to 130.56.64.29. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/

CGG method for robust inversion and its


application to velocity-stack inversion

Jun Ji1

maximum-likelihood formulation. This formulation allows the de-


ABSTRACT sign of various statistical inversion solutions. Among the various
ᐉ p-norm solutions, the ᐉ1-norm solution is more robust than the
The modified conjugate gradient 共CG兲 method, called the ᐉ2-norm solution because it is less sensitive to spiky, high-amplitude
conjugate guided gradient 共CGG兲 method, is a robust itera- noise 共Claerbout and Muir, 1973; Taylor et al., 1979; Scales and
tive inversion method producing a parsimonious model esti- Gersztenkorn, 1987; Scales et al., 1988兲. To take advantage of both
mation. The CG method for solving least-squares 共LS兲 共i.e., ᐉ2 and ᐉ1 norm solutions, hybrid ᐉ1 /ᐉ2 - norm solutions are tried also
ᐉ2-norm minimization兲 problems is modified to solve for dif- 共Huber, 1973; Bube and Langan, 1997; Guitton and Symes, 2003兲.
ferent norms or different minimization criteria by guiding the However, the implementation of the algorithm to find ᐉ1-norm solu-
gradient vector appropriately during iteration steps. Guiding tions is not a trivial task; it uses linear programming techniques
is achieved by iteratively reweighting either the residual vec- 共Taylor et al., 1979兲 and needs a large quantity of computer memory.
tor or the gradient vector during iteration steps like the itera- An iterative inversion algorithm called the iteratively reweighted
tively reweighted least-squares IRLS method does. Robust- least squares 共IRLS兲 method 共Gersztenkorn et al., 1986; Scales and
ness is achieved by weighting the residual vector and parsi- Gersztenkorn, 1987; Scales et al., 1988; Bube and Langan, 1997兲 is a
monious model estimation is obtained by weighting the gra- good choice for solving ᐉ p-norm minimization problems for 1 ⱕ p
dient vector. Unlike the IRLS method, however, the CGG ⱕ 2. The IRLS approach, which was originally developed for non-
method doesn’t change the corresponding forward operator linear inversion, can be adapted to solve linear inverse problems in a
of the problem and is implemented in a linear inversion tem- ᐉ p-norm sense by modifying the iterative inversion method such as
plate. Therefore, the CGG method requires less computation the conjugate gradient 共CG兲 method 共Darche, 1989; Nichols, 1994;
than the IRLS method. Since the solution in the CGG method Claerbout, 2004兲.
is found in a least-squares sense along the gradient direction The ᐉ p-norm minimizing IRLS inversion can be used for any in-
guided by the weights, this solution can be interpreted as the version problem where robustness to spiky noise and parsimony of
LS solution located in the guided gradient direction. Guiding the model are required, and velocity-stack inversion is one of those.
the gradient gives us more flexibility in the choice of weight- Velocity-stack inversion is useful not only for velocity analysis but
ing parameters than the IRLS method. I applied the CGG also for various data processing applications. Applications of the ve-
method to velocity-stack inversion, and the results show that locity-stack inversion include nonhyperbolic noise removal in com-
the CGG method gives a far more robust and parsimonious mon-midpoint gathers 共CMP兲 gathers 共Nichols, 1994; Guitton and
model estimation than the standard ᐉ2-norm solution, with re- Symes, 2003兲, multiple-removal 共Thorson and Claerbout, 1985;
sults comparable to the ᐉ1-norm IRLS solution. Hampson, 1986; Foster and Mosher, 1992; Kostov and Nichols,
1995; Lumley et al., 1995; Kabir and Marfurt, 1999; Herrmann et al.,
2000兲, missing offset reconstruction 共Ji, 1994; Sacchi and Ulrych,
INTRODUCTION 1995兲, and so on. In these applications, the velocity-stack panels ob-
tained by inversion are usually required to be as spiky and sparse as
The inverse problem has received considerable attention in vari- possible. Then the hyperbolic events represented by the isolated
ous geophysical applications. One of the most popular inverse solu- peaks in the velocity-stack panel are more easily distinguished from
tions is the least-squares 共LS兲 solution. The LS solution is a member the rest of the noise.
of a family of generalized ᐉ p-norm solutions that are deduced from a This paper introduces a modification of the conventional CG

Manuscript received by the Editor May 25, 2004; revised manuscript received December 30, 2005; published online August 2, 2006.
1
Hansung University, Department of Information System Engineering, 389 Samsung-dong 2-ga, Sungbuk-ku, Seoul, 136-792, Korea. E-mail: jun@hansung.
ac.kr.
© 2006 Society of Exploration Geophysicists. All rights reserved.

R59
R60 Ji

method for solving the LS problem to be robust and produce a parsi- jugate gradient vector ⌬r. The update step size is determined by
monious model estimation. The modified CG method is called the minimizing the quadrature function composed from ⌬r 共the conju-
conjugate guided gradient 共CGG兲 method. Modification of the CG gate gradient兲 and ⌬s 共the previous iteration descent vector in the
method is performed by guiding the gradient vector during the itera- conjugate space兲 共Claerbout, 1992兲 as follows :
tion steps. Guiding the gradient vector is achieved by iteratively re-
weighting either the residual vector or the gradient vector during it- Q共␣, ␤兲 = 共r − ␣⌬r − ␤⌬s兲T共r − ␣⌬r − ␤⌬s兲.
Downloaded 06/21/16 to 130.56.64.29. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/

eration steps as the IRLS method does. The weighting of the residual Notice that the gradient vector 共 ⌬m兲 in the CG method for the LS so-
vector makes the CGG method robust, and the weighting of the gra- lution is the gradient of the squared residual and is determined by
dient vector makes the CGG method produce a parsimonious model taking the derivative of the squared residual 共i.e., the ᐉ2-norm of the
estimation. In the first section, I review the conventional CG method residual, rTr兲 with respect to the model mT:
for solving LS problems and show how the IRLS approach differs
from the standard LS approach. Next, I explain the CGG method and ⳵
⌬m = 共Lm − d兲T共Lm − d兲 = LTr. 共4兲
contrast it with both the LS and IRLS methods. Finally, the proposed ⳵ mT
CGG method is tested on velocity-stack inversions with both syn-
thetic and real data, and the results of the CGG method are compared
with conventional LS and ᐉ1-norm IRLS results. CG method for IRLS
Instead of the ᐉ2-norm solution obtained by the conventional LS
CG METHOD FOR LS INVERSION method, ᐉ p-norm minimization solutions, with 1 ⱕ p ⱕ 2, are tried
Most inversion problems start by formulating the forward prob- often. Iterative inversion algorithms called IRLS algorithms have
lem, which describes the forward-operator L that transforms the been developed to solve these problems, which lie between the least-
model vector m into the data vector d: absolute-values problem and the classical least-squares problem.
The main advantage of IRLS is that it provides an easy way to com-
d = Lm. 共1兲 pute the approximate ᐉ p-norm solution. Among the various ᐉ p-norm
solutions, ᐉ1-norm solutions are known to be more robust than
In general, the measured data d may be inexact, and the forward-op-
ᐉ2-norm solutions, being less sensitive to spiky, high-amplitude
erator L may be ill-conditioned. In that case, instead of solving the
noise 共Claerbout and Muir, 1973; Taylor et al., 1979; Scales and
above equation directly, different approaches are used to find an op-
Gersztenkorn, 1987; Scales et al., 1988兲.
timum solution m for given data d. The most popular method is find-
The problem solved by IRLS is a minimization of the weighted re-
ing a solution that minimizes the misfit between the data d and the
sidual/model in the LS sense. The residual to be minimized in the
modeled data Lm. The misfit is often referred to as the residual vec-
weighted problem is described as
tor r and is described as follows:

r = Lm − d. 共2兲 r = Wr共LWmm − d兲, 共5兲


where Wr and Wm are the weights for residual and model, respec-
In least-squares inversion, the solution m is the one that minimizes
tively. These residual and model weights are for enhancing our pref-
the squares of the residual vector as follows:
erence regarding the residual and model. They can be applied sepa-
minm共rTr兲 = minm共Lm − d兲T共Lm − d兲. 共3兲 rately or together according to a given inversion goal. In this section,
for simplicity, the explanation will be limited to the case of applying
Most iterative solvers for the LS problem search the minimum so- both weights together, but the examples given in a later section will
lution on a line or a plane in the solution space. In the CG algorithm, show all the cases, including the residual and model weights sepa-
not a line, but rather a plane, is searched. A plane is made from an ar- rately for comparison. Those weights can be any matrices, but diago-
bitrary linear combination of two vectors. One vector is chosen to be nal matrices are often used for them, and this paper will assume all
the gradient vector. The other vector is chosen to be the previous de- weights are diagonal matrices. Then the gradient for the weighted
scent step vector. Following Claerbout 共1992兲, a conjugate-gradient least-squares becomes
algorithm for the LS solution can be summarized as shown in algo-
rithm 1: ⳵
T T T
Wm L Wr r = 共LWmm − d兲TWrTWr共LWmm − d兲.
⳵ mT
Algorithm 1 CG method for LS solution 共6兲
r Ü Lm − d
A particular choice for the residual weight Wr is the one that results
while condition do
in minimizing the ᐉ p-norm of the residual. Choosing the ith diagonal
⌬ m Ü L Tr element of Wr to be a function of the ith component of the residual
⌬r Ü L⌬m vector as follows,
共m,r兲 Ü cgstep共m,r, ⌬m, ⌬r兲
diag共Wr兲i = 兩ri兩共p−2兲/2 , 共7兲
end while
In algorithm 1, the condition represents a convergence check such the ᐉ -norm of the weighted residual is then
2

as the tolerance of residual vector r, a maximum number of itera- 储Wrr储22 = rTWrTWrr = rTWr2r = 储r储 pp . 共8兲
tions, and so on. The subroutine cgstep共兲 updates model m and resid-
ual r using the previous iteration descent vector in the conjugate Therefore, the minimization of the ᐉ -norm of the weighted residual
2

space ⌬s = L共mi − mi−1兲, where i is the iteration step, and the con- with a weight as shown in equation 7 can be considered a minimiza-
CGG method for robust inversion R61

tion of the ᐉ p-norm of the residual. This method is valid for ᐉ p-norms model as the iteration step continues. This method can be summa-
where 1 ⱕ p ⱕ 2. When the ᐉ1-norm is desired, the weighting is as rized as algorithm 2, where f共r兲 and f共m兲 represent functions of the
follows: residual and model described in equation 7 and equation 9, respec-
tively:
diag共Wr兲i = 兩ri兩−1/2 .
This weight will reduce the contribution of large residuals and im-
Downloaded 06/21/16 to 130.56.64.29. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/

Algorithm 2 CG method for IRLS solution


prove the fit to the data that is well estimated already. Thus, the
r Ü Lm − d
ᐉ1-norm-based minimization is robust, i.e., less sensitive to noise
bursts in the data. In practice, the weighting operator is modified while condition do
slightly to avoid dividing by zero. For this purpose, a damping pa- diag共Wr兲 Ü f共r兲
rameter ⑀ is chosen and the weighting operator is modified to be


diag共Wm兲 Ü f共m兲
兩ri兩−1/2 , 兩ri兩 ⬎ ⑀ r Ü Wr共LWmm − d兲
diag共Wr兲i = .
⑀, 兩ri兩 ⱕ ⑀ while condition do
⌬m Ü WmTLTWrTr
The choice of this parameter is related to the distribution of the resid-
⌬r Ü WrLWm⌬m
ual values. Some authors choose it as a relatively small value like ⑀
= max兩d兩/100, and others choose it as a value that corresponds to a 共m,r兲 Ü cgstep共m,r, ⌬m, ⌬r兲
small percentile of data, e.g., 2 percentile 共which is the value with end while
98% of the values above and 2% below兲. In this paper, I used the per- m Ü W mm
centile approach to decide the parameter ⑀ because it can reflect the end while
distribution of the residual values in it and shows more stable behav-
For efficiency, algorithm 2 often is implemented to avoid waiting
ior in the experiments performed in this paper.
until the convergence of the inner loop, and is implemented instead
The use of the model weight Wm is to enhance our preference re-
to finish the inner loop after a certain number of iterations and to re-
garding the model, for example, the parsimony or the smoothness of
compute the weights and the corresponding residual 共Darche, 1989;
the solution. The introduction of the model weight corresponds to
Nichols, 1994兲. To take advantage of plane search in CG, however,
applying precondition and solving the problem:
the number of the iterations of the inner loop should be two or more.
LWmm̂ = d The experiments performed for the examples in this paper have
shown almost no differences between the results of different itera-
followed by tion steps of the inner loop. In this paper, therefore, the IRLS algo-
m = Wmm̂. rithm is implemented to finish the inner loop after two iterations.

The iterative solution of this system minimizes the energy of the new CGG METHOD
model parameter m̂:
From the algorithmic viewpoint of the CG method, the IRLS algo-
储m̂储22 = m̂Tm̂ = mTWm
−T −1
Wm m. rithm can be considered an LS method, but with its operator L modi-
In the same vein as the residual weight, the model weight Wm can be fied by the weights Wr and Wm. The only change in the problems to
chosen as solve that distinguishes the IRLS algorithm from the LS one is the
substitution of WrLWm and WmTLTWrT for L and LT, respectively.
diag共Wm兲i = 兩mi兩共2−p兲/2 . 共9兲 Since the weights Wr and Wm are functions of the residual and the
model, respectively, and the residual r and the model m are changing
Then the weighted model energy that is minimized is now
during the iteration, the problem that the IRLS method solves is a
−2
m TW m m = 储m储 pp , nonlinear problem. Therefore, the IRLS method obtains the ᐉ p-norm
solution at the cost of nonlinear implementation. I propose another
which is the ᐉ p-norm of the model. When the minimum ᐉ1-norm algorithm that obtains an ᐉ p-norm solution without breaking the lin-
model is desired, the weighting is as follows: ear inversion template. Instead of modifying the operator which re-
sults in nonlinear inversion, we can choose a way to guide the search
diag共Wm兲i = 兩mi兩1/2 .
to find the minimum ᐉ2-norm solution in a specific model subspace
The IRLS method can be incorporated easily into CG algorithms to obtain a solution that meets a user’s specific criteria. The specific
by including the weights Wr and Wm such that the operator L has a model subspace could be guided by a specific ᐉ p-norm’s gradient or
postmultiplier Wr and a premultiplier Wm, and the adjoint operator constrained by an a priori model. Such guiding of the model vector
LT has a premultiplier WmT and postmultiplier WrT 共Claerbout, 2004兲. can be realized by weighting the residual vector or gradient vector in
However, the introduction of weights that change during iterations the CG algorithm. Because the weights are basically changing the
leads us to implement a nonlinear CG method with two nested loops. direction of the gradient vector in the CG algorithm, this proposed
The outer loop is for the iteration of changing weights, and the inner algorithm is the CGG method.
loop is for the iteration of the LS solution for a given weighted opera-
tor. Even though we do not know the real residual/model vector at
the beginning of the iteration, we can approximate the real residual/
CGG with residual-weight guide
model with a residual/model of the previous iteration step, and it will Suppose we apply the same residual weight Wr to the residual as
converge to a residual/model that is very close to the real residual/ the one we used in the IRLS method when we compute the gradient
R62 Ji

⌬m but do not apply the weight when we compute the conjugate gra- Even though model weighting has a different meaning from resid-
dient ⌬r. This means that we do not change the operator from L to ual weighting in the inversion result, the analyses are similar. As we
WrL, and the weight affects only the gradient direction. This corre- redefined the contribution of each residual element by weighting it
sponds to guiding the gradient direction with a weighted residual, with the absolute value of itself to some power, we can do the same
and the resultant gradient will be the same gradient that we used for thing with each model element in the solution
the ᐉ p-norm residual solution in the IRLS method. Unlike the IRLS
Downloaded 06/21/16 to 130.56.64.29. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/

method, however, we don’t need to recompute the residual when the diag共Wm兲i = 兩mi兩 p , 共10兲
weight has changed because we did not change the operator during where p is a real number that depends on the problem we wish to
the iteration, and the problem is the same as it was before we solve. If the operator used in the inversion is close to unity, the solu-
changed the weight 共i.e., we are solving a linear problem兲. This algo- tion obtained after the first iteration already closely approximates the
rithm can be implemented as shown in algorithm 3: real solution. Therefore, weighting the gradient with some power of
the absolute value of the previous iteration means that we down-
Algorithmic 3 CGG method with residual weight guide weight the importance of small model values and improve the fit to
the data by emphasizing model components that already have large
r Ü Lm − d
values.
while condition do
diag共Wr兲 Ü f共r兲 CGG with residual- and model-weights guide
⌬m Ü LTWrTr
In the previous two subsections, we examined the meaning of
⌬r Ü L⌬m weighting the residual vector and the gradient vector, respectively.
共m,r兲 Ü cgstep共m,r, ⌬m, ⌬r兲 Because applying the weighting in both residual space and model
end while space is nothing but changing the direction of the descent for the so-
Notice that algorithm 3 is different from the original CG method lution search, the weighting is not limited to either residual or model
共algorithm 1兲 only at the gradient ⌬m computation step. The gradi- space. We can weight both the residual and the gradient as shown in
ent is modified by changing the residual before the gradient is com- algorithm 5:
puted from it. By choosing the weight as a function of the residual of
the previous iteration step, as we did in the IRLS method, we can Algorithm 5 CGG method with residual and model weights guide
guide the gradient vector to the gradient vector of the ᐉ p-norm. Thus, r Ü Lm − d
the result obtained by weighting the residual in the CGG method
while condition do
could be interpreted as a localized LS solution in the subspace com-
posed by the ᐉ p-norm gradient vectors, not in the whole solution diag共Wr兲 Ü f共r兲
space. The minimum ᐉ2-norm location is unlikely to be located diag共Wm兲 Ü f共m兲
along the gradient direction of the different ᐉ p-norm, which is guided ⌬m Ü WmTLTWrTr
by the applied weight. Therefore, it is more likely that the solution
⌬r Ü L⌬m
will be close to the minimum ᐉ p-norm location, which is guided by
the applied weight. 共m,r兲 Ü cgstep共m,r, ⌬m, ⌬r兲
end while
CGG with model-weight guide Again, algorithm 5 is different from the conventional CG method
共algorithm 1兲 only in the gradient computation step. Whether we
Another way to modify the gradient direction is to modify the gra- modify the gradient in the residual sense or in the model sense, it
dient vector after the gradient is computed from a given residual. changes only the gradient direction 共i.e., the direction in which the
Since the gradient vector is in the model space, any modification of solution is sought兲, and the solution is found in the LS in that direc-
the gradient vector imposes some constraints in the model space. If tion. Therefore, the problem solved by the CGG method is a linear
we know some characteristics of the solution that can be expressed problem, and the CGG algorithm always converges to a solution,
in terms of weighting in the solution space, we can use the weight to which is different from the LS solution that is located along the orig-
redirect the gradient vector by applying the weight to it. Again, by inal gradient direction. Notice that the CGG algorithm 共algorithm 5兲
keeping the forward operator unchanged, we don’t need to recom- is simpler than the IRLS algorithm 共algorithm 2兲, but the CGG meth-
pute the residual when the weight has changed. This algorithm can od gives a solution similar to that of the IRLS method, which is dem-
be implemented as shown in algorithm 4: onstrated with examples shown in the following section.

Algorithm 4 CGG method with model weight guide APPLICATION OF THE CGG METHOD IN
r Ü Lm − d VELOCITY-STACK INVERSION
while condition do The CGG method described in the section above can be used to
diag共Wm兲 Ü f共m兲 solve any inversion problem whose required properties are robust-
⌬m Ü WmTLTr ness to spiky noise and parsimony of the model. In this section, the
CGG method is tested on a velocity-stack inversion that is useful not
⌬r Ü L⌬m
only for velocity analysis but also for various data processing appli-
共m,r兲 Ü cgstep共m,r, ⌬m, ⌬r兲 cations. The conventional velocity stack is performed by summing
end while or estimating semblance 共Taner and Koehler, 1969兲 along the vari-
CGG method for robust inversion R63

ous hyperbolas in a CMP gather, resulting in a velocity-stack panel. in the background, bursty spike noises, and a trace with only Gauss-
Ideally a hyperbola in a CMP gather should be mapped onto a point ian noises. Figure 1b is the same data as Figure 1a, but displayed in
in a velocity-stack panel. Summation along a hyperbola, or hyper- wiggle format to clearly show the bursty spike noises that were not
bolic Radon transform 共HRT兲, does not give such resolution. To ob- discernible because of the clipping in the raster format display. The
tain a velocity-stack panel with better resolution, Thorson and Claer- relative amplitudes of three noise types are compared to the maxi-
bout 共1985兲 and Hampson 共1986兲 formulated it as an inverse prob- mum amplitude of the hyperbolic data: ten times for the bursty
Downloaded 06/21/16 to 130.56.64.29. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/

lem in which the velocity domain is the unknown space. If we find an spikes, two times for the noisy trace, and 0.2 times for the Gaussian
operator H that transforms a point in a model space 共velocity-stack noise, respectively.
panel兲 m into a hyperbola in data space 共CMP gather兲 d, Figure 2 shows the inversion result 共Figure 2b兲 obtained using the
d = Hm, 共11兲 conventional CG algorithm for the LS solution and remodeled data
共Figure 2a兲 from it. In the CG method, the iteration was performed
and also find its adjoint operator HT, we can pose the velocity-stack 30 times, and the same number of iterations was used also for all the
problem as an inverse problem. The adjoint operator HT corresponds examples presented in this paper 共including the number of iterations
to the velocity-stacking operator for a given range of velocities 共or in the inner loop in the case of the IRLS CG method兲. From Figure 2,
slownesses兲 which generates a velocity-stack panel and can be de- we can clearly see the limit of ᐉ2-norm minimization. In the remod-
scribed as eled data, the noise with Gaussian statistics was removed quite well,
hmax but some spurious events were generated around the bursty noise
m共s, ␶兲 = 兺 d共h,t = 冑␶2 + h2s2兲, 共12兲 spikes and noisy trace. The inversion result obtained as a velocity-
h=hmin stack panel also shows many noisy values that correspond to the
noise part of the data that was not removed completely.
where d共h,t兲 denotes CMP gather and m共s, ␶兲 denotes the velocity- Figure 3d–f shows the inversion results obtained using the IRLS
stack panel. The slowness-time pair 共s, ␶兲 are the coordinate axes of algorithm with ᐉ1-norm residual weight only, ᐉ1-norm model weight
the velocity-stack, and the offset-time pair 共h,t兲 are the coordinate
axes of the CMP gather. A straightforward definition for the forward-
operator H is the adjoint of the operator HT defined by equation 12.
Through the suitable definition of the inner product, H turns out to be
simply the process of reverse NMO and stacking 共Thorson and
Claerbout, 1985兲:
smax
d共h,t兲 = 兺 m共s, ␶ = 冑t2 − h2s2兲. 共13兲
s=smin

Inverse theory helps us to find an optimal velocity-stack panel that


synthesizes a given CMP gather via the operator H. The usual pro-
cess is to implement the inverse as the minimization of a LS problem
and calculate the solution by solving the normal equation:

HTHm = HTd. 共14兲


Since the number of equations and unknowns may be large, an itera-
tive LS solver such as CG is usually preferred to solving the normal Figure 1. Synthetic data with various types of noise in raster format
共a兲 and in wiggle format 共b兲.
equation directly.
The LS solution has some attributes that may be undesirable. If the
model space is overdetermined and has bursty noise in data, the LS
solutions usually will be spread over all the possible solutions. Other
methods may be more useful if we desire a parsimonious representa-
tion of the solution. To obtain a more robust solution, Nichols 共1994兲
and Trad et al. 共2003兲 used the IRLS method for ᐉ1-norm minimiza-
tion, and Guitton and Symes 共2003兲 used a quasi-Newton method
called limited-memory BFGS 共Broyden, 1969; Fletcher, 1970;
Goldfarb, 1970; Shanno, 1970; Nocedal, 1980兲 for Huber-norm
minimization. Another possibility is the CGG method proposed in
the preceding section. In the next subsections, the results of the CGG
method for the velocity-stack inversion are compared with the re-
sults of the conventional LS method and the ᐉ1-norm IRLS method.

Examples on synthetic data


To examine the performance of the proposed CGG method, a syn- Figure 2. The remodeled synthetic data 共a兲 from the velocity stack
thetic CMP data set with various types of noise is used. Figure 1 共b兲 obtained by LS inversion using the CG method for noisy synthet-
shows the synthetic data with three types of noise — Gaussian noise ic data 共Figure 1兲.
R64 Ji

only, and ᐉ1-norm residual and model weights together, respectively. Thus, we can say that guiding the gradient using the ᐉ1-norm-like re-
Figure 3a–c shows the remodeled data from the corresponding in- sidual weight in the CGG method seems to behave the same as the
version results. From the results of ᐉ1-norm residual weight 共Figure ᐉ1-norm residual, minimizing the IRLS method. From the results of
3a and d兲, we can see the robustness of ᐉ1-norm residual minimiza- the model weight 共Figure 4b and e兲, we can also see the improvement
tion for the bursty noise and the successful removal of the Gaussian in the parsimony of model estimation compared to the result of LS
noise, too. From the results of the ᐉ1-norm model weight 共Figure 3b inversion 共Figure 2b兲 and the similar behavior in reducing noise to
Downloaded 06/21/16 to 130.56.64.29. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/

and e兲, we can see the improvement in the parsimony of the model that of the ᐉ1-norm model minimizing IRLS method. For the model
compared to the result of LS inversion 共Figure 2b兲. The ᐉ1-norm weight, I used diag共Wm兲i = 兩mi兩1.5, where the exponent 1.5 was de-
model weight also seems to have some effect in reducing low-ampli- cided empirically. If we want the same model weight as the one used
tude noise quite well, but the result shows some limit in reducing in the ᐉ1-norm model weight in the IRLS method, the model weight
high-amplitude noise by making some spurious event around the diag共Wm兲i would be 兩mi兩0.5, but the result of it was not as successful
bursty spike noises 共Figure 3b兲. From the results of using ᐉ1-norm as the IRLS method. So the appropriate value for the exponent was
residual and model weights together 共Figure 3c and f兲, we can see decided to be 1.5 after experiments with several exponent values
that the IRLS method certainly can achieve both goals: the robust- from 0.5 to 3. From the results of the residual and model weights to-
ness to the bursty noises and the parsimony of the model representa- gether 共Figure 4c and f兲, we can see that the CGG method also
tion. achieves successfully both goals, the robustness to the bursty noise
Figure 4d–f shows the inversion results obtained using the CGG and the parsimony of the model representation, and the results of the
algorithm with the residual weight only, the model weight only, and CGG method are comparable to the results of the IRLS method 共Fig-
the residual and the model weights together, respectively. Figure ure 3c and f兲.
4a–c shows the remodeled data from the corresponding inversion re- Figure 5 shows the differences of the results of the IRLS method
sults. In Figure 4a and d, we can also see the robustness of the residu- and of the CGG method from the original synthetic data, respective-
al weight for the bursty spike noises, and the successful removal of ly. We can see that both differences contain nothing but the noise por-
the Gaussian noise. Here the residual weight used was the same as tion of data. This demonstrates that both the IRLS method and the
the one used in the ᐉ1-norm residual minimizing the IRLS method. CGG method are very successful in removing various types of nois-

Figure 3. Remodeled data and velocity-stack inversion results ob- Figure 4. The remodeled data and the velocity-stack inversion re-
tained by the IRLS method with three different norm criteria: 共a兲 and sults obtained by CGG method with three different guiding weights:
共d兲 are with the ᐉ1-norm residual weight only; 共b兲 and 共e兲 are with the 共a兲 and 共d兲 are with the residual weight only, 共b兲 and 共e兲 with the
ᐉ1-norm model weight only, and 共c兲 and 共f兲 with the ᐉ1-norm residu- model weight only, and 共c兲 and 共f兲 with the residual/model weights
al/model weights together. together.
CGG method for robust inversion R65

es. Therefore, we can say that the CGG inversion method can be used Figure 9 shows the remodeled data from the inversion results
to achieve the same goals as the IRLS method: making an inversion 共Figure 10兲 obtained using the CGG algorithm with three different
robust and producing parsimonious model estimation. In addition, guiding types — the residual weight only, the model weight only,
the CGG method requires less computation than the IRLS method by and the residual and the model weights together. All three remodeled
solving a linear inversion problem, which requires one iteration
loop, instead of solving a nonlinear inversion problem, which re-
Downloaded 06/21/16 to 130.56.64.29. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/

quires two nested iteration loops.

Examples on real data


I tested the proposed CGG method on a real data set that contains
various types of noise. The data set was a shot gather from a land sur-
vey. However, the trajectories of the events in the data set looked hy-
perbolic enough to be tested with a hyperbolic inversion.
Figure 6 shows the real data set used for testing and the results, the
velocity-stack 共Figure 6c兲, and the remodeled data 共Figure 6b兲 when
we used the conventional LS inversion. We can see that the real data
共Figure 6a兲 originally contains various types of noise such as strong
ground roll, amplitude anomalies early at near-offset and late at
0.8-km offset, and time shifts around offsets 1.6 km and 2.0 km. The
Figure 6. The real data set 共a兲, the remodeled data from the inversion
conventional LS inversion generally does a good job of removing result 共b兲, and the LS inversion result 共c兲.
most dominant noise except some whose characteristics are some-
how bursty 共i.e., the amplitude anomalies early at near-offset and
late at 0.8-km offset and the time shift around offsets 1.6 km and
2.0 km兲. The resultant velocity-stack panel 共Figure 6c兲 was filled
with various noises that requires some more processing if we want to
perform any velocity-stack oriented processing such as velocity
picking, multiple removal, and so on.
Figure 7 shows the remodeled data from the inversion results
共Figure 8兲 obtained using the IRLS method with three different
weighting combinations: ᐉ1-norm residual only, ᐉ1-norm model
only, and ᐉ1-norm residual and model together. We can see that all
three remodeled results show successful and similar removal of
most noises. The main difference among the three inversion results
is the degree of parsimony of the corresponding velocity stacks as
shown in Figure 8. Even though the approach of the ᐉ1-norm residual
weight can reduce many noisy signals in the velocity stack, the result
of the ᐉ1-norm model weight shows better parsimony of the velocity Figure 7. The remodeled data from the velocity-stack inversion re-
stack. sults 共Figure 8兲 of the real data 共Figure 6a兲 using the IRLS method
with different norm criteria: ᐉ1-norm residual only 共a兲, ᐉ1-norm
model only 共b兲, and ᐉ1-norm residual/model together 共c兲.

Figure 5. 共a兲 The difference of the remodeled data obtained by the


IRLS method from the original synthetic data: The original noisy
synthetic data 共Figure 1兲 was subtracted from the remodeled data us-
ing the IRLS method 共Figure 3c兲. 共b兲 The difference of the remod- Figure 8. The velocity-stack inversion results of the real data 共Figure
eled data obtained by CGG method from the original synthetic data: 6a兲 using the IRLS method with different norm criteria: ᐉ1-norm re-
The original noisy synthetic data 共Figure 1兲 was subtracted from the sidual only 共a兲, ᐉ1-norm model only 共b兲, and ᐉ1-norm residual/model
remodeled data using the CGG method 共Figure 4c兲. together 共c兲.
R66 Ji

relatively high-amplitude model values get more emphasis in fitting


the model to the data. Likewise, if we decrease the value of the expo-
nent of the residual weight function, which is a negative value, the
relatively high-amplitude residual values get less emphasis in fitting
the model to the data. The optimum value of the exponent, or relative
emphasis in the model and the residual, depends on the distribution
Downloaded 06/21/16 to 130.56.64.29. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/

of the values of the model/residual and could be found empirically,


as described in this paper. The experiments described here demon-
strate that the value of the exponent of weight function used for the
ᐉ1-norm residual/model in the IRLS approach are good choices for a
start but could be increased or decreased appropriately for each case
if further improvement is required.

Figure 9. The remodeled data from the velocity-stack inversion re- CONCLUSIONS
sults 共Figure 10兲 of the real data 共Figure 6a兲 using the CGG method
with different guiding weights: the residual weight only 共a兲, the The proposed CGG inversion method is a modified CG inversion
model weight only 共b兲, and the residual/model weights together 共c兲. method, which guides the gradient vector during the iteration and al-
lows the user to impose various constraints for the residual, model,
or both. Guiding is implemented by weighting the residual vector
and the gradient vector, either separately or together. Weighting the
residual vector with the residual itself corresponds to guiding the
search for a solution toward the ᐉ p-norm minimization; weighting
the gradient vector with the model itself corresponds to guiding the
solution search toward imposed a priori information. Testing the
CGG algorithm for the velocity-stack inversion of synthetic and real
data, demonstrates that guiding with residual weighting gives a ro-
bust model estimation comparable to the IRLS method, and guiding
with model weighting produces a parsimonious velocity spectrum
also comparable to the IRLS method. So we can say that the CGG
method can be used to achieve the same goals as the IRLS method —
but with less computation, by solving the linear problem instead of
the nonlinear problem — and with more flexibility in the choice of
weighting parameters. Therefore, the CGG method seems to be a
good alternative to the IRLS method for robust and parsimonious
Figure 10. The velocity-stack inversion results of the real data 共Fig-
model estimation inversion of seismic data.
ure 6a兲 using the CGG method with different guiding weights: the
residual weight only 共a兲, the model weight only 共b兲, and the residual/
model weights together 共c兲.

data sets 共Figure 9a-c兲 show quite similar quality as the ones ob- ACKNOWLEDGMENT
tained with the IRLS method 共Figure 7兲. The differences in the parsi-
This research was financially supported by Hansung University in
mony of the velocity stacks among the different guiding types also
2005. I thank two anonymous reviewers and the associate editor for
are shown clearly in Figure 10, and they are similar to the results of
their helpful and constructive comments.
the IRLS method. In the case of guiding with residual weight 共Figure
9a兲, I used the weight, diag共Wr兲i = 兩ri兩−0.75, to achieve noise removal
similar in quality to that of the ᐉ1-norm residual minimizing the
IRLS method. The exponent value −0.75 also is decided empirically REFERENCES
after experiments with various exponents’ values.
In the real data example above, the values of the exponents of the Broyden, C. G., 1969, A new double-rank minimization algorithm: Notices
weight functions in the CGG method were decided empirically and of the American Mathematical Society, 16, 670.
were sometimes different from the exponents of the weights used in Bube, K. P., and R. T. Langan, 1997, Hybrid ᐉ1 /ᐉ2 minimization with applica-
tions to tomography: Geophysics, 62, 1183–1195.
the ᐉ1-norm IRLS method. In the IRLS approach, the meaning of the Claerbout, J. F., 1992, Earth Soundings Analysis, Processing versus Inver-
exponents for the weight functions can be explained either with the sion: Blackwell Scientific Publications, Inc.
ᐉ p-norm sense or with the relative weighting for each value of the re- ——–, 2004, Image Estimation by Example: http://sepwww.stanford.edu/
sep/prof/index.html.
sidual/model. Even though the two explanations are related closely, Claerbout, J. F., and F. Muir, 1973, Robust modeling with erratic data: Geo-
the meaning of the exponent for the weight function in the CGG physics, 38, 826–844.
method can be explained better with the latter because it only chang- Darche, G., 1989, Iterative l1 deconvolution: Stanford Exploration Project
Report, 61, 281–302.
es the gradient vector and doesn’t minimize the ᐉ p-norm. So, if we Fletcher, R., 1970, A new approach to variable metric methods: The Comput-
increase the value of the exponent of the model weight function, the er Journal, 13, 317–322.
CGG method for robust inversion R67

Foster, D. J., and C. C. Mosher, 1992, Suppression of multiple reflections us- ple suppression: 65th Annual International Meeting, SEG, Expanded Ab-
ing the Radon transform: Geophysics, 57, 386–395. stracts, 1460–1463.
Gersztenkorn, A., J. B. Bednar, and L. R. Lines, 1986, Robust iterative inver- Nichols, D., 1994, Velocity-stack inversion using Lp norms: Stanford Explo-
sion for the one-dimensional acoustic wave equation: Geophysics, 51, ration Project Report, 82, 1–16.
357–368. Nocedal, J., 1980, Updating quasi-Newton matrices with limited storage:
Goldfarb, D., 1970, A family of variable metric methods derived by varia- Mathematics of Computation, 35, 339–353.
tional means: Mathematics of Computation, 24, 23–26. Sacchi, M., and T. Ulrych, 1995, High-resolution velocity gathers and offset
Guitton, A., and W. Symes, 2003, Robust inversion of seismic data using the space reconstruction, Geophysics, 60, 1169–1177.
Downloaded 06/21/16 to 130.56.64.29. Redistribution subject to SEG license or copyright; see Terms of Use at http://library.seg.org/

Huber norm: Geophysics, 68, 1310–1319. Scales, J. A., and A. Gersztenkorn, 1987, Robust methods in inverse theory in
Hampson, D., 1986, Inverse velocity stacking for multiple elimination: Jour- J. A. Scales, ed., Geophysical imaging, symposium of geophysical society
nal of the Canadian Society of Exploration Geophysicists, 22, 44–55. of Tulsa: SEG, 25–50.
Herrmann, P., T. Mojesky, M. Magesan, and P. Hugonnet, 2000, De-aliased, Scales, J. A., A. Gersztenkorn, S. Treitel, and L. R. Lines, 1988, Robust opti-
high-resolution Radon transforms: 70th Annual International Meeting, mization methods in geophysical inverse theory: 58th Annual Internation-
SEG, Expanded Abstracts, 1953–1956. al Meeting, SEG, Expanded Abstracts, 827–830.
Huber, P. J., 1973 Robust regression: Asymptotics, conjectures, and Monte Shanno, D. F., 1970, Conditioning of quasi-Newton methods for function
Carlo: The Annals of Statistics, 1, 799–821. minimization: Mathematics of Computation, 24, 647–657.
Ji, J., 1994, Near-offset interpolation in wavefront synthesis imaging: Stan- Taner, M. T., and F. Koehler, 1969, Velocity spectra — Digital computer der-
ford Exploration Project Report, 82, 195–208. ivation and applications of velocity functions: Geophysics, 34, 859–881.
Kabir, M. M. N., and K. J. Marfurt, 1999, Toward true amplitude multiple re- Taylor, H. L., S. C. Banks, and J. F. McCoy, 1979, Deconvolution with the L-
moval: The Leading Edge, 18, 66–73. one norm: Geophysics, 44, 39–52.
Kostov, C., and D. Nichols, 1995, Moveout-discriminating adaptive subtrac- Thorson, J. R., and J. F. Claerbout, 1985, Velocity stack and slant stochastic
tion of multiples: 65th Annual International Meeting, SEG, Expanded Ab- inversion: Geophysics, 50, 2727–2741.
stracts, 1464–1467. Trad, D., T. Ulrych, and M. Sacchi, 2003, Latest views of the sparse radon
Lumley, D. E., D. Nichols, and T. Rekdal, 1995, Amplitude-preserved multi- transform: Geophysics, 68, 386–399.

You might also like