Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/228078158

Magnetotelluric inversion for minimum structure

Article  in  Geophysics · December 1988


DOI: 10.1190/1.1442438

CITATIONS READS

90 85

2 authors, including:

John Booker
University of Washington Seattle
68 PUBLICATIONS   3,563 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

INDEPTH View project

All content following this page was uploaded by John Booker on 01 April 2015.

The user has requested enhancement of the downloaded file.


GEOPHYSICS. VOL.
GEOPHYSICS. S3. NO.
VOL. 53. NO, 12
12 (DECEMBER
(DECEMBER 1988):
1988); P.p, 1565-1576.
1565-1576. 13
13 FIGS..
FIGS .. 33 TABLES.
TABLES,

Magnetotelluric
Magnetotelluric Inversion
Inversion for
for minlmum
minimum structure
structure

J. Torquil
J. Torquil Smith*
Smith* and
and John
John R. Booker*
R. Booker*

ABSTRACT
ABSTRACT

Structure can be measured


measured in terms
terms of a norm of of the Others result
result in systematic
systematic overfitting
overfitting of
of low fre-
derivative of a model with respect
respect to a function of
of depth quencies, a “red”
quencies, .. red" fit, and extraneous
extraneous deep structure.
structure. A A
,f(z),
{(;:). where the model m(z) m(z) is either the conductivity crcr or robust statistic is used
used to
to test
test for whiteness:
whiteness; thethe fit can
log cr.
cr. An iterative linearized algorithm can find models
models be made acceptably
acceptably white
white by
by varying
varying the
the depth function
that minimize norms norms ofof this form for chosen
chosen levels
levels of lIz)
,j’
(~) which defines
defines the norm.
norm, An optimum norm norm pro-
pro-
chi-squared
chi-squared misfit. The modelsmodels found may very well be duces
duces an inversion which does does not introduce false
false struc-
global minima of these these norms,
norms, since
since they are not ob- ture and which approaches
approaches the true structure in a rea-
served
served to depend on the starting model. Overfitting
Overfitting data sonable
sonable way as as data errors decrease.
decrease. Linearization
causes
causes extraneous
extraneous structure.
structure. Some choices
choices of the depth errors are
are often so so small that models
models of CT cr (but not
function
function result
result in systematic
systematic overfitting of high fre- log o)
cr) may hebe reasonably
reasonably interpreted as as the true con-
quencies.
quencies, a "blue"
“blue” fit, and extraneous
extraneous shallow structure.
structure. ductivity averaged
averaged through known resolution
resolution functions.
functions,

INTRODUCTION
INTRODUCTION

One-dimensional 0-0) (I-D) inversion


inversion remains
remains an important tool lems. like the
lems. the inversion
inversion of MT MT data, by by linearization about
for interpreting
for interpreting magnetotelluric
magnetotelluric (MT) (MT) data,
data. There are are many
many models fitting the
models the data (e,g"
(e.g., Parker, 1970;
1970; Oldenburg, 1979), 1979).
instances, particularly at very
instances, very low frequencies,
frequencies, whenwhen multidi- Unfortunately, the the averages
averages are are unique
unique only for models models close
close
mensional effects
mensional effects maymay be be approximated by by aa frequency-
frequency- to the
the models
models about
about which the the linearization was was made.
made. Olden-
independent static
independent static distortion and and only
only aa I-D
I-D interpretation is is burg reports
burg reports quitequite different averages
averages forfor different models
models of
necessary(Weidelt, 1972;
necessary 1972; Larson,
Larson, 1977),
1977). Also,
Also, 1-0
I-D inversions
inversions log <J
log CT(where
(where cr o is
is the
the conductivity in S/m) fitting the the same
same MTMT
are routinely
are routinely performed
performed to to constrain
constrain starting
starting models
models forfor 2-0
2-D data whenwhen the the averaging
averaging functions
functions are are centered
centered in low-
3-D modeling
or 3-0 modeling or inversion.
inversion. conductivity areas.
conductivity areas. This result
result casts
casts doubt
doubt on on the
the uniqueness
uniqueness
In solving
In solving anyany inverse
inverse problem,
problem, one one seeks
seeks notnot merely
merely aa of his
of his averages
averages in in the
the better
better resolved
resolved high-conductivity zones, zones.
model which
model which Ilts
lits aa given
given set
set of
of data,
data, but
but also
also knowledge
knowledge of of AA later
later study
study by by Oldenburg (1981), (1981), using
using averages
averages of log log cr
G
what features
what features in in that
that model
model areare required
required by by the
the data
data and
and are
are determined by
determined by hishis linearized
linearized log log cr
o model-construction
model-construction algo- algo-
not merely
not merely incidental
incidental to to the
the manner
manner in in which
which thethe model
model was was rithm, reached
rithm, reached conclusions
conclusionsregarding
regarding thethe conductivity beneathbeneath
obtained. This
obtained, This is
is particularly
particularly important
important in in 1-0
1-D models
models intend-
intend- the Pacific
the Pacific plate
plate which
which later
later had
had toto be
be recanted
recanted (Oldenburg
(Oldenburg et et
ed as
ed as starting
starting points
points forfor 2-0
2-D or or 3-0
3-D models,
models, since
since un-
un- al., 1984).
aI.,1984),
constrained details
constrained details may
may persist
persist in
in later
later iterations
iterations andand bebe mis-
mis- Gven the
Given the uncertainties
uncertaintiessurrounding
surrounding nonlinear
nonlinear effects
effectsin
in MT
MT
takenly interpreted
takenly interpreted as assignificant
significantstructure,
structure. inversion, we
inversion, we argue
argue that
that one
one should
should seek
seekmodels
models that that have
have the
the
Evaluating what
Evaluating what features
featuresareare resolved
resolvedhas has been
beenwell
well studied
studied minimum structure
minimum structure possible
possible for for some
some tolerable
tolerable level
level of
of misfit
misfit
for the
for the linear
linear inverse
inverse problem.
problem. Backus
Backus and and Gilbert
Gilbert (1968)
(1968) to the
to the data,
data. If If aa minimum-structure
minimum-structure model model exhibits
exhibits aa particu-
particu-
show how
show how to to construct
construct averages
averagesof of models
models that
that are
are uniquely
uniquely lar feature.
lar feature. we we have
have confidence
confidence that that that
that feature
feature is is required,
required.
determined by
determined by the
the data.
data. These
These a'lerages
averagesare are the
the truth
truth viewed
viewed Conversely, ifif aa minimum-structure
Conversely, minimum-structure model model does
does not not exhibit
exhibit aa
through peaked
through peaked resolution
resolution functions,
functions,whose
whose locations
locations maymay be be particular feature,
particular feature, then
then that
that feature
feature certainly
certainly isis not
not required
required
varied. Knowledge
varied. Knowledge of of the
the resolution
resolution functions
functions and and the
the vari-
vari- by the
by the data.
data.
ancesof
ances of the
the averages
averagesallows
allows critical
critical evaluation
evaluation of of details
detailsinin the
the We also
We also show
show thatthat the
the nonlinear
nonlinear errors
errors (i.e.,
(i.e., those
those resulting
resulting
structure.
structure, from linearization)
from linearization) made made in in interpreting
interpreting minimum-structure
minimum-structure
The same
The same methods
methods have have been
been applied
applied to to nonlinear
nonlinear prob-
prob- modelsof
models of cro as
asa'lerages
averagesof of the
the true
true conductivity
conductivity through
through reso-
reso-

Manuscriptreceived
Manuscript received by the
by theEditor DecemberI,1,1986;
Editor December 1986;revised
revisedmanuscript
manuscript received June9,1988.
receivedJune 9, 1988.
*Geophysics Program,
*Geophysics Program, University
University of
of Washington.
Washington. Seattle.
Seattle.WA
WA 98195,
98195.
(‘ 1988 Society of Exploration Geophysicists. All rights reserved.
('[988 Society of Exploration Geophysicists, An rights reserved,

1566
1565

Downloaded 15 May 2010 to 95.176.68.210. Redistribution subject to SEG license or copyright; see Terms of Use at http://segdl.org/
1566
1566 Smith
Smith and Booker
and Booker

lution functions
lution functions can
can be
be quite
quite small,
small, allowing
allowing the
the investigator
investigator aa model
model as as aa linearly
linearly filtered
filtered version
version of of the
the truth,
truth, itit isis essential
essential
the use
the use of
of resolution
resolution functions
functions asas aa means
means ofof quantifying
quantifying the
the that
that errors
errors associated
associatedwith with linearization
linearization be be small.
small. This
This cannot
cannot
resolution of
resolution of aa data
data set.
set. In
In contrast
contrast wewe will
will show
show that
that errors
errors be
he the
the case
caseforfor pp because
becauseaddingadding aa thinthin layer
layer of of infinite
infinite resist-
resist-
are not
are not small
small for
for models
models of of log
log aIS(nor
(nor of
of resistivity
resistivity pi,
p), which
which ance
ance(zero
(zero conductance)
conductance) has has nono effect
effect on
on the
the response,
response,but but can
can
may explain
may explain some
some ofof the
the erroneous
erroneous conclusions
conclusionsof of Oldenburg
Oldenburg produce
produce aa vastly
vastly different
different filtered
filtered model.
model. TheThe same
same is is true
true for
for
(1981).
(1981). log pp and
log and log
log (Jo but
but not for a.
not for cr.AA physical
physical argument
argument in in favor
favor ofof
All real
All real data
data have
have measurement
measurementerrors,
errors, so
so that
that itit isisgenerally
generally o isis that
("j that itit isis large
large in in conductors
conductors where where MT MT gives
gives the the most
most
neither possible
neither possible nor
nor desirable
desirable to
to fit
fit the
the data
data exactly.
exactly. The The chi-
chi- information
information and and small
small in in resistors
resistors where
where MT MT givesgives the the least
least
squared statistic
squared statistic information.
information. Thus, filtered ao is
Thus, aa filtered is dominated
dominated by by regions
regions where where
we
we know
know the the most,
most, while
while aa filtered
filtered pp isis dominated
dominated by by regions
regions
2
X = I [L'lYi]2
2N
- , (\)
(1) where
where we we know
know the the least.
least. However,
However, despite
despite being
being more
more nonlin-
nonlin-
i==l E;i ear than ao models,
ear than models, log log ao models
models reducereduce the
the masking
masking of of struc-
struc-
where lI.Yi
where Ayi are
are the
the data
data residuals
residuals andand f:i are the
E, are the data
data standard
standard ture in
ture in resistive
resistive zones
zones through
through side-band
side-band leakage
leakage from from con-con-
errors, is
errors, is aa common
common measure measure of of the
the misfit
misfit between
between aa model
model ductive
ductive zones
zones because
because they they are are less
less variable.
variable. Using log log acr is
is
and the
and the data.
data. ForFor 1-0
1-D data
data with
with independent
independent GaussianGaussian errors.
errors. somewhat
somewhat akin to to prewhitening in in time
time series
series analysis.
analysis. Mod- Mod-
the X
the x2 2
misfit of
misfit of the
the data
data to to the
the truth is is distributed
distributed as as the
the eling
eling log
logo(J also
also ensures
ensuresthat ao will be be positive.
positive.
standard X
standard 2 for which probabilities arc given in most books
x2 for which probabilities arc given in most books Having
Having chosenchosen ao as as the
the model
model variable
variable for for which
which the the in-
on statistics.
on statistics.The The expected
expected value
value of of Xx22 for the misfit of the data
for the misfit of the data vcrsc problem is
verse is most
most linear,
linear, we we select
select anan appropriate re-
to the
to the truth is is 2N for for 2N datadata points.
points. Parker (1980) (1980) shows
shows sponse
sponsc to to measure
measure based based on aa heuristic
heuristic argument.
argument. In the the 1-0
1-D
when no
that when no model
model fitsfits MT
MT data exactly,
exactly, the the model
model which MT
MT problem, assuming assuming aa time dependence dependence exp ((-kot) - iool) and aa
minimizes X2
minimizes x2 (which he he calls
calls D+) consists
consists of deltadelta functions
functions piccewise-continuous
piccewisc-continuous conductivity a(z), O(Z), the
the governing
governing equa-
with finite
finite conductance
conductance but but locally infinite conductivity.
conductivity. Other tion
tlon for
for the
the horizontal electric
electric field
field E is
is
types of models
types models that approach
approach the the same
same levellevel of misfIt
misfit develop
develop En
E” == -ioo~oaE,
- iwpo oE, (2)
2 (2)
oscillations. As
oscillations. As X x2 decreases,the
decreases, the oscillations
oscillations increase
increase as as they
they
try to mimic the the delta functions
functions of D D+.
+ Th Thus, one seeks
us, if one seeks where
where 110 un is
is the permeability and
and the
the left side
side is
is differentiated
models with minimum structure,
models structure, it is is aa bad
bad idea
idea to demand
demand twice
twice with respect
respect to the vertical
vertical coordinate z.Z. The boundary
x2 be
that X2 be close
close to itsits minimum possible
possible valuevalue or be be much
much less
less conditions at the the surface
surface and
and great
great depth
depth are E(0) = Eo
are E(O) E, and
than the the expected
expected value value 2N. In fact, fact, minimum-structure £'( Y )) == 0,
E’( Y. 0, respectively.
respectively. Integrating once
once and normalizing by
models with greater
models greater amounts
amounts of misfit (such (such asas the
the 90
90 percent
percent the
the surfacc field, we
surface ficld, we get
get
95 percent
or 95 percent confidence
confidence limit values
place more conservative
place conservative bounds
values of x2) X2 ) may be
bounds on the amount of structure
be desired
desired to
structure
£'(0)
F:(O) = Jo
r' imp,,
ioo~() o(z)
O"(z) __
E(z)
El=)
E(O) dz.
dz. (3)
(3)
required. E(O)
The X2 x2 statistic
statistic does
does not give give a complete
complete picture of the the
misfit. We call a fit which distributesdistributes the normalized residuals residuals
uniformly acrossacross the frequency
frequency spectrum
spectrum a white fit, one one that We define the complex response
detine the response
overfits low-frequency data a red fit, and one that overfits
£'(0,
E’ 0))
(O. w) iooB r (0.
ioB, (0, w)
(0)
high-frequency data a blue fit. fit. It
It is
is important
important that an inver- u(w) = ~
Y(O))=--= = , (4)
(4)
sion not systematically
systematically overfit some some frequency
frequency rangesranges and un- E(O, 0)
E(O, (0) Ex (0, 0)
E,(O, (0) ’
derfit others.
others. We show that a red red fit results
results in more structure
where f3,.
where By and E, thc magnetic and electric fields
E, are the fields in orthog-
orthog-
than required at depth for a given given x2 X2 and less less structure
structure than
onal horizontal
onal horizontal directions.
directions. (Note
(Note that
that in all other sections
sections of
required in the shallow part of the model. We use use a robust
this paper yY has
this has been
been normalized by dividing
dividing by the standard
statistic
statistic to testtest for whiteness
whiteness and show how to make the fit
measurements of
errors of the measurements of r.) Since y would be linear in oa
y.) Since
acceptably
acceptably white by tailoring the norm that defines defines the
t:(z) were independent of
if t(z) a, yY may be more linear in CT
of o, a than
minimum-structure
minimum-structure model. Using artificial data, we show that
response c(" = - 1:~
the response I/y used
used by Weidelt (1972)
(1972) and Parker
Parker
the optimum
optimum norm produces produces an inversion which does does not in-
(1970). This motivates our choice.
(1970). choice. Other
Other choices
choices could be
troduce falsefalse structure and which approaches approaches the true struc-
made, douhtful they would give linearization
made. but we are doubtful linearization errors
ture in a reasonable
reasonable way as as the data errors decrease.
decrease.
We
as small as those
as those we
we have
have obtained.
We restrict our examplesexamples to inversions
inversions of of artificial
artificial 1-D
\-0 data
with
with Gaussian, zero-mean independent errors of known scale, scale,
so that we can compare to the truth and test I;INDINC;
FINDING MINIMUM-STRUCTURE
MINIMLM-STRUCTURE MODELS
MODELS
test statistically the
residuals
residuals to compare different inversions. inversions. Considerable cau-
A convenient way
A way to
to minimize
minimize structure is is to
to minimize
minimize a
tion must be used used in interpreting
interpreting statistical tests tests made on the
norm ofof a derivative of
of the
the model.
model. Models
Models minimizing
minimizing the
the first
residuals
residuals left upon upon inverting
inverting realreal data, since since the
the distributions
deri\,ativc arc commonly
derivative arc commonly called
caIled “flattest.”
"flattest." We
We define
define the
the flat-
flat-
and scales
scales of of the
the errors may may be be poorly
poorly known known and the the 1-D
1-0
test mrx_lel
test mouel asas the
the one
one that
that minimizes
minimizes

i ' [dm]2
assumption
assumption is at best best an approximation.
approximation.

CHOOSING
CHOOSING THE
THE MODEL
MODEL VARIABLE
VARIABLE
F(m,
.
n= 0
--
df{zj
df·(z)
.
(5)
AND
AND RESPONSE
RESPONSE FC’NCTION
Ft.:;'>;CTlON
for aa given
for given value
value of Xl,
of x’ where rn
, where is either
f11 is a or
either CT log oa and
or log and the
the
Three
Three possible
possible model
model variables
variables are
are conductivity
conductivity cr,
a, resistivi-
resistivi- function f(controls
function controls the
the norm.
norm. The
The choice off haseffccts
choice off has-eficctssome"
some-
ty
ty pp == I/O.
I/a, and
and log
log conductivity
conductivity log
log oa == -log
-log p.
p. To
To interpret
interpret what similar
what similar to
to the
the choice of layer
choice of layer thicknesses
thicknesses in
in the
the fitting
fitting of
of

Downloaded 15 May 2010 to 95.176.68.210. Redistribution subject to SEG license or copyright; see Terms of Use at http://segdl.org/
MT Inversion
MT Inversion for
for Minimum
Minimum Structure
Structure 1567
1567

simplest choice is f’
layered models. The simplest I== z. However,
However, this
likely to
choice is !ik~e!y te !ead
lead to a red fit with
with unnecessary
unnecessary structure -rli - yo, = jmgi(zj Am(z) dz. (7)
0
because the resolution of
at depth, because of MT
MT data generally de-
creases with
creases with depth. The deeper structure required to fit low- low- For
For m = cs
IT and i =
= 1 to
to N,
N,
frequency typically has
frequency data typically has a longer length scale
scale and
and contrib-
contrib-
utes less
utes less to F(m, z). Thus, low-frequency data will be easier
easier to
will end up with
fit and will with smaller residuals.
residuals. To
To compensate for (8)
this effect
this effect, one can contract the elrective
effective scale
scale of the derivative
at depth by choosing f’ I(z) such that
(z) such
(see
(see Oldenburg,
Oldenburg, 1979).
1979). When N ++ 1 to 2N,
When i = N 2N, one takes
df(z) = (z + ZO)'1 the imaginary part and when m = log o, cr, yi
g; is replaced by
(6)
liz crO
0'0 (z)gi
(Z)gi (2).
(z). Letting
Letting

for some some q 11 and z0


terization for JI, since
Zo > 0. O. Equation (6) is
since it includes
is a useful
includes the obvious choices
useful parame-
of f = z
choices off=
r; =
r, f"
a

s0
&)m, (z) dz,
gi(z)m O (2) liz, (9)

and ff== log (z + + zO).


zo). Below, we compare models models usingusing f= f = z,
f = log (z ++ zO). I = -- I/(;I/(z + + zO),
zo), corresponding to 11 II == 0, we can write
,f= 2 0 ), and ./=
-I, and -2.
-1,and -2. aD
The constant a0 Zo in the definitions of,fensures off ensures that the inte- YI; - Yo,
Yl{ Yo; + ri=
+ I-i lOO si(z)m,(z) dz.
gi(z)ml(z) da. (10)
s0
gration of dfn/df dm/d( to recover
recover m m isis not singular.
singular. Physically,
Physically, z,, Zo isis
required becausebecause the resolution length approaches approaches a constant
constant Integrating by parts,
Integrating
at the Earth’Earth's surface rather than appmoachingzero.
s surface apPLoachin&zero. We some- some- m

Re (c)
arbitrarily choose
what arbitrarily
(Weidelt, 1972)
(c) (Weidclt,
choose z0 20 equal to half the penetration depth

1972) for the highest highest frequency


frequency in the data.
Yli - yo,
yi, + Fi
Yo; + r i ++ Gi(0)m,(O)
Gi(O)ml(O) = I'v
s0
Gi m’ dz,
Gim' dz, (11 )
(11)

since we
SlilCC we cxmot
cannot hope hope- iu resolve structure
tu resoivc structure muchmucli shallower where
than this.this. Conceivably, one could adjust adjust the litfit of middle fre-
n
quencies, as
quencies, as compared to high frequencies, frequencies, by varying zO. 20 ,
G,(z) = Yi (x) dx (12)
Marchisio (1985)
Marchisio (1985) (see(see also
also Marchisio
Marchisio and Parker, 1984) 1984) pre- i
sents a fully nonlinear inversion which minimizes a quantity
sents
and
that is is a bound on F(log o, cr, -_)
.::) when the model is is close
close to a
uniform slah slab over an an infinitely conducting half-space. half-space. While
While aa m’ = dm/dz.
m' = (13)
significant advance
significant advance in nonlinear inverse inverse theory, Marchisio's
Marchisio’s
If our goal were to fit the
If the yi,s
YI,S to the yis
YiS exactly and m,(O)
ml(O)
solution is is not necessarily
necessarily the the flattest
flattest and is is likely to produce
produce
were known, replacing yi,
were y" with yi.y" equation (11) would pro-
aa redred fittit with structure
structure at depth that is is not required
required by the the
constraints to the minimization
vide 2N constraints minimization of F. However, sincesince
data.
data. Whittall
Whittall and Oldenburg (1986) (1986) also
also present
present several
several non-
goal is
our goal is to fit
tit the y,,s
Y1, S to the YiS
yis only to some
some prescribed
prescribed x:,
X;,
linear inversions
inversions which minimize various various norms
norms of the impulse
impulse
response
response of the the model rather than than normsnorms of the the model itself.
itself.
we replace
we replace yi;
y,; by Yi e", rewrite equation
yi - c,,, eq uation (11) as
as
This is is another step step in in the
the right direction,
direction. but still falls falls short
df
of finding truly minimum-structure
minimum-structure models. models. yi - Yo,
Yi + rFii - Gi(O)ml(O)
yo, + Gi(0)m,(O) = 1""0
I
Gim' dz +
mGim’ + e,,,
eli' (14)
(14)
Constable et al. al. (1987)
(1987) present
present aa many-layered, linearized
inversion
inversion that minimizes the sum sum of the squaredsquared firstfirst differ- and minimize
ences
ences(or second second differences)
differences) of adjacentadjacent layers
layers of their models,
models, (15)
w(m,, x:, S,) = F(m,,f) + h,x: (15)
for
for aa given mistit. Their inversion minimizing the first
given misfit. first differ-
ences
ences shouldshould givegive very similar results results to oneone minimizing F, in the linearized constraints
with the constraints (14).
(14).
the
the limit of vanishing
vanishing layer layer thicknesses
thicknessesand aa sufficiently
sufficiently deepdeep the Appendix we show
In the choose ~,
show how to choose p, so
so that mini-
final
final layer.
layer. Since
Since Constable et et al.
al. weight
weight allall differences
differences equally, W(m,, l},
mizing W(ml' xf, ~t) results in the smallest
p,) results smallest F for aa specified
specified
their choice
choice of layer layer thicknesses
thicknesses(as (as aa function of depth) plays plays value X;
value of x: when the the linearization inherent in equation (7) is is
the role of the the functionf(z)
functionJ(z) in controlling the the "color"
“color” of the the fit valid. If m,(O) is
valid. is also
also unknown, we solve solve simultaneously
simultaneously for for
to the data.data. the m,(O)
the m,(O) which minimizes W(m" W(m,, X;,
x:, ~t).
h,).
We minimize structure structure directly by minimizing F(m, f) in
F(m, ./;) in aa Our algorithm is is aa method of keeping
keeping the the change
change to the the
stable
stable linearized scheme. scheme. Let mo(z) ma(z) be be the starting model
model of cr oDo model small
model small enough
enough at each each iteration soso that the the linearization
or log 0'0 crOfor the
the current step step and
and m m,l == mom, ++ ~mAm bebe the model is valid,
is valid. yet
yet large
large enough
enough so so that the
the flattest
flattest model with the the
2
considered
considered for tbe the next
next step.
step. Let "(i yi for ii == 11 toto N be be the
the realreal desired X
desired x2 is arrived at quickly, without an excessive
is excessive number
part and and forfor ii = N + + 11 toto 2N be be thethe imaginary part of the the calculations. The process
of forward calculations. process involves
involves choosing
choosing the the
measured
measured data normalized by by their standard
standard errors i . Simi-
errors E6,. target X;
target xf < X~, _xi. calculating m mil by minimizinjS-
minimizing W(m W(m,, xf, ~,)
i , X}, B,)
larly;
larly, let iet Yo;
yoi andand YI, yi, be be the
the datadata predicted
predicted by by mo and m
w10and m,l using the
using the linearization, and then then forward modeling to com-
normalized
normalized by by the the standard
standard errors. errors. The normalized
normalized misfitsmisfits pute X~,
pute the actual
x,‘, the actual X x22 attained by
by the
the model mo m, ++ a~m,
ubm, andand
e,, = Yi
eo, yi - Yo,
y,,, and
and ee,, l ; = Yiyi - "(';
yi, have
have total squared misfits X~
squared misfits xz W(m,o + a~l11,
W(l1l aAm, X~, xf, ~,), where 0 < au ~
p,), where I I. If a~m aAm is is small
small
and xi,
and xi, respectively.
respectively. enough, the
enough, the linearization will hold; and and W(m W(m,o + + a~m,
aAm, X~,
xb. ~t)
0,)
If ~m
If Am is is small,
small, perturbing equation (3) (3) and
and neglecting
neglecting be smaller
will be smaller than W(m,. X~.
than W(l11o. xi. ~,)'
S,), its
its value
value for
for the
the previous
previous
second-order
second-order terms terms in ~m Am gives
gives model. We then
model. then begin
begin another iteration, further further reducing
reducing the the

Downloaded 15 May 2010 to 95.176.68.210. Redistribution subject to SEG license or copyright; see Terms of Use at http://segdl.org/
1568
1566 Smith
Smith and
and Booker
Booker

target, until X~
target, until XI reaches our our ultimate
ultimate goal.goal. However, the value value RESULTS AND
RESULTS AND DISCUSSION
DISCUSSION
of W W may may increase
increase at any any step becausebecause aAm aAm is is too large
large for
the
the linearization to hold. The remedy remedy depends
depends on whether the Level
Level of
of misfit
misfit
large
large aAm uAm is is due
due to trying to flatten the model too much much in a
single
single stepstep or attempting to decrease decrease the misfit too much. much. To Requiring too small a misfit requires requires large oscillations
oscillations mim-
mim-
determine which is is the case,case, we use use the linearization to find icking the best fitting model (D+).
the best (De). Ideally, we should should aim for
the model
model m mff which
which minimizes
minimizes W(m X}, I3pf)
W(m,,j , x:: with I3
f ) with psf selected the misfit that our data has
the has with respectrespect to the true Earth
so
so that
that X} X6· This
x: == xi. Th’1s produces
produces a Am, Amf whichwhich flattens
flattens the response.
response. However, since since we do not know the true Earth’ Earth'ss
model
model without reducing reducing the the misfit
misfit. We then then compare the the size
size structure,
structure, the best best we we can do is is to aim for the expected
expected value,
value,
2
of Am,
Amf to Am - Am,.. Ami' For simplicity,
simplicity. we we compare using using the
the E(X
E(x*) ) = 2N. Even this this level
level isis not always
always desirable
desirable or possible,
possible,
maximum of the absolute absolute value value of the Yunctions
functions (t (/, oc norm). If If because
because the D+ misfit may approach or exceed exceed it, particularly
particularly
Am, is
AmY is similar in size size to to Am Amf , too much
Am - ArnJ.. much flattening is is to when
when the frequencies
frequencies are very closely closely spaced
spaced and the misfit of
blame, and and we we reduce
reduce aLI by a factor factor of 2. 2. If Amf is much
If Amf much the truth itself is is larger than E(x2). E(X2). Instead one one may want to
smaller
smaller than than fs.mAm - fs.m irmf,f' tootoo large-
iargc aIT an attempted
atiempted decreasedeerease in find the flattest model modeL withywith. some
some. higher level level of misfit, such
such asas
x2 is
Xl is to blame. We must must then then repeat
repeat the minimization
minimization with a the 95 percent confidence
confidence level. leveL Then one can be more confi-
smaller decrease in the target X;.
smaller decrease x:. The changechange aAm aAm can can always
always dent that the structures
structures which remain in the model are re-
be
be made
made small small enough
enough for the linearization to hold, and W W quired to fit the data. To illustrate the dangers dangers of overfitting
overfitting
decrease.Then mo
will decrease. n+, + aAm is
+ aAm is used
used asas the starting model for data and other points, we generated generated 11 frequencies of synthet-
11 frequencies
the next iteration. To avoid unnecessary
the unnecessaryfailed failed steps,
steps, we we never
never ic data from 3.2 3.2 x 10- 10m3 Hz to 1.6
3 Hz 1.6 x lo33 Hz. Since
X 10 disas-
Since the disas-
choose x;
choose xf less less than 0.1 X~, nor do we
0.1~~~ we change
change a by more than a trous
trous effects
effectsofof overfitting
overfitting the data are evident only if there are
factor
factor of 2 betweenbetween tries. tries. To be certain that we we reach
reach a mini- errors in the data, we added added 1 percent Gaussian Gaussian noise
noise to the
mum of Wand W and F, we we must
must iterate until Am, Amr is is negligible.
negligible. All synthetic data (Table 1). 1).
the
the necessary
necessary decisions
decisions in the the process
process can can be made automati-
be made In Figure I, we plot models models of log CJ cr which are flattest with
cally and our algorithm typically converges converges from a half-space half-space respect
respect to log (z (Z + zo) (a)
+ z,,) (a) with the expected
expected misfit x2 X2 = 22
to a model with the expected expected X x22 in about eight iterations. (model la), (b) with a much smaller misfit x2 X2 =
= 4.73 (model
These
These iterations are are quite rapid, since since only one or two forward h), and (e)
Ilb), (c) with the 95 percent percent confidence
confidence limit x2 33.9
X2 = 33.9
calculations
calculations are are generally needed needed to find an m m,l which reduces
reduces (model 1Ic). c). (We refer
refer to models
models by the number of the figure in
W. (When modeling log cr, rs, starting modelsmodels with unnecessary
unnecessary which they are shown, shown, and the letter of their trace in the
structure
structure often increase increase the the number of iterations required.) required.) figure,
figure, e.g .. model la
e.g.. I a isis shown in trace trace (a) of Figure I.) For
J.) For
It isis difficult to be bc sure that that we have have found
found the Blobal ylobal mini-
mini- comparison, in Figure 2 we have have plotted the true model, and
mum
mum of of F. We have tried tried starting
starting models
models ranging from half- the
the locations of the conductances
conductances of the best best fitting D+ model
spaces
spaces to smoothed
smoothed versionsversions of D D++ and have have nevernever found a (x2 == 3.75).
(X' 3.75). The valuesvalues of the the D+ conductances have been
D + conductances been
case
case where
where the the final model depended depended on the starting model. scaled into conductivities by dividing them by the distance
scaled distance
Thus it seems seems likely that the log cr CJmodels
models foundfound are globally between
between the midpoints to the adjacent spikes. spikes. These
These are the
the
the flattest.
flattest. In a previous
previous versionversion of the the algorithm, the the decision
decision conductivities that would result result from redistributing the con-
to accept
accept a model model 1110 m, + aAm for use
+ aAm use asas the starting point for ductances
ductances into uniform layers layers extending between between the mid-
the
the next iteration was made made solely
solely on the the basis
basis of improved x2 X' points.
points. These scaled conductances
These scaled conductances have values values very close
close to
rather thanthan W. With that criterion, the the algorithm was was oc- the
the true
true model; this is is consistent
consistent with the fact fact that inversion
casionally trapped in local
casionally local minima when the the model variable for conductance
conductance is is well posedposed (cf.,
(cr., Weidelt, 1985).
1985). The true
was
was Go and the the starting model was was very far from one one fitting the model makesmakes no no attempt to fit the errors in the data and has has a
data.
data. These
These minima were were easily
easily recognized;
recognized; X x22 was
was very largelarge misfit
mistit of 25.6. case, fitting to the expected
25.6. In this case, expected x2 X2 (model la)
and
and the the model
model had large large negative
negative valuesvalues of cr. o. Since
Since we we recovers
recovers essentially structure of the true model with the
essentially all the structure
changed
changed the the criterion. this this trapping has has not recurred.
recurred. Other exception
exception of the resistive resistive zone between between 1 km and 1.6 1.6 km.
convergence
convergence problems may occur occur when when the magnitude of a cr cr Fitting
Fitting only to the 95 percent percent confidence
confidence limit limit x2 (model lc)
X2 (mode1 Ie)
model
model approaches
approaches zero with increasing increasing depth. depth. In this case, case,
the magnitude of I(aAm
. the aAII1( Jj) I( is is typically found found to be be of thethe Table 1. Synthetic
Table I. data expressed in terms
Synthetic data of c =
terms of -t/y, gener-
= -l/y, gener-
order of I) cr( CJ(cc)/I or smaller.
Jj) smaller. As As Ij cr( Jj) I1decreases
o(m) decreasesin successive
successive ated from model
model (a) of
of Figure
Figure 2, with
with 1 percent
percent Gaussian
Gaussian noise
iterations, the the algorithm successively decreases a to keep
successively decreases keep added.
I1aAm(
uAm(s) -c I1crt
C0 JI( < cc)JI( and Am,
o(ex) Am J may never become become negligible.
negligible.
Frequency Re (c)
(c) Imag (c)
(c) 1 std error
This casecase occurs
occurs when the the lowest frequency data are overfit
lowest frequency (HL) (m) (m) (m)
(m) (m) (m)
and
and the best best fitting D Dtf model ends ends in a resistor.
resistor. [Two[Two of the
models presented in
models presented in the
the results
results (Figures
(Figures 7a 7a and 7b) 7b) suffer thisthis 0.003I835
0.0031835 39577.
39577. --62879.
62879. 526.
526.
problem. In these cases we
these cases we have
have let let the
the inversion
inversion continue 0.0159155
0.0159155 10587.
10587. --22403.
22403. 175.
175.
until IIo(&~l)
cr( ex» I1< S/m.] Fortunately, choosing
IO-”8 Sjm.]
-C 10- choosing a norm, such 0.0954930
0.0954930 2726.
2726. -5866.
-5866. 46.
46.
0.3183099
0.3183099 1378.
1378. -2359.
-2359. 19.
19.
as F(o, --l/zl/z +
as F(a, z,), which
+ zo), which does does not overfit the lowest lowest fre- 0.9549297
0.9549297 722.4
722.4 -1132.3
-1 132.3 9.5
9.5
quency data circumvents
quency circumvents the problem. Except for these these cases,
cases, 3.183099
3.183099 427.4 --470.8
470.8 4.5
4.5
we have
have nevernever found final cr o models
models which depended depended on the the 6.366198
6.366 198 350.1
350.1 --303.1
303.1 3.3
3.3
starting model. 15.91549
IS.9 IS49 246.2
246.2 --203.2
203.2 2.3
2.3
63.66198
63.66198 138.82
138.82 --110.77
110.77 1.27
1.27
159.1549
159.1549 93.94
93.94 --76.43
76.43 0.86
0.86
I 59 1.549
1591.549 28.34
28.34 --28.40
28.40 0.28
0.28

Downloaded 15 May 2010 to 95.176.68.210. Redistribution subject to SEG license or copyright; see Terms of Use at http://segdl.org/
MT Inversion
Inversion for
for Minimum
Minimum Structure
Structure 1569
1569

loses
loses resolution but reduces
reduces the
the sensitivity
sensitivity to data errors.
errors. Re- -.5r------r-----,------,------,------.-----,
quiring aa misfit
mistit close
close to the minimum
minimum possible
possible Xx22 requires
requires thethe (b) X2 '4.7
false
false structures

spikes
structures in model 1lbb to fit the noise
that most
most of the
spikes in the
the extraneous
the DLI-- model 2b.
noise in the data. Note
peaks in model I1b correspond
extraneous peaks
2b. This correspondence
correspond to
correspondence increases
increases as as
E -1.0
~ ----'" II ;;:(a) X2 , 22.0
i(e) X2 , 33.9

the
the misfit of aa minimum-structure log
minimum possible.
possible.
log cr
CTmodel approaches
approaches the the
o
o
.....J
-1. 5 ---------.....
r(
.\
I

I
I
\ I
, I
In comparing misfit statistics,
statistics, it should
should be noted that we we -2.
I
measure normalized misfit in terms
measure terms of Yiyi = - l/ed!:i'
l/c,/cr. Parker's
Parker’s b
D+ minimizes the squaredsquared misfit in termsterms of cJE; where E;
c,,/Zi where Ei is is 0 --2.5-
2• 5
the estimated
estimated standard error in ('i' so D+
ci, so D+ does
does not necessarily
necessarily o
obtain thethe smallest
smallest squared
squared misfit expressed terms of Yi'
expressed in terms yi. .....J
-3. o-
-3.0
When the relative misfit at each each frequency
frequency is is small
small (e.g.,
(e.g., :::;
255
percent),
percent), the squared
squared misfit is is very
very nearly identical expressed
expressed
-3.-

___
in y or c.
in c. When the relative misfit is is larger,
larger, the
the squared
squared misfit
of DD++ expressed
expressed in terms
terms of Y y may be be somewhat larger larger than
101
102 ld ldt 16 106
the
the minimum possible
possible and may be be different from the the squared
squared
DEPTH Cm3
FIG. 1.
F](;. 1. Models minimizing
minimizing F[log
F[log o, (z +
cr, log (Z + zO)]
zo)J fit to data of
misfit of D+
D+ expressed
expressedin termsterms of c.
c. Table 1, I. with misfits X2 =
misfits (a) x2 E(X 2 ) = 22.0,
= @x2) X2 = 4.73,
22.0, (b) x2 4.73, and
(c) X2
(c) x2 == 33.9.
33.9.
Whiteness
Whiteness of
of fit -.5r------r-----.,-----,------.------.-----~

The choice
choice of what norm is is minimized can affect affect the color
of the fit significantly.
significantly. Changing the norm by decreasing decreasing 1] n in E -1.0 ,___
equation (6) (6) penalizes
penalizes structure
structure at depth and typically in-
"-
(f) .__.

1
creases the size
creases size of thcthe low-frcquency
low-frequency residualsresiduals relative to the the 0 - 1• 5
residuals, making for aa bluer fit of the model to
high-frequency residuals, “o0-la5
.....J
-I
the data. In Figure 3, 3, we compare the the truth (model
(model 3d) 3d) to
U -2.0
LJ-2.

ir---1,--------......
{.{ ' "'" -.-. ---.-.-.-
flattest models
flattest models of log cr o with respect
respect to z, Z, log (2(Z +
+ 20)'
z,), and
and
~ l/(z +
-1/(z + zo),
z,,), (models
(models 3a, 3a, 3b,
3b, 3c)
3c) corresponding to 1] n == 0,0, - I,
1, b
-2, respectively.
-2, respectively. All these these flattest
flattest models
models have have Xl x2 misfits
misfits c3--2.5
0 2• 5
equal
equal to its its expected
expected valuevalue of 22. 22. We plot the normalized
o , : (b)O+
Y
.....J
residuals associated
residuals associated with the the three modelsmodels and the the truth in -3.0a
-3. Li
Figure 4. 4. The model flattest
flattest with respect
respect to log (z (Z + z,,), model
+ zo), I
3b,
3b, shows
shows the the true structure
structure most most clcarly
clearly of the three invcr- inver- -3.5 J
sions. The model flattest
sions. flattest with respect
respect to 2, 2, model 3a, 3a, shows
shows 18 ld 102 left 1(}'5 10'>
fluctuations
fluctuations at depth which are are not present
present in the the true model, DEPTH em)
Cm1
and
and thethe structure
structure is is less
less clearly defined
defined near the surface. surface.
FIG. 2.
FIG. 2. (a) Model
Model from which data of Table 1 were
were generated;
Model
Model 3a clearly
clearly fits
fits the
the high
high frequencies
frequencies systematically
systematically better x22 = 25.6.
X 25.6. (b)
(b) Conductances D"’ model scaled
Conductances of best-fitting D scaled
frequencies (Figure 4).
than the low frequencies 4). The model flattest
flattest with conductivities by dividing
into conductivities dividing by the midpoint
midpoint distances
distances be-
respect to - 1/(::
respect I/(; + 2,) shows
+ zo) shows lessless dctail
detail atat depth, more fluctu- tween the
tween the conductance
conductance spikes: X2 = 3.75.
spikes: x2 3.75.
ations near
ations near thethe surface,
surface, and systematically
systematically overfits the high -.5r-----,-----,-----,------.-----.-----,
-.5 I I I
I
frequencies (Figure 4).
frequencies 4). Overfitting
Overftting the the low frequencies
frequencies de-
mands
mands thethe oscillations
oscillations at depth, whereas whereas underfitting the high
frequencies
frequencies loses
loses resolution
resolution near the the surface.
surface. The model flat-
E -1.0
"-
(f)
........
test respect to log (z +
test with respect + 20)'
+,), model 3b, 3b, achieves
achieves aa fairly
even fit, resulting
even resulting in more accurate
accurate detail and fewer fewer extraneous
extraneous 0 - 1• ....(.
oscillations.
oscillations.
o I
Lie) f = -II (z +zo )
.....J
LJ
To quantify the color of the fit, we we useuse Spearman's
Spearman’s statistic
statistic -2.
D. This robust statisticstatistic isis used
used to test test the
the significance
significance of aa
trend (ef.,
(cf., Bickel
Bickel and Doksum, 1977, 1977, p. p. 365-369) and is is based
based
b
0 -2•
ranks Ri and Si
on the ranks Si of two variables.
variables. The samplessamples of aa
o
variable are arranged
arranged by size size and the ordered samples samples are .....J
(e.g., 1,
numbered (e.g., 1, 2, 3,
3, ... J. Then the rank of each
. .). each sample
sample is is -3.
simply the number of its its place
place in the the ordered set. set. In our case,
case,
we let RiR, be
be the ranks
ranks of the sums sums of the squares
squares of the real -3.
and
and imaginary parts of the the residuals
residuals (normalized by their stan- stan- 18 lo’ 102 ld ldt 16 106
dard errors)
errors) and let Si Si be
be the ranks
ranks of the corresponding fre- DEPTH Cm3
quencies.Spearman's
quencies. statistic D
Spearman’s statistic is
D is
FIG. 3.
FIG. 3. Models all with misfit x2 X2 = 22.0 fit to the data of Table
= 22.0
I. minimizing (a)
I, (a) F(log o,cr, z),
z), (b) F(log cr, log (z +
F(\og o, + z,,)),
zo)), and (c)
(c)
D = 2 (S, - RJ2 (\6)
(16) F(log
P(log 0, - I!(- +
o, -1/(z + so)). Model from which data were
zo»). (d) Model were gener-
L= :
;'-=1 xZ2 == 25.6.
ated; X 25.6.

Downloaded 15 May 2010 to 95.176.68.210. Redistribution subject to SEG license or copyright; see Terms of Use at http://segdl.org/
1570
1570 Smith
Smith and Booker
Booker

(j)
and is is equivalent to a correlation coefficient coefficient between
between the two -.J 1
sets
sets of ranks.
ranks. LowLow values
values of D correspond
correspond to positive corre- ~
~
a
-1
lations and high values values to negative
negative correlations.
correlations. A statistic
statistic ~
~
based
based on the ranks ranks is is more robust
robust than one one based
based on the the (j)
actual
actual values,
values, because
because it does does not depend on distributional W 1
assumptions
assumptions (such (such as as the
the errors being Gaussian)
Gaussian) and is is in-
p:: a
-1
variant to transformations of either variable as ~
as long as as the W
transformations conserve conserve order. (Our use use of D does,does, however, N
1
require that the the errors in the data be be independent.)
independent.) -.J a
~ -1
Standard tables tables exist for the the distribution of D (Lehman,
L:
1975,
1975, p. p. 433).
433). For no correlation between between Rand R and S, S, the distri- p::
0
bution of D is is symmetric about its its expected
expected value E(D), which
value E(D), Z 1
is
is (N N)/6, and has
(N33 -- N)j6, has variance
variance var (D) = N 2(N +
N*(N + Il)*(N
)2(N - 1)/36.
1)/36.
a v
.._.
_
-1
large N, the
For large the distribution of [D [D - E(D)]/[var
E(D)]/[var (D)]IJ2(D)]“’ is is I . , , ..,.I, , , , * ,,,. I , , , ,,,,,, , , , ,,,,,, , , , ,_

approximately normal for no correlation between between Rand R and S. S. In w· 3


lo-3 10-
10-22 10-I 1
10- IcP
18 1011
10 162
102 103
ld lo't
Id+
the case of I11
the case I frequencies,
frequencies, P(D P(D s I 102
102 or 338 338 sI DjD) == 0.094,
0.094, FREQUENCY
FREQUENCY (Hz)
(Hz)
since P(D s
since I 102) 0.047 and E(D)
102) = 0.047 E(D) = 220.220. Also P(D s < 8484 or FlO.
FK;. 4.
4. Residuals models of Figure 3,
Residuals of models 3, from top to bottom
356 s
356 I D) == 0.048.
0.048. We may concludeconclude with a 90 percent percent confi- (3aH3d).
(3a)-(3d), normalized by the standard errors of the data.
dence
dence level
level that therethere is is a trend when D SI 102 102 or D 2': 2 338,
338, ( -
( ~~, rql ---, imaginary part.)
real part; ---,
and
and with a 95 95 percent
percent confidence
confidence level level when D sI 84 84 or
by the truth, so
by the so this is
is a very good choice F. Our
choice of F. Our experi-
D 2':
2 356.
356.
ments
ments used logarithmically spaced
used logarithmically spaced data and fairly uniform
After we we have normalized the the observations
observations to have have unit
error estimates; less uniformly distributed data, this may
estimates; for less
variance,
variance, the the actual
actual squared
squared data errors should should be be randomly
not bebe as choice. For models
as good a choice. models minimizing F(cr,!), such
minimizing F(m,f), such
distributed and un correlated with frequency.
uncorrelated frequency. The presence
presence of a
as
as those shown in Figure 7,
those shown 7, there does
does not appear to be be a
trend in the the residuals
residuals may indicate a frequency-dependent
frequency-dependent
single best
single best choice
choice off(~)
of I(z) independent of the true conductivity.
misestimation of the errors in the data, a fail ure of the
failure the 1-1-DD
For models resistive at depth, using
models which are resistive F[o, log (t
using Fro, (z
assumption,
assumption, or a failure to model the data adequately adequately due to a
systematic
systematic bias bias in an inversion routine. (By bias bias we meanmean the the
+
+ =o)J
z,)] tends frequencies, since
tends to overfit the low frequencies, since it does
does not
penalize structure in resistive
penalize resistive regions
regions as
as much as as using
using F[log
F[1og
tendency to fit some
tendency some frequency
frequency ranges
ranges better than others.)others.) One
C. log (=
cr. (2 +
+ zO)]
2 0 )] does.
does. Minimizing
Minimizing F[o, (2 +
F[cr, log (Z + so)]
zo)] fits
fits the
the
should rule out the last
certainly should last possibility beforebefore invoking
data more unjformly
uniformly when the true conductivity is is more uni-
either of the firstfirst two as as probable causes
causesof an observed
observed trend.
form.
form.
Since D may vary by approximately (var (D»1/2
Since (D))‘j2 from its its
To avoid unnecessary
unnecessary structure at some some depths
depths and insuffi-
expected
expected valuevalue when no trend exists, exists, asas it does
does for the true
cient
cient structure depths, we must reject
structure at other depths, reject any models
models for
model (3d),(3d), we do not require that D == E(D) E(D) exactly for a
which Spearman's statistic indicates
Spearman’s statistic indicates a red or blue fit, regard-
acceptable. Values
model to be acceptable. Values of D for the the residuals
residuals of the the
less of
less or how the models
models are
are obtained. At the very least,
least, we must
models
models of Figure 3 are listed listed in Table 2. 2. Either model 3b or 3c 3c
exercise
exercise caution in interpreting the deeper deeper portions of models
models
is acceptable
is acceptable at a 90 percent percent confidence
confidence level level on thethe basis
basis of D.
for which D D takes
takes on small values,
values, since
since they may contain
synthetic data, we
With synthetic we cancan compare
compare the obtained values values of D
large oscillations
large oscillations due to fitting the errors in the data. One
ta the-
to value from the residuals
thcvalnefrom residuals to the the truth, to check
check for biases
biases
must
must also realiz~ that
also realize that these
these models
models may not have have the neces-
neces-
of an inversion
inversion algorithm. The value value of D for the truth lies lies
about 0.4 of the way between between those usingf(z) = log
given by usingf(z)
those given log cr(J -. 5
-.5~----~-----,------'------.------r-----,
1 1 I I I I
and J(z) - l/(z +
f(z) == -l/(z + zo).
z,,). Thus an 11 IJ of about - 1.4 1.4 should
should give
give
least biased
biased fit to this this data set.
set. With
With real
real data one should should
,...
the least E -1.
-1.
check
check that the the values
values obtained are within a range range that has has a "-
(j)
reasonable
reasonable chancechance of occurring (e.g., 2 0.1).
(e.g., P 2': If D indicates
0.1). If indicates a
C) -1.5
-1.
trend, one one then can can evaluate
evaluate the bias bias of the inversion algo- 0
rithm by inverting synthetic data at the the same
same frequencies,
frequencies.with -.J
the
the same
same scale
scale errors
errors asas the real
real data, generated
generated from a model -2. a
-2. o-
that at least
least roughly fits fits the
the data. b
experience is
Our experience is that models
models minimizing
mimmlzing F[iog F[iog cr, cr,
cl -2.
//l;~;~;'~-~;~-~;---
C) -2.
log (Z +
log (z 0 )J tend
+ 2-_J] tend to havehave values
values of D close close to the value
value given
given 0
2
-.J
-3. a
-3. / L(b) 5% error
Table
Table 2. Spearman's
Spearman’s D for the residuals left by the models
z ·-(a) 20% error
0
shown in Figures 3 and 7, showing the effect of
shown
fez). For
function f(t). For 1111 frequencies, E(D)
limits), if
cent confidence limits), if no trend is
is present.
of choice or
E(D) = 220 ±
of depth
+ 118 (90
(90 per-
Z
y -3r*Id
-3.
lcP 101
18 ld 162
103
102
DEPTH Cm3
DEPTH (m)
1(1+ Ie? lrP

FIG. 5.
FIG. 5. Models minimizing
minimizing F[logF[log (r, (Z +
cr, log (z + z,,)]
zo)] for data sets
sets
I(;;)
,/I4 z
2 (z +
Log (2 l/(5 +
0 ) -- 1/(2
+ 2z,) + 2zO) Truth Variable
0 ) Truth generated
generated from model 2a (Figure 2), 2), each
each with x2 X2 = E(x2)
E(X2) =
----+---
22.0,
22.0, for three different levels
levels of error: (a) 20 percent
percent error, (b)
Figure 3 86
86 256
256 322
322 284
284 log (cr)
(0) 5 percent (c) 1 percent
percent error, and (c) percent error. (d) Model
Model from which
Figure 7 54 1\0
110 300 284
284 cr.
0. data were generated;
generated; x2 2
X = = 25.6.
25.6.

Downloaded 15 May 2010 to 95.176.68.210. Redistribution subject to SEG license or copyright; see Terms of Use at http://segdl.org/
MT Inversion
MT Inversion for
for Minimum
Minimum Structure
Structure 1571
1571

~
f
"'-
(j)
.5r------r------r-----~------~----~----__,

el., = YM, - '(m, - i x iI;(Z, m)[M - mJ d.:::, (17)

D where Y~t, and


where yu, ,("", are
and y,,, are the
the normalized
normalized data data predicted
predicted by
by the
the
o truth
truth and
and the
the Rattcst
flattest model.
model. We We call
call eL,
eL , the
the linearization
linearization error.
error.
-l
LJ
In
In vectc>r
vect,lr form.
form, ifif M ,~1 is
is the
the true
true earth,
earth, then
then our
our measured
measured
po: data
data yyare
arc the
the sum
sum of of the
the true
true data
data Y,$~
YM and
and thethe data
data errors
errors ed,
ed'
o Or---~~--~~~r---~T_~~------~
po:
po: --------r---------------------- Y =YU + ed. (18)
(18)
W L2 sid errors
Letting
Letting By
/',.Y he
be the
the residual
residual from
from fitting
fitting the
the model
model m,
m, we
we also
also
po:
« have
have that
that
w
Z Y =YM + 4’. (19)
(19)
~

-l F, F(log (eT), log (z+zo))


Z Assume
Assume that our iterative inversion process process has
has converged
converged so
o _.5~~~~~~~~LU~~~~~~~~~~w
that A,?1
/',./11 is
is negligibly small.
smalL Then thethe starting model
model for a step
Z uP 101 1~ 103 Hf' 10'5 lrP
,n,,
Ill" and the resulting model model mm are identical.
identicaL Using
Using equations
DEPTH Cm)
FIG. 6.
(IX)
(18) and (19) (19) with
with equations (7).
(7), (9).
(9), and (A-12).
(A-12), we
we get
6. Nonlinear
Nonlinear error versus
versu:; depth for model 3b (Figure 3)
for which the model variable is
is log B,
cr, plotted with an envelope
envelope
±2 linear standard errors.
of & errors. !>I(_)
Jll(:) = r' B(Z)’
Jo
1’0
B(~)' g(=,,
g(=o'. m) .M(=,)
M(.:::o) d;,
Ill) + B'e
dz o + B’e, L + B'ed'
+ Bfed, (20)
(20)

sary
sary shallow structure
structure to fit the high-frequency
high-frequency data ad- where
equately.
equately. Similarly. for models
models with largelarge values
values of I>,
D, the
shallow portions may contain large
large oscillations.
oscillations, due to over- B' =
B’ =A’A' ++ a’rt.' - A'G(O) CL’
A%(O) rt.'.. (21)
(21)
fitting the errors in the high-frequency
high-frequency data, and deep deep struc-
struc-
ture.
ture, which
which may not adequately
adequately fit
fit the
the lower frequency
frequency data.
data. (A'
(A’ and
and a’ c/,' are defined
defined in the Appendix.) Equation (20)
characterizes
characterizes the the Hattest
flattest models
models as as the true earth M smoothed
Effect of error
Effect of error level
level on
on resultant
resultant models
models through
through the the resolution
re~olution function
function B(z)‘ ~(z,) plus
B(z)'g(zo) plus the nonlinear
error B’ B'ee,L and
and the stochastic
stochastic error B’ e,. d . The nonlinear error
B'e
As the
As the errors inin MT
MT data decrease.
decrease, flattest
flattest inversions
inversions repro-
made
made in interpreting the model as as a filtered
filtered version
version of the
duce
duce the
the true structure
structure with increasing
increasing fidelity. We have have gener-
gener-
truth is given by propagating e,_
is given e L in our model estimates estimates in
ated three
ated three sets
sets of synthetic
synthetic data with 20 20 percent,
percent, S 5 percent,
percent,
exactly the
exactly the samesame way that random errors errors ed ed propagate; i.e., i.e.,
and percent errors
and I percent errors added.
added. The frequencies
frequencies and the true true
B’e,.L . This procedure
B'e procedure was was used effectively
effectively in the the seismic
seismic travel-
model are
model are the
the same
same as as for
for our first
first set
set of data.
data. In Figure 5, 5, we
we
time problem by by Pavlis
Pavlis and Booker (1983). (1983). The nonlinear
models minimizing F[log
plot models F[log cr, o. log
log (z (-_++ ':::0}1
=“)I for the the three
three
2 error is is just the the difference
ditTerence between
between the the flattest
flattest model
model and the
data sets,
data sets, fitting each
each model
model to to the
the expected
expected X x2. expected,
. As expected,
smoothed through
truth smoothed through the the resolution
resolution kernelskernels of the flattest
flattest
we resolve
we resolve more details
details of the the true
true conductivity as as the
the level
level of
model. with aa correction
modeL correction for the the differences
differences that the the linear
errors in
errors in the
the data
data decreases.
decreases.
theory predicts
theory predicts shouldshould be be due
due to the the differing responses
responsesof the
inverting data,
In inverting data, it is is essential
essential that the the estimates
estimates of the the
two models
two models [cf., [cf., equation
equation (17)],(17)]. We emphasize
emphasize that nonlinear
errors in
errors in the
the data
data bebe accurate.
accurate. If the the estimates
estimates of the the errors
errors are
are
2 error andand linearization error are are only relevant
relevant to interpreting
too large,
too large. then
then the
the estimated
estimated X x2 misfit [equation (I)]
misfit (I)] will be be
flattest models
flattest models as as averages,
averages, and and that minimization of W does does
too small;
too small; fitting
litting toto the
the expected
cxpccted X x22 may
may underfit
underfit the the data,
data,
not depend
not depend on on having
having small
small linearization errors. errors.
losing resolution.
losing resolution. Worse
Worse yet, yet, if thethe estimated
estimated variances
variances arc are
resolution kernels
The resolution kernels may be be used
used to to display
display the the inherent
inherent
unrealistically small,
unrealistically small, even
even fitting only only to to the
the 9S 95 percent
percent confi-
confi-
2 may be
resolution limitations of aa data
resolution data set.
set. In addition to this, this, given
given aa
dence limit X
dence x2 may be overfitting the the data,
data, and and maymay require
require
Hattcst model
flattest model and and aa setset of resolution
resolution kernels,kernels, oneone might be be
false structures
false structuresto to fit
fit the
the noise
noise in in the
the data.
data. Egbert
Egbert and and Booker
Booker
tempted to
tempted LOtry try to
to deconvolve
deconvolve the the flattest
flattest model
model to to obtain the the
(1986)have
(1986) have shown
shown thatthat G GDS transfer function
OS transfer function estimates
estimatesfoundfound
truth. This is
truth. is not
not possible
possibleeveneven assuming
assumingthat both both error terms
terms
by conventional
by conventional nonrobust
nonrobust methods
methods often often have
have unrealistically
unrealistically
are negligible
are negligible (Cd (e, == c/.
e,. = 0).
0). In this
this case,
case, bothboth the
the truth andand thethe
small error
small error estimates
estimates due due toto violations
violations of of the
the assumptions
assumptions of of
Rattest model
tlattest model averaged
averaged through
through the the resolution
resolution kernelskernels yield
yield
uncorrelated Gaussian
uncorrelated Gaussian errors errors implicit
implicit in in the
the standard
standard meth-meth-
the same
the same flattest
flattest model,
model, so so aa unique
unique deconvolution
deconvolution is is impossi-
impossi-
ods. Similar
ods. Similar results
resultsareare likely
likely toto hold
hold for
for MTMT impedance
impedance error error
ble. Since
ble. Since resolution
resolution kernels
kernels have have been
been presented
presented for for MT
MT in-
estimates,so
estimates, so robust
robust transfer
transfer function
function estimation
estunation methods methods suchsuch
versions previously
versions previously (see (see for
for example
example Parker, 1970, 1970, or Olden-
as those
as those used
used byby Egbert
Egbert and and Booker (1986) or
Booker (1986) or Chave
Chave et et al.
al.
burg. 1979),
burg. 1979). we msewill consider
consider only the the nonlinear
nonlinear errors
errors in-
(1987)should
(1987) should bebe used.
used.
herent in
herent in their
their use.
USC.
Nonlinear
Nonlinear error
error in
in flattest
flattest models
models considered
considered as
as When the
When the truth
truth contains
contains largelarge variations
variations not not resolved
resolved by by
averages of
averages of the
the truth
truth the data,
the data, such
suchas as the
the resistive
resistivezonezone between
between 1.0 1.Okmkm and 1.6km
and 1.6 km
included in
included in our
our test
test case,
case, thethe magnitude
magnitude of of the
the linearization
linearization
The flattest
The flattest model
model me:)
m(j) can
can be
be shown
shown to to be
be the
the truth
truth M(z)
M(z) error inherent
error inherent in in interpreting
interpreting models
models of of loglog crG to
to be
be averages
averagesof of
smoothed through
smoothed through aa resolution
resolution function,
function, plus
plus aa nonlinear
nonlinear the true
the true loglog ('j'o is
is large.
large. The
The nonlinear
nonlinear errors errors B'eB’eL,, for
for log
log cru
error and
error and aa stochastic
stochastic error.
error. The
The term
term neglected
neglected inin writing
writing models 3a,
models 3a, 3b,3h, and
and 3c 3c are
are extremely
extremely similar,
similar, so so we
we plot
plot only
only
equation (7)
equation (7) for
for the
the change
change between
between the
the flattest
flattest model
model and
and the
the one (3b)
one (3b) in in Figure
Figure 6. 6. For
For comparison
comparison we we have
have plotted
plotted an an
truth isis
truth envelope of
envelope of ± t 22 standard
standard errors
errors (linear
(linear stochastic
stochasticerror)error) ofof the
the

Downloaded 15 May 2010 to 95.176.68.210. Redistribution subject to SEG license or copyright; see Terms of Use at http://segdl.org/
1572 Smith and Booker

Table 3.
3. Squared
Squared linearization error
error for
for various
various models,
models, comparing
comparing the effects of the
the effects the choice
choice of model
model variable.
variable. (Paten-
(Paren-
thetic
thetic values
valuesomit the
the lowest
lowest frequency.)
frequency.)

Model
Model 3a 3b
3b 3c 7a
7a 7b 7c
7C lOa
10a

Variable log (J
cr log (J
0 log (J
0 (J (J a cr
6 8
Ie,.I’ 3746.
3746. 2488.
2488. 1724.
1724. > 10 (3.02)
10bG(3.02) > 10
lOs”((1.52)
I .52) 0.467
O&7 1.72
1p72

model interpreted as as averages


averages through
through resolution
resolution functions.
functions. depth) is furthest jn
is furthest in some sense
sense (or norm) from that of the
The nonlinear error is is greatest
greatest near the depth of the unre- truth. The integrated conductivities of two admissible
admissible profiles
profiles
solved where it is
solved layer, where is much larger than the uncertainties uncertainties in may easily
easily differ at great depth, where the data no longer
the averages
averages due
due to random noise noise in the data. The squared squared constrain the conductivity in any way. Parker (1981)
constrain (1981) has
has
magnitude of the linearization errors errors I1eeL I’2, , listed
L 1 listed in Table 3, 3, is
is shown that one can often find models
shown models terminating
terminating in an infi-
large
large for all the log (J csmodels.
models. Each
Each hashas large squared squared errors
errors nite conductance acceptable
conductance which still fit the data within an acceptable
(> 40) at each
(> each of the seven
seven lowest
lowest frequencies.
frequencies.These These errors areare x2 misfit. For
Xl For these
these models,
models, the
the conductivity below the infinite
due
due principally to thethe unresolved
unresolved resistive
resistive layer.
layer. conductance
conductance has Parker calls
has no effect on the data. Parker calls the shal-
The nonlinear error made in interpreting models models of (J o as
as lowest
lowest level where one can place place the
the infinite conductance,
conductance,
averages of the true conductivity need
averages need not be be soso large.
large. Figure while still having x2 less than or equal to the 95 percent
X2 less percent
7 shows
shows models
models minimizing F(a, F(o,f)f) for
for comparison with the confidence
confidence limit of x2 X2 the "maximum inference."
“maximum depth of inference.”
log acsmodels
models in Figure 3. 3. In Figure 8 we plot the the nonlinear Models with an infinite conductance
conductance can never be linearly
error asas a function of depth with envelopes envelopes of ± 2 standard
_+2 standard close to any model lacking the infinite conductance,
close conductance, since
since the
(stochastic)
(stochastic) errors for (J o models
models 7a 7a and 7b. 7b. For model model 7c,7c, the
nonlinear errors
errors (not shown)
shown) are less less than 0.5 0.5 standard errors
errors r-I .01
E
depths. For
at all depths. For the models
models 7a 7a and 7b the the nonlinear errors "'-
(f)
are smaller than I1 standard error at alJ all depths
depths above above 500 km.
The magnitude of the nonlinear error increases increases below 500 km
models 7a and 7b,
for models 7b, reflecting
reflecting the fact fact that the data no p:::
longer constrain
constrain the model enough enough for the Frechet kernels kernels Yiyi 0
p:::
to be
be similar for
for different models
models fitting the data. The increase increase p:::
W 0
in nonlinear error at depth reflectsreflects huge
huge linearization errors errors at
p::
the lowest
lowest frequency
frequency (> (> 10lo66 for models
models 7a 7a and 7b). 7b). The sums
sums «
of the squared
squared linearization errors of all frequencies
frequencies except except the w
lowest are 3.02 and 1.52
are only 3.02 1.52 for models
models 7a 7a and 7b, 7b, respec-
respec-
z
-.J
tively. Even with very large linearization errors errors at the lowest
lowest
Z
frequency
frequency in these
preting the a(J models
models as
examples, the nonlinear errors in inter-
these two examples,
as averages
averages are are insignificant at all depths depths
0
Z F F(u,z) =
of interest.
interest.
suspect that nonlinear errors should be largest
We suspect largest when thethe
integrated conductivity of a flattest flattest model (as (as a function of

.15r-----~-----.----~------r_----._----~ " .Olr------r----~r-----,-----_.------~----,


E
"'-
(f)
LJ

p:: ,-..
g -*---- I ----_._
op:::
“....\
____---------_.~_ : *a,
,,.‘.._,.’ - ----- ________
____---..___“i
E
p:::
k..,’

w
W ok_*_-‘-
p:::
I= p.
b
5
w
« i -‘ *---*
_l_ ___.________...-----------..
_____...*,*,*”
zZ
H
2 std.errors
-1
Z
~ = F(r, log(ztz,))
F=F(cr,loglzH
F o))
OF4 _.OlL-~~~~~~~~~W-~~~~~~~~~~

-= IcP
18 101
ld lei!
1cF 10'
16 left
lti liP
lo’ ui>
lo" IOJ 10' n:? 103 lrft 1(}'i l-rP-
DEl'TH
DEPTH em)
Cm1 DEl'TH em)

FIG.
FIG. 7.
7. (a,
(a. b,
b, c)
c) Models all with misfit X x22 =z 22.0 fit to data of
Table I:1: (a) minimizing
minimizing F(a, z), (b)
F(o, z), F[o, log (2
(b) F[cr, (Z +
+ zo)J,
+)I, and
and (c)
(c) FIG.
FIG. 8.
8. (a, b) Nonlinear
Nonlinear error versus
versus depth for models
models 7a,
7a, and
F[a, - l/[z +
F[o. -1/(z 0 )]. (d) Model
+ 2so)]. Model from
from which
which data were generated;
generated; 7b (Figure 7),
7), for which the model variable is
is cr,
a, plotted within
x2 = 25.6.
Xl 25.6. an envelope of + ± 2 linear standard errors.
errors.

Downloaded 15 May 2010 to 95.176.68.210. Redistribution subject to SEG license or copyright; see Terms of Use at http://segdl.org/
MT
MT Inversion
Inversion tor
for Minimum
Minimum Structure
Structure 1573
1573

latter
latter have Frechet kernels kernels gjgi that are are nonzero at all finite • 15rl----.-----.-----.----.~---.----~
depths.
depths. As As finite_
finite_conductivities
conductivities at the the bottom of a model are- are
-I
increased,
increased, the the electric
electric fields
fields (from which the the Frechet kernels
kernels
are calculated)
calculated) are excluded
excluded from the the high-conductivity region region '\..--0+ smoothed through
.10
and the data are less less affected
affected by the the change
change in conductivity , resolution kernols
than the linearized theory would predict. An example of in-
creased nonlinear errors
creased errors due
due to this
this effect
effect follows
follows at thethe end
end of E _._.. __....,....,..
the next section.
section, The practical meaning is is that the kernels
kernels ~ .05 Lmodel 7b . .~ .. ~ ../
obtained by linearization may tend to overestimate overestimate the effects
effects
of a large
large increase
increase in conductivity at great great depth. Tn In light of
this,
this, we cannot use use the averages
averages through the resolution ker-
o
nels
nels to exclude
exclude the possibility of large increases increases in conduc-
conduc-
near or below the maximum depth
tivity near depth of inference.
inference.
Lnon-linear error
A more fundamental concern concern remains:
remains: WithinWithin the depth
range
range for which a data set set contains information,
information, how large large an
effect
effect can
can variations of integrated conductivity have have on non-
linear errors?
errors? This concern
concern would be be best
best addressed
addressed by by con- DEPTH (ml
sidering
sidering the nonlinear errors that would be be indicated if the the FIG.
FIG.9. (a) Model 7b. (b) D+ model 2b smoothed through
model 7b.
true conductivity were one of the D+ D+ models
models minimizing
minimizing or resolution
resolution kernels (c) Nonlinear
kernels of model 7b. (c) Nonlinear error for
faT model
maximizing conductance X2 (WeideJt, 7b D+ model 2b considered
7b with D+ considered as
as true model. (d) +2
±2 linear
conductance for a given given levellevel of x2 (Weidelt,
standard
standard errors of model 7b.
7b.
1987).
1987). Since
Since the code to compute these
the code these models
models is is not widely
distributed, we consider
consider instead
instead thethe nonlinear errors that -.
-.5~---.-----r----.-----r----.----,
5 I I
I I 8
would be be indicated if the best-fitting D+ model were actually
~(a) minimizes (F.J
the true conductivity. The results results of this
this exercise
exercise for model 7b ,.....
E -1.0
are shown in Figure 9. 9. The nonlinear error is is substantially
substantially ......
standard errors in many parts of the
larger than two standard the model. (f)

Despite this, Df smoothed through the


this, D+ the resolution kernels
kernels of D -I.
the
the model still bearsbears a strong resemblance
resemblance to the model, indi- o
....J
cating that in this case case the data may constrain the integrated
conductivity well enough enough for resolution kernels kernels to remain a
worthwhile means means of expressing
expressing thethe resolution properties of b
the data.
data. Although in each each case,
case, this
this exercise
exercise considers
considers the D
Cl -2.
errors indicated
nonlinear errors indicated for only one one of the infinite number
o0
....J
-I
of possible
possible candidates
candidates for the the true conductivity, it provides
provides an -3.
indication of how poor the linearization may be be if the conduc-
tance
tance of the truth is is distributed in as as uneven
uneven a manner as as that
of D I’ . We have not experimented
experimented muchmuch with this this exercise,
exercise, but
it seems
seems probable that more reasonable reasonable candidates
candidates for the
truth may be be expected
expected to yield smaller
smaller linearization errors.
errors.
FE.
FIG. 10.
10. (a) Logarithm minimizing F,
Logarithm of model minimizing F 2 [o,
[a, l/o,,
l/a o ' log (z
+ z,)] with
+ =,,)] X
x22 = 22.0, and where crO
= 22.0, cr 0 is
is the conductivity at the
previous iteration. (b) True model.
previous model.
Log (J mode!s- r~east- ~ cr- models
,..... .01

.Ol-r_--
Models minimizing a derivative of log cr o are formulated as as E
averages of log cr,
averages o, not asas averages
averages of cr,
o, so
so interpreting them asas "-
(f)
L.J
averages is
averages is subject
subject to large errors. By reformulating
large nonlinear errors. ;
the problem slightly, it is
the is possible
possible to find averages
averages of cr
o that p::
share the
share the desirable
desirable characteristics
characteristics of nonnegativity and re- 0
p::
duced variability in resistive
duced resistive zones
zones that averages
averages of log ao have,
have, p::
but which avoid the the nonlinearity of log a. cr. Instead of mini- W 0
F[log a,/(;;)J,
mizing F[log O,,/‘(Z)], we minimize p::
«
[
F 2 cr, I/ao , f(;;)
[1
f'
= Jo
J 0'0
do J2
(z) dj(z) d[(z), (22)
w
z......
....J
1
i’
where,
where, at any iteration, aoO(z)
o (z) is
is the
the conductivity profile of the
previous iteration or alternatively the conductivity profile
previous
found by directly minimizing the
found the derivative of log a.
o. These
These
Z
a
z I
_.OIL-~~~~-u~W-~LU~~~~~~~~L-~~.w

1~ 1~
\,
s ,%

I~
‘I :__.*,/
L,
;

1~ I~
model (lOa)
for model

I~
(100)
minimizing (F,)
minimizing (F:)

I~
minimizations would be be equivalent to minimizing the the deriva-
DE"PTH
DEPTH em)
Cm1
tive of log a,
o, except
except that the weight function is is held
held constant
constant
with respect
respect to variations &cr 60 in any single
single iteration of the FIG.
FIG. 11.
1 I. Nonlinear lOa, plotted with an en-
Nonlinear error of model lOa,
inversion. Treating 1/°
inversion. t/o,(z) as a weight function in equation
0 (z) as velope
velope of + errors.
± 2 linear standard errors.

Downloaded 15 May 2010 to 95.176.68.210. Redistribution subject to SEG license or copyright; see Terms of Use at http://segdl.org/
1574
1574 Smith
Smith and
and Booker
Booker

-.5r-------------.-------------.-------~----~ ~
pI .oo+
.00lr------------.-----------,-.-.----------,
I 1
E
E
.. maximum depth
'-

o~-::r_.__
?
(J) .002
.002- 7
of inference I! ~ r

~: : .f
p::
o
i
.001
l
....•./ ....-.----
(al minimizes (F2l p:: ---------------------------------_•••
p::

r--- A . I:'
w O~----------------~--+_--------r_------~

b : ~
pr
6 -------T-------------------·_----"'
_. 001 __._____/~~._;~;--------- \~---------.~~,..__-_~
\ l.········-----------'
........% ""
" ",

~---
:
CJ -2.
o ---------;.~ ! g
Z
H
>-4
-.OOl
I
L 2 sid errors
\\1
'
...~ ......- .. -~
-l
-3.0 i -l
2 -.00
Z -.oo for model
for model(l2al
(120)
i(blfruth o
Z
9
_3.~--~~~~~~~-L~~~--~~~~~ II : , , , ,,,,,,
l~ 1&
ldr 1~
lb lrY
DEPTH
DEPTH em)
Cm3 DEPTH-
DE3Tkt Em)
Cm:
12.(a)
FE. 12.
FIG. (a) Lofarithm
Loprithm of model
model minimizing F F,2 [a,
[o, 1I lao, log (z
/oO, log (z FIG. 13.
FIG. 13. Nonlinear error of model
model 12a
12a (Figure 12),
12), plotted with
+ =0)]
zo)] with Xx == 43.8,
43.8, and where acr,
and where is the
o is the conductivity at at the
the an cnvclope of ±
an cnvelope + 2 standard
standard errors.
errors.
previous iteration. (b)
previous (b) True model.
model.

(5), equation (20)


(5), (20) lets
lets us us interpret models models mmlmlZlng
minimizing this this model
model (not shown) are reduced
(not shown) reduced to 14.6, 14.6, and the
the nonlinear
weighted norm as
weighted as averages
averages of the the true conductivity through error attains aa value
value of just twice
twice the
the stochastic
stochastic standard
standard error
resolution functions.
resolution functions. In Figure 10 IO we
we plot the the logarithm of the the in the
in the final layer and is is less
less than the stochastic
stochastic error else-
else-
conductivity model model minimizing
minimizing this norm norm with with f,f‘== log (z + + where. The much
where. much larger
larger nonlinear
nonlinear errors first of these
errors in the first these
z,,) and Cl"o
=0) CT<, as the model
as model of the previous previous iteration.
iteration. The log log of examples are
examples are evidently duedue to the large
large discrepancies
discrepancies in con-
this model
this model is is almost
almost identical
identical to model model 3b, 3b, soso we we may be be ductance at great
ductance great depth. An explanation
explanation is is clear: side
side bands
bands of
certain that differences
certain differences in 1(e,, /I are
e L 12 are due
due to to our choice
choice of model
model the resolution
the resolution kernels
kernels for the flattest
flattest model extend
extend into the
variable (a (o or log log Cl"),
m), not due due to differences
differences in in the
the models
models conducting region,
conducting region, and thethe large
large conductivities
conductivities there
there have
have aa
themselves. For the
themselves. the cr
o model
model lOa, IOa, 11eeJZ
L I
2 is
is only 1.72,
1.72, negligible
negligible large effect
large effect on
on the
the averages
averages through the the resolution kernels,
kernels,
compared to the the data errorserrors and three three orders
orders of magnitude even when the amplitudes of the kernels
even kernels are very
very small
small at
smaller than 1(eJ2
smaller eL 12 for the the comparable log a o model 3b. 3b. In depth. Despite large nonlinear errors.
depth. errors, both of these
these models
models
Figure 11 11 we
we plot the the nonlinear error inside inside an an envelope
envelope of appear to give
give reasonable
reasonable averages
averages of the conductivity at
±
i2 2 standard
standard (linear) errors of the averages. averages. As As expected,
expected, the the depths above the
depths the maximum depth of inference, inference, the caveat
caveat
nonlinear error is is negligible
negligible compared to the the stochastic
stochastic uncer-
uncer- being that the averages
being averages are lessless affected
affected by conductance
conductance near
tainties.
tainties. or below the maximum depth of inference inference than thethe averaging
averaging
final example (Figures
A final (Figures 12 I2 and 13) 13) demonstrates
demonstrates the the in- kernels indicate.
kernels indicate.
crease in linearization
crease IineariTation error for the case case whenwhen the the true conduc-
conduc-
increasesgreatly below the
tivity increases the maximum depth of inference. inference. CONCLUSIONS
CONCLUSIONS
have generated
We have generated 15 I5 frequencies
frequenciesof artificial data from model
12b. aa slightly smoothed
12b, smoothed versionversion of aa uniform slab slab model pro- Tests with synthetic data show that norm minimization
Tests minimization may
vided
vided by Parker (I 983). The frequencies
(1983). frequencies are the same same as as for the
the be
be highly successful
successful in recovering the large-scale
large-scale features
features of
cOPROD data set
COPROD set (cf.,
(cf., Parker.
Parker, 1983),1983). ranging between between 5.099
5.099 the true conductivity, even even in cases
cases where nonlinear effects
effects
4
x 10
IO ~4 Liz
Hz to 3.509
3.509 x 10 10 z2 Hz. We have have added
added 15 15 percent
percent may bebe very large, such
such as
as in modeling log cr. cr. Features which
Gaussian
Gaussian errors to the data at the lowest lowest three frequencies
frequencies and are not resolved
resolved by the flattest models
models fitting a data set set are
the highest
highest frequency
frequency and 5 percent percent Gaussian
Gaussian errors to the the not necessary,
necessary, and their existence
existence cannot be determined from
other data, approximating
approximating the error levels levels estimated
estimated in the the the data. Flattest models
models of conductivity
conductivity have the further ad-
COPROD
COP ROD data. The maximum depth of inference inference for this this vantage that nonlinear effects
effects are often soso small that model
data setset is
is 336 km, which is is shallower than the rise rise in conduc- values
values may be be interpreted reasonably
reasonably asas the true conductivity
tivity
tivity centered
centered at 396 km. In Figure 12, 12, we also also plot the loga- averaged
averaged through known resolution functions.
functions.
rithm of the o a model that minimizes P, F 2 [a, l/a o , log (z +
[o. l/o,. + +)I,
zo)]' No matter how a model is is obtained, it is
is essential
essential that the
fit to the 95 95 percent
percent confidence
confidence limit of x2 X2 == 43.8.
43.8. (The
(The mini-
mini- model fit high-frequency and low-frequency
low·frequency data equally well
mum possible
possible misfit for this data is is 30.2,
30.2, which is is greater than (a white fit).
fit), Failure
Failure to assure
assure whiteness
whiteness results
results in models with
with
E(x2)
E(X2) = = 30,
30, so
so we only fit to the 95 percent confidence confidence level.)
level.) unnecessary
unnecessary structure in some some depth ranges
ranges and possibly
possibly inad-
inad-
The linearization error is is 2010,
2010, which is is much larger than the equate
equate structure in other depth ranges.
ranges. We
We have proposed use use
random errors in the data. The nonlinear errors (Figure 13) 13) of Spearman’
Spearman's s statistic D to test against selective
selective overfitting
overfitting or
are largest
largest at the depths
depths of the final good conductor and still underfitting of of data from either end of of the spectrum, while
exceed
exceed the stochastic
stochastic errors at shallower depths. depths. When the making minimum
minimum assumptions
assumptions about the functional form of of
same
same test test is
is made by inverting synthetic synthetic data from a model any relationship between frequency
frequency and residual size.
size. We
We find
(not shown) similar to model 11 II b but with a less less conducting that minimizing
minimizing some
some norms results
results in systematic
systematic overfitting
overfitting
final layer of 0.01 0.01 S/m, the linearization errors of the resultant of
of low-frequency data, whereas
whereas minimizing
minimizing others does does not.

Downloaded 15 May 2010 to 95.176.68.210. Redistribution subject to SEG license or copyright; see Terms of Use at http://segdl.org/
MT Inversion
MT Inversion for
for Minimum
Minimum Structure
Structure 1575
1575

Which
Which norms result in white fits may be somewhat data- Montanist. Acad. Sci.
Montanist. Sci. Hung.,
Hung., 43, 897-907.
43, 897-907.
Nonparametrics: statisltcal
Lehman, E., 1975, Nonparametrics: statistical methods
methods based
based onon
dependent (particularly
(particularly for o
(j models), so
so the test of
of Spear-
Holden-Day, Inc.
ranks: Holden-Day, Inc.
man’
man'ss D should be made for every inversion to (0 protect against Marchisio. G.
Marchisio, G. B., 1985, Exact
Exact non-linear
non-linear inversion
inversion ofof electromagnetic
electromagnetic
otT-white fits.
off-white fits. induction soundings: Ph. Ph. D.
D. thesis,
thesis, Univ.
Univ. of Calif.
Calif. at San
San Diego.
Diego.
Msrchisio,
Marchisio, G. B.. and Parker,
B., and Parker, R. R. L.,
L., 1984, Exact
Exact nonlinear
nonlinear inversion
inversion
of
of electromagnetic
electromagnetic induction
induction soundings
soundings (abstract):
(abstract): EOS,
EOS, 64, 692.
64,692.
ACKNOWLEDGMENTS
ACKNOWLEDGMENTS Oldenburg,
Oldenburg, D. W., 1979, 1979, One-dimensional
One-dimensional inversion
inversion of natural
natural source
source
magndotelluric observations:
magnrm1elluric observations: Geophysics,
Geophysics, 44, 44,1218-1244.
1218-1244.
1981. Conductivity
1981. Conductivity structure
structure of of the
the oceanic
oceanic upper
upper mantle
mantle be-
be-
We
We wish to thank GaryGary Egbert for the suggestion
suggestion of Spear- neath the pacific
neath Pacific plate: Geophys. J. Roy. Astr. Sot., Soc., 65, 359-394.
359-394.
man’
man'ss statistic and for many useful
useful discussions S. Con-
discussions and S. Con- Oldenburg, D. W., Whittall,
Oldenburg, Whittall, K. P., P., and
and Parker,
Parker, R.
R. L., 1984, Inversion
stable and an anonymous reviewer for thoughtful review of an of ocean bottom
bollom magnetotelluric
magnetotelluric data revisited: J. Geophys. Res., Res.,
89, 1829-I
1829-1 X33.
X33.
earlier version
version of
of this paper. Parts of the research
research were sup- Parker. R. L., 1970,
Parker, 1970. The
The inverse
inverse problem of electrical conductivity
conductivity in
ported by the Department
Department of Energy under Grant
Grant DE-FG06-
DE-FG06- the mantle: Geophys. J.
the Aslr. Sot.,
1. Roy. Astr. Soc., 22,
22,121-138.
121-138.
86ER 13472 and the National inverse problem of
1980, The inverse
1980. of electromagnetic induction:
induction: Exis-
86ER13472 National Science
Science Foundation
Foundation under
lence and construction of solutions based
lencc based upon incomplete data: data: J.
Grant EAR-8500248.
Grant Geophys. Res..Res., 85.4421-4428.
85, 4421-4428.
-
__ - 19X 19S I. existence of
I. The existence of a region inaccessible
inaccessible to magnetotelluric
magnetotelluric
REFERENCES
REFERENCES sounding: Geophys. J. J. Roy.
Roy. Astr.
Astr. Sot.,
Soc., 68, 165-170.
68,165-170.
-_-
- - - 1983,
1983. The magnetotelluric
magnetotelluric inverse
inverse problem:
problem: Geophys. Surveys,
Surveys,
Backus, G. E.,
Backus, E., and Gilbert,
Gilbert, J.
J. F., 1968. The resolving power of gross gross 46, 5763-57S3.
46,5763-57x3.
data: Geophgs.
earth data: Geophys. J. Soc., 16,
J. Roy. Astr. Sot., 16, 169-205.
169-205. Pavlis. G. L., and Booker, J. R., 1983, A A study ofof the importance
importance of of
Doksum, K. A., 1977, Mathematical
J., and Doksum,
Bickel, P. J., Mathematical statistics:
statistics: basic
basic nonlinearity in the inversion of
nonlinearity of earthquake
earthquake arrival
arrival time data
data for
ideas and selected
ideas selected topics: Holden-Day,
Holden-Day, Inc. velocity structure: J.
vslocity ], Geophys. Res.,Res., 88.
88,5047-5055.
5047-5055.
D .. Thomson, D. 1
Chave. A. D.. .. and
.I., and Ander,
Ander. M.M. E..
E.. 1987, robust
1987, On the robust P .. 1972.
Weidelt. P.. inverse problem of geomagnetic induction:
1972, The inwcrse induction: Z.
spectra, coherences,
estimation of power spectra. coherences, and transfer
transfer functions:
functions: 1.
J. Geophys .. 38. 257-289.
Geophys., 257-289.
Res., 92.633-64X.
Geophys. Res., 92, 633-648. -
__ - 1985, Construction of conductance bounds from mag-
19~5, Construction
Constable. S.S. C..
C. Parker, R. L., and Constable, C. C G., 1987, Occam's
1987, Occam’ s impedances: J.
netotelluric impedances: 1. Geophys., 57, 191-206.
191-206.
algorithm for generating smooth models
inversion: a practical algorithm models from _ . - 19X7.
--- conductivity averages
19X7. Bounds on spatial conductivity averages from MT MT im-
EM sounding data: Geophysics,
EM Geophysics, 52, 289-300.
289-300. pedances: Presented
pedances: Presented at the XIX XIX General
General Assembly.
Assembly, Internat. Union
Union
D., and
Egbert, G. D.. J. R.,
Bnd Booker, J. R., 1986,
1986, Robust estimation of geomag- Geou, Geophys..
Geod. Geophys., Vancouver.
netic transfer functions: Geophys. 1.
netic Soc., 87, 173-194.
J. Roy. Astr. Sot., 173-194. Whittall, K. P..
Whittall, P .. and Oldenburg,
Oldenburg. D. D. W., 1986, Inversion of mag-
J. C..
Larson, J. C., 1977. surface conductivity
1977. Removal of local surface conductivity effects
effects from netotelluric data using using a practical inverse
inverse scattering formulation:
formulation:
low frequency
frequency mantle response
response curves:
curves: Acta Geodact..
Geodaet., Geophys. et et Geophysics,
Geophysics, St. 51. 3X3-395.
383-195.

APPENDIX
APPENDIX
SOLUTION OF THE LINEARIZED
SOLUTION LINEARIZED EQUAT10NS
EQUATJONS

side conditions (14)


The side (14) are
are most easily
easily applied to the
the mini- m'
m’
mization of W when rewritten (in vector
vector form) as
as ----K’
- -Q .[&‘
(f')1!2
""]-LQ’[-t- A']
K' O[ -
- t Q' [r +
+Ay’], ,
l1y'J (A-7)
(./“)“2

where we
where we have
rI-+A+
s
have dropped the
f'
0
I

+ l1i Jo K
K (/')112
&
1
[m~
Cwlb+
+ 11m'] dx +
Am’1 dx

subscript from ee,,p and where


the subscript
+ e, e, (A-I)
(A-1) where
where

E?.+, (A-8)
ll.y
Ar’ ""
= 11'Y- [m, (0) + ll.m(O)]G(O),
Ay - [mo(O) Am(O)]G(O), (A-2)
(A-2)
which may be
which be integrated for m.m. "Squaring"
“Squaring” the misfit vector
ll.'Y""'Y
Ay=y - Yo,~0, (A-3)
(A-3) is not
(which is not displayed
displayed here),
here), one
one gets
gets the
the squared 2
squared misfit 11ee 1(*,,

and
and
e’e = $ [r + Ar’]‘Q[&]-20’[r + A?‘], (A-9)
K = (,f’)“*G. (A-4)
(A-4)
we treat the
For now we the surface
surface value m,(O) + I1m(O)
value mo(O) Am(O)as aa fixed where ~:'
where depends on t3@ through equation (A-8) and
&’ depends and !o.y'
Ay’ de-
parameter. Define pends on
pends on !o.m(O).
Am(O). Since
Since 1lelZ function of ~p for
is aa monotonic function
e 12 is
~
p > 0,
0, this
this may be be solved
solved for p using Newton's
j3 numerically using Newton’s
tJ == i~ K K' dz, (A-5) method.
method.
above holds
The above holds for any
any choice
choice of !o.m(O).
Am(O). We use use equation
where the
where superscript ’ denotes
the superscript' transpose. H
denotes transpose. E;Iis
is symmetric
symmetric (and
(and (A-7) to
(A-7J to form anan expression
expression for W and minimize W with
for Wand
positive semidefinite)
positive semidefinite) and
and may bebe diagonalized
diagonalized by an
an orthog- respect to changes
respect changes in
in I1m(O),
Am(O),yielding
onal transformation Q. Let
onal
m,(O) + Am(O) = a’[r + A?], (A-10)
(A-lO)

@= Q~Q'.
{f= Q&P. where
where
(A-6)
(A-6)
,_ G(O),Q[~J
G(O)‘Q[&‘l-- IQ'
0
‘’
, ~)
2
Then, by
Then, by mlmmlzmg
minimizing W(m,W(m, 1(e e 112, p) with respect
respect to pertur- exa* =
s , (A-II)
(A-l 1)
bations 8e
bations 6e and
and 8[m'/(f')t i
G[m’/(,f’)‘i2],2], one
one finds
finds that G(O),Q[~]
G(OyQ[h']- 'Q'G(0)
IQ'G(O) ’

Downloaded 15 May 2010 to 95.176.68.210. Redistribution subject to SEG license or copyright; see Terms of Use at http://segdl.org/
1576
1576 Smith
Smith and
and Booker
Booker

Then combining equations


equations (A-2),
(A-2), (A-7),
(A-7), and
and (A-to),
(A-IO), we
we have
have pair ~p and Am(O)
Llm(O) that give the flattest
flattest model with a specific
specific
the model which minimizes W for a given choice of~:
given choice of 0: squared
squared misfit, we solve
solve for p~ with Am(O) initially, obtain a
I'1m(O) = 0 initially,
new
new LlI11(O)
Am(O) from equation (A-lo),(A-I0), and reiterate, solving for j3 ~
111(;;) [A’ +
M(Z) = [A' + (1.'
a’ - A'G(O)(1.tJ[r
A’G(O)u’][l- +
+ l'1yJ,
Ay], (A-12)
with the
the improved Anz(0) time. In practice 1'1111(0)
each time
Llm(O) each Am(O) and p ~
where
where converge
converge rapidly, generally in less less than fivefive iterations. In the
few
few cases
cases where more than five iterations are needed, needed, we con-
A = Q[~']-lQfrK(Z)
Q[&‘] - ‘Q’ =f’K(z) dz.
dz. (A-13) tinue iterating using
using weighted averages
averages of the last last two esti-
s0 mates
mates to avoid cyclic repetition. Iterative
Iterative solution of equations
equations
Equation (A-9) gives value of ~p necessary
gives the value necessary to obtain a (A-9) and
and (A-\ 0) is
(A-IO) is rapid, since
since it is
is not necessary
necessary to recompute
specific
specific squared
squared misfit, given
given a choice
choice of I'1m(O).
Am(O).To obtain the
the Ij.~,
lj, A, or Q.

View publication stats Downloaded 15 May 2010 to 95.176.68.210. Redistribution subject to SEG license or copyright; see Terms of Use at http://segdl.org/

You might also like