Characterizing Uncertain Site-Specific Trend Function by Sparse Bayesian Learning

Characterizing Uncertain Site-Specific Trend Function by
Sparse Bayesian Learning

Jianye Ching, M.ASCE 1; and Kok-Kwang Phoon, F.ASCE 2
Downloaded from ascelibrary.org by THE UNIVERSITY OF NEWCASTLE on 02/24/17. Copyright ASCE. For personal use only; all rights reserved.
Abstract: This paper addresses the statistical uncertainties associated with the estimation of a depth-dependent trend function and spatial
variation about the trend function using limited site-specific geotechnical data. Specifically, the statistical uncertainties associated with the
following elements are considered: (1) the functional form (shape) of the trend function; (2) the parameters of the trend function (e.g., intercept
and gradient); and (3) the random field parameters describing spatial variation about the trend function, namely standard deviation (σ) and
scale of fluctuation (δ). The problem is resolved with a two-step Bayesian framework. In Step 1, a set of suitable basis functions that pa-
rameterize the trend function is selected using sparse Bayesian learning. In Step 2, an advanced Markov chain Monte Carlo method is adopted
for the Bayesian analysis. The two-step approach is shown to be consistent in the well-defined sense that the resulting 95% Bayesian con-
fidence interval (or region) contains the actual trend (or actual σ and δ) with a chance that is close to 0.95. Inconsistency can occur when the
spatial variability has a large σ or a large δ relative to data record length. DOI: 10.1061/(ASCE)EM.1943-7889.0001240. © 2017 American
Society of Civil Engineers.
Author keywords: Geotechnical engineering; Statistical uncertainty; Trend; Site characterization; Spatial variability.
Introduction where the trend function is obtained by regression and the detrended
data is adopted to characterize the statistical uncertainty in (σ, δ):
One of the purposes of site investigation is to obtain information the statistical uncertainty in the trend function is ignored. Detrending
on the vertical spatial distribution of geotechnical parameters. is a standard preprocessing procedure for spatial variability data
Typically, the vertical spatial distribution is expressed as a depth- (e.g., Fenton 1999a; Phoon and Kulhawy 1999; Uzielli et al. 2005).
dependent trend function and a zero-mean spatial variation. The Ching et al. (“On the identification of geotechnical site-specific
most popular model for the spatial variation is the random field. trend function,” submitted, J. Risk Uncertainty Eng. Syst. Part A:
The amount of information collected in a site investigation is lim- Civ. Eng., ASCE, Reston, Virginia) showed that ignoring the stat-
ited compared with the volume of soil. Hence statistical uncertainty istical uncertainty in the trend has the undesirable consequence of
is particularly important in the geotechnical context (Phoon and overestimating the reliability. Past studies have recognized that
Kulhawy 1999). This means that the parameters used to character- detrending deserves more rigorous attention (e.g., Kulatilake 1991;
ize the vertical spatial distribution, such as the trend function (t), Li 1991; Jaksa et al. 1997; Fenton 1999b). Phoon et al. (2003) tried
standard deviation (σ), and scale of fluctuation (SOF) (δ), cannot be to couple detrending with the identification of statistically homo-
estimated with complete certainty. From the Bayesian perspective, geneous soil layers using modified Bartlett statistics. They argued
statistical uncertainty can also be interpreted as model uncertainty, that detrending by regression assumes that the remaining zero-
and the prior/posterior distributions are used to quantify the model mean oscillating component is white noise, which is inconsistent
uncertainty before/after the site investigation. However, the term with the empirical observation that geotechnical parameters are
statistical uncertainty is adopted here because it is a recent focus spatially correlated. All previous studies did not consider the im-
in geotechnical risk and reliability (Phoon and Retief 2015; Phoon pact of statistical uncertainty on detrending.
et al. 2016). This paper addresses the statistical uncertainties associated with
The statistical uncertainty in some of these parameters has been the characterization of spatially variable site data in full and within
addressed in Ching et al. (2016c). They considered a vertical spatial a consistent theoretical framework. Ching et al. (2016c) only con-
distribution with a constant trend μ. Ching et al. (“On the identi- sidered a constant trend t ¼ μ, whereas Ching et al. (“On the iden-
fication of geotechnical site-specific trend function,” submitted, tification of geotechnical site-specific trend function,” submitted,
J. Risk Uncertainty Eng. Syst. Part A: Civ. Eng., ASCE, Reston, J. Risk Uncertainty Eng. Syst. Part A: Civ. Eng., ASCE, Reston,
Virginia) further considered a vertical spatial distribution with a Virginia) only considered a linear trend tðzÞ ¼ a þ bz. Another
linear trend a þ bz (z is the depth; a and b are the intercept and source of statistical uncertainty is in the form of the trend function.
gradient, respectively). They further considered a common scenario This is more fundamental than the statistical uncertainty in the trend
parameters such as μ or (a, b). The actual trend function tðzÞ is a
1
Professor, Dept. of Civil Engineering, National Taiwan Univ., No. 1, consequence of natural deposition processes and ensuing environ-
Section 4, Roosevelt Rd., Da’an District, Taipei 10617, Taiwan (corre- mental factors that continue to modify the deposits. There is limited
sponding author). E-mail: jyching@gmail.com physical basis to believe that the actual trend can be represented by
2
Professor, Dept. of Civil and Environmental Engineering, National
a constant or linear function, although normally consolidated soils
Univ. of Singapore, 21 Lower Kent Ridge Rd., Singapore 119077.
Note. This manuscript was submitted on July 20, 2016; approved on
tend to exhibit strength profiles that increase linearly with depth.
November 23, 2016; published online on February 21, 2017. Discussion Overconsolidated soils may exhibit strength profiles that remain
period open until July 21, 2017; separate discussions must be submitted constant with depth up to some limit, because the overconsolidation
for individual papers. This paper is part of the Journal of Engineering ratio is known to decrease with the depth and this trend may be
Mechanics, © ASCE, ISSN 0733-9399. compensated by the effective vertical stress that increases with
© ASCE 04017028-1 J. Eng. Mech.
J. Eng. Mech., -1--1

depth. If the actual trend is neither a constant nor a linear function, where εðzÞ is modeled as a zero-mean stationary random field with
force-fitting a constant or a linear trend model will introduce bias standard deviation = σ and SOF = δ. The normalized parameter
that cannot be eliminated by increasing the amount of site inves- nD ¼ D=δ is used to characterize the total depth D ¼ zn − z1 ;
tigation data. The bias in the form (or shape) of the trend function nD is the equivalent number of SOFs within the total depth D.
will ultimately propagate to the design outcomes.
The purpose of the current paper is to characterize the statistical
uncertainty in the trend function without assuming the form of the Probabilistic Site Characterization
trend function a priori. Hence the statistical uncertainty is not lim- Given the site investigation data Y ¼ ðy1 ; y2 ; : : : ; yn Þ, the purpose
ited to the trend parameters associated with a prescribed trend func- of probabilistic site characterization is to estimate the trend function
tional form, as was considered in previous studies; this paper also tðzÞ as well as (σ, δ). Note that not only are the parameters (σ, δ)
considers the uncertainty in the trend functional form. An analysis unknown, but the functional form for the trend function tðzÞ is also
framework that can potentially handle both sources of uncertainties unknown. It will be explained in the next section that the unknown
(parameters and functional form) is proposed. The trend function is function tðzÞ is parameterized into a set of unknown weights. The
represented as the linear combination of a collection of basis func- only available information is the Y data. It is impossible to deter-
tions (BFs) with uncertain weights. The framework consists of mine the unknown parameters (t, σ, δ) exactly given the limited
two steps. In Step 1 (BF selection), the collection of BFs that give information Y. It is more realistic to present the unknown param-
the maximum conditional evidence is identified using an iterative eters (t, σ, δ) in the form of a range of possibilities, all of which are
optimization procedure. This step is similar to the Bayesian model consistent with the observation data Y, rather than a single definite
class selection (Beck and Yuen 2004; Yuen 2010a, b; Cao and number. The range should logically broaden/narrow when there is
Wang 2014; Wang and Aladejare 2015), but it is not necessary less/more information. To characterize this range, say in the form of
to compute the evidence for every possible collection of BFs. a 95% confidence interval or region, the statistical uncertainties in
Instead, a few optimization runs over the hyperparameters will suf- (t, σ, δ) must be quantified. This is a practical and critical problem
fice. This greatly reduces the computational cost. Tipping (2001) in probabilistic site characterization. This paper shows that it is
called this framework sparse Bayesian learning (SBL) because possible to do this entirely within a consistent two-step Bayesian
the optimization result is typically sparse in the sense that only framework. The following sections introduce the probabilistic
a few BFs are necessary to effectively represent the trend function. models for the trend function tðzÞ and for the spatial variability
However, Tipping (2001) only considered uncorrelated residuals. In εðzÞ. Without loss of generality, the trend function tðzÞ, spatial
the context of spatial variability at a site, this means that the de- variability εðzÞ, and the observations Y ¼ ðy1 ; y2 ; : : : ; yn Þ are all
trended residuals from soil data records are uncorrelated, which is defined on the depth interval [0, 1]. For real site investigation
rarely encountered in actual field measurements. It will be shown data not defined on the depth interval [0,1], the transformation
that the assumption of uncorrelated residuals underestimates the z ¼ ðoriginal depth z − z1 Þ=D can be adopted to convert to the
uncertainty in the trend function. [0, 1] interval. The SOF needs to be converted as well; SOF ¼
A key contribution of this paper is the modification of the SBL ðoriginal SOFÞ=D. Note that nD ¼ D=δ is invariant under this
framework to incorporate correlated residuals. It will be made clear transformation.
that the critical sparsity feature still holds with the incorporation of
spatial correlation when an appropriate thresholding scheme pro-
posed in this paper is implemented. Recently, Wang and Zhao Probabilistic Model
(2016) showed that the compressive sampling method (Candès and The trend function tðzÞ is modeled as the linear combination of a
Wakin 2008) can also lead to sparse BFs. However, SBL is adopted collection of basis functions (BFs)
in this study to be consistent with the Bayesian analysis in Step 2.
In Step 2, the BFs selected in Step 1 are adopted for further X
m
Bayesian analysis. Another key contribution of this paper is the tðzÞ ¼ wk ϕk ðzÞ ð2Þ
proposing of an efficient MCMC method to draw samples from the k¼0
high dimensional posterior probability density function (PDF) of
the trend function and (σ, δ). With the two-step approach, the stat- where ϕk ðzÞ = kth BF and wk = unknown weight. This expression
istical uncertainties in the trend functional form and in the trend is fairly general and can encompass the polynomial basis
parameters are addressed holistically and consistently. The statis- (zk ∶k ¼ 0; 1; 2; : : : ), the wavelet basis (wavelets with various
tical uncertainties in (σ, δ) are also considered within this consistent scales and locations), and the Fourier cosine basis, among others.
two-step Bayesian framework. In this study, the shifted Legendre polynomial basis is first
demonstrated, although other bases can be adopted. The shifted
Legendre polynomials are a collection of polynomials defined
Bayesian Probabilistic Site Characterization on the interval [0, 1]. They are the shifted versions of the Legendre
polynomials on ½−1; 1. The first five such polynomials are
Let Y ¼ ðy1 ; y2 ; : : : ; yn Þ denote the site investigation data ob-
served at depths (z1 ; z2 ; : : : ; zn ). For example, yi can be the cor- ϕ0 ðzÞ ¼ 1
rected cone tip resistance (qt ) in the cone penetration test (CPT) at pffiffiffi
ϕ1 ðzÞ ¼ 3ð2z − 1Þ
depth zi . It can also be a transformed observation—e.g., yi ¼ pffiffiffi
ln½qt ðzi Þ or yi ¼ ln½Qtn ðzi Þ ¼ ln½qt ðzi ÞCN , where CN is the ϕ2 ðzÞ ¼ 5ð6z2 − 6z þ 1Þ
overburden correction factor (e.g., Liao and Whitman 1986). It pffiffiffi
is customary to hypothesize that the observation yi is the summa- ϕ3 ðzÞ ¼ 7ð20z3 − 30z2 þ 12z − 1Þ
pffiffiffi
tion of the trend tðzi Þ and the spatial variability εðzi Þ ϕ4 ðzÞ ¼ 9ð70z4 − 140z3 þ 90z2 − 20z þ 1Þ ð3Þ
Note that they are scaled by a constant ð2k þ 1Þ0.5 such that the
yi ¼ tðzi Þ þ εðzi Þ ð1Þ norms of the polynomials are 1:
© ASCE 04017028-2 J. Eng. Mech.

By setting sk to be a fixed large number, the prior PDF for wk is also
flat. The members of set S ¼ ðs0 ; s1 ; : : : ; sm Þ are called the hyper-
parameters. Note that although the prior mean for wk is zero, the pos-
terior mean of wk can become non-zero after the Bayesian updating.
Likelihood and Posterior PDF

From Eq. (6), the likelihood of observing Y given the uncertain
variables (W; σ; δ) can be written as
1 1
fðYjW; σ; δÞ ¼ pffiffiffiffiffiffin × pffiffiffiffiffiffi
2π jΣj

1
× exp − ðY − Φ × WÞT Σ−1 ðY − Φ × WÞ
2
ð10Þ
where Σ is the covariance matrix of ε

2 3
1 e−2jz1 −z2 j=δ e−2jz1 −z3 j=δ · · · e−2jz1 −zn j=δ
Fig. 1. First five BFs for the shifted Legendre polynomials 6 7
6 1 e−2jz2 −z3 j=δ · · · e−2jz2 −zn j=δ 7
6 7
6 7
Z Σ¼σ ×6
2
6 1 7 ð11Þ
7
1 6 .. 7
2
ϕk ðzÞ dz ¼ 1 ð4Þ 6 SYM: . e −2jzn−1 −zn j=δ 7
0
4 5
1
Fig. 1 shows the first five shifted Legendre polynomials. The
collection of all BFs up to order m is denoted by M, i.e., M ¼
fϕ0 ; ϕ1 ; : : : ; ϕm g. It is desirable that m is large (e.g., m ¼ 20)
so that the trend model can cover a wide variety of trend functions. Step 1—Selection of Basis Functions
The following analysis requires the BFs to have a unity norm but it
does not require the BFs to be orthogonal, nor does it require that An important question to ask is whether all BFs in M ¼
the data (y1 ; y2 ; : : : ; yn ) be equally spaced in the depth direction. fϕ0 ; ϕ1 ; : : : ; ϕm g are necessary to characterize the trend function
From the parameterized trend model in Eq. (2), it is clear that tðzÞ. Using the shifted Legendre polynomials as an example, it is
X
m expected that a regular trend function (e.g., a linear trend) can be
yi ¼ wk ϕk ðzi Þ þ εðzi Þ ð5Þ expressed as the linear combination of a small subset of the BFs
k¼0 that contains only few low-order polynomials. Adopting high-
order polynomials may overfit the data—i.e., they may fit the spa-
In other words tial variability εðzÞ in addition to the trend function tðzÞ. This is the
fundamental difficulty pointed out in past studies. However, aban-
Y ¼Φ×Wþε ð6Þ
doning high-order polynomials indiscriminately will create another
where Y ¼ ðy1 ; y2 ; : : : ; yn ÞT , W ¼ ðw0 ; w1 ; : : : ; wm ÞT , ε¼ difficulty. If the actual trend function happens to be irregular, the
½εðz1 Þ; εðz2 Þ; : : : ; εðzn ÞT , and low-order polynomials cannot fit the trend function well and parts
2 3 of the trend function are incorporated into the spatial variability
ϕ0 ðz1 Þ ϕ1 ðz1 Þ · · · ϕm ðz1 Þ εðzÞ. It is often difficult to determine by visual inspection whether
6 7 the trend function is regular or not because the trend function is
6 ϕ0 ðz2 Þ ϕ1 ðz2 Þ ϕm ðz2 Þ 7
6 7 buried in the spatial variability.
Φ¼6 . .. .. 7 ð7Þ
6 .. . . 7 One method of selecting a suitable collection of BFs is the
4 5
Bayesian model class selection. A collection of BFs is referred to
ϕ0 ðzn Þ ϕ1 ðzn Þ ··· ϕm ðzn Þ as a model class in this paper. Given a model class M j —i.e., a
subset of M—the model evidence fðYjM j Þ can be estimated as
The spatial variability εðzÞ is modeled as a zero-mean stationary
Z
normal random field with standard deviation = σ and the following
auto-correlation function ρðΔzÞ (Vanmarcke 1977): fðYjMj Þ ¼ fðYjW; σ; δ; M j ÞfðWjM j Þfðln σjM j Þfðln δjM j Þ
ρðΔzÞ ¼ ρ½εðzÞ;εðz þ ΔzÞ ¼ expð−2jΔzj=δÞ ð8Þ · dW · dðln σÞ · dðln δÞ ð12Þ
where δ = SOF. where fðln σjM j Þ and fðln δjMj Þ are taken to be flat priors; and
The purpose of probabilistic site characterization is to estimate fðWjMj Þ is also flat by setting sk to be a fixed large number. The
(W; σ; δ), where tðzÞ is parameterized by W. For Bayesian analysis, model evidence fðYjM j Þ quantifies the plausibility of M j given the
prior PDFs for (W; σ; δ) need to be specified. For (σ; δ), flat non- observed data Y, so the model class that maximizes the model evi-
informative prior PDFs are assigned to lnðσÞ and lnðδÞ. For each dence is the most plausible. A more complicated model class tends
weight wk , its prior PDF is assumed to be a zero-mean normal to have a smaller model evidence. This penalization mechanism
variable with standard deviation sk against complicated model classes is well known (e.g., Mackay 1992;
Yuen 2010a). This means that M j containing many unnecessary BFs
fðwk jsk Þ ∼ Nð0; s2k Þ ð9Þ is penalized. As a result, maximizing the model evidence fðYjMj Þ
© ASCE 04017028-3 J. Eng. Mech.

can effectively mitigate overfitting. There are numerous possible 1 1
fðYjS; σ; δ; MÞ ¼ pffiffiffiffiffiffin × pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
model classes—e.g., M 1 ¼ fϕ0 g, M 2 ¼ fϕ0 ; ϕ2 g, and M 3 ¼ 2π jΦ × Ω × ΦT þ Σj
fϕ0 ; ϕ3 ; ϕ6 g. There are 2mþ1 possible model classes. If m ¼ 20,
there are 221 possible model classes. For each model class 1
× exp − Y T ðΦ × Ω × ΦT þ ΣÞ−1 Y ð14Þ
M j ⊆ M, the model evidence fðYjM j Þ can be evaluated and the 2
model class with the highest model evidence is the most plausible
model class, denoted by M . The analytical evaluation of fðYjM j Þ where Ω = covariance matrix for W
is challenging because the integral in Eq. (12) is high dimensional. 2 3
Some methods are able to approximately estimate fðYjM j Þ— s20 0 ··· 0
6 7
e.g., Laplace approximation (Beck and Yuen 2004; Yuen 2010a, b) 6 0 s21 0 7
6 7
and transitional MCMC (Ching and Chen 2007)—but the compu- Ω¼6 .. .. .. 7 ð15Þ
6 7
tation cost for each model evidence estimation is high, not to 4 . . . 5

mention that the computation cost will be multiplied by 2mþ1.
0 0 ··· s2m
Maximizing the Conditional Evidence f Y jS;σ;δ;M The optimization problem for fðYjS; σ; δ; MÞ is high dimen-
sional because there are ðm þ 1Þ þ 2 ¼ m þ 3 parameters in
This study adopts an alternative strategy for performing the BF
(S; σ; δ). The Appendix shows that the following steps can be
selection that was first proposed by MacKay (1992). Whether or
adopted to decompose the high-dimensional optimization problem
not the kth BF ϕk ðzÞ is active is governed by the standard deviation
into (m þ 3) one-dimensional (1D) optimization problems, and
sk of its weight wk ; sk ¼ 0 implies that ϕk ðzÞ is deactivated because
(m þ 2) of them have analytical solutions. The optimal solution
wk is equal to 0 with prior probability 1. In MacKay’s framework,
(S ; σ ; δ ) can be obtained by iterating among these (m þ 3)
S ¼ ðs0 ; s1 ; : : : ; sm Þ are considered as variables rather than fixed
1D optimization problems until convergence is achieved. Tipping
at large values as in Eq. (12). Furthermore, numerous model classes
(2001) derived the solutions for the scenario with uncorrelated re-
(M1 ; M 2 ; M 3 ; : : : ) are not considered. Instead, the entire set M is
siduals. When applied to spatial variability at a site, this means the
considered and the optimal (S; σ; δ) that maximizes the conditional
derivation only applies to a soil data record that is spatially uncor-
evidence fðYjS; σ; δ; MÞ is sought:
related. This is clearly unrealistic in the geotechnical context. The
Z derivations are extended to accommodate the spatial correlation
fðYjS; σ; δ; MÞ ¼ fðY; WjS; σ; δ; MÞ · dW structure in the Appendix. Throughout the steps, the following two
Z items always need to be updated to their most recent states:
¼ fðYjW; σ; δ; MÞfðWjS; MÞ · dW ð13Þ
C ¼ ðΩ−1 þ ΦT × Σ−1 × ΦÞ−1 μ ¼ C × ΦT × Σ−1 × Y
Maximizing the conditional evidence fðYjS; σ; δ; MÞ is similar ð16Þ
to maximizing the likelihood function fðYjW; σ; δ; MÞ in Eq. (10),
but now the objective is to fine-tune the standard deviations S ¼ Note that Ω−1 can be computed easily because Ω is diagonal.
ðs0 ; s1 ; : : : ; sm Þ rather than the weights W ¼ ðw0 ; w1 ; : : : ; wm Þ. The algorithm for optimizing fðYjS; σ; δ; MÞ is as follows:
Maximizing the likelihood fðYjW; σ; δ; MÞ (fine-tuning W di- 1. Initialize (S; σ; δ). It is found that a small initial δ will help the
rectly) can lead to a great deal of overfitting. On the other hand, convergence.
maximizing the model evidence fðYjM j Þ in Eq. (12) can mitigate 2. Update ðnew s2k Þ ¼ ðold s2k Þ × μ2k =½ðold s2k Þ − Ckk . Do this for
the overfitting because all random variables are integrated out, but k ¼ 0; 1; 2; : : : ; m. To prevent the round-off error when evalu-
it is computationally costly. Maximizing the conditional evidence ating Ω−1 in Eq. (16), a lower bound of 10−10 is adopted for
fðYjS; σ; δ; MÞ is an intermediate solution. The question of whether (new s2k ) in this paper. Specifically, ifðnew s2k Þ < 10−10 , (new s2k )
or not maximizing the conditional evidence can mitigate overfitting is set to be 10−10 .
will be addressed later. 3. Update ðnew σ2 Þ ¼ ðold σ2 Þ × ½ðY − Φ × μÞT Σ−1 ðY − Φ × μÞ=
Tipping (2001) showed that the optimal solution S ¼ ½n − traceðC × ΦT × Σ−1 × ΦÞ, where traceð·Þ is the summation
ðs0 ; s1 ; : : : ; sm Þ that optimizes the conditional evidence fðYjS; σ;
of the diagonals of a square matrix.
δ; MÞ is typically sparse; most optimal hyperparameters sk go to 4. Update (new δ) by optimizing fðYjS; σ; δ; MÞ, where S and σ
zero. Tipping (2001) called this framework sparse Bayesian learn- are fixed at their most updated values. There is no analytical
ing (SBL) because the optimization result is typically sparse. This solution for this 1D optimization problem. Any optimization
SBL has recently attracted some attention in civil engineering algorithm (e.g., steepest decent) can be adopted.
(Huang et al. 2014; Huang and Beck 2015; Mu and Yuen 2016). 5. Iterate Steps 2–4 until convergence.
Because sk ¼ 0 implies that the kth basis ϕk ðzÞ is deactivated,
the marginal likelihood fðYjS; σ; δ; MÞ is maximized by deactivat-
ing all BFs with sk ¼ 0. Therefore,It is found that a small initial Examples for Basis Function Selection
the problem of selecting an optimal model class M among To showcase the BF selection, consider the following three
(M1 ; M 2 ; M 3 ; : : : ) that maximizes the model evidence fðYjM j Þ examples:
is simplified into the problem that finds the optimal S that max- 1. EX1: tðzÞ ¼ 100 þ 200z;
imizes the conditional evidence fðYjS; σ; δ; MÞ. The computation 2. EX2: tðzÞ ¼ 100 þ 100z − 1000z2 þ 1000z3 ; and
cost is greatly reduced; the former problem requires the estimation 3. EX3: tðzÞ ¼ 100 þ 100 × cosð4πzÞ.
of a combinatorial number of model evidences fðYjM 1 Þ; fðYjM 2 Þ; For all examples, the spatial variability εðzÞ is with standard devia-
fðYjM 3 Þ, and so on, whereas the latter only requires a single opti- tion σ ¼ 20 and SOF δ ¼ 0.1 (nD ¼ 1=0.1 ¼ 10). Given (σ; δ),
mization run. ε ¼ ½εðz1 Þ; εðz2 Þ; : : : ; εðzn ÞT can be simulated by
The conditional evidence fðYjS; σ; δ; MÞ has the following
analytical expression: t(z) = EX1 + ε ¼ L · U ð17Þ
© ASCE 04017028-4 J. Eng. Mech.

Covariance Matrix = Sigma² * R
where U ¼ ½U 1 U 2 ; ::: ;U n T contains independent standard normal This means that ϕk with k > 3 is unnecessary. However, ϕ4 ; ϕ7 ; ϕ8 ,
samples; L is the Cholesky decomposition of the covariance matrix etc. in Fig. 3(b) are still active. If all these unnecessary BFs are
ΣðΣ ¼ L × LT Þ in Eq. (11). Further given the trend function tðzÞ, incorporated, they may fit the spatial variability εðzÞ—i.e., there
Eq. (1) can be adopted to simulate the Y ¼ ðy1 ; y2 ; : : : ; yn Þ data. may be overfitting. Therefore a further thresholding procedure is
Figs. 2(a), 3(a), and 4(a) show the simulated Y data (dashed lines). proposed in this study as follows.
The Y data are obtained at equally spacing depths ðz1 ; z2 ; : : : ; zn Þ ¼ Some remaining BFs are insignificant in the sense that their sk
ð0; 0.01; 0.02; : : : ; 1.0Þ. Although there are 101 data points in Y, values are substantially less than the optimal solution for σ indicated
those data points are correlated. Effectively, there are only approxi- by the horizontal dashed lines in Figs. 2(b), 3(b), and 4(b). Because
mately nD ¼ 10 equivalent independent data points in Y. Ching all BFs have unit norm, it is meaningful to compare between sk and
et al. (2016c) defined the normalized sampling interval to be σ . These BFs have insignificant contributions to the trend and are
Δ ¼ ðsampling intervalÞ=δ. They showed that if Δ is larger than likely to fit the spatial variability, hence they are further deactivated
0.5, δ may become unidentifiable. For the three examples above, for the reason of robustness. In this study, 50% of σ is taken to be
Δ is equal to 0.1, so δ is identifiable. Only one realization of Y is the threshold of whether or not a BF has significant contribution.
used to select the BFs. The shifted Legendre polynomials with This 50% ratio threshold is adopted based on numerical experience:
order up to 20 (m ¼ 20) are adopted as the BFs: M ¼ fϕ0 ; BFs below this ratio tend to fit the spatial variability, not the trend.
ϕ1 ; : : : ; ϕ20 g. Figs. 2(b), 3(b), and 4(b) show the optimal solutions The optimization algorithm should be executed again without these
for S ¼ ðs0 ; s1 ; : : : ; s20 Þ after one optimization run. For all exam- further deactivated BFs to update the optimal solution (S ; σ ; δ ).
ples, many sk values are effectively zero: they attain the lower limit The updated σ typically will be higher than its previous estimate
10−10 . These BFs should be deactivated in Step 2 (Bayesian ana- because the overfitting is suppressed by removing insignificant
lysis). However, some unnecessary high-order BFs are still active. BFs so that the residual becomes more significant. It is then possible
For instance, the actual trend for EX2 is a third-order polynomial. that the updated sk values for some BFs become less than 50% of the
Fig. 2. (a) One simulated realization for EX1; (b) optimal solutions for (s0 ; s1 ; : : : ; s20 ) after the first optimization iteration; (c) final solutions for
(s0 ; s1 ; : : : ; s20 )
(s0 ; s1 ; : : : ; s20 )
© ASCE 04017028-5 J. Eng. Mech.

(s0 ; s1 ; : : : ; s20 )
updated σ . If this happens, these BFs are further deactivated and the (the sparsity). The tðzÞ samples conditioning on this (S ; σ ; δ ) and
optimization is executed again. The iterative process stops when Y are plotted in Fig. 5(b) as the grey dotted lines. It is evident that
there are no more BFs with an updated sk that is less than 50% the overfitting is effectively suppressed. These results suggest that
of the updated σ . Step 1 usually takes less than five optimization simply maximizing the conditional evidence fðYjS; σ; δ; MÞ using
iterations. Figs. 2(c), 3(c), and 4(c) show the final optimal solutions a single optimization cannot effectively mitigate overfitting. How-
for S ¼ ðs0 ; s1 ; : : : ; s20 Þ. The remaining BFs with non-zero sk are ever, overfitting can be effectively mitigated by adopting the iter-
fairly sparse. Two BFs fϕ0 ; ϕ1 g are finally selected for EX1, three ative optimization and thresholding procedure in Step 1.
fϕ0 ; ϕ2 ; ϕ3 g for EX2, and three fϕ0 ; ϕ4 ; ϕ6 g for EX3. Step 1 only To further illustrate whether or not Step 1 can mitigate overfit-
selects the BFs. The trend function will be estimated in Step 2. ting, let M 0 denote the model class that contains the BFs selected
To illustrate whether or not the iterative optimization and thresh- in Step 1. Let M denote the model class maximizing the model
olding in Step 1 can mitigate overfitting, consider EX2 in Fig. 3. evidence fðYjMj Þ in Eq. (12). It is demonstrated that M 0 is close
Suppose that Step 1 only consists of a single optimization without to M in the sense that fðYjM 0 Þ is close to fðYjM Þ—i.e., M 0 is
the thresholding with respect to 50% of σ . Fig. 3(b) shows the close to the optimal solution for the model evidence. Recall that
optimal solutions for S ¼ ðs0 ; s1 ; : : : ; s20 Þ for this single optimi- maximizing the model evidence fðYjM j Þ can effectively mitigate
zation. This single optimization still suffers from overfitting, as overfitting. Because Step 1 is close to optimality for the model evi-
seen from the tðzÞ Samples conditioning on (S , σ , δ ) and the dence, it is expected that Step 1 can also effectively mitigate over-
Y data [see the grey dotted lines in Fig. 5(a)]. Such tðzÞ samples fitting. To illustrate that M 0 is close to M , consider a reduced M
can be readily drawn because W is multivariate normal condition- which contains the first ten shifted Legendre polynomials (m ¼ 9).
ing on (S ; σ ; δ ) and Y (multivariate normality will be discussed A smaller m is taken herein to make it feasible to search over
in a later section). The overfitting in Fig. 5(a) is evident because the 2mþ1 ¼ 1,024 possible model classes. For each model class
tðzÞ samples follow the Y data closely. M j , the model evidence fðYjM j Þ in Eq. (12) is estimated using
Now consider the full Step 1 that consists of a series of iterative the transitional Markov chain Mote Carlo (TMCMC) method
optimization and thresholding operations. Fig. 3(c) shows the final (Ching and Chen 2007; Ching and Wang 2016a), producing
optimal solutions for S ¼ ðs0 ; s1 ; : : : ; s20 Þ. Many of them are zero 1,024 model evidence estimates. Fig. 6(a) shows the histogram
Fig. 5. tðzÞ samples conditioning on (S ; σ ; δ ) and Y: (a) (S ; σ ; δ ) obtained by a single optimization; (b) (S ; σ ; δ ) obtained at the end of Step 1
© ASCE 04017028-6 J. Eng. Mech.

Fig. 6. Histograms of the 1,024 ln½fðYjMj Þ estimates: (a) EX1; (b) EX2; (c) EX3; the first 10 shifted Legendre polynomials (m ¼ 9) are adopted
for the 1,024 ln½fðYjM j Þ estimates for the EX1 data. The vertical fðYjM 0 Þ in Eq. (12) is evaluated under a flat prior fðWjM j Þ
solid line indicates the location of the optimal log model evidence: for which sk is set to be a fixed large number. By adopting the flat
ln½fðYjM Þ ¼ −316.24. This represents the optimal model class prior, the model evidences for different model classes are evaluated
M for EX1 from the viewpoint of the Bayesian model class selec- based on the same W prior PDF. This puts the comparison among
tion. The vertical dashed line indicates the location of the log model different model classes on the same ground. If sk is set to sk rather
evidence for M 0 ¼ fϕ0 ; ϕ1 g selected by Step 1. The log model evi- than to a fixed large number, the resulting model evidence estimates
dence for M 0 is fairly close to the optimal log model evidence. for various model classes are not based on the same W priors. In
Figs. 6(b and c) show the histograms for the 1,024 ln½fðYjM j Þ fact, numerical evidences show that the model class with overfitting
estimates for the EX2 and EX3 data. For EX2 and EX3, M 0 hap- sk tends to have a higher model evidence.
pens to be identical to M . Drawing posterior samples from the high-dimensional pos-
terior PDF fðW 0 ; ln σ; ln δjY; M 0 Þ is challenging. This study
proposes an efficient method to draw the posterior samples of
Step 2—Bayesian Analysis (W 0 ; ln σ; ln δ) to circumvent the difficulty for high-dimensional
sampling: (1) (ln σ; ln δ) samples are first drawn from fðln σ;
After the model class M 0 is selected in Step 1, Bayesian analysis ln δjY; M 0 Þ. Many sampling methods can become inefficient for
can be performed in Step 2 to draw posterior samples of high-dimensional PDFs. However, fðln σ; ln δjY; M 0 Þ is always
(W; ln σ; ln δ) conditioning on the Y data. Because the full model two-dimensional and (2) W 0 samples are then drawn from the
class M is reduced to M 0 after Step 1, the posterior PDF of interest multivariate normal PDF fðW 0 jσ; δ; Y; M 0 Þ, where σ and δ are
is fðW 0 ; ln σ; ln δjY; M 0 Þ, where W 0 contains the weights for the fixed at their sampled values. This PDF is potentially high dimen-
selected BFs only. The matrices Φ and Ω reduce to Φ 0 and Ω 0 as sional, but samples from a multivariate normal PDF can be easily
well. For EX2 in Fig. 3, fϕ0 ; ϕ2 ; ϕ3 g are selected in Step 1. There- obtained.
fore, W 0 ¼ ðw0 ; w2 ; w3 ÞT , S 0 ¼ ðs0 ; s2 ; s3 Þ
2 3 Drawing (ln σ; ln δ) Samples from f (ln σ; ln δjY ;M 0 )
ϕ0 ðz1 Þ ϕ2 ðz1 Þ ϕ3 ðz1 Þ
2 3
6 7 s20 The posterior PDF fðln σ; ln δjY; M 0 Þ can be written as
6 ϕ0 ðz2 Þ ϕ2 ðz2 Þ ϕ3 ðz2 Þ 7
0 6 7 6 7
Φ ¼ 6. .. .. 7 Ω0 ¼ 4 s22 5 ð18Þ
6 .. . . 7 fðYjσ; δ; M 0 Þfðln σjM 0 Þfðln δjM 0 Þ
4 5 s23 fðln σ; ln δjY; M 0 Þ ¼ ð20Þ
ϕ0 ðzn Þ ϕ2 ðzn Þ ϕ3 ðzn Þ fðYjM 0 Þ
and Eq. (6) now becomes where fðYjσ; δ; M 0 Þ has the following analytical expression
[similar to Eq. (14)]:
Y ¼ Φ0 × W 0 þ ε ð19Þ
1 1
In Step 2, sk is not treated as a variable but is set to be a fixed fðYjσ; δ; M 0 Þ ¼ pffiffiffiffiffiffin × pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
large number (hence the diagonals of Ω 0 are fixed large numbers as 2π jΦ 0 × Ω 0 × Φ 0T þ Σj
well). This means that a flat prior PDF is assigned to each weight
1 T 0 0 0T −1
wk . If sk is set to the optimal solution sk from Step 1, the trend × exp − Y ðΦ × Ω × Φ þ ΣÞ Y ð21Þ
2
estimates obtained in Step 2 will not be significantly affected. How-
ever, the estimated model evidence fðYjM 0 Þ, a by-product of Step 2
that is to be introduced later, will be misleading. This model evi- The denominator fðYjM 0 Þ in Eq. (21) is the model evidence of
dence estimate is essential for the selection of basis type (e.g., se- M 0.Recall that lnðσÞ and lnðδÞ have non-informative flat prior
lection among polynomial, shifted Legendre, and Fourier cosine PDFs, hence fðln σjM 0 Þ and fðln δjM 0 Þ are constant functions.
bases) and for the selection of autocorrelation model. Recall that Therefore
© ASCE 04017028-7 J. Eng. Mech.

1 intervals ½lnð5Þ; lnð50Þ and ½lnð0.01Þ; lnð1Þ, respectively. For each
fðln σ; ln δjY; M 0 Þ ∝ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
jΦ × Ω × Φ 0T þ Σj
0 0 set of (σ̂, δ̂) samples, a companion set of Ŵ 0 samples is drawn and a
trend function sample t̂ðzÞ is obtained using Eq. (25). Therefore
1 there are 5,000 trend function samples as well as 5,000 (σ̂, δ̂) sam-
× exp − Y T ðΦ 0 × Ω 0 × Φ 0T þ ΣÞ−1 Y
2 ples. These samples reflect the statistical uncertainties in the trend
ð22Þ function tðzÞ and in (σ, δ).
Fig. 7(b) shows the median and the 95% confidence interval for
With Eq. (22), the MCMC method (Metropolis et al. 1953; the 5,000 trend function samples. The actual trend function tðzÞ is
Hastings 1970; Wang and Cao 2013; Wang et al. 2016) can be shown for comparison. Note that the results presented here are
adopted to obtain samples from fðln σ; ln δjY; M 0 Þ. In this paper, based on one realization of the Y data. This corresponds to the most
the TMCMC method (Ching and Chen 2007; Ching and Wang extreme site-investigation scenario where only a single CPT sound-
2016a) is adopted to obtain the samples. An important benefit ing is available. The effect of multiple CPT soundings on the iden-
of TMCMC is that it can estimate the model evidence fðYjM 0 Þ tification of tðzÞ and (σ; δ) is presented in Ching et al. (2016d). The
as a by-product without extra computation cost. Recently, Betz et al. 95% confidence interval for the trend function in Fig. 7(b) is a
(2016) proposed a useful modification to the original TMCMC. depthwise confidence interval. For instance, Fig. 7(b) shows the
They found that if the TMCMC weights are adjusted after each 95% confidence interval for z ¼ 0.25 as the left-hand-side error
MCMC move, the bias in the model evidence estimate will be sig- bar. The upper and lower bounds of the error bar are the 97.5 and
nificantly reduced (see also the discussion by Ching and Wang 2.5% percentiles, respectively, of the 5,000 trend function samples
2016b). This modification is adopted in the TMCMC. With the evaluated at z ¼ 0.25. The 95% confidence interval for another
modified TMCMC method, the posterior samples for (ln σ; ln δ) depth can also be plotted—e.g., the right-hand-side error bar is the
can be readily drawn from fðln σ; ln δjY; M 0 Þ. Moreover, the model 95% confidence interval at z ¼ 0.75. The dashed lines in Fig. 7(b)
evidence fðYjM 0 Þ can be estimated. This model evidence will be are the locus of the upper and lower bounds of the 95% confidence
adopted for the Bayesian model class selection over different types intervals at various depths. For this particular random realization of
of basis functions (e.g., polynomials and Fourier cosine functions). the Y data, the actual tðz ¼ 0.25Þ lies within the 95% confidence
In order to obtain a meaningful fðYjM 0 Þ estimate, all sk should be interval at z ¼ 0.25 [Fig. 7(b)]. For another random realization of
set to fixed large numbers, and the diagonals of Ω 0 are the squares Y, this does not necessarily hold, because the upper and lower
of these large sk . bounds of the 95% confidence interval will change. The confidence
interval in this paper is always constructed based on a single reali-
Drawing W 0 Samples from f (W 0 jσ;δ;Y ;M 0 ) zation of the Y data. For each realization, the chance of such an
interval capturing the actual trend at a depth z is either yes or
Let (ln σ̂; ln δ̂) denote a set of samples drawn from fðln σ; ln δjY; no. To get a sense of the chance of capturing the actual trend using
M 0 Þ. They are first converted to exponentials to obtain (σ̂, δ̂). Next, this confidence-interval construction procedure, it is necessary to
the W 0 sample can be readily drawn from fðW 0 jσ̂; δ̂; Y; M 0 Þ, a repeat the procedure for a large number of different Y realizations;
multivariate normal PDF with the following mean vector and a sample size of 100 is adopted in this paper. One hundred random
covariance matrix: realizations of the Y data show that the chance for such a confi-
dence interval at z ¼ 0.25 to contain the actual trend tðz ¼ 0.25Þ
Ĉ 0 ¼ ðΩ 0−1 þ Φ 0T × Σ̂−1 × Φ 0 Þ−1
is 0.97 and that the chance for such a confidence interval at
μ̂ 0 ¼ Ĉ 0 × Φ 0T × Σ̂−1 × Y ð23Þ z ¼ 0.75 to contain the actual trend tðz ¼ 0.75Þ is 0.95. Both
are reasonably close to 95%. The two-step approach is deemed
where Σ̂ is associated with (σ̂, δ̂): to be consistent in this well-defined sense that the resulting 95%
2 3 Bayesian confidence interval (or region) contains the actual trend
1 e−2jz1 −z2 j=δ̂ e−2jz1 −z3 j=δ̂ · · · e−2jz1 −zn j=δ̂ (or actual σ and δ) with a chance that is close to 0.95. It is not true
6 7
6 1 e−2jz2 −z3 j=δ̂ · · · e−2jz2 −zn j=δ̂ 7 that the chance for the dashed lines in Fig. 7(b) to completely enclose
6 7
6 7 the entire tðzÞ is also about 95%; the criterion of complete enclosure
2 6
Σ̂ ¼ σ̂ × 6 1 7 ð24Þ
7 is more strict, hence the chance for complete enclosure is less.
6 .. 7
6 SYM: . −2jzn−1 −zn j=δ̂ 7 Fig. 7(c) shows the (σ̂, δ̂) samples. The actual (σ, δ) values are
4 e 5
shown for comparison. There seems to be a strong tradeoff between
1 (σ̂, δ̂). Ching et al. (2016c) found that (σ; δ) are unidentifiable for
data with insufficient nD . When nD is not sufficiently large, there
Consider again EX2 in Fig. 3 as the example. Recall that
are many combinations of (σ; δ) that are all consistent with the Y
fϕ0 ; ϕ2 ; ϕ3 g are finally selected. Suppose (σ̂, δ̂) has been drawn
data. For the current case, nD ¼ 10. For Y data with a larger nD, the
using TMCMC. The companion sample Ŵ 0 ¼ ðŵ0 ; ŵ2 ; ŵ3 Þ can
tradeoff will be less significant. The unidentifiability issue of
then be drawn from a multivariate normal PDF with the mean vec-
½tðzÞ; σ; δ is explored in detail in Ching et al. (2016d). Ching et al.
tor and covariance matrix defined in Eq. (23). Finally, the sample
for the trend function, denoted by t̂ðzÞ, can be obtained: (2016c) proposed a method of constructing a 95% confidence el-
lipsoid for the cloud of the (σ̂, δ̂) samples. This 95% confidence
t̂ðzÞ ¼ Φ 0 × Ŵ 0 ¼ ŵ0 ϕ0 ðzÞ þ ŵ2 ϕ2 ðzÞ þ ŵ3 ϕ3 ðzÞ ð25Þ ellipsoid is shown in Fig. 7(c). For this particular random realiza-
tion of the Y data, the actual (σ; δ) is within the 95% confidence
To illustrate Step 2, the Y data points for EX2 in Fig. 3(a) ellipsoid. For another random realization of Y this does not nec-
(dashed line) are analyzed. Because fϕ0 ; ϕ2 ; ϕ3 g are selected in essarily hold, because the location of the 95% confidence ellipsoid
Step 1, the TMCMC method is adopted to obtain samples from will change. One hundred random realizations of the Y data reveal
fðln σ; ln δjY; M 0 Þ with these three BFs only in Step 2. The that the chance for the 95% confidence ellipsoid to contain the actual
TMCMC stage sample size is taken to be 5,000. In Step 2, the stan- (σ; δ) is 0.94. This is also reasonably close to 95%. Although not
dard deviation sk is set to be a fixed large number ½expð10Þ ≈ shown, the (σ̂, δ̂) samples seem to be mutually uncorrelated to the
22,000. The prior PDFs for lnðσÞ and lnðδÞ are uniform over the W 0 samples. The (ŵ0 , ŵ1 , ŵ2 ) samples seem mutually uncorrelated
© ASCE 04017028-8 J. Eng. Mech.

Fig. 7. (a) One simulated realization for EX2: the dashed line is the simulated Y data; (b) median estimate and the 95% confidence interval for the
trend function based on the one simulated realization in (a); (c) (σ̂, δ̂) samples based on the one simulated realization in (a)
as well, because the shifted Legendre polynomials are orthogonal. basis, wavelet basis, and b-spline basis. The effectiveness of a cer-
Furthermore, the posterior mean values for (w0 ; w1 ; w2 ) are signifi- tain basis type depends on the functional form of the actual trend
cantly greater than zero, although its prior mean value is zero. function. The polynomial and shifted Legendre bases may be more
effective for polynomial trends such as EX1 and EX2, whereas the
Effect of Spatial Variability Amplitude Fourier cosine basis may be more effective for the cosine trend
such as EX3. Recall that the TMCMC method can estimate the
The Y data for EX2 are originally simulated with standard devia- model evidence fðYjM 0 Þ as a by-product. The resulting model
tion σ ¼ 20. To investigate the effect of a different spatial variabil- evidence estimate will be adopted to select the basis type in this
ity amplitude, the Y data are simulated with two different standard section. The selection among the polynomial basis (M P0 ), shifted
deviations σ ¼ 10 and 40, whereas the scale of fluctuation remains Legendre basis (M L0 ), and Fourier cosine basis (M F0 ) according
δ ¼ 0.1. For σ ¼ 10, the chance for the confidence interval/ to their model evidence estimates, denoted by fðYjM P0 Þ, fðYjML0 Þ,
ellipsoid to contain the actual trend/parameters is still close to 95%. and fðYjM F0 Þ, respectively, will be demonstrated. To do this, the
For σ ¼ 40, however, it becomes fairly challenging to estimate the above two-step approach is executed for each basis type for the same
trend, and the chance for the confidence interval to contain the ac- Y data. During the TMCMC in Step 2, the evidences fðYjM P0 Þ,
tual trend is only about 75%. Numerical evidence shows that the fðYjML0 Þ, and fðYjM F0 Þ are estimated. The basis type with the
discrepancy from 95% occurs because Step 1 fails to identify a suit- highest evidence is the best basis type. The posterior probability of
able set of BFs in the presence of significant spatial variability. If a each basis type can be estimated as the normalized evidence—
suitable set of BFs is selected in Step 1, the chance can be close to e.g., PðM L0 jYÞ ¼ fðYjM L0 Þ=½fðYjM P0 Þ þ fðYjM L0 Þ þ fðYjM F0 Þ.
95% even in the presence of significant spatial variability. The data for EX1, EX2, and EX3 are analyzed using the three
types of basis—polynomial, shifted Legendre, and Fourier cosine
Effect of the Scale of Fluctuation bases. The logarithms of the resulting model evidence estimates are
shown in Table 1, with the posterior probability shown in brackets.
The Y data for EX2 are originally simulated with scale of fluc-
Note that EX1 and EX2 have polynomial trends and EX3 has a co-
tuation δ ¼ 0.1 (nD ¼ 10). To investigate the effect of a different
sine trend. It is reasonable that the polynomial basis performs the
SOF, the Y data are simulated with two different SOFs δ ¼ 0.02
(nD ¼ 50) and 0.5 (nD ¼ 2), whereas the standard deviation best for EX1 and EX2 and that the Fourier cosine basis performs
remains σ ¼ 20. The former case, δ ¼ 0.02, corresponds to Δ ¼ the best for EX3. To illustrate that the selection result is sensible,
ðsampling intervalÞ=δ ¼ 0.01=0.02 ¼ 0.5, hence δ is still identifi- consider EX3. The polynomial basis performs poorly for EX3 with
able. For δ ¼ 0.02 (or nD ¼ 50), the chance for the confidence in- a very low evidence. Fig. 8(a) further shows the median estimate and
terval/ellipsoid to contain the actual trend/parameters is still close the 95% confidence intervals for the trend function of EX3 based on
to 95%. For δ ¼ 0.5 (or nD ¼ 2), it becomes fairly challenging to the polynomial basis. It is clear that the Y data trend is not even
estimate the trend because it can be difficult to distinguish between captured. The chance that the actual trend tðzÞ is within the confi-
trend and spatial variability when the spatial variability has a large dence interval at a fixed z location becomes significantly lower than
SOF. The chance for the confidence interval to contain the actual 95%. In contrast, the shifted Legendre and Fourier cosine bases can
trend is approximately 75–80%. Numerical evidence shows that the capture the Y data trend [Figs. 8(b and c)]. The Fourier cosine basis
discrepancy from 95% occurs because Step 1 fails to identify a suit- not only captures the Y data trend, but the resulting 95% confidence
able set of BFs in the presence of the large-SOF spatial variability. interval is fairly narrow. The above observations are consistent with
If a suitable set of BFs has been selected in Step 1, the chance the fact that the polynomial basis has the lowest evidence whereas
can be close to 95% even in the presence of the large-SOF spatial the Fourier cosine basis has the highest evidence. For all examples,
variability. the shifted Legendre basis never performs poorly in the sense that the
median estimate fails to capture the general shape of the Y data [such
as in Fig. 8(a)]. In fact, the evidence for the shifted Legendre basis is
Selection for Type of Basis Functions
usually not the lowest among the three basis types.
Shifted Legendre BFs have been considered thus far. There are other The model evidence estimate can also be adopted to select
types of BFs, such as polynomial basis, Fourier cosine basis, radial the autocorrelation model. Although only the single exponential
© ASCE 04017028-9 J. Eng. Mech.

Table 1. Logarithms of the Model Evidence Estimates
ln½fðYjM 0 Þ estimate [PðM 0 jYÞ estimate]
Trend function Polynomial M P0 Shifted Legendre M L0 Fourier cosine M F0
EX1: tðzÞ ¼ 100 þ 200z −322.14½0.52 −322.86½0.25 −322.96½0.23
EX2: tðzÞ ¼ 100 þ 100z − 1,000z2 þ 1,000z3 −328.15½0.79 −329.50½0.21 −335.74½0.00
EX3: tðzÞ ¼ 100 þ 100 × cosð4πzÞ −336.03½0.00 −331.45½0.00 −326.08½1.00
Fig. 8. Median estimates and the 95% confidence intervals for the trend function in EX3: (a) polynomial basis; (b) shifted Legendre basis;
(c) Fourier cosine basis
model in Eq. (8) is adopted thus far, other autocorrelation models i.e., it is not due to statistical uncertainty. From the Bayesian per-
such as squared exponential, cosine exponential, and second order spective, this discrepancy can be interpreted as model uncertainty.
Markov models (Uzielli et al. 2005) can also be adopted. By The model uncertainties in the BFs and the autocorrelation function
comparing the model evidence estimates produced by different for this real-world case study are addressed in the following
autocorrelation models, a suitable autocorrelation model can be analysis. The details for the analysis of Layer 2 are presented
selected. below.
Fig. 9(b) shows the lnðQtn Þ data points for Layer 2. Originally,
the data points were defined on the interval z ¼ ½2.34 m; 11.92 m.
Real Case History However, it is well known that CPTU data has a transition zone,
and the data within the transition zone do not truly represent the
Piezocone test (CPTU) data from the nearshore of the eastern part current layer. For the current case study the transition zone was
of Singapore is analyzed in this section. The water depth is approx- taken to be 0.3 m on each side of the layer boundary based
imately 2 m. The first 2 m of the seabed is very soft surface sed- on visual inspection. The Y ¼ lnðQtn Þ data points excluding the
imentation, followed by upper marine clay, fluvial clay, and lower transition zones were analyzed, so the depth range for the Y data
marine clay. Fig. 9 shows the normalized cone resistance (Qtn ) became z ¼ ½2.64 m; 11.62 m. This depth range was translated to
data. The vertical data interval is 0.02 m. The normalized cone [0, 1][Fig. 9(c)] by z ¼ ½ðoriginal zÞ − 2.64=ð11.62 − 2.64Þ. The
resistance is defined as (Robertson 2009) translated Y data was analyzed using the two-step approach. The
database collected by Phoon and Kulhawy (1999) indicates that
0 n
Qtn ¼ ½ðqt − σv0 Þ=Pa × ðPa =σv0 Þ ð26Þ the coefficient of variation (COV) for the inherent variability of soil
properties roughly ranges from 0.05 to 0.5. This implies that the
where qt = (corrected) cone resistance; Pa ¼ 101.3 kN=m2 (one standard deviation (σ) of the logarithm for the inherent variability
atmosphere pressure); σv0 0 and σ = effective and total overburden also ranges approximately from 0.05 to 0.5. Phoon and Kulhawy
v0
stresses, respectively; and the exponent n varies from 0.5 to 1 de- (1999) indicated that the vertical SOF ranges approximately from
pending on the soil behavior type (n ¼ 1 for claylike soils and n ¼ 0.1 m to 10 m. The Y data was the translated into the [0, 1] interval
0.5 for sandlike soils) (Robertson 2009). Fig. 9(a) also shows the (depth was scaled by 11.62 − 2.64 ¼ 8.98 m), hence the SOF for
stratification results based on the wavelet transform modulus the translated Y data should range from ð0.1 mÞ=ð8.98 mÞ ¼
maxima method (Ching et al. 2015). There are four soil layers, in- 0.011 to ð10 mÞ=ð8.98 mÞ ¼ 1.11. As a result, the prior PDFs
dexed 1 to 4 from top to bottom, with soil behavior type (Robertson for lnðσÞ and lnðδÞ were uniform over the intervals [ln(0.05), ln
2009) annotated in the figure. The purpose is to demonstrate the (0.5)] and [ln(0.011), ln(1.11)], respectively. The two-step approach
use of the proposed two-step approach for the characterization was executed three times with the polynomial basis, shifted Legen-
of the statistical uncertainties in the trend function tðzÞ for the dre basis, and Fourier cosine basis, producing three model evidence
lnðQtn Þ profile as well as in its (σ; δ). It is noteworthy that for this estimates. For Layer 2, the polynomial basis had the highest model
real-world example, discrepancy between the reality and the model evidence. The polynomial basis was therefore selected to obtain the
(such as the adopted BFs and autocorrelation function) may exist. median estimate and 95% confidence interval for the trend function
This discrepancy cannot be reduced by increasing amount of data; as well as the posterior samples of (σ; δ). Note that the δ samples
© ASCE 04017028-10 J. Eng. Mech.

Fig. 9. (a) lnðQtn Þ data for the eastern Singapore site; (b) lnðQtn Þ data for Layer 2; (c) translated lnðQtn Þ data for Layer 2 (excluding the transition zone)
need to be multiplied by 8.98 m to convert them back to the original The single exponential model in Eq. (8) produced the maximum
physical depth scale. Different autocorrelation models, including model evidence for all layers, hence it was selected for all the
the single exponential, squared exponential, cosine exponential, analysis.
second order Markov, and triangular models, were also investigated. Fig. 10 shows the resulting median estimate and 95% confi-
dence interval for the trend function as well as the posterior samples
of (σ; δ). The lnðQtn Þ data points in each soil layer were analyzed
independently with respect to the other soil layers. The polynomial
basis was optimal for Layers 2 and 4, and the shifted Legendre
basis was optimal for Layers 1 and 3. The (σ; δ) samples for most
layers seemed to have a strong tradeoff except for Layer 2. Layer 2
was considered to be relatively thick because its nD = thickness/
SOF was large. Its thickness was 8.98 m (excluding the transition
zone) and its δ samples ranged from 0.1 to 0.2 m. Therefore its nD
ranged from 45 to 90. The lnðQtn Þ data were also reanalyzed by
the two-step approach but with the assumption of uncorrelated
residuals [the original setting in Tipping (2001)]. The resulting
trend estimates were fairly different from those seen in the left
plot in Fig. 10. The resulting median estimate for the trend closely
followed the Y data and the resulting 95% confidence intervals
were very narrow—e.g., for Layer 2, the 95% confidence interval
was about seven times narrower than the ones seen in Fig. 10.
The uncertainty in the trend was greatly underestimated with the
uncorrelated residual assumption.
Conclusions
This study tries to fully characterize the statistical uncertainty in the

unknown trend function for the vertical spatial distribution of geo-
technical site investigation data. Previous studies have shown that
ignoring the statistical uncertainty in the trend function may lead to
unconservative reliability estimation. Although the characterization
of uncertain trend parameters—e.g., gradient and intercept—has
been addressed in the literature, the characterization of the uncer-
tain functional form for the trend function has not been addressed in
Fig. 10. Analysis results for the eastern Singapore site: (a) median and
the literature. The trend function is usually buried in spatial vari-
95% confidence interval for the trend function; (b–e) the posterior
ability, and it may not be realistic to determine its functional form
samples of (σ; δ) for the four soil layers
by visual inspection. To the authors’ best knowledge, methods that
© ASCE 04017028-11 J. Eng. Mech.

characterize both uncertainties (uncertainties in parameters and in ∂½Y T ðΦ × Ω × ΦT þ ΣÞ−1 Y
functional form) do not yet exist. ∂s−2
k
The current paper proposes a two-step approach to characterize
the uncertainties in all parameters, including the functional form ∂½Y T Σ−1 Y − Y T Σ−1 ΦðΦT Σ−1 Φ þ Ω−1 Þ−1 ΦT Σ−1 Y
¼
of the trend, within a consistent Bayesian framework. In Step 1, an ∂s−2
k
iterative optimization procedure is proposed to select the optimal ∂ðΦT Σ−1 Φ þ Ω−1 Þ
set of the BFs that can effectively represent the actual trend func- ¼ −Y T Σ−1 ΦðΦT Σ−1 Φ þ Ω−1 Þ−1 ×
∂s−2
k
tion. The solution for Step 1 is sparse, meaning that typically only
a few BFs are necessary to effectively represent the actual trend. ∂ðΩ−1 Þ
× ðΦT Σ−1 Φ þ Ω−1 Þ−1 ΦT Σ−1 Y ¼ μT × × μ ¼ μ2k
To retain the critical sparsity feature in the presence of correlated ∂s−2
k
residuals, an appropriate thresholding scheme for the BFs is
ð29Þ
proposed. In Step 2, an efficient method based on the TMCMC

method is proposed to obtain the high-dimensional posterior sam-
ples for the uncertain parameters, including the uncertain trend
and uncertain standard deviation and SOF for the spatial variabil- where μ is defined in Eq. (16). Here, the matrix inversion lemma
ity. Simulation studies show that the two-step approach is gener- ðA × B × AT þ DÞ−1 ¼ D−1 − D−1 × A × ðAT × D−1 × A þ B−1 Þ−1 ×
ally effective for characterizing the statistical uncertainties in trend AT × D−1 and the partial derivative identity ∂ðA−1 Þ=∂x ¼ −A−1 ×
parameters and in functional form, in the well-defined sense that ∂ðAÞ=∂x × A−1 have been implemented. Combining Eqs. (27)–
the resulting 95% Bayesian confidence intervals enclose the actual (29), it is clear that
parameters/trend with a chance that is reasonably close to 0.95.
An exception can happen when the BFs are poorly selected in
Step 1. This can be partly mitigated by the Bayesian model class ∂ ln fðYjS; σ; δ; MÞ 1 1
¼ − × ðCkk − s2k Þ − × μ2k ð30Þ
selection over different types of basis, such as polynomial basis, ∂s−2
k 2 2
Fourier cosine basis, radial basis, wavelet basis, and b-spline
basis.
Solving Eq. (30) for zero gives ðnew s2k Þ ¼ μ2k þ Ckk . Note that
both C and μ are based on (old sk ). Tipping (2001) indicated that if
Appendix. Analytical Solutions for 1D Optimization Eq. (30) is written as
Problems
The optimization problem for fðYjS; σ; δ; MÞ can be decomposed

into (m þ 3) 1D optimization problems. This appendix derives the ∂ ln fðYjS; σ; δ; MÞ 1 Ckk 1
¼ − × − 1 × s2k − × μ2k ð31Þ
analytical solutions for (m þ 2) of them: the optimization with re- ∂s−2
k 2 s 2
k 2
spect to sk ðk ¼ 0,1; : : : ; mÞ and the one with respect to σ. The 1D
optimization with respect to δ does not have an analytical solution.
For the 1D optimization with respect to sk , it is convenient to with the first sk being old and the second being new, the conver-
optimize ln½fðYjS; σ; δ; MÞ with respect to s−2 k : gence is much faster. This gives
ln fðYjS; σ; δ; MÞ ¼ −0.5 × lnðjΦ × Ω × ΦT þ ΣjÞ
1 μ2k ðold s2k Þ × μ2k
− Y T ðΦ × Ω × ΦT þ ΣÞ−1 Y ð27Þ ðnew s2k Þ ¼ ¼ ð32Þ
2 1 − Ckk =ðold sk Þ ðold s2k Þ − Ckk
2
The optimal sk satisfies ∂ ln½fðYjS; σ; δ; MÞ=∂s−2

k ¼ 0. The
partial derivative for the first term yields
For the 1D optimization with respect to σ, it is also convenient to
∂ lnðjΦ × Ω × ΦT þ ΣjÞ
optimize ln½fðYjS; σ; δ; MÞ with respect to σ−2 . The partial deriva-
∂s−2
k tive for the first term yields
∂½lnðjΦ × Σ−1 × ΦT þ Ω−1 jÞ þ lnðjΩjÞ þ lnðjΣjÞ
¼
∂s−2
k
∂ lnðjΦ × Ω × ΦT þ ΣjÞ
¼ trace ðΦ × Σ−1 × ΦT þ Ω−1 Þ−1 ∂σ−2
∂½lnðjΦ × Σ−1 × ΦT þ Ω−1 jÞ þ lnðjΩjÞ þ lnðjΣjÞ
∂ðΦ × Σ−1 × ΦT þ Ω−1 Þ ∂ lnðjΩjÞ ¼
× þ ∂σ−2
∂s−2
k ∂s−2
k

¼ trace ðΦ × Σ−1 × ΦT þ Ω−1 Þ−1
∂ðΦ × Σ−1 × ΦT þ Ω−1 Þ ∂ lnðjΩjÞ
¼ trace C × þ
∂s−2 ∂s−2

k k ∂ðΦ × Σ−1 × ΦT þ Ω−1 Þ ∂ lnðjΣjÞ
∂Ω−1 ∂ lnðjΩjÞ × −2
þ
¼ trace C × −2 þ ¼ Ckk − s2k ð28Þ ∂σ ∂σ−2
∂sk ∂s−2
k ¼ traceðC × Φ × R−1 × ΦT Þ − n × σ2 ð33Þ
where C is defined in Eq. (16). Here, the matrix determinant
identity jA þ B × Dj ¼ jAj × jI þ D × A−1 × Bj and the partial
derivative identity ∂ lnðjAjÞ=∂x ¼ trace½A−1 × ð∂A=∂xÞ have where R ¼ Σ=σ2 is the correlation matrix. The partial derivative
been implemented. The partial derivative for the second term yields for the second term yields
© ASCE 04017028-12 J. Eng. Mech.

∂½Y T ðΦ × Ω × ΦT þ ΣÞ−1 Y Cao, Z., and Wang, Y. (2014). “Bayesian model comparison and charac-
terization of undrained shear strength.” J. Geotech. Geoenviron. Eng.,
∂σ−2
10.1061/(ASCE)GT.1943-5606.0001108, 04014018.
∂½ðΦ × Ω × ΦT þ ΣÞ−1 Ching, J., and Chen, Y. C. (2007). “Transitional Markov chain Monte
¼ YT Y
∂σ−2 Marlo method for Bayesian model updating, model class selection
∂ðΣÞ and model averaging.” J. Eng. Mech., 10.1061/(ASCE)0733-9399(2007)
¼ −Y T ðΦΩΦT þ ΣÞ−1 −2 ðΦΩΦT þ ΣÞ−1 Y 133:7(816), 816–832.
∂σ Ching, J., Phoon, K. K., and Wu, S. H. (2016a). “Impact of statistical un-
¼ Y T ½Σ−1 − Σ−1 ΦðΦT Σ−1 Φ þ Ω−1 Þ−1 ΦT Σ−1 certainty on geotechnical reliability estimation.” J. Eng. Mech., 10.1061
/(ASCE)EM.1943-7889.0001075, 04016027.
× ðσ2 ΣÞ½Σ−1 − Σ−1 ΦðΦT Σ−1 Φ þ Ω−1 Þ−1 ΦT Σ−1 Y
Ching, J., and Wang, J. S. (2016b). “Application of the transitional Markov
¼ ½Y T − Y T Σ−1 ΦðΦT Σ−1 Φ þ Ω−1 Þ−1 ΦT ðσ2 Σ−1 Þ chain Monte Carlo to probabilistic site characterization.” Eng. Geol.,
203, 151–167.
× ½Y − ΦðΦT Σ−1 Φ þ Ω−1 Þ−1 ΦT Σ−1 Y Ching, J., and Wang, J. S. (2016c). “Discussion: Transitional Markov
¼ ðY − Φ × μÞT R−1 ðY − Φ × μÞ ð34Þ chain Monte Carlo: Observations and improvements.” J. Eng. Mech.,
142(5), 04016016.
Ching, J., Wang, J. S., Juang, C. H., and Ku, C. S. (2015). “CPT-based
Combining Eqs. (33) and (34), it is clear that
stratigraphic profiling using the wavelet transform modulus maxima.”
∂ ln fðYjS; σ; δ; MÞ 1 Can. Geotech. J., 52(12), 1993–2007.
−2
¼ − × trace C × Φ × R−1 × ΦT − n × σ2 Ching, J., Wu, S. H., and Phoon, K. K. (2016d). “Statistical characterization
∂σ 2 of random field parameters using frequentist and Bayesian approaches.”
1 Can. Geotech. J., 53(2), 285–298.
− × ðY − Φ × μÞT R−1 ðY − Φ × μÞ ð35Þ
2 Fenton, G. A. (1999a). “Estimation for stochastic soil models.” J. Geotech.
Geoenviron. Eng., 10.1061/(ASCE)1090-0241(1999)125:6(470), 470–485.
Solving Eq. (35) for zero gives Fenton, G. A. (1999b). “Random field modeling of CPT data.” J. Geo-
tech. Geoenviron. Eng., 10.1061/(ASCE)1090-0241(1999)125:6(486),
ðY −Φ×μÞT R−1 ðY −Φ×μÞþtraceðC×Φ×R−1 ×ΦT Þ 486–498.
ðnewσ2 Þ¼
n Hastings, W. K. (1970). “Monte Carlo sampling methods using Markov
ð36Þ chains and their applications.” Biometrika, 57(1), 97–109.
Huang, Y., and Beck, J. L. (2015). “Hierarchical sparse Bayesian learning
If the first R in Eq. (36) is replaced with Σ=σ2 , a more mean- for structural health monitoring with incomplete modal data.” Int. J.
Uncertainty Quantif., 5(2), 139–169.
ingful result will be obtained:
Huang, Y., Beck, J. L., Wu, S., and Li, H. (2014). “Robust Bayesian
∂ ln fðYjS; σ; δ; MÞ 1 compressive sensing for signals in structural health monitoring.”
¼ − ×½σ2 × traceðC × Φ × Σ−1 × ΦT Þ− n × σ2 Comput. -Aided Civ. Infrastruct. Eng., 29(3), 160–179.
∂σ−2 2 Jaksa, M. B., Brooker, P. I., and Kaggwa, W. S. (1997). “Inaccuracies asso-
1
− × ðY − Φ × μÞT R−1 ðY − Φ × μÞ ð37Þ ciated with estimating random measurement errors.” J. Geotech. Geoen-
2 viron. Eng., 10.1061/(ASCE)1090-0241(1997)123:5(393), 393–401.
Kulatilake, P. H. S. (1991). “Discussion on ‘Probabilistic potentiometric
Solving Eq. (37) for zero gives surface mapping’ by P. H. S. Kulatilake.” J. Geotech. Eng., 10.1061
/(ASCE)0733-9410(1991)117:9(1458.2), 1458–1459.
ðY − Φ × μÞT R−1 ðY − Φ × μÞ Li, K. S. (1991). “Discussion on ‘Probabilistic potentiometric surface map-
ðnew σ2 Þ ¼
n − trace ðC × Φ × Σ−1 × ΦT Þ ping’ by P. H. S. Kulatilake.” J. Geotech. Eng., 10.1061/(ASCE)0733
-9410(1991)117:9(1457), 1457–1458.
ðold σ2 Þ × ðY − Φ × μÞT Σ−1 ðY − Φ × μÞ Liao, S. C., and Whitman, R. V. (1986). “Overburden correction factors for
¼ ð38Þ
n − trace ðC × Φ × Σ−1 × ΦT Þ SPT in sand.” J. Geotech. Eng., 10.1061/(ASCE)0733-9410(1986)112:
3(373), 373–377.
The denominator n − trace(C × Φ × Σ−1 × ΦT ) can be inter- MacKay, D. J. C. (1992). “Bayesian interpolation.” Neural Comput., 4(3),
preted as the (effective) degrees of freedom. 415–447.
Metropolis, N., Rosenbluth, A. W., Rosenbluth, M. N., Teller, A. H., and
Teller, E. (1953). “Equation of state calculations by fast computing
machines.” J. Chem. Phys., 21(6), 1087–1092.
Acknowledgments
Mu, H. Q., and Yuen, K. V. (2016). “Ground motion prediction equation
development by heterogeneous Bayesian learning.” Comput. -Aided
The authors gratefully acknowledge Kiso Jiban Consultant Co.
Civ. Infrastruct. Eng., 31(10), 761–776.
Ltd. for providing the piezocone sounding at the eastern part of Phoon, K. K., et al. (2016). “Some observations on ISO2394:2015 Annex
Singapore as a test example. The authors are also grateful for the D (reliability of geotechnical structures).” Struct. Saf., 62, 24–33.
valuable constructive review comments from the reviewers. Phoon, K. K., and Kulhawy, F. H. (1999). “Characterization of geotechnical
variability.” Can. Geotech. J., 36(4), 612–624.
Phoon, K. K., Quek, S. T., and An, P. (2003). “Identification of statistically
References homogeneous soil layers using modified Bartlett statistics.” J. Geo-
tech. Geoenviron. Eng., 10.1061/(ASCE)1090-0241(2003)129:7(649),
Beck, J., and Yuen, K. (2004). “Model selection using response measure- 649–659.
ments: Bayesian probabilistic approach.” ASCE J. Eng. Mech., 10.1061 Phoon, K. K., and Retief, J. V. (2015). “ISO2394:2015 Annex D
/(ASCE)0733-9399(2004)130:2(192), 192–203. (reliability of geotechnical structures).” Georisk: Assess. Manage. Risk
Betz, W., Papaioannou, I., and Straub, D. (2016). “Transitional Markov Eng. Syst. Geohazards, 9(3), 125–127.
chain Monte Carlo: Observations and improvements.” J. Eng. Mech., Robertson, P. K. (2009). “Interpretation of cone penetration tests: A unified
10.1061/(ASCE)EM.1943-7889.0001066, 04016016. approach.” Can. Geotech. J., 46(11), 1337–1355.
Candès, E. J., and Wakin, M. B. (2008). “An introduction to compressive Tipping, M. E. (2001). “Sparse Bayesian learning and the relevance vector
sampling.” IEEE Signal Process. Mag., 25(2), 21–30. machine.” J. Mach. Learn. Res., 1, 211–244.
© ASCE 04017028-13 J. Eng. Mech.

Uzielli, M., Vannucchi, G., and Phoon, K. K. (2005). “Random field char- Wang, Y., Cao, Z., and Li, D. Q. (2016). “Bayesian perspective on geotech-
acterisation of stress-normalised cone penetration testing parameters.” nical variability and site characterization.” Eng. Geol., 203, 117–125.
Geotechnique, 55(1), 3–20. Wang, Y., and Zhao, T. (2016). “Interpretation of soil property profile
Vanmarcke, E. H. (1977). “Probabilistic modeling of soil profiles.” J. Geo- from limited measurement data: A compressive sampling perspective.”
tech. Eng., 103(11), 1227–1246. Can. Geotech. J., 53(9), 1547–1559.
Wang, Y., and Aladejare, A. E. (2015). “Selection of site-specific regression Yuen, K.-V. (2010a). Bayesian methods for structural dynamics and civil
model for characterization of uniaxial compressive strength of rock.” engineering, Wiley, NJ.
Int. J. Rock Mech. Mining Sci., 75, 73–81. Yuen, K.-V. (2010b). “Recent developments of Bayesian model class
Wang, Y., and Cao, Z. (2013). “Probabilistic characterization of Young’s selection and applications in civil engineering.” Struct. Saf., 32(5),
modulus of soil using equivalent samples.” Eng. Geol., 159, 106–118. 338–346.
© ASCE 04017028-14 J. Eng. Mech.

Characterizing Uncertain Site-Specific Trend Function by Sparse Bayesian Learning

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Characterizing Uncertain Site-Specific Trend Function by Sparse Bayesian Learning

Uploaded by

Copyright:

Available Formats

Characterizing Uncertain Site-Specific Trend Function by

Sparse Bayesian Learning

© ASCE 04017028-1 J. Eng. Mech.

J. Eng. Mech., -1--1

© ASCE 04017028-2 J. Eng. Mech.

J. Eng. Mech., -1--1

Likelihood and Posterior PDF

where Σ is the covariance matrix of ε

ρðΔzÞ ¼ ρ½εðzÞ;εðz þ ΔzÞ ¼ expð−2jΔzj=δÞ ð8Þ · dW · dðln σÞ · dðln δÞ ð12Þ

© ASCE 04017028-3 J. Eng. Mech.

J. Eng. Mech., -1--1

tation cost for each model evidence estimation is high, not to 4 . . . 5

© ASCE 04017028-4 J. Eng. Mech.

J. Eng. Mech., -1--1

© ASCE 04017028-5 J. Eng. Mech.

J. Eng. Mech., -1--1

© ASCE 04017028-6 J. Eng. Mech.

J. Eng. Mech., -1--1

© ASCE 04017028-7 J. Eng. Mech.

J. Eng. Mech., -1--1

© ASCE 04017028-8 J. Eng. Mech.

J. Eng. Mech., -1--1

© ASCE 04017028-9 J. Eng. Mech.

J. Eng. Mech., -1--1

© ASCE 04017028-10 J. Eng. Mech.

J. Eng. Mech., -1--1

This study tries to fully characterize the statistical uncertainty in the

© ASCE 04017028-11 J. Eng. Mech.

J. Eng. Mech., -1--1

proposed. In Step 2, an efficient method based on the TMCMC

The optimization problem for fðYjS; σ; δ; MÞ can be decomposed  

The optimal sk satisfies ∂ ln½fðYjS; σ; δ; MÞ=∂s−2

© ASCE 04017028-12 J. Eng. Mech.

J. Eng. Mech., -1--1

© ASCE 04017028-13 J. Eng. Mech.

J. Eng. Mech., -1--1

© ASCE 04017028-14 J. Eng. Mech.

J. Eng. Mech., -1--1

You might also like

The optimization problem for fðYjS; σ; δ; MÞ can be decomposed