Sssamo14 Iooss PDF

Metamodeling with Gaussian
processes
Bertrand Iooss
EDF R&D
2014
09/06/2014
Uncertainties in energy production systems
Many uncertainties for the energy production and the safety due to:
• hazards (demand, weather, …),
• incomplete system knowledge (ageing, physics, …),
• internal agressions (failures, …)
• external agressions (earthquake, …)
In order to better understand, prove the safety and optimize its industrial
processes, EDF R&D develops some physical
numerical simulation codes
Problems of uncertainty management in computer experiments

Summer school SAMO – B. Iooss - 26/06/14 2
Example: Simulation of thermal-hydraulic accident
Pressurized water nuclear reactor
Scenario : Loss of primary coolant

accident due to a large break in cold
leg [ De Crecy et al. 2008 ]
K ~ 10-
10-50 input random variables X:
geometry, material properties,
environmental conditions, …
Computer code Y =f (X)

1-10 h - n ~ 100 - 500
Time cost ~ 1-
Interest output variable Y :

Peak of cladding temperature
Goals: numerical model exploration,

sensitivity analysis, uncertainty analysis
Source: CEA
Main problems in global sensitivity analysis
P0a) f (.) is complex (interactions, non monotonic, discontinuous, …)

Sobol (variance), entropy, distribution-based distances, …
P0b) f (.) is costly (nb of simulations : n << 1000)

screening, metamodels
P1) The dimension of Y is large or Y have a functional (temporal/spatial) nature

functional decompositions
P2) The dimension of X is large (K >> 10) or X have a functional nature

screening, groups of inputs, using the derivatives [ see Sergeï’s lecture ]
P3) The quantity of interest is linked to rare events (failure probability, high
quantile) [ see Lemaître et al., 2014 and Paul Lemaître PhD thesis (2014) ]
P4) The inputs are dependent [ see Thierry’s lecture ]
P5) f (.) is stochastic [ see Marrel et al., 2012 ]

…

Global sensitivity analysis: Identifying influent inputs
The model :
Y = f (X1, …, XK)
Calculations of all types of indices
(Sobol, distribution-based, …)
+ main effects E(Y | Xi )
Complexity/regularity
of model f Screening Variance
decomposition
Non
monotonic
Morris Metamodel Sobol
indices
Monotonic +
interactions
Design of Monte-Carlo sampling
Monotonic experiment
without
interaction Rank regression
Super screening
Linear 1st Linear regression
degree
Number of model
0 K 2K 10K 1000K evaluations
Metamodeling steps
Design of Metamodeling :
Simulation :
experiments: Approximation of
Performing
Points to perform the computer code
simulations
simulations
X1
Computer code
X2
Different type of metamodels: [ Simpson et al. 2001 ] [ Storlie & Helton 2008 ]
Linear regression - Polynomials - Splines – Additive models/GAM
Regression trees - Neural networks – Support Vector Machines
Chaos polynomials – Kriging/Gaussian process
Using kriging metamodels to estimate Sobol’indices
Var[E(Y X i )] Var[E(Y X -i )]
Main effects : Si = ; Total effects : ST i = 1 -
Var(Y ) Var(Y )
Many methods to estimate Sobol’ indices: Sampling (Monte Carlo, quasi-MC,
spectral approaches), smoothing methods, metamodels, …
To reduce the cost (number of model evaluations), the kriging metamodel is

efficient and allows to propagate the metamodel error on Si and STi estimates

Gaussian process
(kriging) metamodel

Kriging metamodel
Kriging [ Matheron 63 ] for computer codes relies on the idea to interpolate the
code outputs in dimension K [ Sacks et al. 89 ] as a spatial cartography
Kriging (or Gaussian process) is interesting because:

• it interpolates the outputs,
• it gives predictor associated with confidence bands
Example in 1D :
Theoretical function (K =1) :
Y = f ( X ) = sin( X )
Simulation of n = 7 computation
points

[ Chevalier, 2011 ]
Spatial statistics: kriging interpolation
N
Linear combination of N data: Y (u) =
*
å
i =1
l i Y( u i )
Kriging can take into account the data configuration, the distance between
data and target, the spatial correlations and potential external information
u1
Y *(u)
u5
u6
u4
u
u2
u7 u3 u8
Probabilistic framework
Estimation without bias: E [ Y *(u) – Y (u) ] = 0

the mean of the errors is zero
Estimation Y *(u) is optimal: Var [ Y *(u) – Y (u) ] is minimal

the dispersion of the errors is reduced
Stochastic model for Y(x)
The random field Y (x), with YÎÂ and x ÎÂp, is characterized by
its mean and its covariance
Y (x) is stationary of second order:

1. E[Y (x)] = m does not depend on x
[ ] - E[Y (x + h)] E[Y (x)] = C(h)
2. Covariance: Cov[Y ( x),Y ( x + h)] = E Y ( x + h)Y ( x)
does not depend on x
C(h)
variance Covariance
function
h
In practice N ( h)
C(h) = å[Y ( xi + h)Y ( xi )] - m 2
i =1
+ + + + +
Var[Y(x)] = C(0)
+ + + + + + + +
Y(x) Y(x+h)
+ + + + + + + + +
Y(x) Y(x+2h)
+ + + + + + + + + +
varianceC(h)
Nugget effect Covariance

(measure error or function
microstructure)
Range = maximal distance of correlation = correlation length
h
Examples of stochastic processes (Gaussian)
1D [ from: Marcotte ]
2D
3D
[ from: Baig, 2003 ]

Simple kriging (known mean)
N
Y (u) =
*
åi =1
l i ( u ) [ Y( u i ) - m ] + m (m = known constant)
Min { E [ Y *(u) – Y (u) ]2 }

li multiple linear regression by least squares
Best Linear Unbiased Predictor (BLUP)
Kriging weights li (u ) for Y (ui ) are obtained by:

ì N
í å l j ( u ) C( u i - u j ) = C ( u i - u ) " i = 1K N
î j =1
System of N linear equations with N unknowns which have an unique solution
(for non singular covariance matrix)
N
Kriging variance (estimation error): s (u ) = C (0) -
2
K å l (u ) C (u
i i - u)
does not depend on the Y values i =1
=> Visualisation of regions with imprecise estimations

=> Put new observation points in these regions
Variogram
Example : cartography of air pollution
73 measures of benzene
concentration (Rouen, France)
[ from: Bobbia, Mietlicki & Roth, 2000 ]
g (h) = C(0) – C(h)
Kriging mean
Kriging standard deviation

Gaussian process metamodel (1/2)
Idea: Computer code results are interpolated with the kriging technique
Necessary hypothesis: Gaussian process
Stochastic process Z with :
Definition: E[Z(x)] = 0
Y(x) = bF(x) + Z(x) Cov(Z(x), Z(u)) = σ ²R(x, u)
where σ ² is the variance
Regression stochastic part and R the correlation function
Z~N(0, σ ²R)
Parametric choices: K
– F : polynomial of degree 1 bF (x) = b 0 + å b i xi
i =1
– R : stationary => covariance function
æ K 2ö
Example: Gaussian covariance R(x, u) = R (x - u ) = exp ç - å q i xi - u i ÷
è i =1 ø
Anisotropy: qi s are not equal (correlation length of each input variable)
Gaussian process metamodel (2/2)
Joint distribution :
– Gaussian process (GP) model : Y(x) = bF(x) + Z(x), xÎÂp
– Learning sample (LS) of N simulations : (XLS,YLS)
( ) ((
X LS = x (1) ,..., x ( N ) , FLS = F ( X LS ) , RLS = R x ( i ) , x ( k ) ))
i ,k
YLS ~ N ( bFLS , s ² RLS )
– Conditional GP metamodel :
Y ( x ) X LS ,YLS ~ GP
[ ]
Mean : Yˆ( x ) = E Y ( x ) X LS ,YLS = b F ( x ) + r ( x ) RLS
-1
[YLS - bFLS ]
[( ) (
with r ( x) = R x (1) , x ,..., R x ( N ) , x )]
( ) (
Covariance : Cov Y (u ) X LS ,YLS , Y (v) X LS ,YLS = s ² R (u , v) + t r (u ) RLS
-1
r (v ) )
Þ Variance Þ Mean Square Error (MSE)
Illustration
Y Y
2 code runs 3 code runs
X x* X
gaussian
mean 95% confidence Y law
intervals (from MSE)
5 code runs
Conclusion: Given a sufficient number of points, we obtain an accurate metamodel

Hyperparameters estimation
Maximum likelihood method
– Likelihood maximisation on the learning basis (Xs,Ys):
(β ,θ,σ ) = Argmax
* * *
( )
ln L(Y
β,θ
,σ
LS , β,θ, σ )
– with
N 1 1 t
ln L(YLS , β,θ
,σ ) = - ln(2πσ² ) - ln(det RLS ) - σ ² [YLS - bFLS ]RLS
-1
[YLS - bFLS ]
2 2 2
– Joint estimation of β and σ : ï LS LS LS [

ì β * = t F R -1 F -1 t F R -1Y
LS LS LS ]
í * 1t
ïσ ² =
î N
YLS - β * FLS RLS
-1
[
YLS - β * FLS ] [ ]
– Estimation of correlation parameters θ:
(θ) = Argmin y (θ)
*
with y (θ) = RLS
-1
N
s ²*
θ

æ K 2ö
Estimation and validation R(u, v) = R (u - v ) = expç - å q i ui - vi ÷
è i =1 ø
Hyperparameters (qi )i=1…K estimated by likelihood maximization
Simplex method, stochastic algorithms

Problems in high dimensional context (K > 10), can be solved by sequential
fitting algorithms [ Marrel et al. 2008 ]
Predictor validation:
Predictivity coefficient
å (Y )
n
- Yî
2
i 1 code run (test point)
Q 2 ( Y , Yˆ) = 1 - i =1
n
å (Y - Yi )
2
i =1
- Test sample
- or leave-one-out
- or k-fold cross validation 3 code runs
MSE validation: Percentage of predicted values inside confidence bounds

Effects of the hyperparameters θand σ
f (x ) = sin (4px )
s = 1;q = 0.2
2 s 2 = 1;q = 10 -4
s 2 = 4 ;q = 0.2 s 2 = 4 ;q = 10 -4 [ Le Gratiet, 2011 ]

Effects of the covariance structure
[ Chevalier, 2011 ]

[ Le Gratiet, 2014 ]
GP metamodel in summary
The unkown function
simulations
Main hypothesis: z(x) is the realization of a

Gaussian process (defined by its mean and cov)
conditional
simulations
mean
observations 95%-confidence
interval
GP metamodel is given by the the probability law

of Z(x) conditionally to the observations Its mean gives the GP predictor
Its variance gives the error
The best way to build GP: Model-based adaptive designs
Example: criterion of the Gaussian process MSE (Mean Square Error)
MSE ( x) = s ² + t r ( x) RLS
-1
r ( x) + u ( x)(t ( bFLS ) RLS
-1
bFLS )t u ( x)
u ( x) = bF ( x)- t k ( x) RLS
-1
bFLS
xnew = arg max MSE ( x)
xÎD
Y Y
2 code runs 3 code runs
xnew X xnew X
Remark: other criteria are possible (e.g. focusing to active variables)
Conclusion: Model-based adaptive designs are the most efficient ones,

but are not always applicable
In practice, we need to initiate the process with a space-filling design
Estimation of rare events probability using GP
[ Bect et al. 2012 ]
Industrial problems: safety analysis with computer code (nuclear, transport, …)
Problem: find Pf = Prob [ f (X ) > T ] with X = random inputs ; T = treshold
Reasonable variance everywhere

Large errors in the target region
T J New adaptive design
X * = arg min ( IMSET )

X
IMSET = ò MSE ( x)1X T ( x )dx

X T is a small tube around T
[ from: Picheny et al. 2010 ]
T
Large variance in non-target region

Good accuracy in target region

Sobol’indices
estimation using
Gaussian process
metamodel

Sobol indices
Definitions for a deterministic function Y = f ( X 1 , K , X K )
Vi Var X i [ E ( f ( X 1 , K , X K ) X i )]
First order : S i = =
V V
V-i Var [ E ( f ( X 1 , K , X K ) X - i )]
Total : S Ti = =
V V
[
Notation : V = Var (Y ) ; X -i = X 1 , K , X i -1 , X i +1 , K , X K ]
Hypothesis: independent inputs
Classical approach estimation: replace f (X) by Ŷ (GP mean)
J + Computationally cheap (possible to perform analytical calculations)
N - Do not infer from the metamodel error

Sensitivity analysis with GP model : 2 analytical approaches
Gp model conditionally to LS points :
Y ( x, w) X LS ,YLS ~ GP
[
Yˆ( x ) = EW Y ( x, w ) X LS ,YLS ]
(
CovW Y (u , w ) X LS ,YLS , Y (v, w ) X LS ,YLS )
Computation of Sobol indices :
From full GP model Y ( X , w ) X LS ,YLS

From predictor formula [ Chen et al. 2005 ]
[ ]
Yˆ( X ) = EW Y ( X , w ) X LS ,YLS
[
ìïa( X i , w ) = E X -i Y ( X , w ) X LS ,YLS / X i]
í
[ (
ïîVi (w ) = Var X i E X -i Y ( X , w ) X LS ,YLS / X i )]
[
ì a ( X i ) = E X Yˆ( X ) / X i
ïï -i
]
a(Xi,ω) : stochastic process of Xi
í Var X i [ E X -i (Yˆ( X ) / X i )]
ïSi = Vi : random variables
ïî Var X (Yˆ)
Do not infer from ~ Vi ~

N metamodel error Si = Þ m i = EW ( S i )
E WVar X Y ( X , w ) X LS ,YLS

Analytical sensitivity indices from full GP model
Sobol indices: [ Oakley & O’Hagan 2004; Marrel et al. 2009 ]
mi =
( [ (
E W VarX i E X -i Y X LS ,YLS ( X , w ) X i )])
E W VarX Y ( X , w ) X LS ,YLS
si =
2
( [ (
Var W Var X i E X - i Y X LS ,Y LS ( X , w ) X i )]) J Infer from the
[
E W Var X Y ( X , w ) X LS ,Y LS ]
2 metamodel error
Computation:
Analytical calculations
Simple and double numerical integrals => computationally expensive N
Other limits:
Additional hypothesis : GP covariance is product of one-dimensional covariance
This estimator is not the “true”expectation of the Sobol’index
Not possible to estimate the total Sobol’indices

GP-simulation based Sobol’indices (1/2)
[ Le Gratiet et al. 2014 ]
w
Si =
[ (
VarX i E X -i Y X LS ,YLS ( X , w ) X i )] is a random variable
VarX Y ( X , w ) X LS ,YLS
We want to build an unbiased Monte Carlo estimator Sî , m where m represents

the number of Monte Carlo particles
and an estimator of the variance of Sî , m
Algorithm:
1. Generate 2 m -size ^ matrices (X1, X2) and compute the pick-freeze matrix X
(the one which used in Sobol’ indices estimation formula)
2. For k=1, …, nsim

• Perform a conditional simulation y*k of the GP at points in X
• Compute Sî , m , k
• Bootstrap procedure (b = 1, …, B replicas) gives Sî , m , k , b

GP-simulation based Sobol’indices (2/2)
[ Le Gratiet et al. 2014 ]
nsim B
Sî , m = å å Sˆ
1
The estimator of Sobol’ index: i , m , k ,b
nsim ´ B k =1 b =1
å å (Sˆ )
nsim B
- Sî , m
1
The variance of this estimator: sˆ =
i
2
i ,m , k ,b
nsim ´ ( B - 1) k =1 b =1
It integrates the metamodeling error + the Monte Carlo integration error
é
( ù
)
B nsim
1 1 ˆ
sˆ ( PG) =
i
2
B
å ê å i ,m , k ,b i , m ,b úû
b =1 ë nsim - 1 k =1
S - S
é 1
( ù
)
nsim B
1 ˆ
sˆ ( MC) =
i
2
nsim
å ê å i , m ,k ,b i ,m , k úû
k =1 ë B - 1 b =1
S - S
Remark: Same algorithm for total Sobol’ indices

Example: Ishigami function (K = 3)
Learning sample: space filling design (n = 40 to 200)
Gaussian process with 5/2 Matérn cov.

GP
predictivity
m = 10000 coef.
nsim = 500
B = 300
X1 X2 X3
In practice, n is fixed and m can be calibrated in order to balance the error

Spatial output

Sensitivity analysis when model ouputs are functions
Thermal-hydraulic example
Hydrogeological application
2 elementary cases: sensitivity analysis on each scalar output (q pixels):

– Very small CPU time consuming model
– Linear regression model => use of standardized regression coef. (SRC)
Difficult case:
– Complex/Non linear model need of Sobol indices (for example)
– and CPU time expensive model need of metamodel
Sensitivity analysis for spatial outputs: Methodology
• Computer code f (.) : [ Marrel et al. 2011 ]
Input: X = ( X 1 , … , XK ) random vector
Output for input x* : y = f (x* , z ) , z Î Dz Ì R2
In practice, Dz is discretized in q points (here: 64 x 64 = 4096 points)
(X,Y (X,z) ) = input-output sample of size N
• Decomposition of Y (z) on a functional basis, for example:

– Principal component analysis PCA) gives ^ functions fitted on the data,
– A wavelet basis is well-suited if there are discontinuities
• Modeling of the decomposition coefficients by a GP metamodel
Selection procedure of the most important coefficients
• Prediction: x* => prediction of coeff. => spatial output map reconstruction

Sensitivity analysis :
Functional
metamodel Spatial maps of sensitivity indices

Metamodel fitting: Methodology for spatial output (map)
Step 1 : Decomposition in basis functions of each map
Step 2 : Kriging metamodeling of the main coefficients (the most

variables) in fonction of X ; constant for the other coef.
Computational challenge: we have to fit as many GP as main coefficients
Step 3 : Prediction for a new input x*

x* => prediction of the coefficients => spatial output map reconstruction
Sensitivity analysis (spatial map of Sobol’ indices)
The estimation of Sobol indices require a large number of simulations

computation of the Sobol indices by the way of the metamodel
Computational challenge to manage thousand of calls of the functional

metamodel

Description of CERES-MITHRA test case
Complex spatio-temporal dynamics analysis by model

reduction and sensitivity analysis (2010-2014)
Assessment of the consequences on human

health of radionuclide accidental or
routine releases (atmospheric releases)
Modelling of radionuclide
atmospheric dispersion based on
Gaussian puff model
Developpement of an atmospheric dispersion

code : CERES-MITHRA (C-M)
Þ impact calculations relative to CEA facilities
[ CEA application: Marrel & Perot, 2012 ]

Description of the scenario
C-M INPUTS
C-M OUTPUTS
Release parameters
2D maps of time integrated activity
Ø 2 release locations (uncertain heights)
Ø Radionuclide quantity (uncertain)
concentrations (Bqsm-3) for different instants
Ø Deposition velocity (uncertain) (=> 2 hours)
Ø Release duration (1h)
Ø Radionucleide: Cesium 137
Meteorological parameters
Ø Wind speed & direction
(uncertain)
C-M
Ø Atmospheric stability Simulator
Ø Rain
Ø Temperature CPU time from 30s
to 30min Cs137+ concentration map after 20 minutes
Objectives
Identify the influence of each uncertain parameters
Problems
Ø CPU time required for each simulation Þ limited number of
available simulations
Ø Spatio-temporal outputs (very large dimension of model outputs)
Summer school SAMO – B. Iooss - 26/06/14 38 of the release
Cs137+ concentration map at the end
Uncertainty quantification
Ø Uncertain parameters (K = 6)
Reference Probability
Parameters Variation Interval
Value distribution
A 15 [7.5 ; 22.5] Uniform

Release heigth (m)
B 45 [22.5 ; 67.5] Uniform
Deposition velocity
Cs137+ 5.10 -3 [5.10 -4 ; 5.10 -2] Log-Uniform
(m.s-1)
Quantity released
Cs137+ 10 9 [10 8 ; 10 10] Log-Uniform
(Bq)
Wind direction (WD) Truncated AR(1)

291 [249 ; 333]
(degree azimuth) process
Wind speed (WS) Truncated AR(1)

5.3 [0 ; 12.5]
(m.s-1) process

Experimental design and learning sample
Numerical experimental design :
Space filling design in order to have a good coverage of the input space
=> Optimized Latin hypercube Sampling (LHS)
Number of C-M simulations : n = 200 simulations

Compromise between CPU time and investigation of uncertain parameter domain
Simulations with C-M : Learning sample creation
Ø Notations
learning sample
X s = ( x(i ) )i=1,L,n
Y (i ) ( z) = y( x(i ) , z)i=1,Ln C-M outputs
(q = 4000 pixels)
Examples of Cs137+ integrated activity maps (logarithmic scale)

Approximation by a metamodel
Spatial metamodel : Proper Orthogonal Decomposition + Gaussian Process
Step A : Spatial decomposition on a functional basis (Chatterjee [2000])

Selection of the h main coefficients of the decomposition
h
Yh ( X , z) = m(z) + åa j ( X ) fj ( z) with a j ( X ) = ò (Y ( X , z) - m(z)) f j (z)dz
j =1 D
Step B : Modeling of the k main PCA coefficients in function of X
Gaussian Process metamodel
POD GP Metamodel
Computer code decomposition
Input parameters Simulated Selection of Modeling of coefs
X = [X1, ... XK ] maps main coefficients in function of X
Step C : Prediction for any new input x*

GP Metamodel POD
reconstruction
New parameters Prediction of the h Predicted
X* = [X*1, ... X*K ] main coefficients map
Spatial surrogate model

Basic principle of principal component analysis (PCA)
Transformation of dependent numerical variables to independent variables
information reduction
PCA: sequential search of new variables = linear combinations of Yi with a

maximal dispersion (variance or inertia) on each axis
V =t Yc Yc variance-covariance matrix of Yc (matrix of centered functions)
Diagonalisation of the variance-covariance matrix to obtain ordered eigen

vectors (inertia axes)
Eigen values are principal component variances and give the total inertia
(percentage of explained variance)
The matrix (N x q ) of principal component is H = Yc L

with L the eigen vector matrix
Remark: data adaptive basis while the wavelet bases are fixed

Selection of PCA coefficients
Ø Application to C-M data & results :

• h = 15 selected PCA components = 99% of information explained
Þ 15 coefficients to be modelled
by GP metamodels

Validation of the spatial metamodel accuracy
v Predictivity coefficient :
-Q² estimated by cross validation
å [Yˆ(x , z ) - Y ]
n
(i ) (i ) 2
(z)
Q ²( z ) = 1 - i =1
2
n
é (i ) 1 n (i ) ù
å êY ( z ) - n å Y ( z ) ú
i =1 ë i =1 û
- Average value of Q2 on the map ≈ 92 %

Q² map with 15 PCA components
modelled with GP, for t = 40min
v Global and local analysis of the metamodel predictivity

Metamodel quality for the various moments

Sensitivity Analysis
Use of the metamodel POD+GP to perform 10000 simulations required for Sobol
indices estimations (RBD-FAST method)
First order Sobol indices for t = 40min

Input influence :
Predominant :
- Wind direction
- Release quantity : located in the
center of the plume
- Wind speed : located at the edges
of the plume
Negligible : deposition velocity, release
heights
Sum of 1er order indices ≈ 75%
Interactions : ≈ 25% of model

variablility

Total Sobol indices for t=40mn
Differences between 1st order and total indices :

- An average of 25% of difference for wind direction and
speed
=> Strong interaction Wind direction x Wind speed?
=> 2nd order indices:
(located on the edges of the dispersion plume)
CONCLUSION : An average of 97% of the C-M output variance explained by :

Wind Direction (34%) - Wind Direction x Wind Speed (23%)
Cs137+ released quantity (20%) - Wind Speed (20%)
Time evolution of Sobol indices
Significant augmentation of interaction wind direction x wind speed

influence:
Less than 10% at the release beginning and more than 40% at the end

Conclusion: proposed methodology
Ø Uncertainty quantification :
Uncertainty quantification : input parameters Construction of a realistic simulator
+ associated probability distributions of weather conditions
Sampling design : N simulations Ø Sampling design :

Latin Hypercube Sample (LHS) with
optimal recovering properties
Simulator (C-
(C-M code)
Ø Metamodel :
Learning sample Spatial decomposition with PCA
+
metamodelling of PCA coefficients
Metamodel Statistical modelling : with Gaussian processes
validation Metamodel or Surrogate model
Ø Sensitivity analysis :
Sensitivity analysis Uncertainty propagation
Computation of Sobol indices
Interpretation
Sensitivity analysis
in R

Uncertainty studies: several packages
In R = software environment for statistical computing (Open Source)
http://www.r-project.org/
Sampling: randtoolbox, DiceDesign
Uncertainty propagation: mistral
Sensitivity analysis: sensitivity, CompModSA, planor
Gaussian process metamodeling: DiceKriging
Optimization using metamodel: DiceOptim, KrigInv
Functional outputs: multisensi, modelcf

R package « sensitivity » - Demo on Ishigami function (K=3)
library(sensitivity)
Sampling method (n = 1000 - 100 boostrap rep. - N = n(K+2) = 5e4 model runs)
X1 = data.frame(matrix(runif(3*n,-pi,pi),nrow=n)) ; X2 = …
sa1 = sobol2002(model = ishigami.fun, X1, X2, nboot = 100); plot(sa1)
GP metamodel from a Monte Carlo sample (X,Y) (n = 100 model runs):

library(DiceKriging)
x = data.frame(matrix(runif(3*n,-pi,pi),nrow=n)) ; y = ishigami.fun(x)
PG = km(formula=~1,x,y) ; print(PG) ; plot(PG)
GP validation using metamodel predictivity (Q2 coefficient with ntest=1000):

Xtest = … ; ytest = ishigami.fun(xtest)
Q2 = 1 - (mean(ytest-predict(PG,xtest,type="UK")$mean)^2)/var(ytest)
Sensitivity indices computed via the GP predictor:

sa2 = sobol2002(model = NULL, X1, X2, nboot = 100)
Y = predict(PG,sa2$X,type="UK")$mean ; tell(sa2,y) ; plot(sa2)
Sensitivity indices computed via the full GP:

sa3=sobolGP(model=PG,type="UK",MCmethod="sobol2002",X1,X2,candidate=x); plot(sa3)

Conclusion

Stakes of uncertainty management in computer experiments
• Modeling phase
– Improve the model
– Explore the best as possible different input combinations
– Identify the predominant inputs and phenomena in order to priorize R&D
• Validation phase
– Reduce prediction uncertainties
– Calibrate the model parameters
• Practical use of a model

– Safety studies: assess a risk of failure (rare events)
– Conception studies: optimize system performances and robustness
Advantages of a probabilistic approach

• to propagate the uncertainties, to perform global sensitivity analysis
• to design/optimize the system taking into account uncertainties
• to give rigorous safety margins, …

Uncertainty management - The generic methodology
Step C : Propagation of
uncertainty sources
Step B: Step A : Problem specification

Quantification
Model
Model Quantity
of uncertainty Quantity of
of
Input
Input (or VVariables
ariables
sources (or measurement
measurement interest
interest
variables
variables process)
process) of
of
process) Ex:
Ex: variance,
variance,
Uncertain :: xx
Uncertain interest
interest probability
Modelisation with
f(f(xx,,uu)) probability ....
Fixed :: uu
Fixed YY == f(f(xx,,uu))
probability
distributions
Direct methods, Step C’: Sensitivity analysis,

statistics, expertise
Prioritization
Step B’: Quantification of sources

Observed
Observed
vvariables
ariables
Inverse methods, calibration, assimilation YYobs
obs
Feedback Decision criterion

process Ex: Probability < 10-b
Open TURNS – The software implementation of the
uncertainty methodology
•TURNS : Treatments of Uncertainties, Risk’n Statistics
• Since 2005
• Open : Open source : LGPL (code), FDL (doc.)
– http://www.openturns.org/
– Environment : Linux, Windows
– Languages : C++ (libraries), Python (command scripts)
– IHM “Eficas”

Main features in OpenTURNS
• Use of non-intrusive techniques generic
• Wrapping of external codes, considered as mathematical

functions in OpenTurns
• Step B: modeling uncertainty of inputs

– With data: parametric & non-parametric stats
– Assessing experts’ advice
– Dependence: definition by marginals + copula
• Step C: Uncertainty propagation

– Standard and advanced Monte Carlo techniques
(Importance sampling, directional sampling …)
– FORM-SORM method
• Metamodels: Polynomials, polynomial chaos expansion,

Gaussian processw
• Step C’: Sensitivity analysis - Linear & rank regression, Sobol,

polynomial chaos, reliability importance factors
On the Gaussian process metamodel
• Valuable tool when computer code is cpu-time expensive (n ~ hundreds runs)
• GP model construction is possible for moderate dimensional cas (K < 50)
• Main advantage of GP: probabilistic metamodel which gives confidence bands in

addition to a predictor
Full interest in sensitivity analysis
• Fitting quality is dependent of the initial design

GP model is well adapted to sequential and adaptative designs
• Caveats: it can require a large amount of effort during the fitting process and
cases with more than 1000 points begin to be difficult (matrix inversion)
• Designs for specific objectives (optimization, quantile, probability, etc.)
• Other hot topics: calibration and validation of computer codes

Bibliography
Books: - Chilès & Delfiner, Geostatistics, Wiley, 1999
– Fang, Li & Sudjianto, Design and modeling for computer experiments, Chapman, 2006
Metamodels:
Simpson, Peplinski, Koci & Allen, Metamodels for computer-based engineering design: Survey
and recommendations, Engineering with computers, 17, 2001
Storlie, Swiler, Helton & Salaberry, Implementation and evaluation of nonparametric
regression procesures for sensitivity analysis of computationally …, RESS, 94, 2009
Gaussian process:
Koehler & Owen, Computer experiments, In Handbook of Statistics (Ghosh & Rao), 13, 1996
Marrel, Iooss, Van Dorpe & Volkova. An efficient methodology for modeling complex
computer codes with Gaussian processes. Computational Stat. and Data Analysis, 52, 2008
Le Gratiet, Cannamela, Iooss. A Bayesian approach for global sensitivity analysis of
(multifidelity) computer codes. SIAM/ASA Journal of Uncertainty Quantification, in press
Functional input and output:
Marrel et al., Global sensitivity analysis for models with spatially dependent outputs.
Environmetrics, 22, 2011
Marrel, Iooss, da Veiga & Ribatet. Global sensitivity analysis of stochastic computer models
with joint metamodels. Statistics and Computing, 22, 2012
Applications:
Auder, de Crecy, Iooss & Marquès. Screening and metamodeling of computer experiments
with functional outputs. Application to thermal-hydraulic computations. RESS, 107, 2012
Marrel and Perot. Development of a surrogate model and sensitivity analysis for an
atmospheric dispersion computer code. Proceedings of ESREL 2012 Conf, Helsinki, 2012

Sssamo14 Iooss PDF

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Sssamo14 Iooss PDF

Uploaded by

Copyright:

Available Formats

Metamodeling with Gaussian

Problems of uncertainty management in computer experiments

Scenario : Loss of primary coolant

Computer code Y =f (X)

Interest output variable Y :

Goals: numerical model exploration,

P0a) f (.) is complex (interactions, non monotonic, discontinuous, …)

P0b) f (.) is costly (nb of simulations : n << 1000)

P1) The dimension of Y is large or Y have a functional (temporal/spatial) nature

P2) The dimension of X is large (K >> 10) or X have a functional nature

P4) The inputs are dependent [ see Thierry’s lecture ]

P5) f (.) is stochastic [ see Marrel et al., 2012 ]

Summer school SAMO – B. Iooss - 26/06/14 4

To reduce the cost (number of model evaluations), the kriging metamodel is

Summer school SAMO – B. Iooss - 26/06/14 7

Summer school SAMO – B. Iooss - 26/06/14 8

Kriging (or Gaussian process) is interesting because:

Theoretical function (K =1) :

Summer school SAMO – B. Iooss - 26/06/14 9

Estimation without bias: E [ Y *(u) – Y (u) ] = 0

Estimation Y *(u) is optimal: Var [ Y *(u) – Y (u) ] is minimal

Y (x) is stationary of second order:

Nugget effect Covariance

Range = maximal distance of correlation = correlation length

[ from: Baig, 2003 ]

Summer school SAMO – B. Iooss - 26/06/14 13

Min { E [ Y *(u) – Y (u) ]2 }

Best Linear Unbiased Predictor (BLUP)

Kriging weights li (u ) for Y (ui ) are obtained by:

=> Visualisation of regions with imprecise estimations

g (h) = C(0) – C(h)

Summer school SAMO – B. Iooss - 26/06/14 15

– Learning sample (LS) of N simulations : (XLS,YLS)

2 code runs 3 code runs

Conclusion: Given a sufficient number of points, we obtain an accurate metamodel

Summer school SAMO – B. Iooss - 26/06/14 18

– Joint estimation of β and σ : ï LS LS LS [

Summer school SAMO – B. Iooss - 26/06/14 19

Simplex method, stochastic algorithms

MSE validation: Percentage of predicted values inside confidence bounds

Summer school SAMO – B. Iooss - 26/06/14 20

s 2 = 4 ;q = 0.2 s 2 = 4 ;q = 10 -4 [ Le Gratiet, 2011 ]

Summer school SAMO – B. Iooss - 26/06/14 22

Main hypothesis: z(x) is the realization of a

GP metamodel is given by the the probability law

2 code runs 3 code runs

Conclusion: Model-based adaptive designs are the most efficient ones,

Problem: find Pf = Prob [ f (X ) > T ] with X = random inputs ; T = treshold

Reasonable variance everywhere

T J New adaptive design

X * = arg min ( IMSET )

IMSET = ò MSE ( x)1X T ( x )dx

Large variance in non-target region

Summer school SAMO – B. Iooss - 26/06/14 25

Summer school SAMO – B. Iooss - 26/06/14 26

Definitions for a deterministic function Y = f ( X 1 , K , X K )

Classical approach estimation: replace f (X) by Ŷ (GP mean)

J + Computationally cheap (possible to perform analytical calculations)

N - Do not infer from the metamodel error

Summer school SAMO – B. Iooss - 26/06/14 27

From full GP model Y ( X , w ) X LS ,YLS

Do not infer from ~ Vi ~

Estimation Y (u) is optimal: Var [ Y (u) – Y (u) ] is minimal