Professional Documents
Culture Documents
Sssamo14 Iooss PDF
Sssamo14 Iooss PDF
processes
Bertrand Iooss
EDF R&D
2014
09/06/2014
Uncertainties in energy production systems
Many uncertainties for the energy production and the safety due to:
• hazards (demand, weather, …),
• incomplete system knowledge (ageing, physics, …),
• internal agressions (failures, …)
• external agressions (earthquake, …)
In order to better understand, prove the safety and optimize its industrial
processes, EDF R&D develops some physical
numerical simulation codes
K ~ 10-
10-50 input random variables X:
geometry, material properties,
environmental conditions, …
P3) The quantity of interest is linked to rare events (failure probability, high
quantile) [ see Lemaître et al., 2014 and Paul Lemaître PhD thesis (2014) ]
The model :
Y = f (X1, …, XK)
Calculations of all types of indices
(Sobol, distribution-based, …)
+ main effects E(Y | Xi )
Complexity/regularity
of model f Screening Variance
decomposition
Non
monotonic
Morris Metamodel Sobol
indices
Monotonic +
interactions
Design of Monte-Carlo sampling
Monotonic experiment
without
interaction Rank regression
Super screening
Linear 1st Linear regression
degree
Number of model
0 K 2K 10K 1000K evaluations
Summer school SAMO – B. Iooss - 26/06/14 5
Metamodeling steps
Design of Metamodeling :
Simulation :
experiments: Approximation of
Performing
Points to perform the computer code
simulations
simulations
X1
Computer code
X2
Different type of metamodels: [ Simpson et al. 2001 ] [ Storlie & Helton 2008 ]
Linear regression - Polynomials - Splines – Additive models/GAM
Regression trees - Neural networks – Support Vector Machines
Chaos polynomials – Kriging/Gaussian process
Summer school SAMO – B. Iooss - 26/06/14 6
Using kriging metamodels to estimate Sobol’indices
Var[E(Y X i )] Var[E(Y X -i )]
Main effects : Si = ; Total effects : ST i = 1 -
Var(Y ) Var(Y )
Many methods to estimate Sobol’ indices: Sampling (Monte Carlo, quasi-MC,
spectral approaches), smoothing methods, metamodels, …
Example in 1D :
Y = f ( X ) = sin( X )
Simulation of n = 7 computation
points
Kriging can take into account the data configuration, the distance between
data and target, the spatial correlations and potential external information
u1
Y *(u)
u5
u6
u4
u
u2
u7 u3 u8
Probabilistic framework
variance Covariance
function
h
Summer school SAMO – B. Iooss - 26/06/14 11
In practice N ( h)
C(h) = å[Y ( xi + h)Y ( xi )] - m 2
i =1
+ + + + +
Var[Y(x)] = C(0)
+ + + + + + + +
Y(x) Y(x+h)
+ + + + + + + + +
Y(x) Y(x+2h)
+ + + + + + + + + +
varianceC(h)
h
Summer school SAMO – B. Iooss - 26/06/14 12
Examples of stochastic processes (Gaussian)
1D [ from: Marcotte ]
2D
3D
73 measures of benzene
concentration (Rouen, France)
[ from: Bobbia, Mietlicki & Roth, 2000 ]
Kriging mean
Kriging standard deviation
Parametric choices: K
– F : polynomial of degree 1 bF (x) = b 0 + å b i xi
i =1
– R : stationary => covariance function
æ K 2ö
Example: Gaussian covariance R(x, u) = R (x - u ) = exp ç - å q i xi - u i ÷
è i =1 ø
Anisotropy: qi s are not equal (correlation length of each input variable)
Summer school SAMO – B. Iooss - 26/06/14 16
Gaussian process metamodel (2/2)
Joint distribution :
– Gaussian process (GP) model : Y(x) = bF(x) + Z(x), xÎÂp
( ) ((
X LS = x (1) ,..., x ( N ) , FLS = F ( X LS ) , RLS = R x ( i ) , x ( k ) ))
i ,k
YLS ~ N ( bFLS , s ² RLS )
– Conditional GP metamodel :
Y ( x ) X LS ,YLS ~ GP
[ ]
Mean : Yˆ( x ) = E Y ( x ) X LS ,YLS = b F ( x ) + r ( x ) RLS
-1
[YLS - bFLS ]
[( ) (
with r ( x) = R x (1) , x ,..., R x ( N ) , x )]
( ) (
Covariance : Cov Y (u ) X LS ,YLS , Y (v) X LS ,YLS = s ² R (u , v) + t r (u ) RLS
-1
r (v ) )
Þ Variance Þ Mean Square Error (MSE)
Summer school SAMO – B. Iooss - 26/06/14 17
Illustration
Y Y
X x* X
gaussian
mean 95% confidence Y law
intervals (from MSE)
5 code runs
(β ,θ,σ ) = Argmax
* * *
( )
ln L(Y
β,θ
,σ
LS , β,θ, σ )
– with
N 1 1 t
ln L(YLS , β,θ
,σ ) = - ln(2πσ² ) - ln(det RLS ) - σ ² [YLS - bFLS ]RLS
-1
[YLS - bFLS ]
2 2 2
Predictor validation:
Predictivity coefficient
å (Y )
n
- Yˆi
2
i 1 code run (test point)
Q 2 ( Y , Yˆ) = 1 - i =1
n
å (Y - Yi )
2
i =1
- Test sample
- or leave-one-out
- or k-fold cross validation 3 code runs
s = 1;q = 0.2
2 s 2 = 1;q = 10 -4
[ Chevalier, 2011 ]
simulations
observations 95%-confidence
interval
MSE ( x) = s ² + t r ( x) RLS
-1
r ( x) + u ( x)(t ( bFLS ) RLS
-1
bFLS )t u ( x)
u ( x) = bF ( x)- t k ( x) RLS
-1
bFLS
xnew = arg max MSE ( x)
xÎD
Y Y
xnew X xnew X
Remark: other criteria are possible (e.g. focusing to active variables)
Vi Var X i [ E ( f ( X 1 , K , X K ) X i )]
First order : S i = =
V V
V-i Var [ E ( f ( X 1 , K , X K ) X - i )]
Total : S Ti = =
V V
[
Notation : V = Var (Y ) ; X -i = X 1 , K , X i -1 , X i +1 , K , X K ]
Hypothesis: independent inputs
Y ( x, w) X LS ,YLS ~ GP
[
Yˆ( x ) = EW Y ( x, w ) X LS ,YLS ]
(
CovW Y (u , w ) X LS ,YLS , Y (v, w ) X LS ,YLS )
Computation of Sobol indices :
mi =
( [ (
E W VarX i E X -i Y X LS ,YLS ( X , w ) X i )])
E W VarX Y ( X , w ) X LS ,YLS
si =
2
( [ (
Var W Var X i E X - i Y X LS ,Y LS ( X , w ) X i )]) J Infer from the
[
E W Var X Y ( X , w ) X LS ,Y LS ]
2 metamodel error
Computation:
Analytical calculations
Simple and double numerical integrals => computationally expensive N
Other limits:
Additional hypothesis : GP covariance is product of one-dimensional covariance
This estimator is not the “true”expectation of the Sobol’index
Not possible to estimate the total Sobol’indices
w
Si =
[ (
VarX i E X -i Y X LS ,YLS ( X , w ) X i )] is a random variable
VarX Y ( X , w ) X LS ,YLS
Algorithm:
1. Generate 2 m -size ^ matrices (X1, X2) and compute the pick-freeze matrix X
(the one which used in Sobol’ indices estimation formula)
• Compute Sˆi , m , k
nsim B
Sˆi , m = å å Sˆ
1
The estimator of Sobol’ index: i , m , k ,b
nsim ´ B k =1 b =1
å å (Sˆ )
nsim B
- Sˆi , m
1
The variance of this estimator: sˆ =
i
2
i ,m , k ,b
nsim ´ ( B - 1) k =1 b =1
é
( ù
)
B nsim
1 1 ˆ
sˆ ( PG) =
i
2
B
å ê å i ,m , k ,b i , m ,b úû
b =1 ë nsim - 1 k =1
S - S
é 1
( ù
)
nsim B
1 ˆ
sˆ ( MC) =
i
2
nsim
å ê å i , m ,k ,b i ,m , k úû
k =1 ë B - 1 b =1
S - S
X1 X2 X3
Thermal-hydraulic example
Hydrogeological application
Difficult case:
– Complex/Non linear model need of Sobol indices (for example)
– and CPU time expensive model need of metamodel
Summer school SAMO – B. Iooss - 26/06/14 34
Sensitivity analysis for spatial outputs: Methodology
• Computer code f (.) : [ Marrel et al. 2011 ]
Input: X = ( X 1 , … , XK ) random vector
Output for input x* : y = f (x* , z ) , z Î Dz Ì R2
In practice, Dz is discretized in q points (here: 64 x 64 = 4096 points)
(X,Y (X,z) ) = input-output sample of size N
Modelling of radionuclide
atmospheric dispersion based on
Gaussian puff model
Meteorological parameters
Ø Wind speed & direction
(uncertain)
C-M
Ø Atmospheric stability Simulator
Ø Rain
Ø Temperature CPU time from 30s
to 30min Cs137+ concentration map after 20 minutes
Objectives
Identify the influence of each uncertain parameters
Problems
Ø CPU time required for each simulation Þ limited number of
available simulations
Ø Spatio-temporal outputs (very large dimension of model outputs)
Summer school SAMO – B. Iooss - 26/06/14 38 of the release
Cs137+ concentration map at the end
Uncertainty quantification
Ø Uncertain parameters (K = 6)
Reference Probability
Parameters Variation Interval
Value distribution
Quantity released
Cs137+ 10 9 [10 8 ; 10 10] Log-Uniform
(Bq)
Ø Notations
learning sample
X s = ( x(i ) )i=1,L,n
Y (i ) ( z) = y( x(i ) , z)i=1,Ln C-M outputs
(q = 4000 pixels)
Eigen values are principal component variances and give the total inertia
(percentage of explained variance)
Remark: data adaptive basis while the wavelet bases are fixed
Þ 15 coefficients to be modelled
by GP metamodels
v Predictivity coefficient :
-Q² estimated by cross validation
å [Yˆ(x , z ) - Y ]
n
(i ) (i ) 2
(z)
Q ²( z ) = 1 - i =1
2
n
é (i ) 1 n (i ) ù
å êY ( z ) - n å Y ( z ) ú
i =1 ë i =1 û
Predominant :
- Wind direction
- Release quantity : located in the
center of the plume
- Wind speed : located at the edges
of the plume
Negligible : deposition velocity, release
heights
Ø Uncertainty quantification :
Uncertainty quantification : input parameters Construction of a realistic simulator
+ associated probability distributions of weather conditions
Ø Metamodel :
Learning sample Spatial decomposition with PCA
+
metamodelling of PCA coefficients
Metamodel Statistical modelling : with Gaussian processes
validation Metamodel or Surrogate model
Ø Sensitivity analysis :
Sensitivity analysis Uncertainty propagation
Computation of Sobol indices
Interpretation
Summer school SAMO – B. Iooss - 26/06/14 49
Sensitivity analysis
in R
Sampling method (n = 1000 - 100 boostrap rep. - N = n(K+2) = 5e4 model runs)
X1 = data.frame(matrix(runif(3*n,-pi,pi),nrow=n)) ; X2 = …
sa1 = sobol2002(model = ishigami.fun, X1, X2, nboot = 100); plot(sa1)
• Validation phase
– Reduce prediction uncertainties
– Calibrate the model parameters
• Caveats: it can require a large amount of effort during the fitting process and
cases with more than 1000 points begin to be difficult (matrix inversion)