Professional Documents
Culture Documents
1.introBayesSetac MLaure
1.introBayesSetac MLaure
1.introBayesSetac MLaure
Methods
Applications
What is a probability ?
One word,
at least two definitions.
What is a model ?
A model
relates observed data Y
to a set of unknown parameters θ,
with the possible inclusion of fixed, known covariates X .
It is classically divided in two parts,
the deterministic one M(X , θ),
and the stochastic one (error model)
Example : the gaussian regression model
Y = M(X , θ) + with ∼ N(0, σ)
A model may also be seen as a data generating process.
What is inference ?
Inference generally implies the fit of the model to observed data.
4.2
4.0
3.8
size (mm)
3.6
3.4
3.2
0 1 2 3 4 5 6 7
3.6
3.4
3.2
0 1 2 3 4 5 6 7
Generalization to population
Inference also implies generalization of a result from a sample to
population, and the calculation of uncertainty in the estimated
parameters, especially uncertainty due to sampling error.
A sample
Sampling
Statistical inference
Point estimate of θ : θb
frequentist framework
θ is assumed fixed but unknown
Interval estimate of θ
− θ0 | > |θ\
p-value = P(|θ\ − θ0 |obs |H0 )
p-value definition
Assuming the null hypothesis true, probability (in the frequentist
meaning, imagining the repeated sampling) to obtain an estimated
difference greater that the one observed on the sample.
if p-value < α, H0 is rejected
(generally α = 5%)
and the difference is said to be significant.
M.L. Delignette-Muller Bayesian inference
Concepts Definitions
Methods Frequentist framework
Applications Bayesian framework
SO BE CAREFUL!
−0.5
−1.0
NOAEC
LOAEC
−1.5
0 1 2 3 4 5 6 7
Bayesian estimation of θ
Bayesian framework
θ is supposed uncertain, and its uncertainty characterized by a
probability distribution (subjective meaning of a probability, degree
of belief)
Bayesian inference
A sample from the
A prior
population
distribution
Conclusion on on
the population Bayesian inference
Point estimate :
mean, median or mode of the posterior distribution
Interval estimate :
definition of a credible interval (or bayesian confidence
interval)
from posterior distribution quantiles (2.5% and 97.5%
quantiles for a 95% credible interval).
Such an interval is easier to interpret than a frequentist
confidence interval : the probability that the parameter lies in
a 95% credible interval is 95%.
Hypothesis test :
It is no more necessary to calculate any p-value : one can
make decisions from posterior distributions.
8
data : y = 24 survivals
out of n =100
6
prior distribution :
4
uniform(0, 1) = beta(1, 1)
non informative
2
posterior distribution
(analytically known in
0
P(Y |θ)×P(θ)
P(θ|Y ) = P(Y ) ∝ P(Y |θ) × P(θ)
Markov Chain (A. Markov) Monte Carlo (S.Ulam and J. Von Neumann)
MCMC algorithms
MCMC simulations
MCMC simulations
http://www.mrc-bsu.cam.ac.uk/bugs/welcome.shtml
1500
1000
Year
0.5
0.0
1.5
uniform distribution for log10(k)
1.0
f
0.5
0.0
k in log scale
0.4
mean ± 2 × SD
0.2
0.0
comparison priors/posteriors
numerical and/or graphical comparison of priors and posteriors
in order to check that the priors are not too informative, (do
not constraint too much the posteriors).
sensitivity analysis to prior choice
repeating the inference by modifying priors in order to check
the robustness of results to choices concerning priors
20
6
5
15
Density
4
10
3
2
5
1
0
0
3.0 3.1 3.2 3.3 3.4 4.05 4.10 4.15 4.20
c d
15
1.5
10
Density
1.0
5
0.5
0.0
0.5 1.0 1.5 2.0 0.26 0.28 0.30 0.32 0.34 0.36 0.38 0.40
log10b log10e
3.2
3.0
4.16
d
0.14
4.08
log10b
1.5
0.37 0.16
0.5
log10e
0.34
sigma
0.13
0.17 0.05 0.077 0.15
0.08
4.5
4.0
Daphnid size (mm)
3.5
3.0
2.5
0 1 2 3 4 5 6 7
concentration
4.5
4.0
Daphnid size (mm)
3.5
3.0
2.5
0 1 2 3 4 5 6 7
concentration
4.0
3.5
3.0
2.5
Comparison of models
Comparison of two models of different complexities:
example of log-logistic models with 4 or 3 parameters
4.5
4.0
Daphnid size (mm)
3.5
3.0
0 1 2 3 4 5 6 7
concentration
An example in ecotoxicology