Professional Documents
Culture Documents
Stancon2018helsinki Intro-Slides PDF
Stancon2018helsinki Intro-Slides PDF
Stancon2018helsinki Intro-Slides PDF
github.com/jgabry/stancon2018helsinki_intro
Why “Stan”?
suboptimal SEO
Stanislaw Ulam
(1909–1984)
Stanislaw Ulam Monte Carlo
(1909–1984) Method
Stanislaw Ulam Monte Carlo
H-Bomb
(1909–1984) Method
What is Stan?
What is Stan?
• Stan inference
- MCMC for full Bayes
- VB for approximate Bayes
- Optimization for (penalized) MLE
What is Stan?
• Stan inference
- MCMC for full Bayes
- VB for approximate Bayes
- Optimization for (penalized) MLE
• Stan ecosystem
- lang, math library (C++)
- interfaces and tools (R, Python, many more)
- documentation (example model repo, user guide &
reference manual, case studies, R package vignettes)
- online community (Stan Forums on Discourse)
Visualization in Bayesian workflow
Visualization in Bayesian workflow
Gabry, J., Simpson, D., Vehtari, A., Betancourt, M., and Gelman, A. (2018).
Code: github.com/jgabry/bayes-vis-paper
Workflow
Bayesian data analysis
Workflow
Bayesian data analysis
black points
indicate ground
monitor locations
For measurements n = 1, . . . , N
and regions j = 1, . . . , J
Model 1
For measurements n = 1, . . . , N
and regions j = 1, . . . , J
Model 1
For measurements n = 1, . . . , N
and regions j = 1, . . . , J
Model 1
For measurements n = 1, . . . , N
and regions j = 1, . . . , J
Models 2 and 3
Exploratory data analysis
building a network of models
For measurements n = 1, . . . , N
and regions j = 1, . . . , J
Models 2 and 3
log (PM2.5,nj ) ⇠ N (µnj , )
Exploratory data analysis
building a network of models
For measurements n = 1, . . . , N
and regions j = 1, . . . , J
Models 2 and 3
log (PM2.5,nj ) ⇠ N (µnj , )
µnj = ↵0 + ↵j + ( 0 + j ) log (satnj )
Exploratory data analysis
building a network of models
For measurements n = 1, . . . , N
and regions j = 1, . . . , J
Models 2 and 3
log (PM2.5,nj ) ⇠ N (µnj , )
µnj = ↵0 + ↵j + ( 0 + j ) log (satnj )
Exploratory data analysis
building a network of models
For measurements n = 1, . . . , N
and regions j = 1, . . . , J
Models 2 and 3
log (PM2.5,nj ) ⇠ N (µnj , )
µnj = ↵0 + ↵j + ( 0 + j ) log (satnj )
↵j ⇠ N (0, ⌧↵ ) j ⇠ N (0, ⌧ )
Exploratory data analysis
building a network of models
For measurements n = 1, . . . , N
and regions j = 1, . . . , J
Models 2 and 3
log (PM2.5,nj ) ⇠ N (µnj , )
µnj = ↵0 + ↵j + ( 0 + j ) log (satnj )
↵j ⇠ N (0, ⌧↵ ) j ⇠ N (0, ⌧ )
Exploratory data analysis
building a network of models
For measurements n = 1, . . . , N
and regions j = 1, . . . , J
Models 2 and 3
log (PM2.5,nj ) ⇠ N (µnj , )
µnj = ↵0 + ↵j + ( 0 + j ) log (satnj )
↵j ⇠ N (0, ⌧↵ ) j ⇠ N (0, ⌧ )
Workflow
Bayesian data analysis
The prior can often only be understood in the context of the likelihood.
arXiv preprint: arxiv.org/abs/1708.07487
Likelihood x Prior
Data Parameters
(observed) (unobserved)
What is the problem with “vague” priors?
What is the problem with “vague” priors?
?
✓ ⇠ p(✓)
Generative models
?
✓ ⇠ p(✓)
? ?
y ⇠ p(y|✓ )
Generative models
?
✓ ⇠ p(✓)
?
y ⇠ p(y)
? ?
y ⇠ p(y|✓ )
Prior predictive checking:
fake data is almost as useful as real data
↵0 ⇠ N (0, 100)
0 ⇠ N (0, 100)
⌧↵2 ⇠ InvGamma(1, 100)
2
⌧ ⇠ InvGamma(1, 100)
Prior predictive checking:
fake data is almost as useful as real data
↵0 ⇠ N (0, 100)
0 ⇠ N (0, 100)
⌧↵2 ⇠ InvGamma(1, 100)
2
⌧ ⇠ InvGamma(1, 100)
Prior predictive checking:
fake data is almost as useful as real data
↵0 ⇠ N (0, 100)
0 ⇠ N (0, 100)
⌧↵2 ⇠ InvGamma(1, 100)
2
⌧ ⇠ InvGamma(1, 100)
Prior predictive checking:
fake data is almost as useful as real data
What are better priors for the global intercept and slope
and the hierarchical scale parameters?
Prior predictive checking:
fake data is almost as useful as real data
What are better priors for the global intercept and slope
and the hierarchical scale parameters?
↵0 ⇠ N (0, 1)
0 ⇠ N (1, 1)
⌧↵ ⇠ N+ (0, 1)
⌧ ⇠ N+ (0, 1)
Prior predictive checking:
fake data is almost as useful as real data
What are better priors for the global intercept and slope
and the hierarchical scale parameters?
↵0 ⇠ N (0, 1)
0 ⇠ N (1, 1)
⌧↵ ⇠ N+ (0, 1)
⌧ ⇠ N+ (0, 1)
Prior predictive checking:
fake data is almost as useful as real data
Prior predictive checking:
fake data is almost as useful as real data
Non-informative
Prior predictive checking:
fake data is almost as useful as real data
Non-informative
Weakly informative
Markov Chain Monte Carlo
Betancourt, M. (2017).
arxiv.org/abs/1701.02434
Gabry, J., Simpson, D., Vehtari, A., Betancourt, M., and Gelman, A. (2018).
arxiv.org/abs/1709.01449 | github.com/jgabry/bayes-vis-paper
Workflow
Bayesian data analysis
Z
p(ỹ|y) = p(ỹ|✓) p(✓|y) d✓
Posterior predictive checking
visual model evaluation
Posterior predictive checking
visual model evaluation
?
✓ ⇠ p(✓|y)
Posterior predictive checking
visual model evaluation
?
✓ ⇠ p(✓|y)
?
ỹ ⇠ p(y|✓ )
Posterior predictive checking
visual model evaluation
?
✓ ⇠ p(✓|y)
ỹ ⇠ p(ỹ|y)
?
ỹ ⇠ p(y|✓ )
Posterior predictive checking
visual model evaluation
T (y) = skew(y)
Posterior predictive checking:
visual model evaluation
T (y) = med(y|region)
Model 2 (multilevel)
Workflow
Bayesian data analysis