Risk

You might also like

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 66

ENGG404

Risk and Uncertainty:


Probability Theory

Prof Scott Ferson, University of Liverpool


10 October 2023
Syllabus
26 September Exceedance risks, random variables and convolutions
3 October Transformations, convolutions, and dependence with Monte Carlo
10 October Events, probabilistic logic
17 October Fault trees, fitting distributions
24 October Conditional probability and Bayes’ rule
31 October WORKSHOP
7 November NO CLASS
14 November Decision making under risk and uncertainty
21 November Extreme value theory, exceedances and disasters
28 November Kinds of uncertainty, probability bounds, validation, predictions, compliance
5 December Risk and safety assessments with bad data and incomplete knowledge
12 December Probability is controversial: rational choice and non-Laplacian uncertainty
What you should know from last time
• How to wrangle named distributions in sra.r
• Construct them
• Transform them
• Do arithmetic with them

• Know what dependence is


• How it affects arithmetic (but not transformations)
• How to simulate perfect and opposite dependence in Monte Carlo
• How to specify it in modelling

• Understand how tail risks are influenced by the number of Monte Carlo replicates

• What the standard normal distribution table is (AKA z-values, normal scores, etc.)

• How to use pnorm, punif, etc.


id = 123456789
Homework 02 set.seed(id); rev(range(trunc(runif(2)*10))) # 6 6
a = normal(6,6)
mean(a) #[1] 6.007374
• Replacing XXXXXXXX with your studentsd(a)
number, what values are output when you
#[1] 5.991342
enter id = XXXXXXXXX; set.seed(id); rev(range(trunc(runif(2)*10)))?
b = a^2
• Let the two numbers you got in the mean(b)
previous #[1]
question be the mean and standard
71.98469
deviation respectively of a normal distribution
sd(b) that
#[1] you assign to the variable a.
88.08636
What do you get when you then enter mean(a)?
c=abs(a)
• What do you get when you enter sd(a)?
mean(c) #[1] 7.000366
• Enter b = a^2 in the R Console. Whatsd(c) #[1] 4.793704
is the mean of b?
• What is the standard deviation of b?
• Enter c = abs(a) in the R Console. What is the mean of c?
• What is the standard deviation of c?
• What is the interpretation of sd(a), where a is a random variable?
root average squared difference been the mean of a and deviates from a
Standard deviation
0.4
=1
Density

0.2 =2
 = 5  = 10
 = 20
0
40 60 80 100 120 140 160

1
Probability

0
40 60 80 100 120 140 160

Not all distributions are normal, but every distribution’s spread has a standard deviation
(which might be infinite)
a = N(m,1)
Homework 02 sd(a) < sd(abs(a))

• Assuming a is a normal distribution, can the standard deviation of abs(a)


ever be larger than that of a?

a abs(a)
Homework 02
• What is the value of max(a)?
30

• Theoretically, what is the largest possible value of a random variable


that is normally distribution with mean 10 and standard deviation 1?
positive infinity

• Can the value of max(abs(a)) ever be larger than max(a)?


yes, e.g., when a = (-5,1)
Homework 02
• Plot the distributions for b=a^2, and c=abs(a) on the same graph. Which
of the following is true? [Hint: You can set a scale for the plot with the
command pl(L, R) where the arguments are the smallest and largest
values to be plotted. Use the red( ), blue( ) functions to distinguish the
distributions.]
The distributions don't overlap, the b distribution is to the left of the c distribution
The distributions don't overlap, the b distribution is to the right of the c distribution
The distributions overlap, but the b distribution is always above the c distribution
The distributions overlap, but the b distribution is always below the c distribution
The distributions cross one another
Other:
a =N( m, s ); c=abs(a); b=a^2

overlap; b to right cross

don’t overlap; b to right other


Homework 02
• What is the standard deviation of the random variable d = a + a^2? [Hint:
use the sd( ) function.]
sd(a + a^2)

d = a + a^2
sd(d)

sd(a + b)
1.0
Cumulative probability

0.8
Homework 02

0.6
0.4
0.2
0.0
• What happens when you enter runif(1000,2,4); U(2,4) ? 2.0 2.5 3.0 3.5 4.0

[910] 2.469481 3.537798 3.386743 3.405579 2.665324 2.608697 2.144858 2.169205 2.176538
[919] 2.586882 3.777938 2.291973 3.245451 2.659889 3.814183 3.812559 2.751853 2.008008
[928] 2.934167 2.946754 3.247294 2.918095 2.343682 2.588088 3.332253 2.006413 2.585856
[937] 3.405006 2.242065 2.990343 3.051423 2.872148 2.913489 3.938031 3.584531 3.803559
[946] 2.756841 3.640123 2.991660 3.504637 2.184287 2.530199 2.114401 3.292859 2.977973
[955] 2.172667 3.571079 3.402172 2.577729 3.524411 2.941902 3.936480 3.328031 3.712335
[964] 2.033870 2.129998 2.398178 2.099636 2.211583 2.552924 3.290095 3.095562 2.781243
[973] 3.388212 3.940027 2.082203 2.275819 3.569186 2.443720 2.417047 3.277920 3.678258
[982] 2.598554 2.443464 3.676692 3.150439 2.314507 3.116093 3.324041 3.526336 3.617612
[991] 2.425114 3.291598 2.064151 3.356522 3.565127 3.379629 3.102189 3.888492 2.032470
[1000] 2.087093
MC (min=2.0000289, median=2.98896430, mean=2.9937513, max=3.99998002)
1.0
Cumulative probability

0.8
Homework 02

0.6
0.4
0.2
0.0
• What happens when you enter runif(1000,2,4); U(2,4) ? 2.0 2.5 3.0 3.5 4.0

it prints out 1000 numbers between 2 and 4, and then it prints


a summary for and graphs a uniform distribution twixt 2 and 4

• What is the difference between the object created by runif(1000,2,4) and


the object created by U(2,4)?
runif( ) gives a list of random values from a uniform distribution
between 2 and 4; U( ) creates an "MC" object characterising the
uniform distribution from which such values are drawn

also acceptable: the difference is sample size


Homework 02
• What other techniques, besides Monte Carlo simulation, are there to
solve functions of random variables?
Laplace/Mellin transforms;
discrete distribution approximations;
analytical (exact) solutions;
FOSM (first-order second-moment);
other Taylor series approximations;
appeals to Central Limit Theorem
Homework 02
• How can you simulate opposite dependence among Monte Carlo variables?
when using a uniform deviate u to get a realisation from one
variable with the inverse transform method, use 1u to get a
realisation from the other distribution

Cumulative probability
1
uniform deviate

0.8
0.6
0.4
0.2
0
0 50 100 150 0 800 1600

op

pe
p
osi

rfe
te

ct
Homework 02
• Let G=c(6, 10, 11, 8, 7, 3, 10, 11, 7, 8, 3, 11, 2, 4) and let H=c(5, 11, 11, 10,
4, 8, 2, 11, 3, 7, 4, 5, 3, 8). What is the mean of G+H?
G=c(6, 10, 11, 8, 7, 3, 10, 11, 7, 8, 3, 11, 2, 4)
H=c(5, 11, 11, 10, 4, 8, 2, 11, 3, 7, 4, 5, 3, 8)
mean(G+H)
13.78571
• Using G and H from above, plot G versus H in R using the plot(). What
does the plot look like?
plot(G,H)
10
8

scattergram with positive correlation


H

6
4

same answer if you used plot(H,G)


2

2 4 6 8 10
Homework 02
• Using R's cor( ) function, what is the correlation between G and H?
cor(G,H)
[1] 0.3820118
• Enter sG = sort(G); sH = sort(H); rH = rev(sH) in the R Console. What is
the mean of the sums sG+sH? mean(sG+sH) mean(sG+rH)
[1] 13.78571 [1] 13.78571
• Enter plot(sG,sH). What does the plot look like?
10

10
Points Points
8

8
plot(sG,sH) plot(sG,rH)
sH

monotonically monotonically

rH
6

6
increasing decreasing
4

4
2

2
Homework 02
• What does it mean to say that random variables X and Y are independent
of one another?
knowing something about X doesn't tell us anything about Y, and vice versa

• Which pairs of variables in the groundwater travel time problem should


probably be modelled as independent?
Not K and n; not BD and n; not i and L
Possibly not K and i; possibly not Koc and foc
Hydraulic conductivity is a function of soil porosity
As soil bulk density increases, soil porosity decreases
Hydraulic gradient is the head difference divided by the distance (which implies L 2 in numerator and so dependence)
Hydraulic conductivity is the ratio of volume flux to hydraulic gradient (so K*i is volume flux)
Partition coefficient Koc is the ratio of Kd to foc fraction organic carbon (so foc*Koc is Kd=ratio of conc in two phases)
Homework 02
• What is the probability that a is less than 1?
a<1
0.201844
• What is the probability that a + 2 * sqrt(abs(a)) / uniform(0.2,1.3) is greater
than or equal to 2?
2 <= a + 2 * sqrt(abs(a)) / uniform(0.2,1.3)
0.8872
• The Office for National Statistics said the total fertility rate (TFR) for England
and Wales was 1.61 children per woman in 2021. What is the associated
ensemble for this statistic?
women in England and Wales as censused in 2021
Homework 02
• Total fertility rate is a complicated random number
• Obviously not the number of births per woman in that year
• Average number of live babies (of either gender) that those women will
have over the entirety of their childbearing lifespans

• Woman seems to be defined as a female having attained the age of 16 as a


surrogate for reproductive maturity (irrespective of reproductive capacity)
• ONS data on gender identity suggests about 0.5% of self-identified women
were not registered as females at birth (but 6% of respondents declined to
answer the question)
From last time

• Read Section 5.3 (pp 7487) of Moss

• Read Chapter 5 (274332) of Ang & Tang


Today
• Dependence, perfect, opposite and independence
• Vocabulary of probability
• Events
• Probabilities of events
• Combining events and event logic
Dependence
Specifying dependence in sra.r

•Distributions are independent by default A = normal(5,1)

•Perfect dependence (comonotone) B = normal(5,1, r = A)

•Opposite dependence (countermonotone) C = normal(5,1, r = -A)

•A copy or increasing function is perfect D=A ; E = log(A)

•Same shape but independent F = samedistribution(A)


Dependence arising in calculations
a = U(-2,5)
b = N(1,2)
c=a+b
• Distributions are independent by default cor(a,b) # 0.005
# perfect dependence
cor(a,c) # 0.711
A = N(5,1)
cor(b,c) # 0.707
• Distributions depend on their parents B=A
C = sqrt(A)
# opposite dependence D = log(A)
• Perfect (comonotone) dependence X = N(5,1) E = exp(A)
A B C D E
Y = -X A 1.000 1.000 0.997 0.987 0.768
Z = 1/X
• Opposite (countermonotonicity) cor(X,Y); cor(X,Z); cor(Y,Z)
B 1.000 1.000 0.997 0.987 0.768
C 0.997 0.997 1.000 0.996 0.730
D 0.987 0.987 0.996 1.000 0.687
-1 -0.936 0.936 E 0.768 0.768 0.730 0.687 1.000
• Same shape but independent Pearson may not be one;
a = N(1,2) Spearman and Kendall are
b = samedistribution(a)
cor(a,b) # 0.000
Dependence can make a big difference
a = N(1,2)
b = U(-2,5)
black(a + b) # independent

c = U(-2,5, r=a)
purple(a + c) # perfect

d = U(-2,5, r = -a)
green(a + d) # opposite
Other correlations in sra.r
• You can specify other intermediate correlations
• Between opposite and perfect which are the extreme cases

• The method is approximate


• So check that the realised correlations are close to what you had planned

• Specify a matrix of Pearson correlation coefficients


• It should be positive semidefinite

• Call the correlations function to make special “r” vectors


• You need the MASS package for R

• Construct each variable using its corresponding “r” vector


Example
# planned correlation matrix
s = c( 1.00, 0.60, -0.30,
0.60, 1.00, -0.40,
-0.30, -0.40, 1.00)
correlations(s)
w = uniform(2,5, r=MC$r[,1])
x = poisson(5, r=MC$r[,2])
y = gumbel(2,3, r=MC$r[,3])
# pairwise bivariate plots
plotcorrs(c(w,x,y))
corrs(c(w,x,y))
w x y
w 1.000 0.577 -0.29
x 0.577 1.000 -0.38
y -0.290 -0.380 1.00
Installing libraries we’ll need

• Click Packages/Install package(s)…, select a CRAN mirror, and click OK

• After a pause, a list of packages will pop up

• Select the package MASS, and click OK

• You may wish to install other packages


Event probabilities
Classical probability
Probability of event A
Number of outcomes in the event
P(A) =
Total number of outcomes in the sample space

You will find this denoted in different ways, e.g.,


Prob(A) Pr(A)
P(A) p(A)
A in sara.r P(expression)
Empirical or frequentist probability
Probability of event A
Number of times A occurs in repeated trials
P(A) =
Total number of trials in a random experiment

Note that, here, the phrase ‘random experiment’ means a collection of


independent trials
Subjectivist or Bayesian probability
Probability of event A

P(A) = an individual’s belief that A occurs (scaled to [0,1])

Different people will have different probabilities for the same event,
depending on what they each know about the world and the event
Rules of probability Kolmogorov’s axioms defining probability theory
• Probability for any event is a value between zero and one
0  P(A)  1

• Probability of the sample space is one


P() = 1

• The probability of a union of disjoint sets is the sum of their probabilities


P(A ∪ B) = P(A V B) = P(A) + P(B), when A ∩ B = Ø
Consequences

P(∅) = 0
If A  B, P(A)  P(B)
P(A ∩ B)  P(A) P(A & B)  P(A)
P(A ∩ B)  P(B) P(A & B)  P(B)
P(A)  P(A ∪ B) P(A)  P(A v B)
P(B)  P(A ∪ B) P(B)  P(A v B)
P((A ∪ B)C) = P(AC ∩ BC) P(not(A v B)) = P(not A & not B)
P((A ∩ B)C) = P(AC ∪ BC) P(not(A & B)) = P(not A v not B)
Derived rules
• Probability of a complement of A is one minus the probability of A
P(not A) = P(AC) =1  P(A)

• Probability of event A or event B occurring is the probabilistic sum


P(A ∪ B) = P(A or B) = P(A) + P(B) – P(A ∩ B)

• Probability of both events A and B occurring is the conditional rule


P(A ∩ B) = P(A and B) = P(A) P(B|A) = P(B) P(A|B)
Vocabulary
• Random experiment random experiment
trial
a repeatable procedure or process that has a well-defined set of experiment
possible outcomes and each time produces an observable random trial
outcome that could not be perfectly predicted in advance random process
outcome
• Outcome elementary event
a result from a single execution of a random experiment that is singleton event
basic outcome
unique and mutually exclusive of other possible outcomes atomic event
sample point
• Sample space
sample space
the set of possible outcomes of an experiment universal set
possibility space
• Event 
a subset of the sample space
Two-sided coin
Experiment: toss a two-sided coin
Outcomes: heads (H) and tails (T)
Sample space: {H, T}
Event space: {∅, H, T, {H,T}}

• The empty set is a subset of the sample space, but the empty set is not
an element of it, so the empty set is an event but it is not an outcome
Fair die
Experiment: roll a fair, six-sided die
Outcome:
Sample space: {1, 2, 3, 4, 5, 6}
Event space: {∅,1,2,3,4,5,6,{1,2},{1,3},,…,{1,6},{2,3},…,{1,2,3},…,{1,2,3,4,5,6}}

• The event space is often the power set of the sample space if it’s finite

• The sample and event spaces for unfair dice are the same so long as all sides
of the die are possible outcomes
The probability of each face is 1/6 for a fair die
A = {2,4,6} B = {1,5}
P(A) = 3/6 P(B) = 2/6

A and B have no elements in common, so they are disjoint, A∩B = {} = Ø

A ∪ B = {2,4,6} ∪ {1,5} = {1,2,4,5,6}

P(A ∪ B) = P(A or B) = P(A) + P(B) = (3+2)/6 = 5/6 = P({1,2,4,5,6})


The probability of complements and unions
A = {2,4,6} C = {1,2,3,4}
P(A) = 3/6 P(C) = 4/6

P(AC) = P(not A) = 1  P(A) = 1 – 3/6 = 3/6 = 1/2 = P({1,3,5})

A ∩ C = {2,4} A ∪ C = {1,2,3,4,6}
P(A ∩ C) = P({2,4}) = 2/6 P(A ∪ C) = P(A) + P(C) – P(A ∩ C)
P({1,2,3,4,6}) = 3/6 + 4/6 – 2/6 = 5/6
Two distinguished dice
Experiment: toss two dice

Outcome: , , …, (red face, blue face)

Sample space: { (1,1) (1,2) (1,3) (1,4) (1,5) (1,6)


(2,1) (2,2) (2,3) (2,4) (2,5) (2,6)
(3,1) (3,2) (3,3) (3,4) (3,5) (3,6)
(4,1) (4,2) (4,3) (4,4) (4,5) (4,6)
(5,1) (5,2) (5,3) (5,4) (5,5) (5,6)
(6,1) (6,2) (6,3) (6,4) (6,5) (6,6) }
What isis the
What the event
event “snake
where the
eyes”? is even?
sum {(1,1)}
of the dice is (1,3),
{(1,1), 4? (1,5),
{(1,3),(2,2),(3,1)}
(2,2), (2,4), (2,6), (3,1), (3,3), (3,5)
(4,2), (4,4), (4,6), (5,1), (5,3), (5,5), (6,2), (6,4), (6,6)
Fair and independent distinguished dice
• Fairness means the probabilities of each die face are equal

• Independence means the probability of faces A and B is just P(A)P(B)

• Probability measure is uniform over the 36 elements of the sample space

• Probability of any pair is 1/36 (1,1) (1,2) (1,3) (1,4) (1,5) (1,6)
(2,1) (2,2) (2,3) (2,4) (2,5) (2,6)
(3,1) (3,2) (3,3) (3,4) (3,5) (3,6)
(4,1) (4,2) (4,3) (4,4) (4,5) (4,6)
(5,1) (5,2) (5,3) (5,4) (5,5) (5,6)
(6,1) (6,2) (6,3) (6,4) (6,5) (6,6)
Playing dice
• You can figure out the chance of any event A just by summing the
probability masses of each elementary event in the event A
• In this case, it’s easy because each elementary event has mass 1/36

• For example, what is the chance (i.e., probability) of snake eyes?

• Chance of getting a sum  3? (1,1) 1 (1,2) 2 (1,3) 3 (1,4) 4 (1,5) 5 (1,6) 6


(2,1) 2 (2,2) 4 (2,3) 6 (2,4) 8 (2,5)10 (2,6)12
• Say you win the product in £ (3,1) 3 (3,2) 6 (3,3) 9 (3,4)12(3,5)15 (3,6)18
(4,1) 4 (4,2) 8 (4,3)12(4,4)16(4,5)20 (4,6)24
(5,1) 5 (5,2)10 (5,3)15(5,4)20(5,5)25 (5,6)30
(6,1) 6 (6,2)12 (6,3)18(6,4)24(6,5)30 (6,6)36
1.0
Playing dice

0.8
Cumulative probability

0.6
0.4
0.2
0.0
1 2 3 4 5 6

fair

• Just make an EDF of the products

1.0
1 2 3 4 5 6
2 4 6 8 10 12

0.8
Cumulative probability
3 6 9 12 15 18

0.6
4 8 12 16 20 24

0.4
5 10 15 20 25 30

0.2
6 12 18 24 30 36

0.0
0 10 20 30 40

Product of two independent fair dice


You can also just use sra.r
sides = 6
many = 3000
fair = 1:sides # 1 2 3 4 5 6
R = sample(fair, many, replace=TRUE) # 3 6 2 4 4 4 4 3 6 6 4 5 4 4 5 5 5 6 5 4 4 6 4 5 4 …
B = sample(fair, many, replace=TRUE) # 6 2 6 6 4 4 2 5 6 2 5 6 4 5 6 4 6 5 4 3 2 6 3 3 5 …
purple(R * B)

dice = function(sides=6) mc(sample(1:sides,MC$many,replace=TRUE))


R = dice()
B = dice()
purple(R * B)
What if they’re not fair, or not independent?
• Each outcome {(red,blue)} has a probability in [0,1]
• The sum of those probabilities, over all 36 outcomes, equals one
• But the probabilities are not all equal
• They must be estimated by experiment or inferred some other way

• The probability of an event E is the sum of the probabilities of the


singleton events {(red,blue)} that make up E
How is a die not fair?
Cumulative probability 1

fair
each face is equally likely

0
1 2 3 4 5 6
1
Cumulative probability

unfair unlikely to roll a 1 or 2,


more likely to get 5 or 6
0
1 2 3 4 5 6
Computing with unfair dice (using outer product)
options(digits=2)
p = 1:6/sum(1:6) # 0.048 0.095 0.143 0.190 0.238 0.286
m = outer(p,p)
[,1] [,2] [,3] [,4] [,5] [,6]
1 2 3 4 5 6
[1,] 0.0023 0.0045 0.0068 0.0091 0.011 0.014 2 4 6 8 10 12
[2,] 0.0045 0.0091 0.0136 0.0181 0.023 0.027 3 6 9 12 15 18
[3,] 0.0068 0.0136 0.0204 0.0272 0.034 0.041 4 8 12 16 20 24
[4,] 0.0091 0.0181 0.0272 0.0363 0.045 0.054 5 10 15 20 25 30
[5,] 0.0113 0.0227 0.0340 0.0454 0.057 0.068 6 12 18 24 30 36
[6,] 0.0136 0.0272 0.0408 0.0544 0.068 0.082
What is the probability of getting at least £20?
Computing with unfair dice (another approach)
unfair = rep(1:6,1:6) # 1 2 2 3 3 3 4 4 4 4 5 5 5 5 5 6 6 6 6 6 6
many = 5000

0.0 0.2 0.4 0.6 0.8 1.0


R = sample(unfair, many, replace=TRUE)

Cumulative probability
B = sample(unfair, many, replace=TRUE)
m=R*B
purple(m)
sum(20 <= m)/many # 0.47
edf(R*B,col=rgb(112/255,48/255,160/255),lwd=3) 0 5 10 15 20 25 30 35

R*B

Note that 0.47 is one minus the intercept on the probability axis
Computing with unfair dice (using sra.r)
p = 1:6
R = dice(6,p)

0.0 0.2 0.4 0.6 0.8 1.0


B = dice(6,p)

Cumulative probability
purple(R * B)
20 <= R * B # 0.47
edf(R*B,col=rgb(112/255,48/255,160/255),lwd=3)
0 5 10 15 20 25 30 35

R*B

These calculations all still assume the dice are independent


What if the dice are not independent?
knucklebones
• Dice were invented so that they would be independent from sheep,
5,000 years BCE

• But we can pose the question, just to see how different the answers are

• It’s easy to explore the effects of perfect or opposite dependence

1:6 * 1:6
• Perfect: sort both sets random deviates
[1] 1 4 9 16 25 36
1:6 * rev(1:6)
• Opposite: sort one and sort the other in reverse order [1] 6 10 12 12 10 6
many = 3000

Cumulative probability

0.8
sides = 6

0.4
fair = 1:sides # 1 2 3 4 5 6

0.0
1 2 3 4 5 6

R = sample(fair, many, replace=TRUE) fair

B = sample(fair, many, replace=TRUE)

Cumulative probability
black(fair,new=TRUE)

0.8
0.4
#assume independence

0.0
i=R*B
0 10 20 30 40

pl(0,40); purple(i)

Cumulative probability
# assume perfect dependence

0.8
R = sort(R)

0.4
B = sort(B)

0.0
p=R*B 0 10 20 30 40

pl(0,40); orange(p)

# assume opposite dependence


Cumulative probability

0.8
B = rev(B)
0.4
o=R*B
0.0

pl(0,40); gray(o) 0 10 20 30 40
General rules for probability logic aka
calculus
• Probability of a complement of A is one minus the probability of A
P(not A) =1  P(A) = P(AC) = P(A)

• Probability of event A or event B occurring is the probabilistic sum


P(A or B) = P(A) + P(B) – P(A and B) = P(A) + P(B) – P(A ∩ B) = P(A ∪ B)

• Probability of both events A and B occurring is the conditional rule


P(A and B) = P(A) P(B|A) = P(B) P(A|B) = P(A ∩ B)

These are always true as they make no dependence assumptions


Assuming independence among events
sra.r
• Probability of a complement of A is one minus the probability of A
P(not A) =1  P(A) = P(AC) = P(A)
not(A)
• Probability of event A or event B occurring is the probabilistic sum
P(A or B) = P(A) + P(B) – P(A)  P(B)
P(A and B) == P(A)
P(A ∪+ P(B)
B) – P(A ∩ B) = P(A ∪ B)
P(A)  P(B)  P(B)B)
P(A)and(A,
• Probability of both events A and B occurring is the conditional rule
 P(B) ==P(A
P(A and B) = P(A) P(B|A) P(B)∩P(A|B)
B) = P(A ∩ B)
P(B) P(B) or(A, B)
In sra.r, we use one variable name to denote both the event and its probability
End
Wrinkles that make random variables different
• Point values replaced with distributions
• Ensembles
• Distribution shapes and tails
• Dependencies
• Backcalculations
• Number of replications
• Multiple instantiations
• Repeated variables
How many replicates
How many replications
• More is always better
• Tails are especially hard to nail down
• Curse of dimensionality
• Latin hypercube sampling can help
• Repeat the simulation as a check  best
• Confidence intervals on fractiles
• Kolmogorov-Smirnov limits on distributions
100 versus 10,000 replicates
1.000
0.02

100 Trials: .750

close-up of .500
0.01
left hand tail .250

0.00
.000
0.00
1.000
0.02
10,000 Trials: .750

close-up of left
0.01
.500
hand tail
.250

0.00
.000
0.00 125.00 250.00 375.00 500.00

Time to contamination (yr)


100 Monte Carlo replicates 1000 Monte Carlo replicates 2000 Monte Carlo replicates 5000 Monte Carlo replicates
0.020

0.020

0.020

0.020
Cumulative probability

Cumulative probability

Cumulative probability

Cumulative probability
0.010

0.010

0.010

0.010
0.000

0.000

0.000

0.000
0 100 200 300 400 500 0 100 200 300 400 500 0 100 200 300 400 500 0 100 200 300 400 500

10000 Monte Carlo replicates 20000 Monte Carlo replicates 1e+05 Monte Carlo replicates 1e+06 Monte Carlo replicates
0.020

0.020

0.020

0.020
Cumulative probability

Cumulative probability

Cumulative probability

Cumulative probability
0.010

0.010

0.010

0.010
0.000

0.000

0.000

0.000
0 100 200 300 400 500 0 100 200 300 400 500 0 100 200 300 400 500 0 100 200 300 400 500

But these graphs are themselves random, so they’ll vary when we estimate them again…
100 Monte Carlo replicates 1000 Monte Carlo replicates 2000 Monte Carlo replicates 5000 Monte Carlo replicates
0.020

0.020

0.020

0.020
Cumulative probability

Cumulative probability

Cumulative probability

Cumulative probability
0.010

0.010

0.010

0.010
0.000

0.000

0.000

0.000
0 100 200 300 400 500 0 100 200 300 400 500 0 100 200 300 400 500 0 100 200 300 400 500

10000 Monte Carlo replicates 20000 Monte Carlo replicates 1e+05 Monte Carlo replicates 1e+06 Monte Carlo replicates
0.020

0.020

0.020

0.020
Cumulative probability

Cumulative probability

Cumulative probability

Cumulative probability
0.010

0.010

0.010

0.010
0.000

0.000

0.000

0.000
0 100 200 300 400 500 0 100 200 300 400 500 0 100 200 300 400 500 0 100 200 300 400 500

…although those made from large numbers of replicates are the most stable
Confidence interval for a fractile

The a100% confidence interval for the pth fractile is estimated by [Yi, Yj ],
where Yk is the (n  k + 1)th largest value from the Monte Carlo simulation,
i = floor(np  b), j = ceiling(np + b), b = z1((1  a)/2)(np(1  p))

• Vary n to get the precision you desire


• You can use the sra.r function qci(mc, p=0.5, conf=0.95)
• Remember, this represents only sampling error, not measurement error

• What’s the opposite, the confidence interval for the probability at point?
Kolmogorov-Smirnov bounds

Bounds on the distribution as a whole


100replications
200 replications 1000 replications
2000 replications
1 1

0.8 0.8

0.6 0.6

0.4 0.4

0.2 0.2

0 0
1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 9

95% of the time, the entire distribution will lie within the bounds
Register your attendance for today

1. Timetables app or timetables.liverpool.ac.uk

2. Click on today’s session for ENGG404

3. Enter 213653
000000
Do it now
The code’s valid only until 17:30 today
What you should know from today
• How to create variables in sra.a with no, perfect, or opposite dependence
• But that dependence is much more general, and can be nonlinear
• That dependence can affect convolutions, especially in the tails
• How to detect intervariable dependence with plot() and cor()

• Probability has multiple interpretations, but a single calculus

• What an ‘event’ is in probability theory


• What an ‘outcome’, ‘sample space’, ‘trial’, and ‘random experiment’ are

• Many notational conventions (which sometimes conflict)


For next time

• Do homework at https://forms.gle/F1KW93Acp9MQr2hk6

• Take a look at Fault Tree Handbook

You might also like