Professional Documents
Culture Documents
Section 7a Reliability Notes
Section 7a Reliability Notes
Section 7a Reliability Notes
Risk Analysis
Introduction and Overview
Thomas A. Mazzuchi
Professor and Chairman
Department of Engineering
Management and Systems Engineering
George Washington University
2
Risk
-A measure of potential loss due to natural or
human activities
-A combination of the probability or frequency of
the hazard and its consequence; e.g.,
Loss
-Adverse consequences of such activities that
affect
Human life or health
Economics or property
The natural environment
Information , etc
3
Risk Analysis
-Is the process of characterizing, managing, and
informing others about the existence, nature,
magnitude, prevalence, contributing factors, and
uncertainties that pertain to the potential losses
-Other names for risk analysis
Probabilistic Risk Analysis (PRA)
Quantitative Risk Analysis (QRA)
Probabilistic Safety Analysis (PSA)
5
Risk Risk
Assessment Management
Risk
Communication
Risk Assessment
-The process by which the probability or frequency of
loss by or to an engineering system is assessed,
and the magnitude of the loss (consequences)
estimated
Risk Management
-The process by which the potential (probability or
frequency) for loss and/or the magnitude of loss is
minimized and controlled
Risk Communication
-The process by which information about the nature
and consequences of risk, as well as the risk
assessment approach and the risk management
options, are shared and discussed among decision
makers and other stakeholders
9
Risk Assessment
Risk Assessment
Modifications
-Si may occur with a given
probability Pi or frequency fi
-Its occurrence may be static
or dynamic over time
-Pi and Ci may be uncertain
and have probability
distributions
-These distributions may be a
function of time or Si or a
combination of the two
-These quantities may be
jointly distributed
11
Overview
12
Societies of Interest
American Society of Mechanical Engineers
Safety Engineering and Risk Analysis Division
American Society of Safety Engineers
American Statistical Association, Section on Risk Analysi
IEEE Reliability Society
International Association for Probabilistic Safety
Assessment and Management.
Risk Assessment and Policy Association
Risk Theory Society
Society for Maintenance Reliability Professionals
Society for Reliability Engineers
Society for Risk Analysis
System Safety Society
The Safety and Reliability Society
14
Introduction
Risk Matrix
a table that has several categories of probability,
likelihood or frequency on its rows (or columns)
and several categories of severity, impact, or
consequence on its columns (or rows)
It associates a recommended level of risk,
urgency, priority, or management action associated
with each column-row pair (i,e, cell)
16
Introduction
Introduction
Qualitative Risk Assessment
NASA Risk Management Reporting
Qualitative Risk Assessment
NASA Risk Management Reporting
20
Problems with Risk Matrices and
Matrix Design Cox (2008)
If Risk = probability * consequence
Risk
Consequence
Probability
p*c=constant
Probability
Consequence
21
Subjective Interpretations
and Input Bias Smith et al (2009)
1 Objective
Likelihood
Subjective
Value
22
Extension of Cox for Opt. 5x5 Matrix
Design Hong and Mazzuchi (2013)
23
Uncertainty Distribution for Portfolios
of Risks Mazzuchi and Scolese (2014)
c
p
Quantitative Risk Analysis
Scenario Analysis
25
Fault Trees
The Basics of Fault Trees
-A fault tree develops a deterministic description of
the occurrence of the top event, in terms of the
occurrence or not of intermediate events
Top events represent system-level failure
-Describes intermediate events further until, at a
finer level of detail, basic events are obtained
Basic events represent component-level failure
-By itself, a fault tree is only a visual model of how a
system failure can occur
26
Fault Trees
1. Identify undesirable
TOP event
2. Identify first
contributors
3. Link contributors to
TOP event by logic
gates
4. Identify second level
contributors
5. Link second level
contributors to TOP
event by logic gate
27
Basic Event
Intermediate Event
28
Leak
Isolation Valve
Permanent Ignition VAL
Source
I2 Gas flowing through pipe, there is a leak
after the isolation valve this valve should
close but then the pressure relief vale
must open to relieve local pressure
30
Fault Tree Example 2
Example with Success Event
VAL
Performs PRV VAL I1
Correctly Fails Fails Present
31
Fault Tree Example 3
Large Example
V4
V2
P1
T1 C
V1
V5
V3
P2
C
+ +
No Water No No Water No
Delivered V3 Fails to V5 Fails to Water a Delivered V4 Fails to V2 Fails to Water
from V1 Remain Remain from P2 from V1 Remain Remain from P1
Open Open + Open Open
+ +
a
S Fails AC S Fails P1 Fails AC
T1
to Send
Signal b P2 Fails to
Function
Fails
Ruptures
to Send
Signal
to
Function
Fails
+
V1 Fails to
Remain
b
S
Fails Open
AC
Fails
33
T1 P1 Branch P2 Branch
V1 Fails to
Ruptures Remain Fails Fails
Open
+ +
Alternative
Construction
34
Fault Tree Example 4
Block Diagram Example
Circuit Block Diagram Example
3
B D
1
4
A F
7
5
C E
2
6
35
No Current No Current
at Point E at Point D
+ +
No Current Units 5 & 6 No Current Units 3 & 4
at Point C Fail at Point B Fail
+ +
No No
Current Unit 2 Current Unit 1 Unit 4 Unit 3
at Pnt A Fails Unit 6 Unit 5 at Pnt A Fails Fails Fails
Fails Fails
36
ABCE S
ABCE F
ABCDE S
Success ABCDE F
Failure
ABCD F
AB F
Let A denote that subsystem A
Mutually
fails and A denote that it does not Exclusive Depends on
fail Events sequence of events
38
PUMP KLAXON
P K
A subgrade compartment
containing important control
S B equipment is protected from
flooding using the above
system. If the water rises it
should close the float switch
which operates a pump with
separate power supply, A
klaxon should also sound
and alert operators to
perform bailing.
40
Water Float
Rises Switch Pump Klaxon Bailing System System
I K B Logic Results
S P
ISP S
ISPKB S
ISPKB F
ISPK F
IS F
41
Attempted Backup
Abnormal Signal Firewall System System
Illegal
Access by Principal Detected by Initiated by Logic Results
Hacker Firewall Operator Operator
I F O B
IF S
IFOB S
IFOB F
IFO F
42
Important Laws
Commutative XY = YX X+Y = Y+X
Associative X(YZ) = (XY)Z X+(Y+Z)=(X+Y)+Z
Distributive X(Y+Z) = XY+XZ
Idempotent XX = X X+X = X
Absorption X+XY = X
Complementation X+X = (X) = X
De Morgans (XY) = X+Y (X+Y) = XY
Empty/Universal Set =
46
Reducing a Fault Tree Using
Boolean Algebra
T = E1E2 = (A+E3) (C+E4)
= AC + AE4 + CE3 + E3E4
= AC + A(AB) + C(B+C) +
+ (B+C)(AB)
= AC + AAB + CB + CC +
+ BAB + CAB
= AC + AB + BC + C + AB +
+ ABC
= AC + AB + BC +C + ABC
= AC + AB + C + ABC
= AB + C + ABC
= AB + C
=1-(1-0*0)(1-0)
=1-(1-1*0)(1-0)
=1-(1-0*1)(1-0)
=1-(1-1*1)(1-0)
=1-(1-0*0)(1-1)
=1-(1-1*0)(1-1)
=1-(1-0*1)(1-1)
=1-(1-1*1)(1-1)
Note that if all elements of {A,B} occur or all elements of
{C} occur then the top event occurs
These are called Cut Sets
51
Representing Systems in
Terms of Their Components
Truth Tables in Excel
T=AB + C = 1-(1-A*B)*(1-C)
52
T = 1 (1-C1)*(1-C2)**(1-Cm)
= 1 (1- j=1,n1X1j)*(1-j=1,n2X2j)*(1-j=1,nmXmj)
55
Example
Consider the following Fault Tree
[(D+E)B][BC+A]
(D+E)B BC+A
D+E BC
56
Example
T = [(D + E) B] [(B C) + A]
T = (BD + BE) [(BC) + A]
T = (BDBC) + (BEBC) + (BDA) + (BEA)
T = BCD + BCE + ABD + ABE
The minimal cut sets of the top event are thus
C1 = {B, C, D}
C2 = {B, C, E}
C3 = {A, B, D}
C4 = {A, B, E}
57
Example
Thus if A = 1 if component A fails and 0 otherwise
and this is true for B,C,D,E as well we can write
Example
59
X5 X8
X6
X2*X3 X4
X1 X7
X5 X8
60
X1 X7
X5 X8
X2*X3 X4
X1 X6*X7*X8
X5
1-(1-X2*X3)*(1-X4)
X1 X6*X7*X8
X5
61
X1 X6*X7*X8
X5
X1 [1-(1-X2*X3)*(1-X4)]X5 X6*X7*X8
1-(1-X1)*(1-[1-(1-X2*X3)*(1-X4)]X5)*(1-X6*X7*X8)
62
5 8
System Indicator
= 1 (1-X1)(1-(1-(1-X2X3)(1-X4))X5)(1-X6X7X8)
=1-(1-X1)(1-X2X3X5-X4X5+X2X3X4X5)(1-X6X7X8)
=1-(1-X1)(1-X2X3X5)(1-X4X5)(1-X6X7X8)
since for binary variables (X5)2= X5
Which is called min cut representation (no Xin terms)
63
5 8
5 8
Boolean Representation
for General Systems
Non series-parallel structures
1 4
2 5
Z=1-(1-X1X2)(1-X1X3X5)(1-X4X5)(1-X2X3X4)
66
Boolean Representation
for General Systems
As structures get more complex this becomes difficult
and we may have to resort to a Fault Tree
A D F
in B
H out
C E G
Boolean Representation
67
No Flow to Out
+
No Flow to H H
No Flow to F F No Flow to G G
A No Flow to D D C No Flow to E E
No Flow + No Flow +
From in From in
B B
No Flow No Flow
We will discount
From in From in
this in our analysis
Boolean Representation
68
[A(B+D)+F] [C(B+E)+G]+H
+
[A(B+D)+F] [C(B+E)+G] H
A(B+D)+F C(B+E)+G
+ +
A(B+D) F C(B+E) G
A B+D C B+E
+ +
A C
B D B E
B B
69
Boolean Representation
for General Systems
Failure = [A(B+D)+F][C(B+E)+G]+H
= [AB + AD + F] [CB + CE + G]+H
= ABBC+ ABCE + ABG + ADBC+
ADCE + ADG + FBC+ FCE + FG +H
= ABC+ ABCE + ABG + ABCD+ ACDE +
ADG + BCF+ CEF + FG +H
= ABC + ABG + ACDE + ADG + BCF
+ CEF + FG + H
ABC Scenario 2
AB Scenario 3
A Scenario 4
Assume split fractions are calculated using fault trees
A=b+cd B=c+e C=bd
A B C
+ +
b G1 c e b d
c d
71
B B
0 1 0 1
C C C C
0 1 0 1 0 1 0 1
0 1 0 1 0 1 1 1
B B
0 1 0 1
C C C C
0 1 0 1 0 1 0 1
0 1 0 1 0 1 1 1
.1 C .1 C .1 C 1 C
.9 .1 .9 .1 .9 .1 .9 .1
0 1 0 1 0 1 1 1
Calculate probability of top event by replacing the
states with their probabilities, and folding back the tree
For example, 0.19 = 0.1 * 0.9 + 1 * 0.1
81
Putting it All Together
Example
Consider the event tree and fault trees below:
I B A
Example
Solution
a) The Boolean equations representing each of the
event tree scenarios in terms of the fault tree
basic events (C1, C2, C3) are:
Scenario 1:
84
Example
Scenario 2:
Scenario 3:
85
Example Solution
86
Example: Solution
7.95x10-6+6.00x10-6
87
Example Solution
Advanced Probability
Analysis
89
Probability of System Failure:
Law of Total Probability
1
3
2
Z=1-(1-X1*X2)*(1-X3)
Notation
We use the event Ci (S) to denote that component i
(the system) fails and Ci (S) that it does not.
1 4
3
2 5
=Pr(CS12Z}/Pr{Z}
= Pr(CS12}/Pr{Z}
98
Importance Measures
Motivation
A key challenge in a PRA is to identify the elements
in the system that contribute most to the risk
Method to accomplish this is Importance Ranking
The many importance measures used for this
process can be categorized as either
Absolute
Defines each risk element in terms of an
absolute risk metric, such as the conditional
frequency of a hazard exposure given the
state of the element; or
Relative
Compares risk contribution of each element to
that of another
99
Importance Measures
Formulation
Risk is usually composed of a collection of
scenarios that occur with a certain frequency or
probability
A series of cut sets can represent these scenarios
Wall, et al. (2001), represent total risk by a linear
function of any single risk element:
R = aP + b
100
Importance Measures
R = aP + b
where
R: total System Risk
a: total contribution from cut sets that involve a
particular element
P: total risk contribution from a particular element
b: total contribution from cut sets that do not
involve a particular element
IB =a , RP=1 RP=0
IFV = aP/(aP+b) , (Rbase RP=0)/Rbase
IC = aP/(aP+b) , (Rbase RP=0)/Rbase
II = aP ,Rbase RP=0
IRRW = aP , Rbase RP=0 (differential method)
IRRW = (aP+b)/b , Rbase/RP=0 (fraction method)
IRAW = a(1-P) , RP=1 Rbase (differential method)
IRAW = (a+b)/(aP+b) , RP=1/Rbase (fraction method)
DIM1 , (R/Pi)/(j=1,nR/Pj)
DIM 2 , aiPi/j=1,naiPi
102
Safety Systems:
k-out-of-n Systems
Consider a system where the system will function if
k-out-of-n of its components function or will fail is n-k+1
or more components fail
Usually these are of identical components, each with
probability of failure p, then the probability of system
failure is
Why?
1 2-out-of-3 System
Min Cut Sets {1,2}, {1,3}, {2,3}
2
Prob of Failures
3
103
Example
105
Example
112
Random Variables
Probability Distribution
f(x) = Pr{X=x} for X discrete (called
pmf)
f(x)dx Pr{x<X<x+dx} for X continuous
(called pdf)
Cumulative Distribution Function:
F(x) = P(X x) = ix f(i) for X discrete
x
= 0 f(u)du for X continuous
Reliability (Survival) Function
R(x) = P(X>x) =1 F(x)
[F(x) or S(x) is often used in place of R(x)]
Important Functions for
117
Random Variables
Important Functions for
118
Random Variables
Failure Rate Function (Continuous rv Only)
h(x) = Lim dx0P(X x+dx|X>x}/dx
h(x)dx P(x<X x+dx|X>x}
Denotes instantaneous probability of failure
time
Note: i. life lengths said to follow a bathtub failure
rate with three phases: infant mortality,
chance failure and wear out
ii. if h(x) is nondecreasing, constant,
nonincreasing we say that X is IFR, CFR, or
DFR for Increasing, Constant or Decreasing
Failure Rate
120
time
Distributions
When a distribution f(x) can be indexed by a
set of parameters, say , whose specification
completely determines the distribution we say
that f(x|) is a parametric family.
Important Properties
Failure Rate Behavior
Distribution of Minimums (for series systems)
TS = Min{T1, , Tn}
Distribution of Sums (for cold backup or switching
systems)
TS = T1 ++ Tn
122
Which Parametric Family
to Use?
Look at the data histogram
Use of Parametric Families: 123
.
1 n
n n
TS=min{T1,,Tn} TS=max{T1,,Tn} TS=T1++Tn
If Ti~ Exp(i) No Distribution for If Ti~ gamma(i,)
then Ts~ Exp(i=1,n) Ti leads to a then Ts~ gamma(i=1,ni ,)
known form If Ti~ normal(i,i2)
distribution for TS then s~normal(i=1,ni,i=1,ni2)
125
1 n
TS=min{T1,,Tn}
TS=max{T1,,Tn}
n
1
3
2
Z=1-(1-X1*X2)*(1-X3)
Classical Estimation
What is an estimator?
Given an unknown parameter and a random
sample X1, ..., Xn from (X|), what are some
estimators for ?
They are functions of the random sample
(X) = (1/n) i=1,n Xi), ,
(X) = max(X1, ..., Xn )
(X) = 3, ......
An estimator is a random variable with a probability
distribution and an estimate is a realization of that
random variable.
What is a good estimator?
Look at its pdf
131
Classical Estimation
What is a good estimator?
Unbiassedness E[] =
Minimum Variance VAR() as small as possible
(there is a Cremer-Rao
Lower Bound)
Consistency n as n
132
Classical Estimation
Main Parametric Estimators:
~
Given a random sample X1, ..., Xn from f(X|), with
unknown parameter(s),
Method of Moments (ok properties but easy to use)
(X) is obtained as the solution to
1. E[X|] = (1/n) i=1,n Xi, ( has dimension one)
2. E[X|] = (1/n) i=1,n Xi,
VAR[X|] = S2 ( has dimension two)
more equations for higher dimensions
Example _
^
Exponential E[X] = 1/ = 1/x
_
Gamma E[X] = / ,_VAR[X]= /2
=x2/S^2, =x/S2^
133
Classical Estimation
Main Parametric Estimators:
Method of Least Squares
(X) is obtained as the solution to
Min i=1,n {F(X(i)|) - i/n}2 , X(i) is the ith smallest Xi
value and F is a particular parametric family
Selected F(x|)
~
0
134
Classical Estimation
Method of Maximum Likelihood (Best
Properties)
(X) is obtained as that which maximizes the
likelihood function, a function essentially describing
the probability of observing what was observed
X
X
Function - Censoring
Right Censored Samples: A life test with n items
that stops after time t*, if r failures are observed,
let the observed failure times be denoted X(r) =
X(1), ..., X(r) in addition we know X(i) > t* for i > r,
L(|X(r),t*) ={i=1,r f(X(i)|)}R(t*|)n-r
X
X
(
(
X
0 t*
~
Formulating the Likelihood
137
Function-Censoring
Left Censored Samples: A life test with n items
that begins at t = 0 but we do not get to observe
the condition of the items until after time t*. Let r
items be observed to be failed at t* and let the
observed failure times be denoted X(n-r) = X(r+1), ...,
X(n) in addition, we know X(i) t* for i r.
L(|X) = {i=r+1,n f(X(i)|)}F(t*|)r
X
X
)
)
X
0 t*
Formulating the Likelihood
138
Function-Censoring
Interval Censored Samples: A life test with n
items begins at time t = 0 but observation of the
state of the items (failed or surviving) is only at
fixed time points 0 = t0 < t1< . < tk < tk+1 = . The
test is stopped at tk. Let Xi, i = 1, ...,k denote the
number of items observed failed in [tk-1,tk], Xk+1 is
the number still surviving at tk
L(|X) i=1,k+1 [F(ti|) - F(ti-1|)]Xi
( )
( )
( )
( )
0 t1 t2 t3 t4
Formulating the Likelihood
139
Function
Or any mixture
)
( )
(
(
X
0 t1 t2 t3 t4
L(|Data)[F(t2|)]*[F(t2|)-F(t1|)]* R(t3|)*R(t4|)*f(t1|)
Usually for numeric reasons we take the natural log
and maximize
Formulating the Likelihood
140
Function
Example:Consider the following failure time
data
from an exponential distribution
t1=5, t2=12, t3=26, t4>10, t5>17, t6<4,
t7[5,10], t8[5,10], t9[11,16], t10[20,30],
L = f(5)*f(12)*f(26)*R(10)*R(17)*F(4)
*[F[10)-F(5)]2*[F(16)-F(11)]*[F(30)-F(20)]
or
141
Bayes Law
Given an event B and a collection of events
A1, , An which are mutually exclusive
(Ai Aj =) and collectively exhaustive (Aj =)
then P(Ai|B) = P(B | Ai)P(Ai)/ j=1,nP(B | Aj)P(Aj)
144
Pr{X =0}
= Pr{X 0|p=.10}Pr{p=.10}+Pr{X=0|p=.05}PR{p=.05}
+ Pr{X =0|p=.02} Pr{p=.02}
= (.90)10(.10)+(.95)10(.40)+(.98)10(.50) = .6829
148
Pr{p=.1|X1=1,X2>3}
151
g(|x)
~
= L(|x)g()/{
~
L(|x)g()d}
~
f(x|x)
~
= f(x|)g(|x)d
~
This is called
the predictive distribution for X after observing x.
~
157
Prior Posterior
PARAMETER
g() g(|x)
~
158
Prior Selection
PRIOR ASSESSMENT Yes
Any Prior
Yes
Access to Experts Access to Computer
No Conjugate Prior
No
Yes Empirical
Access to Data
Bayes Prior
No