(Rapport (Selskapet for Industriell Og Teknisk Forskning Ved Norges Tekniske Høgskole) A24442) Stein Hauge_ Tony Kråkenes_ Per Hokstad_ Solfrid Håbrekke_ Hui Jin - Reliability Prediction Method for Sa

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 95

Restricted

Report

Reliability Prediction Method for Safety


Instrumented Systems
PDS Method Handbook – 2013 Edition

Authors
Stein Hauge
Tony Kråkenes
Per Hokstad
Solfrid Håbrekke
Hui Jin

SINTEF Technology and Society


Safety Research
May 2013
Report
SINTEF Teknologi og samfunn
SINTEF Technology and Society
Address:
Postboks 4760 Sluppen
NO-7465 Trondheim
NORWAY

Reliability Prediction Method for Safety


Telephone:+47 73593000
Telefax:+47 73592896
ts@sintef.no
www.sintef.no
Enterprise /VAT No:
NO 948007029 MVA
Instrumented Systems
PDS Method Handbook – 2013 Edition
KEYWORDS: VERSION DATE
Safety Instrumented 1.0 2013-05-23
Systems (SIS)
Reliability analysis AUTHORS
SIL calculations Stein Hauge
IEC 61508
Tony Kråkenes
Per Hokstad
Solfrid Håbrekke
Hui Jin

CLIENT(S) CLIENT’S REF.


Multiclient – PDS Forum Håkon S. Mathisen

PROJECT NO. NUMBER OF PAGESS:


60S051 93 incl. appendices

ABSTRACT

PDS is a method used to quantify the safety unavailability and loss of production for safety
instrumented systems (SISs). This report presents an updated version of the PDS method.
Among new and updated topics are:

• Calculations for multiple safety systems.


• Slightly updated model for common cause calculations.
• More thorough discussion of different demand mode situations.
• How to incorporate the effect from reduced proof test coverage (PTC) in the
reliability calculations.
• General update of the calculation formulas and examples.

PREPARED BY SIGNATURE
Stein Hauge

CHECKED BY SIGNATURE
Jørn Vatn

APPROVED BY SIGNATURE
Frode Rømo, Research Director

REPORT NO. ISBN CLASSIFICATION CLASSIFICATION THIS PAGE


SINTEF A24442 978-82-14-05601-3 Restricted Unrestricted

1 of 93
Document history
VERSION DATE VERSION DESCRIPTION
DRAFT 0.1 2013-02-04 Draft version to PDS members for comment.
1.0 2013-05-23 Final version

2 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

PREFACE

The present report is an update of the 2010 edition of the PDS method handbook /6/ and is mainly a result
of work carried out in the research project “Barriers to prevent and limit acute releases to sea”. The authors
would like to thank everyone who has provided us with input and comments to this PDS method
handbook. The work has been funded by the Research Council of Norway and the PDS participants.

Trondheim, May 2013

PDS forum participants in the project period 2010 – 2012:

Oil Companies/Operators/Drilling Companies


• A/S Norske Shell
• BP Norge AS
• ConocoPhillips Norge
• Eni Norge AS
• GDF SUEZ E&P
• Odfjell Drilling & Technology
• Marathon Petroleum Company (Norway) LLC
• Talisman Energy Norge
• Teekay Petrojarl ASA
• Statoil ASA
• TOTAL E&P NORGE AS

Control and Safety System Vendors


• ABB AS
• FMC Kongsberg Subsea AS
• Honeywell AS
• Kongsberg Maritime AS
• Bjørge Safety Systems AS
• Siemens AS
• Simtronics AS

Engineering Companies and Consultants


• Aker Engineering & Technology AS
• Det Norske Veritas AS
• Lilleaker Consulting AS
• Safetec Nordic AS
• Scandpower AS

Governmental bodies
• The Norwegian Maritime Directorate (Observer)
• The Petroleum Safety Authority Norway (Observer)
• The Research Council of Norway (funding)

3 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

ABSTRACT

PDS is a method used to quantify the safety unavailability and loss of production for safety instrumented
systems (SISs). The method accounts for all types of failure categories; technical, software, human, etc.
This report presents an updated version of the PDS method.

IEC 61508 and IEC 61511 have become important standards for specification, design and operation of
safety instrumented systems in the process industry. The PDS method is in line with the main principles
advocated in these standards, focusing mainly – but not only - on the quantitative aspects of the standards.

4 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

Table of contents

PREFACE ................................................................................................................................................ 3

ABSTRACT ............................................................................................................................................. 4

1 INTRODUCTION............................................................................................................................. 7
1.1 Purpose of the Handbook............................................................................................................... 7
1.2 Organisation of the Handbook ....................................................................................................... 7
1.3 Abbreviations ................................................................................................................................. 8

2 THE NEED FOR RELIABILITY CALCULATIONS .................................................................................... 9


2.1 Why do we Need Reliability Analysis of Safety Instrumented Systems? ....................................... 9
2.2 Why PDS? ....................................................................................................................................... 9
2.3 Uncertainty in Reliability Analysis ................................................................................................ 10

3 RELIABILITY CONCEPTS .................................................................................................................11


3.1 Introduction.................................................................................................................................. 11
3.2 Failure Classification by Cause of Failure ..................................................................................... 11
3.3 How to make the reliability calculations more realistic ............................................................... 13
3.4 Testing and Failure Detection ...................................................................................................... 14
3.5 Failure Mode Classification and Taxonomy.................................................................................. 15
3.6 Dangerous Undetected Failures - λDU........................................................................................... 17
3.7 Performance Measures for Loss of Safety – Low Demand Systems ............................................ 18
3.8 Loss of Production ........................................................................................................................ 22

4 MODELLING OF COMMON CAUSE FAILURES .................................................................................24


4.1 The PDS Extension of the Beta-Factor Model – CMooN .................................................................. 24
4.2 Proposed Values for the CMooN Factors ......................................................................................... 24
4.3 Standard β-factor Model Versus PDS Approach .......................................................................... 25
4.4 Modelling of CCF for Components with Non-Identical Characteristics........................................ 26

5 PDS CALCULATION FORMULAS – LOW DEMAND SYSTEMS ............................................................28


5.1 Assumptions and Limitations ....................................................................................................... 28
5.2 PDS Formulas for Loss of Safety ................................................................................................... 29
5.3 How to Model the Quantitative Effect of Imperfect Functional Testing ..................................... 36
5.4 Quantification of Spurious Trip Rate (STR) ................................................................................... 39

6 PDS CALCULATION FORMULAS – HIGH DEMAND SYSTEMS ............................................................41


6.1 High and Low Demand Systems ................................................................................................... 41
6.2 Loss-of-safety Measures: PFD and PFH examples ........................................................................ 42
6.3 Using PFD or PFH? ........................................................................................................................ 43
6.4 PFH Formulas; Including both Common Cause and Independent Failures .................................. 44

5 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

7 CALCULATIONS FOR MULTIPLE SYSTEMS.......................................................................................50


7.1 Background................................................................................................................................... 50
7.2 Motivation for Using Correction Factors (CF) in Multiple SIS Calculations .................................. 52
7.3 Correction Factors for Simultaneous Testing ............................................................................... 54
7.4 Correction Factors for Non-Simultaneous Testing ....................................................................... 56
7.5 Concluding remarks ...................................................................................................................... 57

8 QUANTIFICATION EXAMPLE .........................................................................................................58


8.1 System Description – Topside HIPPS function.............................................................................. 58
8.2 Reliability Input Data .................................................................................................................... 58
8.3 Loss of Safety Assessment – CSU.................................................................................................. 59
8.4 Spurious Trip Assessment............................................................................................................. 61

9 REFERENCES.................................................................................................................................62

APPENDIX A: NOTATION AND ABBREVIATIONS.....................................................................................64

APPENDIX B: THE CONFIGURATION FACTOR CMOON ................................................................................66


B.1 Determining the Configuration Factor CMooN................................................................................ 66
B.2 Formulas for the Configuration Factor CMooN ............................................................................... 68

APPENDIX C: DETAILED FORMULAS FOR PFD AND DTU ..........................................................................71

APPENDIX D: MULTIPLE SIS – BACKGROUND AND CALCULATIONS..........................................................75


D.1 Approaches to determining CF in case of simultaneous testing .................................................. 75
D.2 The effect of differences in testing .............................................................................................. 83

APPENDIX E: PFD VERSUS PFH AND THE EFFECT OF DEMANDS ...............................................................87

6 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

1 INTRODUCTION

1.1 Purpose of the Handbook


The PDS 1 method is used to quantify the safety unavailability and loss of production for safety
instrumented systems (SISs). The method has been widely used in the Norwegian petroleum industry, but
is also applicable to other business sectors. This handbook provides an updated version of the PDS method.
The objective has been to incorporate development work done in the PDS project during the last years and,
based on input from practitioners and PDS participants, to provide some more in-depth discussion of
selected areas.

The increased use of computer-based safety systems has resulted in functional safety standards like IEC
61508, /1/ and IEC 61511, /2/. IEC 61508 provides a basis for specification, design and operation of SISs
with emphasis on safety activities in each lifecycle phase of the system. For estimating the reliability of a
SIS, the IEC standards describes a number of possible calculation approaches including analytical formula,
Boolean approaches including reliability block diagrams (RBD) and fault tree analysis (FTA), Markov
modelling and Petri Nets (see IEC 61508-6, Annex B). It should be noted that the IEC standards do not
mandate one particular approach or a particular set of formulas, but leave it to the user to choose the most
appropriate approach for quantifying the reliability of a given system or function. For further reading and
details about the different reliability modelling approaches, reference is also made to the new ISO TR
12489, /24/.

The PDS method represents an example of how to implement analytical formula, and together with the
PDS data handbook, it offers an effective and practical approach towards implementing the quantitative
aspects of the IEC standards. Efforts have been made to give the reader an understanding of how the
formulas are derived at, including their applicability and their limitations.

The report is aimed at reliability and safety engineers, as well as management, designers and technical
personnel working with safety instrumented systems.

1.2 Organisation of the Handbook


The report is organised as follows:

• Chapter 2 includes a general discussion on the need for reliability calculations, and why the PDS
calculation method is recommended.
• Chapter 3 discusses the failure classification and the reliability parameters of the updated PDS
method.
• In chapter 4 the modelling of common cause failures is discussed.
• Chapter 5 presents calculation formulas for low demand mode systems.
• In chapter 6 a discussion of high demand versus low demand mode systems is given and formulas
for high demand mode / continuously operating systems are presented
• Chapter 7 provides a discussion of and formulas for calculating the reliability of multiple layer
safety systems.
• Chapter 8 presents a worked example of quantification.

Appendix A presents a complete list of notation and abbreviations used in the report. In Appendix B the
modelling of common cause failures is discussed in some more detail and in Appendix C slightly more
detailed formulas than those given in chapter 5 are presented. Appendix D provides a description of the
various alternative approaches to determining an appropriate correction factor (CF) for multiple SIS
calculations in the case of simultaneous testing. This appendix also contains a discussion of the effects of
non-simultaneous testing, both regarding different phasing and different length of test intervals. Finally,
Appendix E gives a discussion of the use of PFH versus PFD.

1
PDS is a Norwegian acronym for reliability of safety instrumented systems.

7 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

The present report focuses on the safety and reliability aspects of the PDS method, including performance
measures for loss of safety and for production availability. It does not consider maintenance performance
and lifecycle cost calculations.

1.3 Abbreviations
avg - Average
CCF - Common cause failures
CF - Correction factor (for multiple SISs)
CMF - Common mode failures
Crit - Critical (failures)
CSU - Critical safety unavailability
D - Dangerous
DC - Diagnostic coverage
DD - Dangerous detected
DU - Dangerous undetected
DTU - Downtime unavailability
ESD - Emergency shutdown
HIPPS - High integrity pressure protection system
HR - Hazard rate
IEC - International electro-technical commission
LOPA - Layer of protection analysis
MCS - Minimal cut set
MooN - M-out-of-N; 𝑀, 𝑁 = 1,2,3, …
m-oo-n - Representative m-out-of-n structure of a SIS; 𝑚, 𝑛 not necessary integers
MTTF - Mean time to failure
MTTR - Mean time to restoration
N/A - Not applicable
NOG - Norwegian Oil and Gas Association (former OLF)
NONC - Non-critical (failures)
OREDA - Offshore reliability data
PDS - Norwegian acronym for “reliability of computer based safety systems”
PFD - Probability of failure on demand
PFH - Probability of failure per hour
PSA - Petroleum Safety Authorities (Norway)
PSD - Process shutdown
PT - Pressure transmitter
PTC - Proof test coverage
PWV - Production wing valve
RH - Random hardware
RNNP - Project: Risk level in Norwegian petroleum production (www.ptil.no)
S - Safe
SAR - Safety analysis report
SD - Safe detected
SU - Safe undetected
SFF - Safe failure fraction
SIF - Safety instrumented function
SIL - Safety integrity level
SIS - Safety instrumented system
STR - Spurious trip rate
SYST - Systematic (failures)
TIF - Test independent failure

8 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

2 THE NEED FOR RELIABILITY CALCULATIONS

2.1 Why do we Need Reliability Analysis of Safety Instrumented Systems?


There is an increasing reliance on safety instrumented systems (SISs) to achieve satisfactory risk levels in
the process industry. Also, in other business sectors such as the public transport industry (air and rail) and
the manufacturing industry, there is a major increase in the use of computer based safety systems.

Fire and gas detection systems, process shutdown systems and emergency shutdown systems are examples
of SISs used to prevent abnormal operating conditions from developing into an accident. Such systems are
thus installed to reduce the process risk associated with health and safety effects, environmental impacts,
loss of property, and business interruption costs, /5/. In the PDS method failure of such systems is referred
to as “loss of safety” or “safety unavailability”.

Addressing safety and reliability in all relevant phases of the safety system life cycle therefore becomes
paramount both with respect to safe as well as commercial operation. It must be verified that all safety
requirements for the SIS are satisfied, and that the risk reduction actually obtained from the SIS is in line
with what is required. Here, the PDS method plays an important role in predicting the risk reduction
obtained from the safety instrumented functions (SIF) that are performed by the SIS.

IEC 61508 and IEC 61511 have become the main standards for design, construction, and operation of SISs
in the process industry. The Norwegian Oil and Gas Association (NOG) has developed a guideline (former
OLF guideline no. 070, /3/) to support the implementation of the two IEC standards. In the regulations
from the Norwegian Petroleum Safety Authorities (PSA), /4/, specific references are given to the IEC
standards and the NOG guideline. IEC 61508 allows using different approaches for quantifying loss of
safety. In the NOG guideline, it is recommended to use the PDS method for this purpose.

Although most reliability analyses have been used to gain confidence in the system by assessing the
reliability attributes, it may be even more interesting to use reliability analysis as a means to achieve
reliability, e.g., by design optimisation. It would usually be efficient to employ these techniques in the
design phase of the system, when less costly changes can be made. Proper analytic tools available during
the design process may ensure that an optimal system configuration is installed from the very beginning,
thereby reducing overall system cost.

The operational phase has been given more attention in recent years, and the need for barrier control is
stressed in the PSA regulations, ref. /4/. Further, both the IEC standards and the PSA regulations focus on
the entire life cycle of the safety systems. In the PDS project, guidelines for follow-up of SISs in the
operating phase have been developed (downloadable from the web) and procedures for updating failure
rates and test intervals in the operating phase have been suggested, ref. /7/ and /8/.

2.2 Why PDS?


Uncritical use of quantitative analyses may weaken the confidence in the value of performing reliability
analyses, as extremely ‘good’, but highly unrealistic figures can be obtained, depending on the assumptions
and the input data used.

The PDS method is considered to be realistic as it accounts for all major factors affecting reliability during
system operation, such as:

• All major failure categories/causes


• Common cause failures
• Automatic self-tests
• Functional (manual) testing
• Systematic failures

9 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

• Complete safety function


• Redundancies and voting logic

The PDS method has been developed in close cooperation with the industry and attempts have been made
to keep the formulas and associated explanations as simple and intuitive as possible without losing required
accuracy. The method may therefore contribute to enhance the use of reliability analysis in the engineering
disciplines, thereby bridging the gap between reliability theory and application.

As stressed in IEC 61508, it is important to be function oriented, and take into account the performance of
the total signal path from the sensors via the control logic and to the actuators. This is a core issue in PDS.

2.3 Uncertainty in Reliability Analysis


It is important to realize that quantification of loss of safety is associated with uncertainty. This means that
the results that we obtain from such analyses are not the true value, but rather a basis for comparing the
reliability of different system designs and for trending reliability performance in the operational phase. An
important objective of quantitative (and qualitative) reliability analyses is to increase the awareness among
system designers, operators, and maintenance personnel on how the system may fail and what the main
contributors to such failures are.

We may relate the uncertainty to:

• The model: To what extent is the model able to capture the most important phenomena of the
system, including its operating conditions? In practice, we often need to balance the two
conflicting interests:
o The model should be sufficiently simple to be handled by available mathematical and
statistical methods, and
o The model should be sufficiently realistic such that the results are of practical relevance.

• Data used in the analysis: To what extent are the data relevant and able to capture the future
performance?
o The use of reliability data are usually based on some assumed statistical model. E.g., the
standard assumption of a constant failure rate may be a simplification for some equipment
types.
o Historical performance is not the same as future performance, even for the same
component. The historical performance is often based on various samples with various
operating conditions and in some cases different properties (such as size, design principle
and so on).
o Data may be incomplete due to few samples, lack of censoring, and not including all type
of failures, for example software related failures.
o There is also uncertainty related to data collection, failure reporting, classification and
interpretation of data.

Sensitivity analyses may be performed to investigate how changes in the model and the assumed data can
influence the calculated loss of safety. The use of sensitivity analyses is common practise in sectors like
the nuclear industry and the aerospace industry, but has so far been given limited attention in the process
industry.

10 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

3 RELIABILITY CONCEPTS

3.1 Introduction
This chapter presents the failure classification and the reliability parameters used in the PDS method. The
objective is to give an introduction to the model taxonomy and to explain the relation between the PDS and
the IEC 61508 approach for quantification of loss of safety.

IEC 61508 and IEC 61511 distinguish between four levels of risk reduction, called safety integrity levels
(SIL). To each SIL, the IEC standards assign a target range for loss of safety. To measure loss of safety,
the standards use Probability of Failure on Demand (PFD) for low demand SISs and Probability of Failure
per Hour (PFH) for high demand /continuous operating SISs. This chapter describes some of the main
concepts and principles underlying the formulas for PFD and PFH, and outlines the slight differences
between the PDS approach and the approaches in IEC 61508 and IEC 61511.

3.2 Failure Classification by Cause of Failure


Failures can be categorised according to failure cause and the IEC standards differentiate between random
hardware failure and systematic failure. PDS uses the same classification and gives a somewhat more
detailed breakdown of the systematic failures, as indicated in Figure 1.

Failure

Random Systematic
hardware failure
failure

Aging failure Software faults Operational failure


Installation failure
- Valve left in wrong
- Random failures due to - Programming error - Gas detector cover left on
position
natural (and foreseen) - Compilation error after commisioning
- Sensor calibra ion
stressors - Error during software - Valve installed in wrong
failure
update direc ion
- Detector in override mode
- Incorrect sensor location

Hardware related Excessive stress


failure failure
- Excessive vibration
- Inadequate specificaton
- Unforeseen sand prod.
- Inadequate implementation
- Too high temperature
- Design not suited to
operational conditions

Figure 1: Possible failure classification by cause of failure.

The following failure categories (causes) are defined:

Random hardware failures are failures resulting from the natural degradation mechanisms of the
component. For these failures it is assumed that the operating conditions are within the design envelope of
the system.

11 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

Systematic failures are in PDS defined as failures that can be related to a particular cause other than
natural degradation. Systematic failures are due to errors made during specification, design, operation and
maintenance phases of the lifecycle. Such failures can therefore normally be eliminated by a modification,
either of the design or manufacturing process, the testing and operating procedures, the training of
personnel or changes to procedures and/or work practices.

There are different schemes for splitting between random hardware and systematic failures and for
classifying the systematic failures. Here, a further split into five categories of systematic failures has been
suggested:

• Software faults may be due to programming errors, compilation errors, inadequate testing, unforeseen
application conditions, change of system parameters, etc. Such faults are present from the point where
the incorrect code is developed until the fault is detected either through testing or through improper
operation of the safety function. Software faults can also be introduced during modification to existing
process facilities, e.g., inadequate update of the application software to reflect the revised shutdown
sequences or erroneous setting of a high alarm outside its operational limits.

• Hardware related systematic failures are failures (other than software faults) introduced mainly
during the design phase of the equipment but also during modifications/repairs. It may be a failure
arising from incorrect, incomplete or ambiguous system specification, a failure in the manufacturing
process and/or in the quality assurance of the component. Examples are a valve failing to close due to
insufficient actuator force or a sensor failing to discriminate between true and false demands.

• Installation failures are failures introduced during the last phases prior to operation, i.e., during
installation or commissioning. If detected, such failures are typically removed during the first months
of operation and such failures are therefore often excluded from data bases. These failures may
however remain inherent in the system for a long period and can materialise during an actual demand.
Examples are erroneous location of e.g., fire/gas detectors, a valve installed in the wrong direction or
a sensor that has been erroneously calibrated during commissioning.

• Excessive stress failures occur when stresses or conditions beyond the design specification are placed
upon the component. The excessive stresses may be caused either by external causes or by internal
influences from the medium. Examples may be damage to process sensors as a result of excessive
vibration, internal valve erosion caused by unforeseen sand production or plugging of instrument taps
caused by unforeseen low temperatures.

• Operational failures are initiated by human errors during operation/intervention or testing/


maintenance/repair. In the operational phase the variability of tasks and work practices increases,
thereby increasing the possibility of human errors during interaction with the SIS. Such errors are
therefore believed to be an important contributor towards SIS unavailability, which is further
supported by data from sources such as OREDA /13/ and operational reviews performed by SINTEF.
Examples of such interaction failures2 are loops left in override position after completion of
maintenance, a shutdown function set in bypass during start-up due to dynamic process conditions,
erroneous calibration of a level sensor or a process sensor isolation valve left in closed position so that
the instrument does not sense the medium.

Systems and equipment that are designed and operated in accordance with IEC 61508 undergo a formal
work process specifically looking to minimize systematic failures. The standard also provides a number of
checklists with measure and techniques to avoid and control such failures during the different life cycle
phases. Hence, systematic failures shall be minimised to the extent possible given that functional safety
management according to IEC 61508 and IEC 61511 is properly implemented.

2
It should be mentioned that in other classification schemes (e.g. in /24/) some of these operational human errors are
defined as random failures rather than systematic ones. However, in PDS we have chosen to classify all human errors
as systematic failures.

12 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

In general, systematic failures can give rise to failure of multiple components, i.e., common cause failures.
Random hardware failures, on the other hand, can be denoted independent failures and are assumed not to
result in common cause failures.

It should be noted that some failures may not fit perfectly into the above scheme. E.g., it may sometimes be
difficult to discriminate between an aging failure and a stress failure. Similarly it may be argued that there
is overlap between some of the failure categories. However, for the purpose of illustrating that SIS failures
may have a variety of causes without introducing a too complex classification scheme, the above
categories are considered sufficiently detailed.

Random hardware failures are sometimes referred to as physical failures whereas systematic failures are
referred to as non-physical failures. A physical failure occurs when a component has degraded to a point of
failure where it is not able to operate and thus needs to be changed or repaired. An example can be a relay
which due to wear out is not able to change position.

A non-physical failure on the other hand, occurs when the component is still able to operate but does not
perform its specified function. An example may be a gas detector not functioning because it is still covered
by plastic due to sand blasting in the area. It should, however, be noted that systematic failures caused by
excessive stresses may result in a physical failure of the component. E.g., unforeseen vibration of a pump
can cause a physical failure of a flow transmitter located on the connected piping. Hence, given the
classification scheme in figure 1, it is not correct to state that all systematic failures are non-physical
failures.

In line with the IEC standards, the PDS method has a strict focus on the entire safety function and
therefore intends to account for all failures that could compromise this function. Some of these failures
may be related to the interface/environment such as e.g., “vibration of nearby pump causing transmitter to
fail”. However, it is part of the PDS philosophy to include or at least to consider the possibility of having
such events since they may contribute towards the unavailability of the safety system.

3.3 How to make the reliability calculations more realistic


Following the introduction of IEC 61508 and the accompanying SIL verification process, it has become an
increasing problem that exaggerated performance claims are made by equipment manufacturers, (see e.g.,
/9/ and /10/). Predictive analyses based on seemingly perfect operating conditions often claim failure rates
a magnitude or more below what has historically been observed during operation. There may be several
causes for such exaggerated claims of performance, including imprecise definition of equipment- and
analysis boundaries, incorrect failure classification or too optimistic predictions of the diagnostic coverage
factor, /9/. Another important reason seems to be that figures from such predictive analyses frequently
exclude any possible contributions from systematic failures, e.g., failures that in one way or another can be
attributed to operation rather than the equipment itself. From a manufacturers point of view this is
understandable – why include failures that are not “his responsibility”? On the other hand the SIS is
installed for the purpose of providing a further specified risk reduction and unrealistic failure rates can
result in far too optimistic predictions.

An important idea behind the PDS method is that the predicted risk reduction, calculated for a safety
instrumented function (SIF) in the design phase, should reflect the actual risk reduction that may be
experienced in the operational phase. In the PDS method we have therefore argued that both the
contribution from random hardware failures as well as systematic failures should, to the degree possible, be
quantified. This approach may appear somewhat different from the IEC 61508 standard, saying that only
the contribution from random hardware failures shall be quantified and that reduction and avoidance of
systematic failures shall be treated qualitatively. It should, however, be noted that IEC 61508 actually
quantifies part of the systematic failures through the proposed method for quantifying hardware related
common cause failures (ref. IEC 61508-6, Annex D). The IEC standard also repeatedly states that the
contribution from human errors should be included, although not explicitly saying how this shall be done.

13 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

Some main arguments why the contribution from systematic failures and in particular those introduced in
the operational phase, should be included in the reliability estimates are:

• We want our risk reduction predictions (and SIL calculations) to be as realistic as possible;
• Failure to adequately address potential systematic failures can lead to overly optimistic results and
a possible misallocation of resources intended to reduce risk;
• Too optimistic failure rates may result in an inadequate number of protection layers since too high
risk reduction is assumed for each layer;
• Systematic failures are often the dominant contributor towards the overall failure probability (ref.
e.g., failure data dossiers in /11/);
• Failure rates as given in e.g. /11/ and /13/ are based on historic (operational) data and therefore
often include some systematic failures.

In PDS the systematic failures have, in addition to failure cause, been classified in two main categories:

1. Systematic failures detectable during testing. Examples may be a detector left in override mode at the
last test, a miscalibrated transmitter or a valve that will not close due to hydrate formation;

2. Systematic failures not detected during testing but occurring upon a true demand. One example may be
a software error introduced during update of the program logic. Another example can be a valve that
closes during regular testing but due to insufficient actuator force does not close upon a process
demand situation (with high process pressure).

It should finally be pointed out that a thorough understanding of the system, including an analysis of
relevant failure modes and how to detect them, are crucial in order to avoid these failures in the first place.

3.4 Testing and Failure Detection


Testing and subsequent failure detection is vital in order to reveal and remove hidden failures in the safety
system. Mainly, we have three possibilities for failure detection:

• Failure detection by automatic self-tests (including operator observation);


• Failure detection by functional testing (i.e., manual testing);
• Failure detection upon process demands / shutdowns.

3.4.1 Automatic Self-tests


Modules often have built-in automatic (diagnostic) self-test to detect failures. Typical failure modes that
can be detected by diagnostics are signal loss, drifted analogue signal / signal out of range or final element
in wrong position, /5/. Further, upon discrepancy between redundant modules in the safety system, the
system may determine which of the modules have failed. This is considered part of the self-test. But it is
never the case that all failures are detected automatically. The fraction of failures being detected by the
automatic self-test is called the diagnostic (fault) coverage and quantifies the effect of the self-test. Note
that the actual effect on system performance from a failure that is detected by the automatic self-test will
depend on system configuration and what action is taken when the equipment fault is detected. In
particular it is important to consider whether the fault initiates an automatic shutdown action or
alternatively only generates a system alarm which requires an active operator response.

In addition to the automatic self-test, an operator or maintenance crew may detect dangerous failures
incidentally in between tests. For instance, the panel operator may detect a transmitter that has frozen or a
detector that has been left in by-pass. Similarly, when a process segment is isolated for maintenance, the
operator may detect that one of the valves will not close. In previous editions of the handbook, the PDS
method has allowed for incorporating this effect into the diagnostic coverage factor. However, since there
is an increasing trend towards low- or unmanned (or subsea) installations and also to be in line with the

14 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

IEC definitions, we now define diagnostic coverage for dangerous failures to only include the effect of
self-test (ref. section. 3.6.1 for definition of coverage).

3.4.2 Functional Testing


Functional testing is performed manually at predefined time intervals and aims at testing the components
involved in the execution of a safety instrumented function. In reliability analyses it is often assumed that
functional testing is “perfect” in the sense that it replicates a true demand and thereby detects 100% of the
failures. However, in reality the testing may be imperfect and/or the test conditions may deviate from the
true demand conditions, leaving some parts of the function untested. Some typical examples of test
conditions that may not reveal all failures are 3:

• Partial stroke testing (PST) 4;


• Test buttons on switches, e.g. built in test facilities - these may or may not reveal all faults;
• Transmitters put into test mode and signals injected (usually with smart / fieldbus transmitters);
• Pressure transmitters tested from manifold, i.e. impulse lines not tested;
• Equipment not tested in normal position.

To cater for the effect of incomplete testing, the probability of so called test independent failures (TIF) can
be added to the PFD. This is further discussed in section 3.7.4.

The fraction of failures detected during functional testing is often referred to as the proof test coverage
(PTC). Partial stroke testing of valves is maybe the best known example where only part of the valve
functionality is tested but not the full stroke. Typically, for such a partial test, the test coverage may be
estimated and applied in the reliability calculations. This is further discussed in section 3.7.5.

3.4.3 Process Demands Serving as Testing


Generally, it has not been standard practice in reliability analyses to model demands as a means for failure
detection. One obvious reason for this being that a real demand on the safety function cannot be predicted
and detection of a failure at this point may anyhow be too late (especially in single configurations).

There will however be several planned (and unplanned) shutdown events where data related to SIS
performance can be recorded – either manually or automatically in the plant information management
system. Such information may typically include a listing of activated equipment, result of activation and
possible failure modes including response/travel times. Hence, it may be possible to utilise this shutdown
information for testing purposes, thereby potentially reducing the need for manual testing.

Utilising shutdown reports as a means of testing should however be done with great care. It is required that
the data recorded during the shutdown provides the equivalent information as obtained during a functional
test. Further, it is important to identify which functions or parts of functions that are not activated during
the shutdown and therefore need to be tested separately.

3.5 Failure Mode Classification and Taxonomy


In IEC 61508-4 /1/ the following definitions are given of a dangerous and a safe failure respectively:

Dangerous failure; “failure of an element and/or subsystem and/or system that plays a part in
implementing the safety function that: a) prevents a safety function from operating when required (demand
mode) or causes a safety function to fail (continuous mode) such that the EUC is put into a hazardous or

3
For more information see: http://www.hse.gov.uk/foi/internalops/hid circs/technical general/spc tech gen 48.htm
4
Some practitioners have argued that PST is a mean to increase the diagnostic coverage. However, since the interval
between partial stroke testing is usually 1-3 months, whereas a diagnostic test should take place more frequently than
once a week (ref. IEC 61508-6, Table D.3), we here consider partial stroke testing as a (partial) functional test.

15 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

potentially hazardous state; or b) decreases the probability that the safety function operates correctly
when required”

Safe failure; “failure of an element and/or subsystem and/or system that plays a part in implementing the
safety function that: a) results in the spurious operation of the safety function to put the EUC (or part
thereof) into a safe state or maintain a safe state; or b) increases the probability of the spurious operation
of the safety function to put the EUC (or part thereof) into a safe state or maintain a safe state”.

Furthermore, IEC 61508-4 defines “no effect failure” as a failure of an element that plays a part in
implementing the safety function but has no direct effect on the safety function.

In line with IEC 61508 the PDS method also consider three failure modes; dangerous, safe and non-critical
failures. These failure modes are given the following interpretations (on a component level):

• Dangerous (D). The component does not operate upon a demand (e.g., sensor stuck upon demand
or valve does not close on demand). The Dangerous failures are further split into:
o Dangerous Undetected (DU). Dangerous failures not detected by automatic self-test (i.e.,
revealed only by a functional test or upon a demand);
o Dangerous Detected (DD). Dangerous failures detected by automatic self-test.

• Safe (S). The component may operate without any demand (e.g., sensor provides a shutdown
signal without a true demand - 'false alarm'). The safe failures are further split into:
o Safe Undetected (SU). Safe failures not detected by automatic self-test (or incidentally by
personnel) and therefore results in spurious operations of the components 5;
o Safe Detected (SD). Potential spurious operations failures detected by automatic self-test
(or incidentally by personnel). Hence, actual trips of the components are avoided.

• Non-critical (NONC). The main functions of the component are not affected. Examples may be
sensor imperfection or a minor leakage of hydraulic oil from an actuator, which has no immediate
impact on the specified safety function. These failures correspond to the no-effect failures as
defined by IEC 61058-4.

The Dangerous and Safe (spurious operation) failures are considered “critical” in the sense that they affect
either of the two main functions, i.e., (1) the ability to shut down on demand or (2) the ability to maintain
production when safe. The Safe failures are usually revealed instantly upon occurrence, whilst the
Dangerous failures are “dormant” and can be detected by testing or upon a true demand.

Note that although a safe failure typically results in the system going to its predefined safe state, such
failures are by no means without consequences. There may be associated production losses, environmental
emissions caused by flaring and also the required process start-up with all of its potential hazards.

It should further be noted that a given failure may be classified as either dangerous or safe depending on
the intended application. E.g., loss of hydraulic supply to a valve actuator operating on-demand will be
dangerous in an energise-to-trip application and safe in a de-energise-to-trip application. Hence, when
performing reliability calculations, the assumptions underlying the applied failure data as well as the
context in which the data shall be used must be carefully considered.

Based on the classification discussed above, the failure rate λ can be split into the following elements:

• 𝜆DD = Rate of dangerous detected failures


• 𝜆DU = Rate of dangerous undetected failures
• 𝜆SD = Rate of safe detected failures

5
Depending on system configuration a spurious trip of the SIS may be avoided; e.g., by using a 2oo2 voting.

16 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

• 𝜆SU = Rate of safe undetected failures


• 𝜆NONC = Rate of non-critical failures (comparable to “no effect failure” in IEC)

We also introduce 𝜆crit = 𝜆D + 𝜆S , which is the rate of critical failures; i.e. failures which unless detected
can cause a failure on demand or a spurious trip of the safety function.

In addition we have the total failure rate 𝜆 = 𝜆crit + 𝜆NONC Table 1 and Figure 2 further illustrate how
𝜆crit and 𝜆 can be split into their various elements.

Table 1: Rate of critical failures, λcrit, split into various elements

Safe failures Dangerous failures Sum


Undetected 𝜆SU 𝜆DU -
Detected 𝜆SD 𝜆DD -
Sum 𝜆S 𝜆D 𝜆crit

Dangerous failure,
λDU undetected by automatic
self-test

Dangerous failure,
λDD detected by automatic self-
test
λcrit
Contribute to SFF Safe failure,
(Safe Failure
λ
Fraction)
λSU undetected by automatic
self-test (or personnel)

Safe failure,
λSD detected by automatic
self-test (or personnel)

λNONC

Figure 2: Failure rate λ split into various elements

3.6 Dangerous Undetected Failures - λDU


As discussed above, the critical failure rate, 𝜆crit is split into dangerous and safe failures which are further
split into detected and undetected failures. When performing safety unavailability calculations, the rate of
dangerous undetected failures, 𝜆DU , is of special importance, since this parameter – together with the
functional test interval – to a large degree governs the prediction of how often a safety function is likely to
fail on demand. As discussed in section 3.2, 𝜆DU will include both random hardware failures as well as
systematic failures. Consequently, it is relevant to think of 𝜆DU as comprising two elements; 𝜆DU−RH which
is the rate of DU random hardware failures (i.e., the strict IEC 61508 definition of 𝜆DU ), and 𝜆DU−SYST ,
being the rate of DU systematic failures, detectable by functional testing. Hence we can write:

𝜆DU = 𝜆DU−RH +𝜆DU−SYST.

17 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

Further, in PDS the parameter 𝑟 is defined as being the fraction of 𝜆DU originating from random hardware
failures, i.e., 𝑟 = 𝜆DU−RH /𝜆DU. Then, 1 − 𝑟 will be the fraction of 𝜆DU originating from systematic
failures, i.e., 1 − 𝑟 = 𝜆DU−SYST /𝜆DU .

It must be pointed out that splitting 𝜆DU is not necessary when performing standard reliability calculations.
This is further discussed when the calculation formulas are presented in the next sections. However, when
considering risk reducing measures to reduce the failure rate it is advantageous to have additional
knowledge on how the different failure contributors distribute.

3.6.1 Coverage Factors and Safe Failure Fraction


IEC 61508 defines the diagnostic coverage (DC) as:

• DC = λDD /λD, i.e., fraction of dangerous failures detected by automatic on-line diagnostic tests.
The fraction of dangerous failures is computed by using the dangerous failure rates associated with
the detected dangerous failures divided by the total rate of dangerous failures

In the IEC definition (of DC) given above, the coverage only includes failures “detected by automatic on-
line diagnostic tests”. As discussed in section 0 it will for some equipment and some installations where
detected failures can be rectified quickly, be relevant also to include random observation (by control room
operator, field operator or maintenance crew). However, to be in line with the IEC definition we will adapt
the same definition as above for dangerous failures, whereas we keep the PDS definition (from previous
PDS handbooks) for safe failures. We therefore define the coverage, c, for dangerous and safe (spurious
operation) failures as:

• 𝑐D = λDD /λD, i.e., the fraction of dangerous failures detected by automatic self-tests

• 𝑐S = λSD /λS, i.e., the fraction of safe (spurious operation) failures that are detected by automatic
self-tests (or by personnel) so that a spurious operation of the component is avoided

Thus, we see that the coverage for dangerous failures, 𝑐D now is directly comparable to the DC as defined
in IEC 61508. Concerning the coverage factor for safe failures, 𝑐S , this factor is not explicitly defined in
IEC 61508 and its physical interpretation seems to vary among users of the IEC standards. In PDS a safe
detected (SD) failure is interpreted as a failure which is detected prior to a spurious operation of the
component (or spurious trip of the system), whereas a safe undetected failure actually causes a component
trip (but a system trip may be avoided due to the configuration of the system, e.g., 2oo2 voting).

Finally, observe that IEC also introduces the safe failure fraction (SFF) in relation to the requirements for
hardware fault tolerance. This is the fraction of failures that are not critical with respect to safety
unavailability of the safety function. SFF is defined as the ratio of safe failures plus dangerous detected
failures to the total failure rate and can be estimated as:

• SFF = 1 − (𝜆DU /𝜆crit ); or rather in percentage: SFF = [1 − (𝜆DU /𝜆crit )] ∙ 100 %.

3.7 Performance Measures for Loss of Safety – Low Demand Systems


The measures for loss of safety used in IEC are the average PFD (Probability of Failure on Demand) for
low demand systems and PFH (Probability of Failure per Hour) for high demand systems. This section
presents the various measures for loss of safety used in PDS. All these reflect safety unavailability of the
function, i.e., the probability of a failure on demand. Probability of failure per hour (PFH) is discussed
separately in chapter 6.

3.7.1 Contributions to Loss of Safety


The potential contributors to loss of safety (safety unavailability) have in PDS been split into three main
categories:

18 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

• PFD: Unavailability due to dangerous undetected (DU) failures.


• DTU: Unavailability due to known or planned downtime
• PTIF : Unavailability due to TIF failures (test independent failures)

1) Unavailability due to dangerous undetected (DU) failures, i.e., unavailability caused by dangerous
failures that are detectable only during functional testing or upon a demand (not revealed by automatic
self-test). This unavailability, which is often referred to as “unknown” may be thought of as
comprising two elements:
a) The unavailability due to dangerous undetected random hardware failures (occurring with rate
𝜆DU−RH ).
b) The unavailability due to dangerous undetected systematic failures (occurring with rate
𝜆DU−SYST ).

2) Unavailability due to known or planned downtime. This unavailability is caused by components either
taken out for repair or for testing/maintenance. The downtime unavailability can be split in two main
contributors:
a) The known unavailability due to dangerous (D) failures where the failed component must be
repaired. The average period of unavailability due to these events equals the mean time to
restoration, MTTR, i.e., the time elapsing from the failure is detected until the situation is restored.
b) The planned (and known) unavailability due to the downtime/inhibition time during functional
testing and/or preventive maintenance.

3) Unavailability due to test independent failures, i.e., unavailability caused by hidden dangerous failures
that are not revealed during functional testing but only upon a true demand. These failures are denoted
Test Independent Failures (TIF), as they are not detected through the functional test or by automatic
self-test, only during a real demand.

Figure 3 illustrates the three categories of contributors to loss of safety.

Downtime Test Independent


Dangerous Undetected unavailability Failure (TIF)

1b) 2b) Out for 3) Failures not


1a) Systematic
failures 2a) testing covered by
Random hardware Out for functional
failures repair testing

PFD DTU PTIF

Figure 3: Loss of safety contributors in PDS

It should be noted that the actual contribution to loss of safety from failures in category 2) will depend
heavily on the operating philosophy, on the configuration of the process plant as well as the configuration
of the SIS itself. Therefore, the downtime unavailability should be treated separately and not together with
category 1) and 3). Often, temporary compensating measures will be introduced while a component is
down for maintenance or repair. Other times, when the component is considered too critical to continue
production (e.g., a critical shutdown valve in single configuration), the production may simply be shut
down during the restoration and testing period. On the other hand there may be test- or repair-situations
where parts of or the whole safety system is bypassed while production is being maintained. An example
may be that selected fire and gas detectors are being inhibited while reconfiguring a node in the fire and
gas system.

19 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

Often the downtime unavailability is small compared to the contribution from failures in category 1). That
is, usually MTTR << τ, where τ is the functional test interval. This is, however, not always the case; e.g.,
for subsea equipment in offshore production, the MTTR could be rather long. Category 2b) can often be
considered the least critical, as this represents a truly planned unavailability of the safety system and since
testing and maintenance is often performed during planned shutdown periods.

Below, we discuss separately the loss of safety measures for the three failure categories, and finally an
overall measure for loss of safety is given.

3.7.2 Probability of Failure on Demand (PFD)


In order to quantify the loss of safety due to random hardware failures, IEC uses the term:

PFD = (Average) Probability of Failure on Demand

The PFD is therefore the average probability that the SIS is unable to perform its safety function upon a
demand.

According to the formulas given in IEC 61508, it appears that the PFD includes the failure contributions
from category 1a) as well as from 2a). However, as argued above, it is natural to give the known downtime
unavailability a separate notation. Therefore, in the PDS method the PFD quantifies the loss of safety due
to dangerous undetected failures (with rate λDU ), during the period when it is unknown that the function is
unavailable. The average duration of this period is 𝜏/2 for a single component. If the downtime
unavailability (i.e., category 2 above) is added, this is explicitly stated.

3.7.3 Downtime Unavailability – DTU


This represents the downtime part of the safety unavailability as described in categories 2a) and 2b) above.
The DTU comprises two elements:

• DTUR ; i.e., downtime unavailability due to repair of dangerous failures of rate λD, resulting in a
period when it is known that the function is unavailable (i.e., category 2a above). The average
duration of this period is the mean time to restoration (MTTR); i.e., the time from the failure is
detected until the safety function is restored;

• DTUT; i.e., planned downtime (or inhibition time) resulting from activities such as testing and
planned maintenance (i.e., category 2b above).

Depending on the operational philosophy and the configuration of the process plant and the SIS, it must be
decided whether it is relevant to include only the DTUR , only the DTUT or the entire DTU = DTUR +DTUT
in the overall measure for loss of safety. This is further discussed in chapter 5.

3.7.4 Test Independent Failures - TIF


As discussed in section 3.4.2, it is often assumed in reliability analyses that functional testing is “perfect”
and as such detects 100 % of the failures. In true life this is seldom the case; the test conditions may differ
from the real demand conditions, and some dangerous failures can therefore remain in the SIS after the
functional test. In PDS this is catered for by adding the probability of so called test independent failures
(TIF) to the PFD.

PTIF = The Probability that the component/system will fail to carry out its intended function due to
a (latent) failure not detectable by functional testing (therefore the name “test independent
failure”)

It should be noted that if an imperfect testing principle is adopted for the functional testing, this will lead to
an increase of the TIF probability. For instance, if a gas detector is tested by introducing a dedicated test

20 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

gas to the housing via a special port, the test will not reveal a blockage of the main ports. Another example
is that a pressure transmitter is tested by applying a test pressure directly to the diaphragm, rather than by
raising the pressure in the pipeline or vessel in which the pressure transmitter is installed.
Test independent failures will often be systematic by nature, e.g., a programming error in the application
software that is not revealed since all Cause & Effects are not tested. Some test independent failures may
however be classified as random hardware failures, e.g., wear and tear of a valve stem causing internal
leakage that is not revealed during regular stroke testing.

3.7.5 Proof Test Coverage - PTC


As discussed above, the contribution from non-perfect testing can be modelled by adding the probability of
test independent failures, i.e., the PTIF . Alternatively, the coverage of the proof test can be considered. We
then define the proof test coverage (PTC) as:

PTC = Proof Test Coverage: Fraction of failures detected during functional proof testing (se e.g.,
IEC 61508-6, section B.3.2.5)

The PTC here expresses the fraction of dangerous failures that are actually detected during a functional
test. If the proof test coverage (PTC) is 100 % then the test is “perfect”. If PTC is less than 100% this
reflects that some failure modes are assumed not detected during the proof test.

Formulas that model the quantitative effect of reduced test coverage on the PFD as well as a discussion of
when to use PTC versus PTIF are given in section 5.3.

3.7.6 Critical Safety Unavailability (CSU)


In PDS the measure Critical Safety Unavailability (CSU) is used to quantify the loss of safety:

CSU = The probability that the component/system will fail to automatically carry out a successful
safety action on the occurrence of a hazardous/accidental event.

Thus, we have the relation:


CSU = PFD + DTU + PTIF.

As discussed in section 3.7.3, IEC 61508 quantifies the downtime unavailability which is due to
component restoration time resulting from a dangerous failure (i.e., the DTUR). No equivalent formula for
quantification of unavailability caused by component downtime during testing and inspection is given in
IEC 61508 (i.e. the DTUT). Therefore the average probability of a failure on demand, as used in the IEC
standards for SIL calculations, will be equivalent to the sum of the above defined PFD and the DTUR.

The relationship between the different loss of safety measures used in PDS and IEC is presented in
Figure 4.

PFD
PFDIEC
CSU DTUR
DTUT
PTIF

Figure 4: Loss of safety measures used in PDS versus IEC 61508

21 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

A graphical illustration of the contribution from dangerous undetected failures at time 𝑡 from previous
functional test, PFD(𝑡), and test independent failures (PTIF ) is illustrated in Figure 5. The contribution
applies for a single component with test interval 𝜏 and illustrates the variation in failure probability during
the test period. For a single component the time dependent PFD equals
PFD(𝑡) = 𝜆DU ⋅ 𝑡.

In the figure it is assumed that the DTU contribution is zero, i.e., CSU = PFD + PTIF.

CSU = PFD(t) + PTIF

Time dependent Maximum CSU Average


CSU CSU

Dangerous undetected failures, (PFD) = λDU ·τ/2

Test Independent Failures, (PTIF)

τ 2τ 3τ 4τ 5τ time
τ
Functional test
interval

Figure 5: Contributions from PFD and PTIF as a function of time for a single component

Observe that the PFD and the CSU are at their maximum right before a functional test and at their minimum
right after a test. However, when calculating the CSU, we actually calculate the average PFD value as
illustrated in Figure 5. As a consequence it may occur that the CSU on average will fulfil a given PFD
criteria although the CSU at its maximum may exceed the specified criteria.

3.8 Loss of Production


The IEC 61508 and related standards focus on loss of safety and PFD calculations. However, there is also a
possibility that the safety systems can cause a shutdown of the process when there is no actual demand (i.e.
a spurious trip). Examples may be a gas detector giving an alarm when there is no gas in the area, or a level
transmitter giving a high alarm although the level is actually within normal. As discussed in previous
sections such failures are classified as safe failures and depending on system configuration and whether the
failures are detected or not, they may cause a spurious trip of the system and resulting system downtime.
Since loss of production and subsequent start-up situations are unwanted events, it is important to balance
the loss of safety against the rate of spurious trips (loss of production) 6. In the PDS method the measure for
quantifying loss of production is the spurious trip rate:

STR = The expected number of spurious activations of the SIS per time unit

For this measure, the applied time unit is usually per year or per 106 hrs.

6
See also the new ISO TR 12489 /24/ for a more thorough discussion of this topic.

22 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

In addition there may be loss of production due to repair of dangerous (and safe) failures and also during
testing. Whether this contributes to the downtime unavailability (DTU – which is safety related) or to loss
of production, will depend on the operational philosophy during repair and testing. This is further
discussed in section 5.2.3.

23 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

4 MODELLING OF COMMON CAUSE FAILURES


When we quantify the reliability of redundant safety systems, it is essential to distinguish between
independent and dependent failures. In general, random hardware failures caused by natural stressors (ref.
Figure 1) are independent failures, i.e., a failure of one component/module is not assumed to influence the
failure frequency of other identical modules in the safety system. On the other hand, most systematic
failures, such as excessive stress related failures, installation failures and operational failures are by nature
potentially dependent failures. Such failures can lead to common cause failures (CCF), i.e., simultaneous
failure of more than one module in the safety system due to a shared cause. Common cause failures may
therefore reduce the effect of redundancy.

4.1 The PDS Extension of the Beta-Factor Model – CMooN


The traditional way of accounting for common cause failures (CCF) has been the 𝛽-factor model. In this
model it is assumed that a certain fraction of the failures (equal to 𝛽) are common cause, i.e., failures that
will cause all the redundant components to fail simultaneously or within a short time period. One problem
with this approach is that for any M-out-of-N (𝑀oo𝑁) voting 7 the rate of common cause failures is the
same, regardless of the configuration. If 𝜆DU is the component failure rate, the MooN voted system has a
common cause failure contribution equal to 𝛽 ∙ 𝜆DU. Hence, this approach does not explicitly distinguish
between different voting logics, and the same result is obtained e.g., for 1oo2, 1oo3 and 2oo3 voted
systems.

In the PDS method, we use an extended version of the 𝛽-factor model that distinguishes between different
types of voting. Here, the rate of common cause failures explicitly depends on the configuration, and the
beta-factor of an MooN voting logic may be expressed as:

𝛽(𝑀oo𝑁) = 𝛽 ∙ C𝑀oo𝑁 ; (𝑀 < 𝑁).

C𝑀oo𝑁 is then a modification factor for various voting configurations, and 𝛽 is the factor which applies for
a 1oo2 voting. This means that if each of the 𝑁 redundant modules has a failure rate 𝜆DU , then the 𝑀oo𝑁
configuration will have a system failure rate due to CCF that equals: CMooN ∙ 𝛽 ∙ 𝜆DU.

By using this model, the parameter 𝛽 is maintained as an essential parameter whose interpretation is now
entirely related to a duplicated system. Further, note that the effect of voting is introduced as a separate
factor, C𝑀oo𝑁 , independent of 𝛽. This makes the model easy to use in practice.

4.2 Proposed Values for the CMooN Factors


Determining values for the 𝛽-factor is not a straightforward issue, one problem being the limited access to
relevant data. Checklists like the one in IEC 61508-6 have therefore been developed to support the
estimation of this parameter. However, since there are little or no data available for calibrating the resulting
common cause failure rates, the checklist methods are mainly based on engineering judgement.

Similarly, when determining the C𝑀oo𝑁 factors, the same problem with lack of relevant data is faced. A
procedure for estimating these factors has therefore been proposed based on expert judgements supported
by some data related to the effect of adding redundancy to a system. This procedure is further described in
Appendix B, both explaining how the specific C𝑀oo𝑁 values have been derived at, as well as giving general
formulas for the C𝑀oo𝑁 factor.

As compared to the 2010 version of the PDS method handbook /6/, the suggested values for CMooN have
been slightly modified. This is based on an updated model which combines the C𝑀oo𝑁 model with the
concept of a “lethal shock”. It is here assumed that the “standard” C𝑀oo𝑁 model applies for 95 % of the

7
An MooN voting (𝑀 < 𝑁) means that at least 𝑀 of the 𝑁 redundant modules/components must be activated for the
safety function to be activated.

24 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

common cause failures and that the remaining 5 % are lethal; i.e., cause all redundant components to fail
simultaneously as in the standard 𝛽-factor model. For a more detailed background and description of this
modification of the model, reference is made to Appendix B. Table 2 summarises the resulting updated
C𝑀oo𝑁 values for some typical voting configurations.

Table 2: 𝐂𝐌𝐨𝐨𝐍 values for different voting logics

𝑴 \ 𝑵 𝑵=𝟐 𝑵=𝟑 𝑵=𝟒 𝑵=𝟓 𝑵=𝟔

𝑴=𝟏 C1oo2 = 1.0 C1oo3 = 0.5 C1oo4 = 0.3 C1oo5 = 0.2 C1oo6 = 0.15

𝑴=𝟐 - C2oo3 = 2.0 C2oo4 = 1.1 C2oo5 = 0.8 C2oo6 = 0.6

𝑴=𝟑 - - C3oo4 = 2.8 C3oo5 = 1.6 C3oo6 = 1.2

𝑴=𝟒 - - - C4oo5 = 3.6 C4oo6 = 1.9

𝑴=𝟓 - - - - C5oo6 = 4.5

It should be stressed that the above figures are suggested values. Based on the general formulas for
calculating C𝑀oo𝑁 values as described in Appendix B, the user of the PDS model can modify these factors
based on personal experience and knowledge. E.g., it may be arguments for using more or less
conservative values for special equipment types.

Note that in IEC 61508-6, correction factors for modifying the 𝛽-factor for different 𝑀oo𝑁 voting
configurations are proposed (ref. Table D.5 in 61508-6, /1/ ). This approach is similar to the PDS approach,
even if some of the modification factors proposed by IEC deviate slightly from the suggested values in
PDS. The main point, however, is that the CCF model provides a ranking of the various configurations
with respect to safety and the ranking in IEC 61508 is comparable to that of PDS.

4.3 Standard β-factor Model Versus PDS Approach


The differences between the standard 𝛽-factor model and the PDS approach are further illustrated in Figure
6 below. A circle (say A) represents the event: “component A has failed”. For a duplicated set of redundant
components A and B (𝑁 = 2), the standard 𝛽-factor and the PDS approaches are identical; Here, β
represents the fraction of failures affecting both A and B, so that they fail simultaneously.

For a triplicate set of components (𝑁 = 3), the standard 𝛽-factor model assumes that whenever there is a
failure affecting two components (say A and B) the third component (C) will also fail. Thus, it will never
happen that just two of the three components fail due to a CCF. Using the PDS method and the updated
C𝑀oo𝑁 factors, it is assumed that if A and B have failed due to a CCF, component C may also fail, but only
in 50 % of the cases. As discussed above it is of course somewhat arbitrary to postulate that this fraction
equals 50 %, but it is, however, considered more realistic than assuming the percentage to be 100 as in the
standard β-factor model.

25 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

CCF model,N=2

A B

1- β β 1- β

CCF model,N=3

PDS model β-factor model

A B A B
0 .5 ⋅β 0
1 - 1 .5⋅β 1 - 1 .5⋅β 1- β 1- β
0 5 ⋅β β
0 .5 ⋅β 0 .5 ⋅β 0 0

1 - 1 .5⋅β 1- β
C C

Figure 6: Illustration of the CCF model for N = 2 and N = 3

From this figure it is also seen that the C2oo3 factor in the PDS model becomes 2.0, since the fraction of
failures affecting 2 or 3 components is

0.5 ∙ 𝛽 + 0.5 ∙ 𝛽 + 0.5 ∙ 𝛽 + 0.5 ∙ 𝛽 = 2 ∙ 𝛽.

To highlight and summarise the main differences between the standard β-factor approach and the model
suggested in PDS for modelling of CCF, the following should be noted:

• The standard 𝛽-factor model does not distinguish between different multiple voting configurations
such as e.g., 1oo2, 1oo3 and 2oo3;
• In order to reflect the effect of voting, the PDS model introduces the configuration factor C𝑀oo𝑁 , i.e.,
𝛽(𝑀oo𝑁) = 𝛽⋅ C𝑀oo𝑁 ;
• The PDS method does not distinguish between β and 𝛽D (i.e., beta for detected failures in IEC 61508).
The most applicable β should always be used, but the notation 𝛽D is not used in PDS.

4.4 Modelling of CCF for Components with Non-Identical Characteristics


Often, multiple protection layers are implemented in order to reduce the risk to a sufficiently low level. For
example a combined PSD and HIPPS solution to protect against overpressure of a downstream process
segment, or a PSD function combined with an ESD function to protect against high level in a pressure
vessel (see section 7.1).

In some type of analyses, like LOPA and event tree analysis, these different protection layers are often
considered independent of each other, and their probabilities of failure are simply multiplied together to
obtain a total failure estimate . This approach may be sufficiently exact for some applications, but
represents a simplification for several reasons. First the redundant components may be tested at about the
same time and secondly the safety functions are often implemented with similar components that are
subject to the same common influences. We will here restrict ourselves to briefly discuss the latter case,

26 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

i.e., common cause modelling between similar (but non-identical) components, whereas the effect of
simultaneous testing is discussed separately in chapter 7.

When modelling dependencies between components with non-identical characteristic there are basically
three different cases (or combinations of these) that may apply:

• components with different failure rates


• components with different 𝛽-factors
• components with different test intervals

For the example mentioned above with a PSD function and a HIPPS function in parallel there may for
example be differences between the failure rates for the HIPPS and the PSD valves. Furthermore, the
HIPPS valves may typically be tested every third month whereas the PSD valves are tested annually.

When having to select a “representative” value both for the failure rate, the β-factor and the test interval,
there are several alternatives. It should be noted that there are no definite answers to what is the best
practice for such cases, since none of the methods to our knowledge have been calibrated against real life
experience.

One practical compromise which are often chosen, is to select the geometric mean of the 𝜆𝑖 ‘s for the 𝑖
redundant components, the minimum of the 𝛽𝑖 ’s (or even lower depending on the degree of diversity
between the components) and the arithmetic mean of the test intervals. Generally, for 𝑁 different
components voted MooN, the CCF contribution to the PFD then becomes:

(CCF)
PFD𝑀oo𝑁 = C𝑀oo𝑁 ∙ 𝛽min ∙ 𝑁�𝜆1 ⋅ 𝜆2 ⋅ … ⋅ 𝜆𝑁 ⋅ 𝜏/2

where 𝛽min = min𝑖=1,2,…,𝑁 {𝛽𝑖 } and 𝜏̅ is the arithmetic mean of the test intervals. If a high degree of
diversity can be claimed, one may select an even lower beta from expert judgments.

Alternatively, the CCF contribution can be calculated as:

(CCF)
PFD𝑀oo𝑁 = C𝑀oo𝑁 ∙ min(𝛽𝑖 ∙ 𝜆𝑖 ) ⋅ 𝜏/2,

This approach is slightly less conservative than the first model, but may however be argued to be
somewhat more realistic: The probability of all components failing due to a common cause will not exceed
the probability that the component with the lowest failure rate fails due to a common cause.

Selection of representative values both for the failure rate, the β-factor and the test interval is further
discussed in the PDS example collection /12/.

27 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

5 PDS CALCULATION FORMULAS – LOW DEMAND SYSTEMS


This chapter presents quantification formulas for loss of safety, based on the discussions in chapter 3.7 and
3.8. The main focus of this chapter is on low demand mode of operation and the corresponding formulas
for PFD (ref. IEC 61508-1, Table 2). High demand mode systems are treated in chapter 6.

5.1 Assumptions and Limitations


When using the PDS formulas or any other reliability formulas, it is important that the analyst is aware of
its application area as well as any limitations on the use of the formulas. The following assumptions
underlying the PDS formulas should be noted.

• All failure rates are considered constant with respect to time; i.e., an exponential failure model is
assumed.
• Both PFD and CSU are calculated as average values. As briefly discussed in section 3.7.6, the
failure probabilities will actually vary during the test interval. However, in line with IEC 61508
and IEC 61511, average values are used as the unavailability measures.
• When calculating the PFD, it is assumed that the component can be considered as good as new
after a repair or a functional test. However, including the PTIF or PTC will (to some extent) cater
for the possibility of having imperfect testing/repair (ref. discussion in section 3.7.4 and 3.7.5
respectively).
• When calculating the PFD, the contribution from unavailability due to repair and testing of
components is not included. For this purpose, the downtime unavailability (DTU) must be added.
• Upon detection of a dangerous failure (of rate λDD ), it is assumed that the system is either brought
to a safe state and/or equivalent compensating measures are introduced. In cases when degraded
operation occurs, the contribution from DTU should be added (ref. section 5.2.3).
• The CSU of the function (safety system) is obtained by summing the CSU of each (set of)
redundant module(s). That is, we assume that CSUA and CSUB are small enough to let:
1 − (1 − CSUA ) ⋅ (1 − CSUB ) ≈ CSUA + CSUB.
• The same applies for PFD, PTIF and DTU calculations, i.e., the contributions from each module are
just summed (rather than using the more accurate formula illustrated for CSU above).
• For instance the rate of independent DU failures is throughout (conservatively) approximated with
λDU (rather than e.g., using (1 − 𝛽) ⋅ λDU for 1oo2).
• The term λDU ⋅ 𝜏 should be small enough to allow 𝑒 −λDU⋅𝜏 ≈ 1 − λDU ⋅ 𝜏 , i.e., λDU ⋅ 𝜏 ≤ 0.2.
• For 𝑁 ≥ 3 we ignore the contribution of a combination of single and double failures. For instance,
when considering a triple system voted 1oo3, we will only include the probability for a common
cause failure taking all three components out or three separate (independent) failures.
Consequently, we will disregard the possibility that within the same test interval one common
cause failure takes out two components whereas the third component fails independently.
• The self-test period is “small” compared to the interval between functional testing, i.e., at least a
factor 100 lower (e.g., for a modern gas detector, the self-test period may be less than one minute
and the functional test interval will typically be once every year or half year, corresponding to a
factor of magnitude 100.000 or more).
• The formulas given here do not account for demands as a means of testing to detect dangerous
failures (ref. discussion in section 3.4.3).

28 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

5.2 PDS Formulas for Loss of Safety


In this section the approximate formulas for loss of safety are presented. The section covers both loss of
safety formulas due to dangerous undetected failures that are revealed in functional tests (PFD), downtime
unavailability (DTU) and systematic test independent failures (TIF). Recall that CSU is the sum of the three
contributions:

CSU = PFD + DTU + PTIF.

Here the DTU includes the mean time to restoration (MTTR) unavailability caused by dangerous failures,
as well as the average downtime or inhibition time due to functional testing and/or preventive maintenance.

5.2.1 PFD Formulas


For a single component, PFD is calculated from the formula:

PFD1oo1 ≈ 𝜆DU ⋅ 𝜏/2.

Here τ is the period between functional testing, and we include the unavailability due to undetected failures
only (i.e. the known downtime unavailability due to detected failures is treated separately). Intuitively the
above formula can be interpreted as follows: 𝜆DU is the constant failure rate and τ/2 is the average period
of time that the component is unavailable given that the failure may occur at a random point in time within
the test interval τ.

Further, for a duplicated module, voted 1oo2, we have when including only the CCF contribution:

(CCF)
PFD1oo2 ≈ 𝛽 ⋅ (𝜆DU ⋅ 𝜏/2).

The above formula does not include the contribution from independent failures of the two components.
Often, when having field equipment with relatively high failure rates, the contribution from independent
failures cannot be neglected and should therefore always be calculated. As further discussed in Appendix
C, the PFD contribution from independent failure of two components voted 1oo2 can be approximated by
(the superscript ‘ind’ abbreviates ‘independent’) 8:

(ind.)
PFD1oo2 ≈ (𝜆DU ⋅ 𝜏)2 /3.

Hence, when including both the common cause and the contribution from independent failures, we get the
following formula for a 1oo2 voted system:

PFD1oo2 ≈ 𝛽 ⋅ (𝜆DU ⋅ 𝜏/2) + (𝜆DU ⋅ 𝜏)2 /3.

The above formulas consider common cause failures between two components. In general, the fraction of
common cause failures will depend on the voting logic. For components voted 𝑀oo𝑁 the PFD is calculated
from (ignoring independent failures):

8
Note that simply multiplying the average PFD1oo1 for two single components gives a result of (𝜆DU ⋅ 𝜏)2 /4 which is
incorrect since the PFD is actually nonlinearly increasing throughout the test interval (ref. Figure 5). Also note that
when estimating the contribution from independent failures, 𝜆DU has conservatively been applied instead of the more
correct (1 − 𝛽) ⋅ 𝜆DU , see discussion in Appendix C.

29 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

(CCF)
PFD𝑀oo𝑁 ≈ C𝑀oo𝑁 ⋅ 𝛽 ⋅ (𝜆DU ⋅ 𝜏/2); (𝑀 < 𝑁).

As discussed in chapter 4.1, C𝑀oo𝑁 is a configuration factor that allows us to distinguish between various
voting logics like 1oo2, 1oo3 and 2oo3. Finally for a 𝑁oo𝑁 voting, we may apply the formula:

PFD𝑁oo𝑁 ≈ 𝑁 ⋅ 𝜆DU ⋅ 𝜏/2.

The latter formula for a 𝑁oo𝑁 voting is suggested in the IEC standard, but is somewhat inaccurate when β
is large, say β > 0.05 or when 𝑁 is large (i.e., 𝑁 > 3, see Appendix C).

Simplified PFD formulas for some different voting logics are summarised in Table 3 below. The table
includes the following:

i. In the first column, the voting logic (𝑀oo𝑁) is given;


ii. In the second column the PFD contribution from common cause failures is included. For voted
configurations like 1oo2, 1oo3, 2oo3, etc. this will often be the main contributor towards the total
PFD;
iii. In the third column, the contribution to PFD from independent failures is given. For a 𝑀oo𝑁 voting
we get a contribution if at least 𝑁 − 𝑀 + 1 of the components fail within the same test interval. Note
that for the 1oo1, 2oo2,… and 𝑁oo𝑁 voting’s, the PFD will (conservatively) equal the sum of the
independent failure contributions.

30 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

Table 3: Summary of simplified formulas for PFD

PFD calculation formulas


Voting
Contribution from independent
Common cause contribution
failures

1oo1 - 𝜆DU ⋅ 𝜏/2

1oo2 𝛽 ⋅ 𝜆DU ⋅ 𝜏/2 + (𝜆DU ⋅ 𝜏)2 /3

2oo2 - 2 ⋅ 𝜆DU ⋅ 𝜏/2

1oo3 C1oo3 ⋅ 𝛽 ⋅ 𝜆DU ⋅ 𝜏/2 + (𝜆DU ⋅ 𝜏)3 /4

2oo3 C2oo3 ⋅ 𝛽 ⋅ 𝜆DU ⋅ 𝜏/2 + (𝜆DU ⋅ 𝜏)2

3oo3 - 3 ⋅ 𝜆DU ⋅ 𝜏/2

1oo𝑁 1
C1oo𝑁 ⋅ 𝛽 ⋅ 𝜆DU ⋅ 𝜏/2 + ⋅ (𝜆DU ⋅ 𝜏)𝑁
𝑁 = 2, 3, … 𝑁+1
𝑁!
𝑀oo𝑁
C𝑀oo𝑁 ⋅ 𝛽 ⋅ 𝜆DU ⋅ 𝜏/2 + (𝑁 − 𝑀 + 2)! ⋅ (𝑀 − 1)!
𝑀 < 𝑁; 𝑁 = 2, 3, … ⋅ (𝜆DU ⋅ 𝜏)𝑁−𝑀+1

𝑁oo𝑁
- 𝑁 ⋅ 𝜆DU ⋅ 𝜏/2
𝑁 = 1, 2, 3, …

For slightly more accurate formulas, reference is made to Appendix C.

5.2.2 Comparison with other Methods


To illustrate the quantitative effect of using the above simplified formulas as compared to performing more
exact calculations – here represented by using Markov chain modelling with the standard 𝛽-factor model –
some typical PFD calculations have been performed. Also, for further comparison, the results from using
the IEC 61508 formulas have been included (ref. table B.2–B.4 of IEC 61508-6).

31 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

Table 4: Calculated PFD using alternative calculation methodologies

System / parameter specification Calculated PFD 1)

Voting 𝜷 𝝀𝐃 𝐃𝐂 𝝉 PDS Markov IEC 61508


1oo1 - 5 ⋅ 10−6 60 % 12 months 8.8 ⋅ 10−3 8.7 ⋅ 10−3 8.8 ⋅ 10−3
1oo2 0.02 2.5 ⋅ 10−6 60 % 12 months 1.1 ⋅ 10−4 1.1 ⋅ 10−4 1.1 ⋅ 10−4
2oo2 - 0.5 ⋅ 10−6 60 % 6 months 8.8 ⋅ 10−4 8.7 ⋅ 10−4 2) 8.8 ⋅ 10−4
1oo3 0.10 5.0 ⋅ 10−6 90 % 24 months 2.2 ⋅ 10−4 3) 4.4 ⋅ 10−4 4.4 ⋅ 10−4 4)
2oo3 0.02 2.5 ⋅ 10−5 90 % 12 months 9.2 ⋅ 10−4 3) 6.7 ⋅ 10−4 7.1 ⋅ 10−4 4)
1)
For the IEC 61508 results, a mean restoration time of 8 hours have been assumed for dangerous failures. For the
PDS calculations the contribution from repair of dangerous failures is not included (treated separately as DTU, ref
section 5.2.3). However, as seen from the results this difference is negligible, which is also confirmed by the Markov
modelling where the contribution from repair has been omitted.
2)
For a 2oo2 voting a 𝛽-factor of 0.02 has been applied for the Markov modelling
3)
For a 1oo3 and a 2oo3 voting, the PDS and IEC/Markov modelling results differ as expected. IEC (and here also
the Markov modelling) apply the standard 𝛽-factor model for CCF, which treats a 1oo3 and 2oo3 voting similar to a
1oo2 voting, whereas the PDS formulas include the C𝑀oo𝑁 factors (ref. discussion in section 4.1-4.2).
4)
If using the correction factors for β as given in IEC 61508-6, Table D.5, the same results are obtained using the IEC
61508 and the PDS formulas.

5.2.3 Formulas for Downtime Unavailability (DTU)


The downtime unavailability includes two elements:

1) The downtime related to repair of (dangerous) failures. The average duration of this period is the
mean time to restoration (MTTR);
2) The downtime (or inhibition time) resulting from planned activities such as testing and preventive
maintenance.

As discussed in previous sections, the contribution from downtime unavailability will depend on the
operational philosophy and the configuration of the process as well as the SIS itself. Further, statutory
requirement saying that compensating measures shall be introduced upon degradation of a critical safety
function will also affect the operational philosophy. Hence, which formulas to apply, will depend on
several factors. Below, some approximate formulas and the corresponding assumptions underlying these
formulas are given for each of the two downtime contributors listed above.

1) Downtime Unavailability due to Repair of Dangerous Failures – DTUR


The approximate formulas for the downtime unavailability due to repair, here referred to as DTUR , are
comparable to the PFD formulas presented above. However, given that a dangerous failure has occurred,
the average “known” unavailability period is MTTR rather than 𝜏/2.

If we follow IEC 61508, we include the MTTR of all dangerous failures, and for a single component, the
related downtime unavailability is then approximately 𝜆D ⋅ MTTR. Whether it is correct to treat DD failures
detected during normal operation and DU failures revealed during a functional test or upon a true demand,
similarly, is a matter of discussion. However, in order to be in line with the IEC standard, the following
discussion handles dangerous failures in general.

32 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

When establishing formulas for DTUR, the operational philosophy must be specified. Here, three possible
operational/repair philosophies are considered:

1. Always shut down. This (extreme) philosophy may apply for the most critical safety functions, and
means that production is shut down (even for redundant systems) whenever at least one component of
the safety function experiences a dangerous failure. In such case there will be no contribution to the
DTUR, but there will be a contribution to loss of production.

2. Degraded operation if possible; otherwise shutdown. This may be the most common philosophy. If all
redundant components have a dangerous failure there will be a shutdown, otherwise there will be
degraded operation. If there is a single D failure in a 2oo3 voting, it must be specified whether the
degraded operation is a 1oo2 or a 2oo2 voting. Note that if a 2oo3 voting degrades to a 1oo2
configuration, the safety performance actually improves, and no degradation term should be added,
ref. Appendix C.

3. Always continue production, even with no protection. This is another extreme philosophy where all
the (redundant) components have experienced a dangerous failure, but production is continued during
the repair/restoration period even with no protection available.

Observe that the above list is not complete since alternative operational philosophies can be foreseen
(“combinations” of the above). Also note that the possibility of introducing compensating measures has not
been included in this discussion.

Table 5 presents DTUR formulas for three common configurations for the two operational philosophies that
may give DTUR contributions.

Table 5: Formulas for 𝐃𝐓𝐔𝐑 for some voting logics and operational philosophies

Initial Contribution to 𝐃𝐓𝐔𝐑 for different operational/repair philosophies 1), 2)


Failure
voting
Type
logic Degraded operation Operation with no protection

𝟏𝐨𝐨𝟏 Single failure N/A 𝜆D ⋅ MTTR


Degraded operation with 1oo1:
Single failure N/A
2 ⋅ 𝜆D ⋅ MTTR ⋅ 𝜆DU ⋅ 𝜏/2
𝟏𝐨𝐨𝟐 Both components
N/A 𝛽 ⋅ 𝜆D ⋅ MTTR
fail
Degraded operation with 2oo2: 3)
Single failure N/A
3 ⋅ 𝜆D ⋅ MTTR ⋅ 2 ⋅ 𝜆DU ⋅ 𝜏/2

𝟐𝐨𝐨𝟑 Two components Degraded operation with 1oo1:


fail (C2oo3 − C1oo3 ) ⋅ 𝛽 ⋅ 𝜆D ⋅ MTTR ⋅ 𝜆DU ⋅ 𝜏/2 N/A

All three
components fail N/A C1oo3 ⋅ 𝛽 ⋅ 𝜆D ⋅ MTTR
1)
Note that 𝜆D has been used in the formulas to ensure consistency with IEC 61508. In Appendix C the more correct
𝜆DD has been applied since degraded operation mainly will take place upon a detected failure.
2)
Also note that the formulas provided here do not distinguish between the MTTR for one or two (three) components.
3)
Degradation to a 1oo2 voting gives no contribution to the DTU, since a 1oo2 voting actually gives increased safety
as compared to a 2oo3 voting.

When the operational/repair philosophy for safe (detected and undetected) failures are specified, similar
DTUR formulas as shown in Table 5 can be established also for these failure types. If the same repair

33 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

philosophy applies for all critical failures, the approximate DTUR formulas as given in Table 5 can be
applied, simply replacing 𝜆D with 𝜆crit .

2) Downtime Unavailability due to Functional Testing/Preventive Maintenance – DTUT


The downtime unavailability due to planned testing (or other preventive maintenance) activities, is here
referred to as DTUT. The DTUT can for a single component be given as 𝑡/τ, where 𝑡 is duration of function
being bypassed during functional testing, and τ is time between tests. Hence, this is simply the fraction of
time when the function is being bypassed.

Similarly, if redundant (voted) components are taken out for testing simultaneously while production is
maintained, (as is often the case for e.g., gas detectors), the DTUT can also be given as 𝑡/τ. Here, 𝑡 is still
the duration of the function being by-passed during the test; (may differ from the 𝑡 above when a single
component is tested).

For some redundant systems, e.g., two process sensors voted 1oo2, it may be more relevant that one sensor
is taken out for testing, while the other is still operating. Hence, in this period, the function is actually
degraded to a 1oo1 system, and we therefore need to calculate the unavailability contribution during the
time of degraded operation:

While testing the first component (component 1), a time period τ has elapsed since the last test. The
corresponding unavailability contribution while testing component 1 then becomes:

DTUT(1) ≈ (𝑡/𝜏) ⋅ 𝜆DU ⋅ 𝜏 = 𝑡 ⋅ 𝜆DU.

Here, 𝑡/τ is the fraction of time while component 1 is tested, and 𝜆DU ⋅ 𝜏 is the probability that a dangerous
undetected failure has been introduced to component 2 during the period, τ since last test.

While testing component 2, component 1 will be active and has just been tested (“as good as new”). Hence,
the unavailability contribution from testing of component 2 then becomes:

DTUT(2) ≈ (𝑡/𝜏) ⋅ 𝜆DU ⋅ 𝜏/2.

Assuming that 𝑡 << 𝜏, the contribution from testing of component 2 will be negligible as compared to
testing of component 1, and therefore

DTUT = DTUT(1) + DTUT(2) ≈ DTUT(1) ≈ 𝑡 ⋅ 𝜆DU .

This expression is based on the assumption that 𝑡 << 𝜏; (𝑡/𝜏 < 0.01) and that no compensating measures
(such as e.g., manual safeguarding) are introduced during the functional testing. Similar expressions can be
obtained for a 2oo3 voting, with degradation either to a 1oo2 or a 2oo2.

These results are summarised in Table 6. Here we again consider the three operational philosophies:

1. Always shut down during testing, i.e., regardless of configuration the production is shut down
during testing of either component. No contribution towards DTUT.
2. Degraded operation during testing if possible; otherwise shutdown, i.e., subsequent testing of one
and one component, and degraded operation if possible (not possible for 1oo1, and here shutdown
is assumed).
3. Continue production during testing, even with no protection, i.e., simultaneous testing of all
components and production without protection during testing.

34 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

Table 6: Formulas for 𝐃𝐓𝐔𝐓 for some voting logics and operational philosophies

Initial Number of Contribution to 𝐃𝐓𝐔𝐓 for different operational/testing philosophies


voting components tested
logic simultaneously Degraded operation Operation with no protection 1)

𝟏𝐨𝐨𝟏 One at a time N/A 𝑡/𝜏

One at a time 𝑡 ⋅ λDU N/A


𝟏𝐨𝐨𝟐 Both tested
N/A 𝑡/𝜏
simultaneously
2)
Degradation to 2oo2:
One at a time N/A
𝑡 ∙ 2 ⋅ λDU
𝟐𝐨𝐨𝟑
All three tested
simultaneously N/A 𝑡/𝜏
1)
Note that the formulas provided here do not distinguish between the testing time t for one component and
simultaneous testing of two (or three) components. The total testing time without protection should therefore be used
2)
Degradation to a 1oo2 voting gives no contribution to the DTU, since a 1oo2 voting actually gives increased safety
as compared to a 2oo3 voting

Again there may be alternative testing philosophies to those specified above. As for the DTUR it is seen
that the operational philosophy of the installation largely impacts the unavailability contribution from
testing and maintenance.

5.2.4 Comparison with the PFD Formulas of IEC 61508


As discussed above (see Figure 4) the correspondence between the PFD of IEC61508 and the PDS
measures are

PFDIEC = PFD + DTUR.

Simplified expressions for PFD and DTUR are given in Table 3 and Table 5, respectively, and are further
elaborated in Appendix C. Note that regarding DTUR it is the operational philosophy "Operation with no
protection" (Table 5) that applies when comparison with the IEC formulas is carried out below.

Taking 1oo2 as an example, the PDS formula for this expression becomes:

(𝜆DU ⋅𝜏)2 𝜏
PFD + DTUR = 3
+ 𝛽 ⋅ 𝜆DU ⋅ + 𝛽 ⋅ 𝜆D ⋅ MTTR
2

which way also be written

(𝜆DU ⋅𝜏)2 𝜏
PFD + DTUR = + 𝛽 ⋅ 𝜆DU ⋅ � + MTTR� + 𝛽 ⋅ 𝜆DD ⋅ MTTR
3 2

Here we recognize the main terms of the IEC formula for PFD, see Section B.3.2.2.2 of Appendix B in
IEC61508-6. The differences are:

• IEC61508 applies different 𝛽 values for DU and DD failures, (𝛽 and 𝛽D respectively).


• IEC61508 applies different "repair times" for DU and DD failures, (MRT and MTTR)

35 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

• Another difference – not appearing in the 1oo2 formula above – is that IEC61508 in the formulas
of Part 6, Appendix B applies the standard beta factor model for all configurations. However, in
Part 6, Appendix D (Table D.5) of the standard, factors similar to 𝐶𝑀oo𝑁 are introduced.
• Finally, (not relevant for the above formula) the handling of proof test coverage (PTC) of the
functional test might be slightly different in the two approaches; see next section.

To conclude, the formulas of PDS and IEC 61508 are rather similar. First, the PDS formulas might easily
be modified to account for different 𝛽 and MTTR for DU and DD failures, respectively, if that is required
(and data are available). Secondly, the formula for combinations of independent failures (in the same test
interval) is somewhat different, but will usually give comparable results (see Table 4). Both approaches
present approximations for this contribution, and it might be difficult to claim that one is better than the
other. However, these independency terms are rarely dominating. Finally, the CCF modelling of the two
approaches are rather similar, especially if comparison is made with Table D.5 of IEC 61508-6.

5.3 How to Model the Quantitative Effect of Imperfect Functional Testing


It is widely accepted that functional testing is not always 100% perfect, i.e. all DU failures will not
necessarily be detected during a functional test. This can be modelled by introducing the proof test
coverage (PTC) factor, as done in IEC61508. In the PDS method imperfect testing has traditionally been
treated by introducing a test independent failure (TIF) probability. A general discussion of both alternative
approaches is given below.

5.3.1 TIF Formulas


As discussed in previous sections, a test independent failure (TIF) is a failure not detected during
functional testing but revealed upon a true demand. This factor is not included in the IEC formulas and it is
therefore up to the analyst (in co-operation with the contractor company) to decide whether the PTIF shall
be included in the calculations. Alternatively, and as discussed in the next section, the effect of imperfect
testing can be quantified in terms of the proof test coverage (PTC) factor.

For a single component, 𝑃TIF expresses the likelihood of a component having just been functionally tested,
to fail on demand (irrespective of the interval of functional testing). For redundant components, voted
𝑀oo𝑁 (𝑀 < 𝑁), the TIF contribution to loss of safety is given by the general formula: C𝑀oo𝑁 ⋅ β ⋅ PTIF,
where the numerical values of C𝑀oo𝑁 are assumed identical to those used for calculating PFD, ref. Table 2.
The formulas for calculation of PTIF for different voting logics are summarised in Table 7.

Table 7: Formulas for PTIF for various voting logics

TIF contribution to 𝐂𝐒𝐔


Voting
for 𝐌𝐨𝐨𝐍 voting
𝟏𝐨𝐨𝟏 PTIF

𝟏𝐨𝐨𝟐 𝛽 ⋅ PTIF

𝑴𝐨𝐨𝑵; 𝑴 < 𝑵 C𝑀oo𝑁 ⋅ 𝛽 ⋅ PTIF

𝑵𝐨𝐨𝑵; 𝑵 = 𝟏, 𝟐, … 𝑁 ⋅ PTIF

Note that in Table 7 it is assumed that the contribution from two or more test independent failures
occurring independently of each other is negligible.

36 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

5.3.2 Incorporating the Effect of Reduced PTC into the PFD Formulas
As an alternative to the PTIF, the PFD contribution from non-perfect testing can also be modelled by
considering the Proof Test Coverage (PTC) i.e., the fraction of failures detected during functional proof
testing.

When incorporating the PTC the rate of dangerous undetected failures can be regarded as having two
constituent parts:

1. Failures detected during proof testing: with rate PTC ∙ 𝜆DU and proof test interval 𝜏, and
2. Failures not detected during proof testing: with rate (1 − PTC) ∙ 𝜆DU and “test interval” 𝑇.

Here 𝜏 is the proof test interval and 𝑇 is the assumed interval of complete testing. 𝑇 may for example be
the interval of a complete component overhaul when it is the assumed that the residual failure modes will
be detected. If some failure modes are never tested for, then 𝑇 should be taken as the lifetime of the
equipment. For a 1oo1 voting the PFD is then given as:

𝜏 𝑇
PFD1oo1 = PTC ⋅ �𝜆DU ⋅ � + (1 − PTC) ⋅ �𝜆DU ⋅ �
2

We see that the above expression becomes identical to the simplified formula given for PFD1oo1 in section
5.2.1 when the proof test coverage, PTC = 1 (= 100 %), i.e., when the functional test is perfect. This was
also illustrated in Figure 5 where the average PFD (or CSU) is the same in all test intervals. However, if
PTC < 1, the average PFD for a test interval will increase in subsequent test intervals, as illustrated in
Figure 7.

PFD(t)

τ 2τ 3τ (Ν−1)τ Τ=Ντ
Time, t

Figure 7: Time dependent PFD with PTC < 100 %

The above formula and Figure 7 applies for a single component voted 1oo1. This PFD formula which
incorporates the PTC is quite easily generalised to any voting configuration.

The presentation in Figure 7 is valid whenever the time dependent PFD is stepwise linear in 𝑡, (“linear
case”). In particular for an NooN configuration we write (using the standard approximation):

𝜏 𝑇
PFD𝑁oo𝑁 ≈ PTC ⋅ �𝑁 ⋅ 𝜆DU ⋅ � + (1 − PTC) ⋅ �𝑁 ⋅ 𝜆DU ⋅ �
2 2

Further, if we restrict to include CCF for the MooN configuration we get the approximation

𝜏 𝑇
PFD𝑀oo𝑁 ≈ PTC ⋅ �C𝑀oo𝑁 ⋅ 𝛽 ⋅ 𝜆DU ⋅ � + (1 − PTC) ⋅ �C𝑀oo𝑁 ⋅ 𝛽 ⋅ 𝜆DU ⋅ �
2 2

37 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

So in the “linear case” where a single failure or a CCF is sufficient for the system to fail, the approach is
very straightforward. However, if we will also account for combinations of independent failures, e.g., in a
1oo2 or 2oo3 configuration, the argument becomes slightly more complex.

For a 1oo2 configuration we add the contribution by a combination of two independent failures being
detectable in the functional test to the first term of the above PFD1oo2 , and the combination of two
independent failures detectable after time 𝑇 to the second term. But there is also a contribution from the
combination of one failure detectable in functional test and one failure detectable after time 𝑇. When a
single failure occurs, which is not detected before time 𝑇; the system will for a period operate as a 1oo1
system. Then we get the following improved approximation for 1oo29:

2
𝜏 (PTC ⋅ 𝜆DU ⋅ 𝜏)2 𝑇 �(1 − PTC) ⋅ 𝜆DU ⋅ 𝑇�
PFD1oo2 = 𝛽 ⋅ PTC ⋅ 𝜆DU ⋅ + + 𝛽 ⋅ (1 − PTC) ⋅ 𝜆DU ⋅ + +
2 3 2 3
𝑇 𝜏
2 ⋅ �(1 − PTC) ⋅ 𝜆DU ⋅ � ⋅ �PTC ⋅ 𝜆DU ⋅ �.
2 2

Here the second and fourth terms represent the “obvious” extensions of two independent failures of “same
type”, while the fifth (i.e. last) term corresponds to a combination of one failure undetectable in functional
testing, and one detectable.

A similar argument can be carried out e.g., for the 2oo3 configuration. Using C2oo3 = 2 we get

𝜏 𝑇 2
PFD2oo3 = 2 ⋅ 𝛽 ⋅ PTC ⋅ 𝜆DU ⋅ + (PTC ⋅ 𝜆DU ⋅ 𝜏)2 + 2 ⋅ 𝛽 ⋅ (1 − PTC) ⋅ 𝜆DU ⋅ + �(1 − PTC) ⋅ 𝜆DU ⋅ 𝑇�
2 2
𝑇 𝜏
+ 3 ⋅ �(1 − PTC) ⋅ 𝜆DU ⋅ � ⋅ �𝛽 ⋅ PTC ⋅ 𝜆DU ⋅ �.
2 2

Here the last term corresponds to one failure – not being detectable in functional test – occurring before
time 𝑇, and a CCF of the two remaining components, occurring within the same test interval 𝜏 (in case this
last term should be of any significance, there are also other combinations to consider, but the present
formula is in line with the general assumptions of this handbook).

Comparison with the IEC 61508 expressions


First note that contributions from DD failures and MTTR are not included in the formulas above. So in
order to make a comparison with the expressions of IEC, we should add DTUR (cf. Section 5.2.4).
However, this term is not affected when PTC < 1. Taking the 1oo2 voting as an example, we have
DTUR = 𝛽 ⋅ 𝜆D ⋅ MTTR (Operation with no protection). Thus, we can write (when ignoring the
contribution from two independent failures)

𝜏 𝑇
PFD + DTUR ≈ PTC ⋅ 𝛽 ⋅ 𝜆DU ⋅ � + MTTR� + (1 − PTC) ⋅ 𝛽 ⋅ 𝜆DU ⋅ � + MTTR� + 𝛽 ⋅ 𝜆DD ⋅ MTTR
2 2

which corresponds to the "main" (i.e. CCFs) terms of the PFD expression given in Section B.3.2.5 of
IEC61508-6, (when it is assumed that both 𝛽 and MTTR are the same for DU and DD failures). The
difference is that the IEC formula adds a contribution of two independent failures as is also done in the
more detailed PFD1oo2 formula above. Thus, the conclusions of Section 5.2.4 can be maintained.

9
Also in these formulas we conservatively use 𝜆DU instead of the more correct (1 − 𝛽) ⋅ 𝜆DU when estimating the
contribution from independent failures.

38 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

5.3.3 Should we use PTC or PTIF?


Given the above PTC formulas and the TIF formulas, an obvious question arises: Should we use the PTIF or
the PTC to model imperfect testing?

The arguments can be divided into “mathematical” ones and the more pragmatic ones. Starting off with the
first category the main question becomes what type of failures we are considering and how these develop
throughout time (keeping in mind the different characteristics of the PFD(𝑡) from Figure 5 and Figure 7):

i. When considering failures introduced during operation and/or developing over time that the
standard functional test (of interval 𝜏) is not designed to reveal, such failures are probably best
modelled by using PTC since their likelihood will increase until they are revealed either by a
demand or a complete functional test (of interval 𝑇). An example can be valve body erosion not
revealed unless testing for internal valve leakage. Another example can be errors introduced during
modification of the control logic that are not revealed during standard functional testing.

ii. If considering systematic design related failures that are present from day one and have a constant
probability, then these are best modelled by using the PTIF . For example a valve actuator not
dimensioned to close during a “real” process situation or a failure in the Cause & Effect matrix
(not being revealed during commissioning).

iii. Failures that are caused by erroneous execution of the functional test – e.g., incorrect calibration,
applying the wrong test pressure during hydrostatic valve testing, or forgetting to test all relevant
I/O channels during logic testing, which are all failures that may be revealed at the next test. These
errors are examples of systematic failures that are often included in generic estimates of the 𝜆DU ,
and as such neither PTIF nor PTC are directly applicable here.

iv. Failures being introduced during operation due to human error in between tests, for example
caused by safety systems not being reset or an operator inadvertently putting a valve in the wrong
position. Such errors will - similar to those in category iii. – normally be detected at the next test
and are therefore often included as part of our 𝜆DU (neither PTIF nor PTC are therefore directly
applicable here).

Considering the more pragmatic arguments:

i. For which parameter do we have data? The PDS data handbook /11/ includes PTIF data for most
topside components but the values are general values based on expert judgment. For some valve
types the PTC relevant for partial stroke testing may be suggested by the vendor, whereas in most
cases the PTC must be estimated separately.

ii. If data is not available - which parameter is the easiest and most intuitive to estimate?

iii. IEC 61508 suggests using PTC for cases with imperfect testing.

The above considerations may form a basis for the choice of how to treat imperfect testing. Regardless of
which approach is chosen, the main point is to acknowledge that testing will often be imperfect and to
include this effect in the quantitative estimates.

5.4 Quantification of Spurious Trip Rate (STR)


In this section we present the approach for quantification of the spurious trip rate of a SIS. A spurious trip
may occur due to undetected safe (SU) failure(s) resulting in spurious operation(s) of SIS component(s). A
SIS with single component (i.e., voted 1oo1) will have the spurious trip rate

STR = λSU.

39 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

Similarly, the approximate STR formula for a SIS with redundant components voted 1ooN is

STR ≈ 𝑁⋅ λSU ; (1oo𝑁 voting)

The reason is that for a 1ooN voting it is sufficient that one component has a spurious operation failure in
order for the system to trip. If the components are voted 𝑀oo𝑁, where 𝑀 > 1, a multiple failure must
occur in order to give a trip and the (approximate) formula becomes:

STR ≈ C(𝑁−𝑀+1)oo𝑁 ⋅ 𝛽⋅ λSU ; (MooN voting; 2 ≤ 𝑀 ≤ 𝑁; 𝑁 = 2, 3, 4, … )

Here, C(𝑁−𝑀+1)oo𝑁 is the configuration factor as given previously, see Table 2. Note that we in the above
equation apply the same β notation as for dangerous failures. It should however be noted that the common
cause failure rate for spurious trip failures is not necessarily the same as for dangerous failures. Two safety
valves may, for example, fail to close (FTC) on demand due to scaling, while scaling will never lead to
spurious operation of the same valves. Hence, it may be necessary to reconsider the standard 𝛽 values
when performing spurious trip calculations. This is further discussed in /21/. However, for simplicity, we
have maintained the standard 𝛽 notation in this section.

Approximate STR formulas for different voting logics are given in Table 8, where the first entries present
examples of the general formulas.

Table 8: Formulas for STR 1)

Voting STR

𝟏𝐨𝐨𝟏 λSU
𝟏𝐨𝐨𝟐 2 ⋅ λSU

𝟐𝐨𝐨𝟐 𝛽⋅ λSU

𝟏𝐨𝐨𝟑 3 ⋅ λSU

𝟐𝐨𝐨𝟑 C2oo3 ⋅ 𝛽⋅ λSU

𝟑𝐨𝐨𝟑 C1oo3 ⋅ 𝛽⋅ λSU

𝟏𝐨𝐨𝑵; 𝑵 = 𝟏, 𝟐, 𝟑, … 𝑁 ⋅ λSU

𝑴𝒐𝒐𝑵; 𝟐≤ 𝑴≤ 𝑵; 𝑵 = 𝟐, 𝟑, … C(𝑁−𝑀+1)oo𝑁 ⋅ 𝛽⋅ λSU


1)
These formulas account for CCF only, (except for 1oo𝑁 configurations). Note that shutdowns can also be initiated
as a result of dangerous failures, ref. discussion in section 5.2.3.

40 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

6 PDS CALCULATION FORMULAS – HIGH DEMAND SYSTEMS


This chapter discusses an alternative to the use of PFD as a measure for loss of safety. This has in the
literature (in particular IEC 61508 and IEC 61511) been closely related to the discussion of low versus
high demand mode.

6.1 High and Low Demand Systems


The IEC 61508 standard makes a distinction between so called low demand systems and high demand (or
continuously operating) systems. The formulas described in chapter 5 focused on systems working in the
low demand mode. There will however be systems that are subject to frequent or even more or less
continuous demands. Roughly speaking the two modes of operation can be described as:

• A low demand safety system operates only upon a demand, can often be seen as an add-on to the
basic control system, and shall only be called upon when something goes wrong or starts to go
wrong. Typical examples are a process shutdown system (PSD), a high integrity pressure
protection system (HIPPS) or an emergency shutdown system (ESD).

• A high demand or continuous mode system may be a system that experiences frequent demands or
operates more or less continuously. If operating continuously it can be seen more as a control
system which shall prevent the process or equipment it controls from exceeding certain bounds.
Typical examples of such systems will be a dynamic positioning system, a ballast system, or if
considering other industries than the petroleum sector; a railway signalling system, /18/.

The previous version of IEC 61508, /22/ (see part 4, sect. 3.5.12) defined the split between these two
operating modes by saying that systems where the frequency of demands exceed one per year or greater
than twice the functional test interval shall be defined as high demand or continuous mode systems.
However, in the current version of IEC 61508, /1/, the split between low and high demand is only related
to the "frequency of annual demands being higher or lower than one"-criterion, i.e., independent of the
length of the functional test interval. This split between the two modes is not further substantiated in IEC
61508 and may leave an impression of two types of systems that shall be treated completely differently
when calculating the loss of safety.

However, in both versions of the standard, it is recommended that PFD is used for measuring loss-of-safety
in the low demand mode, and that otherwise a “frequency of system failure”, denoted PFH (Probability of
Failure per Hour) is applied. This recommendation and the related specification of low/high demand mode
do need some further consideration:

• The specified distinction between high and low demand mode (frequency of demand being one per
year) seems somewhat arbitrary. Moreover, it is not obvious that the SIS failure rate, PFH is the
most suitable measure in any case where the number of demands is higher than once a year.
• A distinction between a “high” demand mode of say 1–2 demands per year and a continuous
demand mode seems more appropriate. As stated above, the continuous system is similar to a
control system, which differs significantly from an on demand system, where the “dormant”
failures are in focus. So the essential “split” should be between on-demand systems and continuous
systems.
• For a (more or less) continuous system, where failures are detected “immediately”, the occurrence
of “dormant” failures is not that essential for the performance, and the use of PFH as a loss-of-
safety measure seems sensible.

So we make the following initial claims:

• We should distinguish between the following three modes: low, high and continuous.

41 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

• In the continuous mode, talking of more or less immediate failure detection, the system “average
failure frequency”, PFH, could be a reasonable loss-of-safety measure.
• In all on demand cases where “dormant” failures is an essential issue, i.e. including both the low
and high demand mode (but excluding continuous demand mode), the choice of loss-of-safety
measure should be based on a more detailed consideration.

We start by focusing on the distinction between low and high demand (disregarding continuous mode) and
discuss the respective measures for expressing loss-of-safety, and the relation between these.

6.2 Loss-of-safety Measures: PFD and PFH examples


For low demand systems, IEC 61508 applies the probability of failure on demand (PFD) as the measure for
loss of safety; cf. chapter 5. This is the average probability that a component or system upon a demand fails
to perform its intended safety function.

For high demand systems the IEC 61508 standard applies the frequency of failure per hour (PFH). This is
the expected number of dangerous component or system failure(s) per hour.

In order to illustrate the relationship between these two loss-of-safety measures, consider as an example a
single component where the system is immediately put in a safe state upon detection of a dangerous failure,
i.e., as usual we assume that the time between self-tests is significantly shorter than the time between
demands. So mainly the DU failures will result in safety unavailability, and by assuming a constant failure
rate, the two safety measures can be calculated as follows:

PFH1oo1 = 𝜆DU

PFD1oo1 ≈ 𝜆DU ⋅ 𝜏/2.

Hence, in order to obtain the PFD we just multiply the rate of dangerous undetected failures with the
average period 𝜏/2 that the component will be unavailable due to a DU failure.

To take a more general example, consider a 1oo𝑁 configuration of 𝑁 identical components, failing
independently, (i.e., not considering common cause failures). Considering only the DU failures with rate,
𝜆DU , it can then be shown that:

PFH1oo𝑁 ≈ (𝜆DU ⋅ 𝜏)𝑁 /𝜏 ; for 𝑁 = 1, 2, 3, … ; independent failures only

PFD1oo𝑁 ≈ (𝜆DU ⋅ 𝜏)𝑁 /(𝑁 + 1) ; for = 1, 2, 3, … ; independent failures only

The above formula for PFD1oo𝑁 is a standard approximation for a safety system with 1oo𝑁 voting, ref.
Table 3. It is, however, quite easy to explain; for a 1ooN voting all 𝑁 components have to fail within the
same interval, 𝜏, in order to have a system failure. The term (𝜆DU ⋅ 𝜏)𝑁 is the approximate probability that
all 𝑁 components fail in the same test interval of length 𝜏. When we divide this probability with 𝜏, we
obtain the rate of this event, see above formula for PFH1oo𝑁 . We note that for a redundant system (𝑁 > 1)
there must be more than one failure in a test interval in order to have a system failure, and so the length of
the test interval, 𝜏, necessarily enters the PFH formula; (an alternative explanation of the above formula is
given below for a 1oo2 voting.)

Further observe that, given a system failure, this will on the average occur in the last part of the interval
when all components have failed, and 1/(𝑁 + 1) is actually the average fraction of the interval where all
𝑁 components are failed. So the probability that the system has a dangerous undetected failure is the
probability that all units fail in an interval, multiplied with this fraction of the interval having a system
failure; see above formula for PFD1oo𝑁 .

42 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

Observe that the two safety measures – PFH and PFD – are just two different ways of expressing loss of
safety, and in the present example we have 10:

PFD1oo𝑁 ≈ PFH1oo𝑁 ⋅ 𝜏/(𝑁 + 1) ; (For 𝑁 = 1, 2, 3, … ; independent failures only).

Also note that neither of these measures actually restricts themselves to be used for low demand or high
demand mode only. However, there is a difference in interpretation: PFD is the relative fraction of time
that the system is unavailable, whereas PFH expresses the frequency at which DU failures occur
(irrespective of duration of the resulting unavailability).

6.3 Using PFD or PFH?


When choosing between the measures PFD and PFH it is important to consider the foreseen frequency of
process demands. If this is higher than once per year the system should – according to IEC 61508 – be
treated as high demand, implying the use of PFH. This criterion for deciding which measure to apply is
considered in more depth in Appendix E. Mainly some conclusions are referred below.

First, it is claimed that an important criterion for choosing a loss-of-safety measure should be its ability to
capture how the safety depends on the following parameters:

• Failure rates (in particular 𝜆DU ) and 𝛽


• Configuration, (i.e. , C𝑀oo𝑁 )
• Length of interval of functional testing, (𝜏)
• Demand rate, (𝛿).

The effect of the demand rate, 𝛿, is in most analyses not explicitly accounted for although it certainly
affects safety. In the present context, where low demand and high demand is introduced, the demand rate
however becomes an important parameter.

Secondly, we will include the hazard rate (HR), which is the rate of demands where the SIS has failed. This
is an important safety parameter as it expresses the actual hazard frequency. Thus, in addition to
calculating the PFD or PFH, we also want to estimate the HR. In standard reliability theory where the effect
on the PFD from the number of demands is not accounted for, HR = PFD ∙ 𝛿. Since the demands may
actually reduce PFD by serving as a functional test, this relationship can be somewhat more complex.
However, for a fixed average PFD the hazard rate and the demand rate will be proportional.

The investigations in Appendix E mainly consider the standard approximation regarding PFH and PFD,
i.e., only the contribution from CCF is considered for 𝑀oo𝑁 (𝑀 < 𝑁). It is concluded as follows
regarding measures for loss-of-safety due to DU failures:

• PFH seems a sensible measure for systems operating in continuous mode, when we are talking of
more or less immediate failure detection

• PFH alone is not suited as a measure for loss of safety for on demand systems, (neither low
demand nor high demand); irrespective of 𝜏 and 𝛿:
o The main contributor to PFH does not depend on the length of the test interval, 𝜏. So the
decrease in safety experienced by increasing 𝜏 is essentially not captured by PFH.
o No argument is found for using PFH instead of PFD if the number of demands is above
one (1) per year. PFH is essentially constant, independent of the demand rate, 𝛿, and does
not reveal how the risk depends on the demand rate.
However, by starting from PFH, we can easily calculate both PFD and HR.

10
Here the factor 𝜏/(N + 1) can be interpreted as the average duration of system unavailability when all N
components have failed

43 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

• The relation HR = 𝛿 ∙ PFD can be shown to be valid also in the generalized situation where PFD
depends on 𝛿 (see Appendix E). So when the demands actually serve as functional tests, it is
recommended that the above generalized expressions for HR and PFD (depending on both 𝜏 and 𝛿,
see Appendix E), should be used to determine whether the SIS has an acceptable safety
unavailability.

Summing up the above it can be concluded that for systems working on demand, such as emergency
shutdown systems, process shutdown systems and fire and gas detection systems, it should generally be
preferred to use PFD rather than PFH as the safety unavailability measure. However, using the hazard rate
(HR) may be an even better alternative for estimating the associated risk.

6.4 PFH Formulas; Including both Common Cause and Independent Failures
In this section we will for completeness present simplified formulas for calculating PFH for different
voting configurations; now also taking into account the effect of combinations of independent failures. The
focus is still on DU failures, but it is also discussed how to include the effect of DD failures. Note that the
effect of demands is not accounted for in this section.

6.4.1 Assumptions for Simplified PFH Formulas


In section 5.1 some assumptions underlying the PFD formulas were given. Similarly, when establishing
simplified formulas for PFH certain assumptions must be made, i.e.:

• All failure rates are considered constant with respect to time; i.e., an exponential failure model is
assumed.

• PFH is calculated as an average value.

• A component is considered “as good as new” after a repair or a functional test (standard
assumption).

• The time between diagnostic self-tests is assumed significantly lower than the time between
demands.

• The self-test period is “small” compared to the interval between functional testing, i.e., at least a
factor 100 lower.

• When giving the “simple” formulas for PFH, (section 6.4.2), the contribution from unavailability
due to repair and testing of components is not included; (cf. discussion in section 6.4.2), i.e., short
MTTRs are assumed.

• For single (1oo1) component systems, the system is immediately put in a safe state upon detection
and repair of a dangerous detected failure. Similarly, a DD failure affecting all 𝑁 redundant
components of a system will upon detection immediately result in the system going to a safe state.
So, in these simplified formulas we actually ignore DD failures, and PFH equals the rate of DU
failures.

• The PFH of the function (safety system) is obtained by summing the PFH of each (series of / set
of) redundant modules(s). That is we assume that PFHA and PFHB are small enough to let:

1 − (1 − PFHA ) ⋅ (1 − PFHB ) ≈ PFHA + PFHB .

• The term λDU ⋅ τ should be small enough to allow e−λDU⋅τ ≈ 1 − λDU ⋅ τ, i.e., λDU ⋅ τ ≤ 0.2.

• The rate of independent DU failures is throughout approximated with λDU (rather than e.g., using
(1 − 𝛽) ⋅ λDU for 1oo2).

44 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

• For 𝑁 ≥ 3 we ignore the contribution of a combination of single and double failures. For instance,
when considering a triple system voted 1oo3, we will only include, in the system failure
frequency, common cause failures taking all three components out or three separate (independent)
failures. Consequently, we will disregard the possibility that within the same test interval one
common cause failure takes out two components whereas the third component fails independently.

• The formulas given here do not account for demands as a means of testing to detect dangerous
failures (ref discussion in section 3.4.3).

6.4.2 Simplified PFH Formulas


In the following simplified formulas the above list of assumptions applies, implying that the contribution
from dangerous detected failures is negligible. For inclusion of DD failures, relaxing some of the
assumptions, see next section.

1oo1 voting
As discussed above, when considering a single 1oo1 system which goes to a safe state upon detection of a
dangerous failure, the PFH will equal the dangerous undetected failure rate:

PFH1oo1 = λDU .

Observe that for a single component system the test interval τ does not enter the PFH formula. A constant
failure rate is assumed and functional testing (of a single component system) will not influence this failure
rate. Additionally, frequent self-tests prevent DD failures to contribute significantly.

1oo2 voting
For a duplicated module, voted 1oo2, we get the following contribution when first considering only the
common cause failures:

(CCF)
PFH1oo2 ≈ 𝛽 ⋅ 𝜆DU .

In addition we need to add the contribution from independent failures. This contribution can be
approximated by:

(ind.)
PFH1oo2 ≈ (𝜆DU ⋅ 𝜏)2 /𝜏 = 𝜆DU 2 ⋅ 𝜏.

Observe that for a duplicated- and generally for a redundant system, the test interval 𝜏 does enter the PFH
formula. This can be explained as follows for a system voted 1oo2. Upon failure of either of the
components (with constant rate 𝜆DU) the likelihood of the other unit also being down (and thus give a
system filure upon a demand) will inevitably depend on how long it is since the components have been
tested. Or said in other words: For a single system it is assumed that a critical situation will occur “at once”
upon the introduction of a DU failure, whereas for a redundant system there will be “back-up” and the
availability of this back-up will depend on the time since the last functional test.

Now, the contribution from independent failures in a 1oo2 voting can be intuitively interpreted as follows:
Consider a system with two redundant components A and B each failing with the constant rate 𝜆DU . If
component A fails this will “place a demand” on component B, which must have failed for the redundant
system to fail. The rate of this event can be expressed as the rate of failure of component A, i.e., 𝜆DU , times
the likelihood of component B to be failed upon a demand, i.e., (𝜆DU ⋅ 𝜏)/2. Hence the rate of the above
event becomes: 𝜆DU ⋅ (𝜆DU ⋅ 𝜏)/2. Similarly, component B can fail first with rate 𝜆DU which is multiplied
with the likelihood of component A failing on demand, i.e., 𝜆DU ⋅ (𝜆DU ⋅ 𝜏)/2. By adding these two equal
(ind.)
contributions, we obtain the above expression of PFH1oo2 .

45 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

Hence, when including both the common cause and the contribution from independent failures, we get the
following PFH formula for a 1oo2 voted system:

PFH1oo2 ≈ 𝛽 ⋅ 𝜆DU + (𝜆DU ⋅ 𝜏)2 /𝜏.

2oo3 voting
For components voted 2oo3, the common cause contribution to the PFH is given by:

(CCF)
PFH2oo3 ≈ C2oo3 ⋅ 𝛽 ⋅ 𝜆DU .

The contribution from independent failures can be approximated by:

(ind.)
PFH2oo3 ≈ 3 ⋅ (𝜆DU ⋅ 𝜏)2 /𝜏 = 3 ⋅ 𝜆DU 2 ⋅ 𝜏.

We observe that this contribution is three times the independence contribution for a 1oo2 voting. This can
be intuitively explained as follows: Consider a system with three components A, B and C voted 2oo3, each
failing with a constant rate 𝜆DU . For a 2oo3 voting two components must fail for the system to fail 11. Now
consider the case when component A fails. Then this will place a demand on component B and C and upon
failure of either of these components the system fails. The rate of this event can be expressed as the rate of
failure of component A, i.e., 𝜆DU , times the likelihood of either component B or C being failed upon a
demand, i.e., 2 ⋅ (𝜆DU ⋅ 𝜏)/2. Hence the rate of the above event becomes: 𝜆DU ⋅ (𝜆DU ⋅ 𝜏). Similarly
component B or C can fail first, and by adding these three equal contributions, we obtain the above
(ind.)
expression of PFH2oo3 .

When including both the common cause and the contribution from independent failures, we then get the
following PFH formula for a 2oo3 voted system:

PFH2oo3 ≈ C2oo3 ⋅ 𝛽 ⋅ 𝜆DU + 3 ⋅ (𝜆DU ⋅ 𝜏)2 /𝜏.

MooN voting
Generally, for components voted 𝑀oo𝑁, the common cause contribution to the PFH is given by:

(CCF)
PFH𝑀oo𝑁 ≈ C𝑀oo𝑁 ⋅ 𝛽 ⋅ 𝜆DU ; (𝑀 < 𝑁).

When also including the contribution from independent failures, we obtain the following approximate
formula for an 𝑀oo𝑁 voted system, ref. /18/:

𝑁!
PFH𝑀oo𝑁 ≈ C𝑀oo𝑁 ⋅ 𝛽 ⋅ 𝜆DU + ⋅ [(𝜆DU ⋅ 𝜏)𝑁−𝑀+1 /𝜏]
(𝑁 − 𝑀 + 1)! ⋅ (𝑀 − 1)!

Finally for an 𝑁oo𝑁 voting, we may apply the following approximation formula:

PFH𝑁oo𝑁 ≈ 𝑁 ⋅ 𝜆DU.

11
We have then disregarded the possibility of triple failures which will also give a system failure for the 2oo3 system.

46 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

Summary of simplified formulas for PFH


The simplified/approximate PFH formulas for some different voting logics are summarised in Table 9. The
table includes the following:

i. In the first column, the voting logic (𝑀oo𝑁) is given;


ii. In the second column the PFH contribution from common cause failures is included;
iii. In the third column, the contribution to PFH from independent failures is given.

Table 9: Simplified formulas for PFH, (DU failures only; not accounting for demands as a test)

PFH calculation formulas


Voting
Contribution from CCF Contribution from independent failures

𝟏𝐨𝐨𝟏 - 𝜆DU

𝟏𝐨𝐨𝟐 𝛽 ⋅ 𝜆DU + (𝜆DU ⋅ 𝜏)2 /𝜏

𝟐𝐨𝐨𝟐 - 2 ⋅ 𝜆DU

𝟏𝐨𝐨𝟑 C1oo3 ⋅ 𝛽 ⋅ 𝜆DU + (𝜆DU ⋅ 𝜏)3 /𝜏

𝟐𝐨𝐨𝟑 C2oo3 ⋅ 𝛽 ⋅ 𝜆DU + 3 ⋅ (𝜆DU ⋅ 𝜏)2 /𝜏

𝟑𝐨𝐨𝟑 - 3 ⋅ 𝜆DU

𝑴𝐨𝐨𝑵; 𝑁!
C𝑀oo𝑁 ⋅ 𝛽 ⋅ 𝜆DU + ⋅ (𝜆DU ⋅ 𝜏)𝑁−𝑀+1 /𝜏
𝑴 < 𝑵; 𝑵 = 𝟐, 𝟑, … (𝑁 − 𝑀 + 1)! ⋅ (𝑀 − 1)!
𝑵𝐨𝐨𝑵;
- 𝑁 ⋅ 𝜆DU
𝑵 = 𝟏, 𝟐, 𝟑, …

Note that there is a close connection between the PFH formulas given above and the ones for PFD in Table
3. The results could be summarized as follows:

For MooN voting (𝑀 < 𝑁):

(CCF)
PFD𝑀oo𝑁 ≈ PFH𝑀oo𝑁 ⋅ 𝜏/2

(ind.)
PFD𝑀oo𝑁 ≈ PFH𝑀oo𝑁 ⋅ 𝜏/(𝑁 − 𝑀 + 2)

And for NooN:

PFD𝑁oo𝑁 ≈ PFH𝑁oo𝑁 ⋅ 𝜏/2

So except for combinations of independent failures we have (see Section 6.3) that PFD ≈ PFH ⋅ 𝜏/2,
("linear case").

6.4.3 Including MTTR and DD Failures in the PFH Formulas


As stated in the list of assumptions, the above formulas are valid when:

47 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

1. Time between diagnostic self-tests is "very short", so that DD failures can be assumed to be
detected "immediately";
2. Also the MTTR is very short (e.g. compared to 𝜏), and is in practice we let MTTR = 0;
3. A DD failure causing a system failure will upon detection result in the system immediately going to
the safe state. This assumption corresponds to the philosophy (see Section 5.2.3) "Degraded
operation if possible; otherwise shutdown". Note that this is the same operational philosophy
specified in IEC61508 for calculating PFH (see Section B.3.3 in part 6 of the standard).

When combining assumptions 2 and 3, the result is – as shown in the above simplified formulas - that the
DD failures are ignored and (for a 1oo1 voting) the PFH equals the rate of DU failures (a DD failure can
never cause a system failure due to assumption 3, and degradation periods due to DD failures are ignored
due to assumption 2).

In this section we will relax some of these assumptions. Assumption 1 and initially also assumption 3 are
maintained. Assumption 2 is however removed, as MTTR is assumed sufficiently long to affect
unavailability (of channels) due to independent failures. As compared to the PFD/CSU (low demand)
calculations, this corresponds to including the DTUR contribution in the unavailability measure.

Consider first a 1oo2 voting configuration. PFH still consists of two terms (Table 9), the CCF contribution
resulting from DU failures and the contribution from an independent DU failure in both channels. The
second contribution is now changed:

• First one of the two channels fail, causing one channel being unavailable: The mean duration of
this unavailability is 𝜏/2 for a DU failure and MTTR for a DD failure.
• During this unavailability of one channel, there is a DU failure of the second channel,

This results in the following PFH:

𝜏
PFH1oo2 = 𝛽 ⋅ 𝜆DU + 2 ⋅ �𝜆DU ∙ + 𝜆DD ∙ MTTR� ⋅ 𝜆DU
2

Observe that for MTTR = 0 this reduces to the expression given in Table 9. When we compare with the
corresponding expression given in Section B.3.3.2 of IEC61508-6, the results are very similar. The
standard introduces different 𝛽 values for DU and DD failures; (1 − 𝛽D ) ∙ 𝜆DD is used for the rate of
independent DD failures and (1 − 𝛽) ⋅ 𝜆DU is used rather than 𝜆DU for the rate of independent DU failures
(which is similar to the more accurate PFD formulas in Appendix C). Further, in the standard the mean
unavailability period of a DU failure is given as 𝜏/2 + 𝑟𝑒𝑝𝑎𝑖𝑟 𝑡𝑖𝑚𝑒, rather than 𝜏/2 as used above.

Following the same line of argument for 2oo3 and 1oo3 configurations we get:

𝜏
PFH2oo3 = C2oo3 ⋅ 𝛽 ⋅ 𝜆DU + 6 ⋅ �𝜆DU ⋅ + 𝜆DD ⋅ MTTR� ⋅ 𝜆DU
2
and

𝜏 𝜏
PFH1oo3 = C1oo3 ⋅ 𝛽 ⋅ 𝜆DU + 6 ⋅ �𝜆DU ⋅ + 𝜆DD ⋅ MTTR� ⋅ �𝜆DU ⋅ + 𝜆DD ⋅ MTTR� ⋅ 𝜆DU .
2 3

Again, these formulas are comparable to the results of the IEC standard. It is worth observing that 𝜆DU is a
common factor in all the PFH expressions given above, corresponding to the rate of the "final" DU failure
that causes the system to fail.

The general expression for a 𝑀𝑜𝑜𝑁 voting is relatively complex. However, for a 𝑁𝑜𝑜𝑁 voting we have, as
before:

48 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

PFH𝑁oo𝑁 = 𝑁 ⋅ 𝜆DU.

Finally, consider a change to assumption 3 given above and assume that for instance a DD failure of all
channels will give a system failure (i.e. the system is not brought to a safe state). This means that the "last
failure", causing the system to fail, has rate 𝜆D = 𝜆DU + 𝜆DD rather than 𝜆DU . So the result for this
operational philosophy, "No protection" is given as:

No protection 𝜆D
PFH𝑀oo𝑁 = ⋅ PFH ∗ 𝑀oo𝑁
𝜆DU

where PFH ∗ 𝑀oo𝑁 represent the expressions for the philosophy used above: "Degraded operation if
possible; otherwise shutdown".

Recently a formula including both DD and DU failures, but still having MTTR = 0, has been obtained. The
result is, see /23/ 12:

No protection 𝑁!
PFH𝑀oo𝑁 ≈ C𝑀oo𝑁 ⋅ 𝛽 ⋅ (𝜆DU + 𝜆DD ) + (𝑁−𝑀+1)!⋅(𝑀−1)! ⋅ �(𝜆DU ⋅ 𝜏)𝑁−𝑀 ⋅ (𝜆DU + 𝜆DD )�.

So if referring to PFH𝑀oo𝑁 as the formula obtained from Table 9, it is easily seen that by also inserting
𝜆D = 𝜆DU + 𝜆DU we again have the following simple relation:

No protection 𝜆D
PFH𝑀oo𝑁 = ⋅ 𝑃𝐹𝐻𝑀oo𝑁
𝜆DU

Often 𝜆DD ≫ 𝜆DU , and for the operational philosophy "No protection" one should apply this PFH formula,
i.e., include DD failures in the PFH calculations, irrespective of 𝜏1 value.

6.4.4 Failure data for high demand systems


The use of failure data for high demand systems is a difficult issue. Most of the SIS data from the process
industry are based on low demand systems. Applying these data for demand systems with higher demand
rates is necessarily not correct since other failure mechanisms may be introduced. Further, failure data for
continuously operated components (pneumatic valves, relays, contactors, position switches, etc.) may
originally be derived based on cyclic fatigue testing and the subsequent procedure described in ISO
13849 13 for deriving mean time to failure (MTTF). Using these data for on demand systems may be a
dubious exercise since systems which are basically stand-by will experience different failure mechanisms
than systems operating more or less continuously.

12
Actually /23/ obtains a formula including even further terms, but the present formula gives the main terms.
13
EN-ISO 13849-1: “Safety of machinery – Safety related parts of control system – part 1: General principles for
design”. In section C.4 of this document a procedure for deriving at MTTF figures based on results from cyclic fatigue
testing is described.

49 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

7 CALCULATIONS FOR MULTIPLE SYSTEMS

7.1 Background
Often more than one SIS – or a combination of a SIS and other safety systems – is necessary in order to
obtain a desired risk reduction. The safety systems may then work in sequence, where the additional
systems are back-up systems that are activated once the preceding system fails. Such redundant safety
systems are commonly referred to as safety layers (e.g., in the LOPA terminology) and the total protection
system may be referred to as a multiple safety system or a multiple SIS.

Normally when having multiple safety systems, it is sufficient that one of the systems works successfully
in order to have a successful safety action. In reliability terminology this means that the systems are voted
1oo𝑁 where 𝑁 is the number of redundant systems.

In many industries, such as the petroleum industry, two or possibly three safety layers are common. A
simple example is given in Figure 8 where a possible solution for high level protection of a pressure vessel
is indicated. Here, level protection of the vessel is implemented both through the PSD system and the ESD
system.

Signal ESD
out Logic

Signal PSD
out Logic
LT1 LT2

Pilot
Pilot

Pressure
ESV XV
vessel

Figure 8: Example of a multiple SIS – combining PSD and ESD level protection

When addressing the total reliability of several SISs combined, one often calculates the average PFD
(PFDavg) of each SIS independently, and combines the results to find the total PFD of the multiple SIS. A
standard approach, adopted by many tools and practitioners, is to simply take the product of the individual
PFDavg to find the total (average) PFD. This is appropriate as long as the PFDs of the systems are
independent, but independence is rarely the case. Dependencies exist between the SISs, as well as between
components within a SIS, due to e.g., simultaneous functional testing, close location or common utility
sources (hydraulic, electricity, etc.). Dependent failures may be split into three categories; common cause
failures (CCF), cascading failures and negative dependencies, ref. /15/. CCFs are discussed separately in
chapter 4, whereas cascading failures and negative dependencies are not explicitly addressed in this
handbook.

In this chapter we primarily focus on the impact of the systemic dependencies introduced by periodic
testing. We consider mainly the PFD arising from independent failures of components, but CCF are also
treated to some extent.

The calculation error made when simply multiplying average PFD values may be negligible or significant,
depending on the case at hand. It may also be either conservative or non-conservative. In order to minimize

50 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

the error and ensure robustness of the calculations, a correction factor (CF) is introduced to the product of
average PFD values. For example, for a multiple SIS comprising two layers, the average PFD of the
multiple SIS can be calculated as:

PFDavg = CF ∙ PFDavg (SIS1 ) ∙ PFDavg (SIS2 )

The main purpose of this chapter is to discuss the appropriate use of correction factors (CFs) in various
cases, and to suggest CFs for multiple SIS.

It should be pointed out that as the complexity of modelling increases due to e.g. number of components,
dependencies between components, specific repair strategies and alternative probability distributions,
simplified formulas may reach their limit of application. It may then be appropriate to explore alternative
modelling approaches, like Markov driven fault trees and Petri Net modelling. For further reading on this,
reference is made to the new ISO TR 12489, /24/.

7.1.1 Time-Dependent PFD and Average PFD


In reliability calculations and in this handbook components are assumed to have a constant failure rate,
denoted 𝜆. The time to failure then follows an exponential probability distribution with parameter 𝜆. The
component unavailability at time 𝑡 is the probability of having failed at time 𝑡, and is called the probability
of failure on demand (PFD) at time 𝑡. The PFD is an increasing function of time, with PFD(𝑡) = 1 − e−𝜆𝑡 .
For small values (<< 1) of the product 𝜆𝜏, we can make the approximation PFD(𝑡) ≈ 𝜆𝑡. The validity of
this approximation is ensured by introducing periodic testing, limiting the value of the parameter 𝑡 to the
length 𝜏 of the test interval. Just after a functional test, the PFD of a component is assumed to be 0 (perfect
testing), and then it increases (approximately) linearly throughout the test interval to its maximal value 𝜆𝜏,
until it drops to 0 again at the next test. This gives the characteristic saw-tooth PFD curve for periodically
tested components.

The unavailability varies over time, but one is often interested in quantifying the average unavailability or
average 𝑃𝐹𝐷 over some period of time. Since the PFD(𝑡) is a repeating function, it suffices to find the
average over one test interval, and since the PFD(𝑡) is approximately linear, the average is PFDavg ≈ 𝜆𝜏/
2. The situation is illustrated in Figure 9.

PFD(t)
1 − e−𝜆𝑡

≈ 𝜆𝑡

𝜆𝜏
PFDavg ≈
2 t
τ 2τ 3τ 4τ
Figure 9: PFD(t), with a linear approximation for small values of 𝝀𝒕

In the remainder of this chapter we apply the following notation:

• PFD(𝑡) denotes the instantaneous, time-dependent PFD.


• As in previous chapters, PFD denotes the average, time-independent PFD. The subscript avg is
hence omitted for increased readability.

51 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

7.1.2 The product of Averages versus the Average of a Product


Simultaneously testing of components introduces dependencies between the failure events of the
components. When tests are performed simultaneously – as is often the case – the PFD(𝑡) for the
components are low at the same time and high at the same time, which disfavours the total PFD. When
testing is not simultaneous, but distributed in time (i.e., staggered testing), some components will have a
low PFD(𝑡) while others are high, and vice versa; this is beneficial to the total PFD. 14

In component structures featuring redundancy, the PFD(𝑡) expression for the structure will contain terms
that is the product of the PFD(𝑡) of two or more components. For example, a parallel structure voted 1oo2
with two identical components with failure rate 𝜆 will have:

PFD(𝑡) = PFD1 (𝑡) ∙ PFD2 (𝑡) = (𝜆𝑡)2

This is a quadratic function in 𝑡, which in the case of simultaneous testing with test interval 𝜏, has the
average value:

𝜏
1 (𝜆𝜏)2
PFD = � PFD1 (𝑡) ∙ PFD2 (𝑡) d𝑡 =
τ 3
0

We recognize this PFD expression from Table 3. If on the other hand we calculate the PFD by simply
taking the product of the average PFDs, we get:

𝜆𝜏 𝜆𝜏 (𝜆𝜏)2
PFD1 ∙ PFD2 = ∙ =
2 2 4

The results differ, and we see that by multiplying the average PFD s, the total PFD is underestimated. In
this case a correction factor of 4/3 must be applied to obtain the correct answer. When the degree of
redundancy increases, so does the power of the 𝜆𝜏 terms in the overall PFD(𝑡) expression, and so does the
CF.

7.2 Motivation for Using Correction Factors (CF) in Multiple SIS Calculations
As a motivation for using CFs in multiple SIS calculations, we consider a multiple SIS comprising two
SISs, i.e., SIS1 and SIS2, and generalize when appropriate. Simultaneous testing is assumed. For simplicity
of notation – and without loss of generality – we also assume identical failure rates for the components.

We first consider the extreme case where each SIS consists of only one component. Each SIS will then
have PFD = λτ/2, and the product of individual PFD s is (𝜆𝜏)2 /4. However, as discussed above, the
actual PFD of the multiple SIS is (𝜆𝜏)2 /3, and the appropriate CF is 4/3.

In the general case of 𝑁 SISs (voted 1𝑜𝑜𝑁), where each SIS is a single component, we get:

𝜏 𝑁 𝜏
1 1 (𝜆𝜏)𝑁
PFD = � � PFDi (𝑡) dt = �(𝜆𝑡)𝑁 dt =
𝜏 𝜏 𝑁+1
0 𝑖=1 0

The product of average PFDs is:


𝑁 𝑁
𝜆𝜏 (𝜆𝜏)𝑁
� PFD𝑖 = � = 𝑁
2 2
𝑖=1 𝑖=1

14
For a more thorough discussion of non-simultaneous (staggered) testing, see e.g., /15/.

52 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

Hence, the appropriate CF is:


𝑁
2𝑁
CF = PFD�� PFD𝑖 =
𝑁+1
𝑖=1

Next, consider the case where SIS1 still consists of only one component, but SIS2 features redundancy in
terms of a simple parallel structure of 2 identical components voted 1oo2. The PFD for SIS1 is still λτ/2,
while the PFD for SIS2 is (𝜆𝜏)2 /3; the product of individual PFDs is now (𝜆𝜏)3 /6. However, the two SISs
combined is in effect a parallel structure voted 1oo3, so the actual PFD is (𝜆𝜏)3 /4. Hence the appropriate
correction factor is 3/2. Note that the “global” approach of disregarding the internal redundancy and
viewing the multiple SIS as a global structure voted 1oo2 suggests a CF of 4/3, which will underestimate
the actual PFD and give a non-conservative result.

In the general case of 𝑁 SISs where each SIS𝑘 has 𝑛𝑘 components voted 1-oo-nk we get:

∏𝑁
𝑘=1(𝑛𝑘 + 1)
CF = (*)
1 + ∑𝑁
𝑘=1 𝑛𝑘

We may generalize further to more complex voting logics. In the general case of 𝑁 SISs where each SIS𝑘
has 𝑛𝑘 components voted 𝑚𝑘 𝑜𝑜𝑛𝑘 it can be shown that the appropriate correction factor is:
∏𝑁𝑘=1(𝑛𝑘 − 𝑚𝑘 + 2)
CF = (**)
1 + ∑𝑁𝑘=1(𝑛𝑘 − 𝑚𝑘 + 1)

It is important to note that the above equations require that the PFD of each SIS is correctly calculated
according to the PDS formulas given in Table 3. The PDS formulas implicitly apply the CF in equation
(**) to the product of average PFDs of redundant components within an element.

The above discussion highlights two main findings in the case of simultaneous testing:

1. A 𝐂𝐅 is required when multiplying average 𝐏𝐅𝐃 values; not using a CF will give non-
conservative results. CFs should be applied on all parallel structures, both for identical components
within a SIS element (automatically catered for by using the PDS formulas) and between
individual SISs in a multiple SIS

2. The appropriate 𝐂𝐅 depends on the voting logic of the elements in the individual SISs.
Disregarding internal redundancies in the elements and applying a CF according to the global
1oo𝑁 voting logic of the multiple SIS may also give non-conservative PFD figures.

So far we have only considered SISs consisting of a single element (possibly containing redundant
components). Normally a SIS will comprise several elements in series, and the above formulas will not
apply directly. Returning to our example, SIS1 is still a single component element while SIS2 consists of
two elements in series, where the first element is a single component and the second element is a parallel
structure with two components voted 1oo2 (Figure 10).

53 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

SIS1

SIS1
1 1
1 1

2 3
SIS2

SIS2
3 3
2 CF = 4 / 3 2
4
4 (negligible)
4 1oo2
1oo2 1oo2
CF = 3 / 2
(negligible)

Figure 10: Example of a multiple SIS consisting of two layers

We first assume that the 1oo2 element (components 3 and 4) has a negligible contribution to the PFD of
SIS2, so that SIS2 can be approximated by the single component element. As shown above, the appropriate
CF in this case is 4/3. We then assume the opposite, i.e., that the single component element in SIS2 has a
negligible PFD contribution, so that SIS2 can be approximated by the 1oo2 element. Using equation (*)
with n1 = 1 and n2 = 2, the CF in this case is 3/2. Normally we will have a situation between these two
extremes, so the appropriate CF will lie somewhere between 4/3 and 3/2.

This demonstrates another finding:

3. PFD contribution from each element should be considered when deciding the appropriate
𝐂𝐅. More influential elements should be given more weight in the process of determining the CF.

Given these findings in example cases involving simple, degenerated SISs, the question is how to
determine appropriate CFs in real cases where the SISs normally consist of several elements and where
redundancy is common. In addition, the underlying assumption of simultaneous testing of all equipment
involved will normally not be fulfilled.

7.3 Correction Factors for Simultaneous Testing


Several approaches may be envisaged to find an appropriate correction factor in the assumed case of
simultaneous testing of all equipment involved in the multiple SIS. Which approach to adopt depends on
the particular situation at hand in terms of system knowledge and the analyst's preferences. In this
handbook we recommend using the “global” approach for multiple SIS calculations in general situations.
Appendix D contains an overview and description of other possible approaches.

7.3.1 The “Global” Approach


The “global” approach considers the multiple SIS as a global system with a 1oo𝑁 voting logic, where 𝑁 is
the number of SISs. The internal structure of each SIS is disregarded; this implicitly assumes that each SIS
element – or at least each element with a non-negligible contribution to the total PFD of the SIS – is a
single component. The situation is illustrated in Figure 11.

SIS1 SIS1
SIS2 SIS2
CF = 2N / (N + 1)
... ...

SISN SISN

Figure 11: The “global” approach to determining CF

54 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

As shown earlier, the appropriate CF for a 1oo𝑁 structure is:

2𝑁
CF =
𝑁+1

Numerical CF values for various N are given in Table 10.

Table 10: Correction factors for multiple SIS when the element structure of each SIS is unknown or
disregarded. Equal test intervals are assumed for all equipment involved

Number of SISs 𝐂𝐅

1 1

2 1.33

3 2

4 3.2

2𝑁
N
𝑁+1

7.3.2 Discussion of the “Global” Approach


The “global” approach disregards internal redundancies in the SIS elements, which may give non-
conservative PFD figures, as discussed above. However, in real cases this is normally not a problem, for
two reasons described in the following.

First, single component elements are usually less reliable than elements featuring redundancy, making the
PFD contribution from the single component elements more dominant. Such elements should therefore be
attributed most weight in determining the CF. The error made when disregarding elements with redundancy
might be considered negligible.

Second, when there are no single component elements – or in (rare) cases where such elements have an
insignificant PFD contribution – elements with redundancy will dominate with respect to PFD contribution.
However, this contribution has two parts: one part stems from simultaneous independent failures of
redundant components, while the other part stems from common cause failures (CCF) of the components.

The relative contribution of these two parts will depend on:


• the failure rate
• the fraction of CCF (i.e., the 𝛽 factor)
• the length of the test interval
• the degree of redundancy

In most cases the CCF contribution will far outweigh the contribution from independent failures. The CCF
can be modelled as the failure of an artificial single component element, which may be included in the
functional/reliability block diagrams in the same way as “real” components (cf. figures in Appendix D).
Hence, the main PFD contribution in the SIS is essentially that of the artificial single component element,
which is always voted 1oo1, supporting the use of the “global” approach.

55 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

Although the “global” approach will be suitable and sufficient for the vast majority of multiple SIS
calculations, there might be special circumstances where it is advisable to use one of the other approaches
2–6. A guide to selecting a suitable calculation approach can be found in Appendix D.

7.4 Correction Factors for Non-Simultaneous Testing


So far we have considered simultaneous testing of all components, both within each SIS and between
different SISs. This will not always be the case. For a given SIS, the reliability of the various elements will
differ greatly; particularly logic units normally have a much higher reliability than sensors and actuating
units (e.g., due to diagnostic testing of logic). The need for functional testing is therefore different for the
elements. Similarly, personnel trained to test a particular type of equipment may not be suited to test other
types of equipment within the SIS. The expertise available for testing may therefore be different for the
elements. In consequence, this may lead to differences in the testing regimes for the elements in the SIS.
However, tests still tend to be performed simultaneously for practical reasons, e.g., during revision stops,
even though some tests then will be considered less necessary.

The effects of non-simultaneous testing are discussed in Appendix D. Dissimilarities in the test intervals,
both in length and phasing, reduce the test dependencies between components, and support the use of
average values without CF in the overall PFD calculations. Some main findings are summarised below.

7.4.1 The Effect of Different Phasing of Test Intervals


Components may have test intervals of equal length, but with different phasing (i.e., starting points). The
appropriate CF to apply depends on the phasing of the test intervals.

• Simultaneous testing of components requires the highest CF. This is the assumed case throughout
section 7.3.
• Evenly distributed testing of components requires the lowest CF. Test dependencies may be
exploited to give a total PFD that is actually lower than the product of individual PFDs, so that a
CF < 1 being appropriate.
• Independent testing of components requires no CF. If components are tested completely
independently (corresponding to tests performed randomly in time from a uniform distribution) ,
the total PFD is equal to the product of average PFDs.

7.4.2 The effect of different length of test intervals


Components may have test intervals of unequal length. The appropriate CF to apply depends on the
difference in interval length, as well as the difference in phasing. Consider for example two components
voted 1oo2 where the test interval of component 2 is a multiple of the interval of component 1, i.e.,
𝜏2 = 𝑛𝜏1 . Testing is “quasi-simultaneous” in the sense that whenever component 2 is tested, component 1
is tested simultaneously (component 1 is tested alone n-1 out of n times). It can be shown (Appendix D)
that the required CF is
3𝑛 + 1
CF =
3𝑛

The CF diminishes rapidly with increasing n, so the more dissimilar the test intervals are, the smaller the
correction will be. For instance, if the two components are tested quarterly and annually (𝑛 = 4), we get
CF = 1.08, i.e., a correction of only 8 %, compared to the “base case” CF of 1.33 corresponding to equal
intervals (𝑛 = 1).

Adding different phasing to the testing, we obtain the same results as summarized in section 7.4.1, i.e.,
simultaneous testing requiring the highest CF, evenly distributed testing requiring the lowest CF (< 1) and
independent testing requiring no CF at all.

56 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

Different phasing will also automatically result if the “quasi-simultaneousness” is removed by letting n not
be an integer, i.e., if the test interval of component 2 is not an exact multiple of that of component 1. This
will over time resemble the situation of independent testing, so no CF should be applied.

7.5 Concluding remarks


It is not straightforward to derive general rules for situations where multiplying average PFD values
without correction factors will be sufficient. It seems clear that in the extreme case of simultaneous testing
of all the equipment involved, using a CF as outlined in section 7.3 is appropriate. In the other extreme (and
theoretical) case of no particular patterns in the testing, no CF need be applied.

Nevertheless, the normal situation is represented by a considerable degree of simultaneousness in the


testing, especially within a SIS, but also between SIS layers. As a general rule – and to ensure
conservativeness – it is therefore recommended to use CFs as outlined in section 7.3.

57 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

8 QUANTIFICATION EXAMPLE
To illustrate the PDS method, a worked example for a topside HIPPS function is presented. Here the
critical safety unavailability (CSU) and the spurious trip rate (STR) are calculated (the downtime
unavailability, DTU, is assumed negligible in this example).

8.1 System Description – Topside HIPPS function


Figure 12 illustrates a HIPPS (High Integrity Pressure Protection System) protecting a vessel. The system
is implemented with 2oo3 voting on the pressure transmitters (PT), and redundant logic and valves (1oo2).

2oo3

PT3
Redundant PT2
Logic PT1

Pilot
Pilot
Pressure
vessel

V1 V2

Figure 12: Example of a simplified topside HIPPS solution

Upon high pressure in the vessel, the pressure transmitters will trip and send a signal via the control logic
to shut down the HIPPS valves (V1 and V2).

8.2 Reliability Input Data


The quantification of safety unavailability for the above system will be based on the input data presented in
Table 11: Generic reliability data for the HIPPS components, which are taken from the updated PDS data
handbook, /11/.

58 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

Table 11: Generic reliability data for the HIPPS components

Failure rate per 106 hours


𝝉
Component 𝐏𝐓𝐈𝐅 β
(months)
𝝀𝐃 𝝀𝐒 𝝀𝐃𝐔 𝝀𝐒𝐔

PT, Pressure Transmitter 1.5 0.5 0.5 0.4 5 ⋅ 10−4 6% 6


Trip amplifier / analogue input
0.04 0.4 0.04 0.4
(single)
Logic solver (1oo1) 0.03 0.3 0.03 0.3 5 ⋅ 10−6 3% 12

Digital output (single) 0.04 0.4 0.04 0.4


HIPPS valve incl. actuator &
2.7 3.3 1.9 3.0 1 ⋅ 10−4 5% 6
solenoid valve

A functional test interval (𝜏) of 6 months (4380 hours) has been assumed for the field equipment whereas
control logic units are assumed tested annually.

8.3 Loss of Safety Assessment – CSU


In this section we calculate the PFD, the TIF contribution and the CSU for the safety instrumented system.
The undesired event is failure of the HIPPS to close at least one of the valves in case of high pressure.

The system consists of three modules; the pressure transmitters, the logic and the valves. Figure 13 shows a
reliability block diagram with respect to loss of safety for the overpressure protection system in Figure 12.

PT1 PT1 PT2 Analoge Logic Digital Valve 1


Input 1 solver 1 output 1

Analoge Logic Digital


PT2 PT3 PT3 Input 2 solver 2 output 2 Valve 2

Figure 13: Reliability block diagram for topside HIPPS example

Calculation of PFD
Here, two of the modules are voted 1oo2, while one is voted 2oo3. The common cause contributions from
each module are first calculated from the simplified equations (ref. Table 3), i.e., 𝛽 ⋅ 𝜆DU ⋅ 𝜏/2 (for 1oo2
voting) and C2oo3 ⋅ 𝛽 ⋅ 𝜆DU ⋅ 𝜏/2 (for 2oo3 voting), where C2oo3 = 2.0 (ref. Table 2), giving:

(CCF)
PFDPT(2oo3) = 2.0 ⋅ 0.06 ⋅ 0.5 ⋅ 10−6 ⋅ 4380/2 = 1.3 ⋅ 10−4

(CCF)
PFDLogic(1oo2) = 0.03 ⋅ (0.04 ⋅ 10−6 + 0.03 ⋅ 10−6 + 0.04 ⋅ 10−6 ) ⋅ 8760/2 = 1.4 ⋅ 10−5
(CCF)
PFDValve(1oo2) = 0.05 ⋅ 1.9 ⋅ 10−6 ⋅ 4380/2 = 2.1 ⋅ 10−4

The above PFD figures include only the contributions from common cause failures. In addition, the
contributions from independent failures must be added (or at least it must be verified whether these

59 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

contributions are negligible or not). As seen from Table 3 (rightmost column), these independency
contributions are given by (𝜆DU 𝜏)2 /3 and (𝜆DU 𝜏)2 , for a 1oo2 and 2oo3 voting, respectively.
Consequently:

(ind.)
PFDPT(2oo3) = (0.5 ⋅ 10−6 ⋅ 4380)2 = 4.8 ⋅ 10−6

(ind.)
PFDLogic(1oo2) = [(0.04 ⋅ 10−6 + 0.03 ⋅ 10−6 + 0.04 ⋅ 10−6 ) ⋅ 8760]2 /3 = 3.1 ⋅ 10−7

(ind.)
PFDValve(1oo2) = (1.9 ⋅ 10−6 ⋅ 4380)2 /3 = 2.3 ⋅ 10−5

It is seen that the contribution from independent failures is small compared to the CCF contribution for all
equipment groups. For the logic and the pressure transmitters the independent contribution may be
considered as negligible compared to the independent contribution (less than 5 % of the total PFD of the
component group). However, for the valves, which are the main contributors to the PFD, the contribution
from independent valve failures is approximately 9 % of the total valve contribution and should therefore
be included in the total PFD.

Contribution from Test Independent (Systematic) Failures


In order to calculate the CSU, we now calculate the TIF probabilities using the formulas (from Table 4):
PTIF (1oo2) = 𝛽 ⋅ PTIF and PTIF (2oo3) = C2oo3 ⋅ 𝛽 ⋅ PTIF , giving (with C2oo3 = 2.0):

PTIF− PT(2oo3) = 2.0 ⋅ 0.06 ⋅ 5 ⋅ 10−4 = 6.0 ⋅ 10−5


PTIF− Logic(2oo3) = 0.03 ⋅ 5 ⋅ 10−6 = 1.5 ⋅ 10−7
PTIF− Valve(2oo3) = 0.05 ⋅ 1 ⋅ 10−4 = 5.0 ⋅ 10−6

Summing up the CSU


The results from the above calculations are summarised in Table 12, giving CSU = PFD + PTIF. These
calculations indicate that the total loop has a CSU = 4.4 ⋅ 10−4.

Table 12: Summary of PFD, PTIF and CSU for example 1

Module PFD (CCF + independent failures) PTIF CSU

PT (2oo3) 1.3 ⋅ 10−4 6.0 ⋅ 10−5 1.9 ⋅ 10−4


Logic (1oo2) 1.4 ⋅ 10−5 1.5 ⋅ 10−7 1.4 ⋅ 10−5
Valves (1oo2) 2.3 ⋅ 10−4 5.0 ⋅ 10−6 2.4 ⋅ 10−4
Total 3.7 ⋅ 10−4 6.5 ⋅ 10−5 𝟒. 𝟒 ⋅ 𝟏𝟎−𝟒

Some observations from the calculations are worth mentioning. First, we see that it is important always to
check whether the contribution from independent failures can be disregarded or not. In our case it was
relevant to include the contribution from independent failures of the valves.

If we assume that the topside HIPPS function on average is triggered twice a year, the likelihood to
observe a failure of this function to close on demand in one specific year, i.e., the hazard rate (HR)
becomes:

HR = 4.4 ⋅ 10−4 ⋅ 2 𝑦𝑒𝑎𝑟 −1 = 8.8 ⋅ 10−4 𝑦𝑒𝑎𝑟 −1 .

60 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

Thus, the mean time to observe such a failure is:

1/( 8.8 ⋅ 10−4 𝑦𝑒𝑎𝑟 −1 ) ≈ 1100 𝑦𝑒𝑎𝑟𝑠.

8.4 Spurious Trip Assessment


The spurious trip event for this example system is the production is unintentionally shut down due to
spurious operation of the HIPPS.

The simplified formulas to be used for calculation of STR are taken from Table 8 (section 5.4), i.e.,
STR = 2 ⋅ λSU (for a 1oo2 voting) and STR = C2oo3 ⋅ 𝛽 ⋅ λSU (for a 2oo3 voting). The resulting figures are
given in Table 14.

The estimated spurious trip rate of this example system is STR = 8 ⋅ 10−6 failures per hour. Multiplied by
8760 hours, this gives an expected number of spurious trip events of 0.07 per year, i.e., one trip
approximately every 14th year. (Note that for simplicity we have assumed the same 𝛽 value for spurious
operation failures as for dangerous failures).

Table 13: Data used and results from the loss of production calculation

Module STR formula 𝝀𝐒𝐔 [hr-1] 𝜷 Module STR [hr-1]

Pressure transmitters
2.0 ⋅ 𝛽 ⋅ λSU 0.4 ⋅ 10−6 6% 4.8 ⋅ 10−8
(2oo3)
Control logic incl. I/O
2 ⋅ λSU 1.1 ⋅ 10−6 - 2.2 ⋅ 10−6
(1oo2)

HIPPS valves (1oo2) 2 ⋅ λSU 3.0 ⋅ 10−6 - 6.0 ⋅ 10−6

Total STR - - - 𝟖. 𝟐 ⋅ 𝟏𝟎−𝟔

All the modules of the HIPPS loop are redundant and the components are tested subsequently.
Consequently, the contributions to loss of production from repair of dangerous failures and from testing are
both negligible (production is maintained upon single component failure and during testing).

61 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

9 REFERENCES

/1/ IEC 61508 Standard. Functional safety of electrical/electronic/programmable electronic (E/E/PE)


safety related systems, part 1-7, Edition 2.0, 2010.

/2/ IEC 61511 Standard. Functional safety - safety instrumented systems for the process industry sector,
part 1-3, Edition 1, 2004.

/3/ 070 - Norwegian Oil and Gas recommended guidelines for application of IEC 61508 and IEC 61511
in the Norwegian Petroleum Industry, Rev. 02, October 2004.

/4/ PSA (2001). Norwegian Petroleum Directorate. Regulations relating to Management in the
Petroleum Activities (see http:/www.npd.no/regelverk/r2002/frame_e.htm).

/5/ Centre for Chemical Process Safety. Guidelines for safe and reliable instrumented protective
systems. Wiley, 2007.

/6/ Hauge S, Lundteigen M.A., Hokstad P. and Håbrekke S. (2010). Reliability Prediction Method for
Safety Instrumented Systems. PDS Method Handbook. SINTEF report STF50 A13503.

/7/ Hauge, S., Lundteigen M.A. and Rausand M. (2009). Updating Failure Rates and Test Intervals in
the Operational Phase: A Practical Implementation of IEC 61511 and IEC 61508. Proceedings of
ESREL 2009, Prague, Czech Republic.

/8/ Hauge, S. and Lundteigen M.A. (2008). Guidelines for Follow-up of Safety Instrumented Systems
(SIS) in the Operating Phase. SINTEF report STF50 A8788.

/9/ A. Summers (2008). IEC Product Approval – Veering Off Course. Article posted 11.06.08 in
www.controlglobal.com

/10/ A. Fijan (2010). How reliable is your SIL verification actually? Article in Inside Functional Safety
01, 2010.

/11/ Håbrekke S. and Hauge S. (2013). Reliability Data for Safety Instrumented Systems – PDS Data
Handbook, May 2013. Report no. SINTEF A24443.

/12/ Hauge S. and Håbrekke S. and Lundteigen M.A. (2010). PDS Example Collection, 2010 Edition.
SINTEF report F15574. 1st edition, December 2010.

/13/ OREDA participants, OREDA; Offshore Reliability Data Handbook, Volume 1 - topside data and
Volume 2 – subsea data. 5th edition, 2009.

/14/ Hauge S. and Onshus T. (2010). Reliability Data for Safety Instrumented Systems – PDS Data
Handbook, 2010 Edition. SINTEF report A13502.

/15/ Rausand, M. and Høyland, A. (2004). System Reliability Theory. Models, Statistical Methods, and
Applications. 2nd edition, Wiley.

/16/ Hokstad, P. A Generalisation of the Beta Factor Model. In Probabilistic Safety Assessment and
Management Eds.: C. Spitzer, U. Schmocker & V.N. Dang, Springer 2004. Proceedings from
PSAM7-ESREL ‘04.

62 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

/17/ SKI Technical report NR 91:6. CCF analysis of high redundancy systems, safety/relief valve data
analysis and reference BWR application. Stockholm 1992.

/18/ Hokstad P., Håbrekke S., Lundteigen M.A. and Onshus T. (2009). Use of the PDS Method for
Railway Application. SINTEF report A11612.

/19/ Vesely, W. E. (1977). Estimating common cause failure probabilities in reliability and risk analysis:
Marshall-Olkin specializations. In Nuclear Systems Reliability Engineering and Risk Assessment
(eds. J. B. Fussell and G. R. Burdick), pp. 314-341. Philadelphia: Society for Industrial and Applied
Mathematics.

/20/ Atwood, C. L. (1986). The binomial failure rate common cause model. In Technometrics, Vol. 28,
pp. 139-148.

/21/ Lundteigen, M.A. and Rausand, M. (2008). Spurious activation of safety instrumented systems in
the oil and gas industry: Basic concepts and formulas. In Reliability Engineering & System Safety,
Volume 93, Issue 8, August 2008, Pages 1208-1217.

/22/ IEC 61508 Standard. Functional safety of electrical/electronic/programmable electronic (E/E/PE)


safety related systems, part 1-7, Edition 1.0, (various dates).

/23/ Jin, H., Lundteigen, M.A. and Rausand, M. (2013). New PFH-formulas for k-out-of-n: F-systems. In
Reliability Engineering and System Safety, Vol. 111 (March 2013), pp. 112-118.

/24/ ISO TR 12489. Petroleum, petrochemical and natural gas industries — Reliability modelling and
calculation of safety systems. Edition 1, 2013.

63 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

APPENDIX A: NOTATION AND ABBREVIATIONS

Table A.1 gives a complete list of notation used in this handbook.

Table A.1: Reliability parameters and performance measures

Reliability parameters and performance measures

Probability of failure on demand. This is the measure for loss of safety caused by dangerous
PFD
undetected failures detectable by functional testing.
Downtime unavailability. This is the “known” downtime unavailability caused by by-pass during
repair or functional testing. The downtime unavailability comprises two elements:
• The unavailability related to repair of dangerous failures (with rate 𝜆D). The average
DTU duration of this period is the mean time to restoration (MTTR); This downtime
unavailability is also denoted DTUR .
• The unavailability resulting from planned activities such as testing, maintenance and
inspection (of average time 𝑡). This downtime unavailability is also denoted DTUT.
The Probability that the component/system will fail to carry out its intended function due to a
PTIF (latent) failure not detectable by functional testing (therefore the name “test independent
failure”).
PTC Proof test coverage. Fraction of failures detected during functional proof testing.
Critical safety unavailability. The probability that the component/system will fail to
CSU automatically carry out a successful safety action on the occurrence of a hazardous/accidental
event. The critical safety unavailability comprises: CSU = PFD + DTU + PTIF .
Mean time to restoration; i.e., time from failure is detected/revealed until function is restored,
(“restoration period”). Note that this restoration period may depend on a number of factors. It
MTTR can be different for detected and undetected failures: The undetected failures are revealed and
handled by functional testing and could have shorter MTTR than the detected failures. The
MTTR could also depend on configuration, operational philosophy and failure multiplicity.
Probability of dangerous failure per hour. In IEC 61508-4 defined as the average frequency of a
PFH dangerous failure per hour.
Spurious Trip Rate. Rate of spurious trips of the safety system (or set of redundant components),
STR
taking into consideration the voting configuration.
HR Hazard rate. The rate of demands where the SIS has failed, thus giving a hazardous event.
Rate of critical failures; i.e., failures that may cause loss of one of the two main functions of the
component/system.
λcrit Critical failures include dangerous (D) failures which may cause loss of the ability to shut down
production when required and safe (S) failures which may cause loss of the ability to maintain
production when safe (i.e., spurious trip failures).
Total rate of Dangerous failures, including detected as well as undetected failures.
𝜆D
𝜆D = 𝜆DU + 𝜆DD .
Rate of dangerous undetected failures, i.e., failures undetected by automatic self-test (i.e., only
𝜆DU
revealed during a function test or a demand).
Rate of dangerous undetected random hardware failures, i.e., the part of
𝜆DU−RH
𝜆DU originating from random hardware failures (i.e., equals the 𝜆DU as defined in IEC 61508).
Rate of dangerous undetected systematic failures, i.e., the part of 𝜆DU originating from
𝜆DU−SYST
systematic failures detectable by functional testing.
𝑟 The fraction of 𝜆DU originating from random hardware failures, 𝑟 = 𝜆DU−RH /𝜆DU. Hence, 1 − 𝑟
is the fraction of 𝜆DU originating from systematic failures detectable by functional testing, i.e.,
1 − 𝑟 = 𝜆DU−SYST /𝜆DU .

64 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

Reliability parameters and performance measures

𝜆DD Rate of Dangerous failures, detected by automatic self-test.


Rate of safe (spurious trip) failures, including both undetected as well as detected failures.
𝜆S
𝜆S = 𝜆SU + 𝜆SD .
Rate of safe (spurious trip) undetected failures, i.e., undetected by automatic self-test (or
𝜆SU
personnel).
𝜆SD Rate of safe (spurious trip) detected failures, i.e., detected by automatic self-test (or personnel).
𝜆NONC Rate of failures that are not critical (neither affecting loss of safety, nor loss of production).
𝜆 Total failure rate. 𝜆 = 𝜆crit + 𝜆𝑁𝑂𝑁𝐶 .
𝜆undet Rate of critical undetected failures. 𝜆undet = 𝜆DU + 𝜆SU.
𝜆det Rate of critical detected failures. 𝜆det = 𝜆DD + 𝜆SD .
𝑐 Coverage. Percentage of critical failures detected by the automatic self-test.
𝑐D Fraction of dangerous failures detected by automatic self-tests. 𝑐D = (λDD /𝜆D ) ⋅ 100%.
Fraction of safe (spurious trip) failures that are detected by automatic self-tests (or by personnel)
𝑐S
so that a spurious trip of the component is avoided. 𝑐S = (λSD /𝜆S ) ⋅ 100%.
SFF Safe Failure Fraction. SFF = (1 − λ𝐷𝑈 /𝜆crit ) ⋅ 100%.
The fraction of failures of a single component that causes both components of a redundant pair to
β fail “simultaneously”. 𝛽 is application specific, and should therefore, preferably, reflect
application specific conditions.
Modification factor for voting configurations other than 1oo2 in the beta-factor model (e.g.,
C𝑀oo𝑁 1oo3, 2oo3 and 2oo4 voting logics). 𝑀oo𝑁 voting (with respect to safety) implies that at least
M-out-of-N components must function for the safety function to work (on demand).
τ Interval of functional test (time between functional tests of a component).
𝑡 Length of by-pass period during functional testing.

65 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

APPENDIX B: THE CONFIGURATION FACTOR CMOON


This Appendix considers in some more detail the configuration factors, C𝑀oo𝑁 , related to the PDS common
cause factor model. First determination of specific “base case” values for C𝑀oo𝑁 is discussed and secondly
general formulas for these factors are provided.

B.1 Determining the Configuration Factor CMooN


In the 2010 version of the PDS method- and data handbook (/6/ and /14/), updated values of C𝑀oo𝑁 for
some typical voting configurations were given, see Table B.1.

Table B.1: CMooN factors for different voting logics from PDS 2010 handbook

𝑴 \ 𝑵 𝑵=𝟐 𝑵=𝟑 𝑵=𝟒 𝑵=𝟓 𝑵=𝟔

𝑴=𝟏 C1oo2 = 1.0 C1oo3 = 0.5 C1oo4 = 0.3 C1oo5 = 0.21 C1oo6 = 0.17

𝑴=𝟐 - C2oo3 = 2.0 C2oo4 = 1.1 C2oo5 = 0.7 C2oo6 = 0.4

𝑴=𝟑 - - C3oo4 = 2.9 C3oo5 = 1.8 C3oo6 = 1.1

𝑴=𝟒 - - - C4oo5 = 3.7 C4oo6 = 2.4

𝑴=𝟓 - - - - C5oo6 = 4.3

Observe that C1oo2 = 1. Hence, for the 1oo2 voting the specified β value without any modification is
used.

The values from Table B.1 were partly supported by some empiric results from a study made in Swedish
Power plants, /17/. This reference suggests the following:

• Given a failure of two redundant components, the likelihood of having a simultaneous failure
of a third added component may sometimes be as high as 0.5.
• When introducing more and more redundant components it appears that the effect of added
redundancy decreases as the number of components increases.
• For systems where the number 𝑁 of parallel components are high (say more than 7–8
components) the likelihood of having 𝑁 simultaneous failures seem to be higher than having
exactly 𝑁 − 1 (or 𝑁 − 2, etc. depending on the magnitude of 𝑁) components failing.

In order to cater for these effects and also to ensure conservativeness in the proposed values for the C𝑀oo𝑁
factors, the figures in Table B.1 were suggested. These values were based on the following assumptions:

• Given a common cause failure of two redundant components, then the probability of a third similar
component also to fail due to the same cause will be 50 %.
• To cater for the (observed) effect of added redundancy decreasing as N increases, it was assumed
that:
o When 3 components are known to have failed, then the probability of a fourth component
also failing will be 60 %
o When going from 4 to 5 components then the probability of the fifth component also
failing will be 70 %, etc.
o Finally if 7 components are known to be failed in a CCF, then the likelihood of the other
components also failing is 100 % (for any 𝑁 ≥ 8).

66 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

Unfortunately, when using these values in the specified algorithm for calculating all C𝑀oo𝑁 values (ref.
section B.2 in /6/) negative values for C𝑀oo𝑁 are obtained for some voting configurations for large 𝑁
(𝑁 ≥ 10).

Hence, although the model described above can be used for most practical applications (i.e., up to at least
𝑁 ≥ 7), we want it to be valid for all MooN configurations. Thus a slightly updated model is now
introduced. In this model it is assumed that there are two different “mechanisms” causing a CCF.

1. Any CCF will with probability 𝑞 be a “lethal shock” causing all 𝑁 components to fail (the concept
of “lethal shocks” was originally introduced in the so-called binomial failure rate model by Vesely
in 1977, /19/ and further improved by Atwood in 1986, /20/)

2. The residual CCF will with probability 1 − 𝑞 follow the logic of the model described in the
previous PDS handbook /6/, but having slightly different parameter values to avoid the problem
with negative C𝑀oo𝑁 values.

Note that the shock model has been introduced to cater for the effect that for high N values, the likelihood
of all (𝑁) components failing seems to be higher than having exactly 𝑁 − 1 (and sometimes 𝑁 − 2, etc.)
components failing. Here 𝑞 = 0.05 has been suggested as a base case value in the shock model.

The “non-lethal-shock” part of the updated CCF model applies with probability 1 − 𝑞 and is based on the
following parameters:
• Given a common cause failure of two redundant components (say A and B), then the probability of
a third similar component failing due to the same cause equals 𝛽2 . Further, given the failure of 𝑘
specific components, the failure of another specified component equals 𝜃, (𝑘 = 3,4, … ). This last
assumption simplifies all formulas, and as long as 𝜃 ≥ 𝛽2 these will never result in negative values
for any C𝑀oo𝑁 .
• The following parameter values have been chosen as the “base case”:
o The probability 𝛽2 = 0.5 (as in the previous PDS handbook /6/), and 𝜃 = 0.6, which is the
value used for 𝛽3 in the previous PDS handbook /6/.

Note that the above parameter values for q, β2 and θ are pragmatically chosen to give a C𝑀oo𝑁 model very
close to the model in /6/, at the same time avoiding unacceptable (negative) C𝑀oo𝑁 values for any N.

This updated model is further described in the next section. When using the formulas given there, we get
an updated table of C𝑀oo𝑁 values (for 𝑁 ≤ 6) as shown in Table B.2.

67 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

Table B.2: Updated CMooN factors for different voting logics

𝑴 \ 𝑵 𝑵=𝟐 𝑵=𝟑 𝑵=𝟒 𝑵=𝟓 𝑵=𝟔

𝑴=𝟏 C1oo2 = 1.0 C1oo3 = 0.5 C1oo4 = 0.3 C1oo5 = 0.2 C1oo6 = 0.15

𝑴=𝟐 - C2oo3 = 2.0 C2oo4 = 1.1 C2oo5 = 0.8 C2oo6 = 0.6

𝑴=𝟑 - - C3oo4 = 2.8 C3oo5 = 1.6 C3oo6 = 1.2

𝑴=𝟒 - - - C4oo5 = 3.6 C4oo6 = 1.9

𝑴=𝟓 - - - - C5oo6 = 4.5

B.2 Formulas for the Configuration Factor CMooN


Now general formulas for the C𝑀oo𝑁 factor are provided, deriving the specific “base case” values given in
Table B.2, and also providing the users with a possibility of modifying the factors, based on application
specific experience and knowledge.


Now let C𝑀oo𝑁 be the C𝑀oo𝑁 factor calculated from the “old” model described in /6/ (with new parameter
values), then according to the modified model described above, C𝑀oo𝑁 is found from


C𝑀oo𝑁 = 𝑞 + (1 − 𝑞) ⋅ C𝑀oo𝑁

As discussed above this corresponds to a situation where a fraction 𝑞 of the CCF can be described as
“lethal shocks” (causing all 𝑁 components to fail), and the fraction 1 − 𝑞 follow the logic of the previous
CCF model of PDS /6/. Observe that if 𝑞 = 1 all CCF are lethal, corresponding to the standard beta-factor
model (C𝑀oo𝑁 = 1 for all 𝑀, 𝑁), and 𝑞 = 0 corresponds to a CCF model of the type described in /6/.


As a motivation for estimating the C𝑀oo𝑁 values, we first look at the case with 𝑁 = 3 components and let
(see Section B.1 above):

𝛽2 the probability that a third component, C fails, given that there has just been a dependent failure
(CCF) affecting both components A and B.

∗ ∗
From this definition we then have that C1oo3 = 𝛽2 and C2oo3 = 3 − 2𝛽2 (ref. Figure 6). If we let 𝛽2 = 0.5
∗ ∗
(as in /6/), it follows that C1oo3 = 0.5 and C2oo3 = 2.0.

For the general 𝑀oo𝑁 voting let 𝛽𝑘 (𝑘 ≥ 2) be the probability that a CCF which is known to affect k
specific components also affects a (specific) component number 𝑘 + 1, (for 𝑘 = 1 we let 𝛽1 = 𝛽, i.e., the
standard beta).


The general expressions for C𝑀oo𝑁 is rather complex and reference is therefore made to /16/ for the details.
As pointed out in section B.1 the general case also has the unfortunate feature of often leading to
unacceptable (negative) values. Thus, we restrict to consider the simplified case where the 𝛽𝑘 ‘s (for 𝑘 ≥ 3)
are constant, i.e., 𝛽𝑘 = 𝜃; 𝑘 ≥ 3. Provided 𝜃 ≥ 𝛽2 , it can be proved that we then get acceptable (non-

negative) C𝑀oo𝑁 values calculated from the formulas:

𝑁
∗ 𝑁
C𝑀oo𝑁 = 𝛽2 � � � 𝜃 𝑗−3 (1 − 𝜃)𝑁−𝑗 ; 𝑀 = 1,2, … , 𝑁 − 2
𝑗
𝑗=𝑁−𝑀+1

68 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

𝑁
∗ 𝑁 𝛽2 𝑁
C(𝑁−1)oo𝑁 = � � ⋅ �1 − � + 𝛽2 � � � 𝜃 𝑗−3 (1 − 𝜃)𝑁−𝑗
2 𝜃 𝑗
𝑗=2

∗ ∗
Observe that the summation in C𝑀oo𝑁 is explained by the fact that C𝑀oo𝑁 corresponds to the failure of at

least 𝑁 − 𝑀 + 1 components. For instance 𝑀 = 1 gives C1oo𝑁 = 𝛽2 ⋅ 𝜃 𝑁−3 , referring to failure of all 𝑁

components; (the probability of this equals 𝑄 ⋅ 𝛽 ⋅ C1oo𝑁 = 𝑄 ⋅ 𝛽 ⋅ 𝛽2 ⋅ 𝜃 𝑁−3 , where 𝑄 is the probability
(rate) that any component fails).

For the new “base case” we have chosen 𝛽2 = 0.5 and 𝜃 = 0.6, and the above formulas can be applied to

obtain the corresponding C𝑀oo𝑁 values. Further, by inserting 𝑞 = 0.05 we get the new “base case” values
for C𝑀oo𝑁 as given in Table B.2.

Further, it can be noted that the formulas given in chapter 5 are in various ways approximate. In particular,
observe that the total rate of DU failures for a system with 𝑁 components does not equal 𝑁⋅λDU , (which is
in fact a conservative approximation). This is e.g., due to the fact that some failures are multiple, and
actually we can show that the total rate of single, double, triple, etc., DU failures equals:

(𝑁 − C𝑁 ⋅ 𝛽) ⋅ 𝜆DU, 𝑁−1
where C𝑁 = ∑𝑀=1 C𝑀oo𝑁

In particular, for 𝑁 = 2, we have C2 = 1, and the sum of single and double DU failures then becomes:
(2 − 𝛽) ∙ λDU. In this case the system’s rate of single failures equals 2(1 − 𝛽) ∙ λDU, and the rate of double
failures equals 𝛽 ∙ λDU, (which sums up to (2 − 𝛽) ∙ λDU ).

For 𝑁 = 3 we get that C3 = 0.5 + 2.0 = 2.5, and the total rate of single, double and triple DU failures for
the system becomes (3 − 2.5 ∙ 𝛽) ∙ λDU, cf. illustration in Figure 6 in section 4.3.

In the more explicit formulas introduced in Appendix C, we will for a system with 𝑁 components also
introduce the parameter H𝑁 . Here β ⋅ H𝑁 is the fraction of a components failure rate which represents
multiple failures, (i.e., CCF of any multiplicity). So

(1 − H𝑁 𝛽) ⋅ 𝜆DU

is the rate of independent (single) failures for a specific component. Here H2 = 1, and for the new base
case values, H3 = 1.5, H4 = 1.8; (cf. Figure 6 for 𝑁 = 2 and 𝑁 = 3). In general we can prove that

𝑁 C𝑀oo𝑁 C𝑁 + C𝑁oo𝑁
H𝑁 = � =
𝑀=1 𝑁 𝑁

Here we have introduced the artificial parameter C𝑁oo𝑁 = C(𝑁−1)oo𝑁 to get a generic PFD formula for all
𝑀oo𝑁, see Appendix C. (This also gives a simpler formula for H𝑁 .) Note that C(𝑁−1)oo𝑁 ⋅ 𝛽 is the fraction
of CCF of any multiplicity, (i.e., multiplicity ≥ 2), and so it could similarly be said that C𝑁oo𝑁 ⋅ 𝛽 is the
fraction of CCF with multiplicity ≥ 1 (being the same).

Below we present an extended table of “base case” configuration factors, including C𝑁 and H𝑁 .

69 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

Table B.3: CCF configuration factor, CMooN and CN, HN (base case values)

𝐂𝑴𝐨𝐨𝑵
𝑵=𝟐 𝑵=𝟑 𝑵=𝟒 𝑵=𝟓 𝑵=𝟔
𝑴=𝟏 1.0 0.5 0.3 0.22 0.15
𝑴=𝟐 - 2.0 1.1 0.8 0.6
𝑴=𝟑 - - 2.8 1.6 1.2

𝑴=𝟒 - - - 3.6 1.9


𝑴=𝟓 - - - - 4.5
𝐂𝑵 1.0 2.5 4.2 6.2 8.35
𝐇𝑵 1.0 1.5 1.8 2.0 2.15

70 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

APPENDIX C: DETAILED FORMULAS FOR PFD AND DTU


Formulas for PFD and DTU were presented in chapter 5. In this Appendix, we will take a closer look at
these formulas. Section 5.3.1 presented quite general (approximate) expressions for PFD. Slightly more
accurate formulas are given below. Further, the derivation of DTUR is discussed in some more detail (cf.
section 5.3.3), and expressions for DTUR are given for a specific set of assumptions. Note that DTUT and
PTIF are not considered in the present appendix.

It should be noted that the unavailability formulas presented herein and in chapter 5 are actually
approximations, and some assumptions/limitations therefore apply (see section 5.2). However, in the
present appendix we will, (as also discussed in Appendix B), consider the following more accurate
assumptions:
• For a duplicated system, a single component’s rate of independent DU failures equals (1 − 𝛽) ⋅
λDU , rather than λDU (as was used as an approximation in chapter 5). A similar modification
applies for other configurations.
• The total failure rate for an 𝑁oo𝑁 configuration was in chapter 5 approximated with 𝑁 ∙ λDU. A
more accurate expression is used below.
• In the DTUR formulas we refrain from the assumption that the contribution from MTTR is
negligible. That is MTTR is not necessarily very short, and periods of degradation may therefore
give a contribution towards safety unavailability (caused by DD failures).

PFD Formulas
The formulas for PFD in section 5.3 account both for CCF and (combination of) independent failures
occurring in the same test interval. Consider a duplicated system where each component has failure rate
λDU . Then there is a rate 𝛽λDU of CCF and therefore each component has a rate (1 − 𝛽)λDU of
independent failures. In the formulas of section 5.3.1 we used λDU (which approximately equals (1 −
𝛽)λDU when 𝛽 is small. Similarly, for a triplicate system, the rate of independent DU failures for one
component equals (1 − 1.5 𝛽)λDU when we use “base case” values for configuration factors, (ref. Figure 6
in section 4.3). Following this more accurate approach, we will now in general write (1 − H𝑁 𝛽)λDU for
the rate of independent failures for a single component of an 𝑀oo𝑁 system, (see definition of H𝑁 in
Appendix B.2, and “base case” values given in Table B.3).

For the 𝑁oo𝑁 configurations we similarly need the total failure rate (sum of CCF and independent
failures). For 2oo2 this sum equals �(1 − 𝛽) + (1 − 𝛽) + 𝛽� ∙ λDU = (2 − 𝛽) ∙ λDU , which in chapter 5
was approximated with 2 ∙ λDU , and for a 3oo3 voting the total rate of failures for all three components
will be (3 − 2.5 ∙ 𝛽) ∙ λDU (which approximately equals 3 ∙ λDU , as used in chapter 5). Following this more
accurate approach, we will now in general write (𝑁 − C𝑁 𝛽) ∙ λDU for the total failure rate of the 𝑁oo𝑁
configuration, (see definition of C𝑁 in Appendix B.2, and “base case” values given in Table B.3).

DTUR Formulas
Regarding repairs, the following assumption applies for the formulas given below:

• MTTR is constant and does not depend on the number of failed components. However, the formulas
could easily be modified to account for an increased MTTR if more than one component fail
simultaneously (and only one repair team is available); e.g., for two components being repaired in
series, we could replace MTTR by 2 ∙ MTTR, (in case a double failure is accounted for).

As stressed in section 5.3.2, the formulas for DTUR will depend on the action taken upon detection of
failure (i.e., during the repair/restoration period). The formulas given below are valid for the following
operating philosophy:

71 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

• Degraded operation takes place if possible. During repair we assume that in case of a single failure, an
𝑀oo𝑁 voting degrades to (𝑀 − 1)oo(𝑁 − 1) when 𝑀 ≥ 2, 𝑁 ≥ 2, whilst 1oo𝑁 degrades to 1oo(𝑁 −
1) if 𝑁 ≥ 2.
• If all (redundant) components have failed, operation is continued without protection.

Regarding degraded operation we should observe the following:

• Essentially we only account for degraded operation due to single failures.


o Since an (𝑀 − 1)oo(𝑁 − 1) voting is safer than an 𝑀oo𝑁 voting (𝑀 ≥ 2), this means that
the formulas only account for degradation when going from 1oo𝑁 to 1oo(𝑁 − 1) voting.
o Degraded operation can of course be applied also if two or more components have failed.
However, this event will have a much smaller probability, and the effect on DTUR can for
most practical purposes be ignored.
• The contribution to DTUR due to repair of a new failure during degraded operation can usually be
ignored, and will not be accounted for here.

Considering a specific component group (e.g., valves), two contributions to DTUR are then accounted for:

i. The increase in DTUR due to DU failures occurring during degraded operation; Note that for
configurations where degradation gives a “better” configuration safety wise, there will be no such
contribution towards DTUR. This implies that a single component DD failure for a 1oo𝑁 voting,
(being degraded to a 1oo(𝑁 − 1) configuration) for 𝑁 ≥ 2 will give a contribution.

ii. Lack of protection during repair if all components have failed; This will for all configurations,
except for 1oo1, give a contribution (C1oo𝑁 ⋅ β ⋅ λDD ) ⋅ MTTR, where the first factor is the rate of all
𝑁 components failing (due to a DD common cause failure), and the second factor, MTTR is the
duration of the period of unavailability, due to this failure. For 1oo1 this contribution is simply
𝜆DD ⋅ MTTR.

So we adopt the restriction of just considering the contribution of degradation and repair due to DD
failures. As stated in section 5.3, similar terms may be added for contribution to DTU of both DU failures
(repaired at functional testing or after a true demand) and of SD failures. This would, however, require
separate arguments.

The formulas in Tables C.1 and C.2 below apply for all voting configurations 𝑀oo𝑁; 𝑁 ≤ 3, providing
separate tables for PFD (giving the contributions from 𝜆DU ) and DTUR (giving the contributions from
𝜆DD ). The tables include

• The voting configuration.


• The contributions from both CCF and independent failures to the PFD including a brief explanation.
• The contributions of repair upon failure of all components, and by degraded operation (contributing for
1oo𝑁 only) to DTUR including a brief explanation.
• For the frequently used 2oo3 voting, the contribution to DTUR of a double failure (leading to a 1oo1
voting) is also included. Otherwise double failures leading to degradation is not included.

72 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

Table C.1: PFD formulas for specific configurations; (values for CN and HN from Table B.3)

Voting Formula for PFD Explanation of contributions

1oo1 λDU ⋅ τ/2 DU failure of the single component.

1oo2 See correction toCommon


β ⋅ λDU ⋅ τ/2
+�(1 − β ) ⋅ λDU ⋅ τ� ⋅ 2/3
this table on next page
cau e failure of both components
Independent DU failure of both components.

2oo2 (2 − β) ⋅ λDU ⋅ τ/2 DU failure of at least one of the two components.

C1oo3 ⋅ β ⋅ λDU ⋅ τ/2 Common cause failure of all three components.


1oo3
+�(1 − 1.5β ) ⋅ λDU ⋅ τ � ⋅ 3/4 Independent DU failure of all three components; (H3 = 1.5).

τ Common cause failure of two or three components.


C2oo3 ⋅ β ⋅ λDU ⋅ Independent DU failure of (any) two of the three components
2oo3 2
+�(1 − 1.5β) ⋅ λDU ⋅ τ� ⋅ 2 (independent failure of all three components neglected);
(H3 = 1.5).
3oo3 (3 − 2.5β) ⋅ λDU ⋅ τ/2 A DU failure of at least one of the components; (C3 = 2.5).

Table C.2: DTUR formulas for specific configurations; (values for CN and HN from Table B.3)

Formula for DTUR


Voting Explanation of contributions
(given operating philosophy)
Repair of DD failure of the single
1oo1 λDD ⋅ MTTR
component.
Repair of CCF of both components
β ⋅ λDD ⋅ MTTR Repair of DD failure of either component
1oo2 𝜏
+(2(1 − β) ⋅ λDD ⋅ MTTR) ∙ �λDU ⋅ � (degradation) and repair of a DU failure upon
2 a demand.
Repair of a CCF of both components; (upon
2oo2 C1oo2 ⋅ β ⋅ λDD ⋅ MTTR single DD failure the voting degrades to
1oo1 which is better safety wise).
Repair of a CCF of all three components.
C1oo3 ⋅ β ⋅ λDD ⋅ MTTR Repair of a DD failure of either component
1oo3 𝜏
+(3(1 − 1.5β) ⋅ λDD ⋅ MTTR) ⋅ �β ⋅ 𝜆DU ⋅ � (degradation) and a CCF of the two other
2 components upon a demand.
Repair of CCF of all three components.
C1oo3 ⋅ β ⋅ λDD ⋅ MTTR
𝜏 Repair of a CCF failure of exactly two
2oo3 + �(C2oo3 − C1oo3 ) ⋅ 𝛽 ⋅ λDD ⋅ MTTR� ⋅ λDU ⋅ components (degradation to 1oo1 voting) and
2 a DU failure of the third component upon a
demand.
Repair of a CCF of all three components;
(upon single or double DD failure the voting
3oo3 C1oo3 ⋅ β ⋅ λDD ⋅ MTTR
degrades to 2oo2 or 1oo1 which is better
safety wise).
Note that these formulas should not be interpreted as being absolutely correct. They are intended to capture
the main contributors to PFD and DTUR in a rather transparent way.

The analogous formulas for a general 𝑀oo𝑁 voting are presented in tables C.3 and C.4. Note the
following:

73 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition
Corrections

Correction page 75:

Table C.1: PFD formulas for specific configurations; (values for CN and HN from Table B.3)

Voting Formula for PFD Explanation of contributions

1oo1 DU ⋅ /2 DU failure of the single component.

 ⋅ DU ⋅ /2 Common cause failure of both components.


1oo2 2
+((1 −  ) ⋅ DU ⋅ ) /3 Independent DU failure of both components.

2oo2 (2 − ) ⋅ DU ⋅ /2 DU failure of at least one of the two components.

C1oo3 ⋅  ⋅ DU ⋅ /2 Common cause failure of all three components.


1oo3 3
+((1 − 1.5 ) ⋅ DU ⋅  ) /4 Independent DU failure of all three components; (H3 = 1.5).

 Common cause failure of two or three components.


C2oo3 ⋅  ⋅ DU ⋅ Independent DU failure of (any) two of the three components
2oo3 2
2 (independent failure of all three components neglected);
+((1 − 1.5) ⋅ DU ⋅ )
(H3 = 1.5).
3oo3 (3 − 2.5) ⋅ DU ⋅ /2 A DU failure of at least one of the components; (C3 = 2.5).

Correction page 76:

Table C.3: Generic formulas for PFD presenting main contributions

Voting Formula for PFD Explanation of contributions


Common cause failure of all 𝑁
1oo𝑁; C1oo𝑁 ⋅  ⋅ DU ⋅ /2 + components.
𝑁
𝑁≥2 ((1 − H𝑁 ⋅ ) ⋅ DU ⋅ 𝜏) /(𝑁 + 1) Independent DU failure of all 𝑁
components.
Common cause failure of
𝑁 − 𝑀 + 1 or more
components.
𝑀oo𝑁; C𝑀oo𝑁 ⋅  ⋅ DU ⋅ /2 +
𝑀 < 𝑁, 𝑁 𝑁−𝑀+1 Independent DU failure of (any)
( ) ((1 − H𝑁 ) ⋅ DU ⋅ ) /(𝑁 − 𝑀 + 2) 𝑁 − 𝑀 + 1 of the 𝑁
𝑀 ≥ 1, 𝑁 ≥ 2 𝑁−𝑀+1
components, (independent
failures of more components
neglected).
𝑁𝑜𝑜𝑁; DU failure of at least one of the
(𝑁 − C𝑁 ⋅ ) ⋅ DU ⋅ /2
𝑁≥2 𝑁 components.
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

• Except for the first formula in Table C.4 all formulas are also valid for 𝑁 = 2, (but these cases are
covered in tables C.1 and C.2),
• It is easily seen that the first formula (1oo𝑁) of Table C.3 is a special case of the second one (𝑀oo𝑁);
i.e., the 𝑀oo𝑁 formula is also valid for 1oo𝑁.
• The formula for 𝑀 < 𝑁 (second line) of Table C.3 is actually valid also for 𝑀 = 𝑁. This follows since
C𝑁 = 𝑁 ⋅ H𝑁 − C𝑁ooN (see Appendix B.2). Thus, the 𝑀oo𝑁 formula for PFD is actually valid for all
configurations, 𝑁 ≥ 2. Further, in Table C.4 it is easily seen that the 𝑀oo𝑁 (𝑀 < 𝑁) formula for
DTUR is also valid for 𝑁oo𝑁.
• The DTUR formula for 2oo3 given in Table C.2 is actually somewhat more accurate than that obtained
from Table C.4, since for the 2oo3 voting the contribution to DTUR of a double failure (degradation to
1oo1) is included (in Table C.2).

Table C.3: Generic formulas for PFD presenting main contributions

Voting Formula for PFD Explanation of contributions


Common cause failure of all 𝑁
C1oo𝑁 ⋅ β ⋅ λDU ⋅ τ/2 + components.
1oo𝑁;
𝑁≥2 See corrections to this table
�(1 − H𝑁 ⋅ β) ⋅ λDU ⋅ 𝜏� ⋅
𝑁 on previous page
Independent DU failure of all 𝑁
𝑁+1
components.
Common cause failure of
𝑁 − 𝑀 + 1 or more
components.
𝑀oo𝑁; C𝑀oo𝑁 ⋅ β ⋅ λDU ⋅ τ/2 +
𝑀 < 𝑁, 𝑁 𝑁−𝑀+1 Independent DU failure of (any)
� � �(1 − H𝑁 β) ⋅ λDU ⋅ τ� /(𝑁 − 𝑀 + 2) 𝑁 − 𝑀 + 1 of the 𝑁
𝑀 ≥ 1, 𝑁 ≥ 2 𝑁−𝑀+1
components, (independent
failures of more components
neglected).
𝑁𝑜𝑜𝑁; DU failure of at least one of the
(𝑁 − C𝑁 ⋅ β) ⋅ λDU ⋅ τ/2
𝑁≥2 𝑁 components.

Table C.4: Generic formulas for DTUR presenting main contributions

Formula for DTUR


Voting Explanation of contributions
(given operating philosophy)
Repair of a CCF of all 𝑁
components.
C1oo𝑁 ⋅ β ⋅ λDD ⋅ MTTR +
1oo𝑁;
𝜏 Repair of a DD failure of either of the
𝑁≥2 (𝑁(1 − H𝑁 β) ∙ λDD ∙ MTTR) ⋅ �C1oo(𝑁−1) ⋅ β ⋅ λDU ∙ � 𝑁 components (degradation) and a
2 CCF of the 𝑁 − 1 remaining
components upon a demand

𝑀oo𝑁; Repair of a CCF of all 𝑁 components


𝑀 < 𝑁, C1oo𝑁 ⋅ β ⋅ 𝜆DD ⋅ MTTR (note that upon failure of less than 𝑁
𝑀 ≥ 1, 𝑁 ≥ 2 components the voting degrades to
something better safety wise)
Repair of a CCF of all 𝑁 components
𝑁𝑜𝑜𝑁; (note that upon failure of less than 𝑁
𝑁≥2 𝐶1oo𝑁 ⋅ β ⋅ λDD ⋅ MTTR
components the voting degrades to
something better safety wise)

74 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

APPENDIX D: MULTIPLE SIS – BACKGROUND AND CALCULATIONS


The first part of this appendix provides a description of the various alternative approaches to determining
an appropriate correction factor (CF) for multiple SIS in the case of simultaneous testing. The second part
contains a discussion of the effects of non-simultaneous testing, both regarding different phasing and
different length of test intervals. This gives justification for the conclusions presented in section 7.4 of the
handbook concerning the effects of non-simultaneous testing.

D.1 Approaches to determining CF in case of simultaneous testing


This section presents six alternative approaches to determining an appropriate CF for multiple SIS in the
assumed case of simultaneous testing of all equipment involved in the multiple SIS, as well as a
comparison and discussion of their use. A common example is used throughout the section to illustrate the
approaches and their differences.

D.1.1 Overview of possible approaches


Several approaches may be envisaged to find the appropriate CF. What approach to adopt depends on the
level of knowledge about the elements in each SIS in terms of:
• element structure (i.e., voting architecture of the elements)
• PFD contribution from each element to the total PFD of the SIS
• the presence of single component elements with a dominant PFD contribution

It also depends on the analyst's preferences with respect to:


• the importance of being conservative in the calculation, as opposed to realistic
• the importance of being accurate in the calculation, as opposed to approximate
• the amount of effort to put in the calculation

An overview of suggested approaches with respect to system knowledge and analyst preferences is given
in Table D.1. The numbering indicates an increasing level of sophistication in the approaches.

Table D.1: Possible approaches to determining the appropriate CF for a multiple SIS

SIS Element Calcu- Dominant


Conservative Approximate
Approach element PFD lation single
or realistic or accurate
structure contribution effort elements
Unknown/ Unknown/
1 “Global” Realistic Approximate Low Yes
disregarded disregarded
“Maximal Unknown/
2 Known Conservative Approximate Low No
order” disregarded
“Minimal Unknown/
3 Known Realistic Approximate Low Yes
order” disregarded
“Dominant
4 Known Known Realistic Approximate Low No
element”
“Structural Conservative
5 Known Known Accurate Medium No
average” Realistic
Conservative
6 “Cut set” Known Known Accurate High No
Realistic

The common objective of all the approaches 1–5 is to establish a single representative m-oo-n structure for
each SIS in the multiple SIS, and calculate the corresponding CF according to equation (**) in section 7.2:

∏𝑁𝑘=1(𝑛𝑘 − 𝑚𝑘 + 2)
CF = (**)
1 + ∑𝑁𝑘=1(𝑛𝑘 − 𝑚𝑘 + 1)

75 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

The “cut set” approach, on the other hand, combines element structures across individual SIS to find the
appropriate CF.

Normally, both structure and PFD contribution for individual SIS elements will be known, so the “cut set”
approach will give the most realistic result. However, there might be cases where this information is not
easily available, and there are also good reasons for disregarding this information and adopt one of the
simpler approaches. A guide to selecting a suitable calculation approach is given in Figure D.1.

Approx-
Element structure Yes PFD contribution Yes Approximate Dominant single Yes
imate
Global
considered considered vs. accurate element
No No Accurate No

Conservative Dominant
Global Calculation effort
vs. realistic element

Conservative Realistic Medium High

Maximal Minimal Structural


Cut-set
order order average

Figure D.1: Guide to selecting a suitable CF calculation approach

In this handbook we recommend using the “global” approach for multiple SIS calculations in general
situations. This approach is described in section 7.3, while the remaining approaches 2–6 are described
below. Although the “global” approach will be suitable and sufficient for the vast majority of multiple SIS
calculations, there might be special situations where it is advisable to use one of the other approaches 2–6.

D.1.2 “Maximal order” approach


When the structure (i.e., voting logic) of each element in each individual SIS is known, this information
may be used to derive a representative m-oo-n structure for each SIS. In the “maximal order” approach,
one element structure in each SIS is chosen as representative for the SIS, and the order of that structure is
used in the CF calculation. The order 𝑜𝑒 of an m-oo-n element structure is defined as

𝑜𝑒 = 𝑛 − 𝑚 + 1

A single component will then have order 1, while a 1oo2 (or any k-oo-k+1) structure will have order 2, a
1ooN structure will have order 𝑁, and so forth. The order corresponds to the power of the various terms 𝜆𝜏
in the PFD expression, e.g., elements with order 2 features (𝜆𝜏)2 terms, etc. Based on the orders of
individual elements, we identify the order 𝑂𝑆𝐼𝑆 of a SIS as the maximum of the element orders, i.e.,

𝑂𝑆𝐼𝑆 = max{𝑜𝑒 }
𝑂𝑆𝐼𝑆 will then reflect the structure with the highest degree of redundancy. This is based on the observation
that the CF increases with the degree of redundancy, and in order to ensure conservativeness in the PFD
estimates (when the element PFD contributions are unknown/disregarded), the maximum order should be
used.

The SIS is then considered to be represented by a 1oo𝑂SIS structure, and is combined with the
representative structures of the other SIS layers as described in section 7.2. For a multiple SIS comprising
N SISs, we let 𝑂𝑘 be the order of SIS𝑘 , and calculate the CF equivalently to equation (**) above:

∏𝑁
𝑘=1(𝑂𝑘 + 1)
CF = (*)
1 + ∑𝑁
𝑘=1 𝑂𝑘

76 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

The situation is illustrated in the example in Figure D.2. In the figure, the order of each element is
indicated. Using equation (*), the appropriate CF is 2.

1A
1C
1B
CCF CCF
1A 1C

1oo2 1oo2
2oo3
SIS1

o=2 o=1 o=2 o=1 o=1

Representative 1oo2
CF = 2
structures
2A
SIS2

2B
2C
CCF CCF
2A 2B

1oo3
1oo2
1oo3 1oo3
o=3 o=2 o=1 o=1 o=1

Figure D.2: Example of the “maximal order” approach to determining CF. Element order is
indicated beneath each element, and the maximal order is highlighted for each SIS

Using equation (*), we obtain CFs for various combinations of SIS orders as shown in Table D.2 for two
layers and Table D.3 for three layers.

In the current (draft) IEC 61511-3 standard conservative CFs are recommended based on the maximal
order of the minimal cut sets, which is equivalent to the maximal order approach.

Table D.2: CFs for multiple SIS with two layers when the element structure of each SIS is known

Order (𝑶𝒌 )

Multi SIS 2 3 4 5
SIS1 1 1 1 2 1 2
SIS2 1 2 3 2 4 3
CF 1.3 1.5 1.6 1.8 1.7 2.0

Table D.3: CFs for multiple SIS with three layers when the element structure of each SIS is known

Order (𝑶𝒌 )
Multiple
3 4 5 6
SIS
SIS1 1 1 1 1 1 1 2
SIS2 1 1 1 2 1 2 2
SIS3 1 2 3 2 4 3 2
CF 2.0 2.4 2.7 3.0 2.9 3.4 3.9

77 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

D.1.3 “Minimal order” approach


The “maximal order” approach described above implicitly assumes that the most important contributor to
the PFD of a SIS is the element of highest order, i.e., the elements with the highest degree of redundancy.
This is very conservative, considering that:

• Elements with a high degree of redundancy are often more reliable than elements with low
redundancy (e.g., single components), thereby contributing less to the total PFD of the SIS. This
may be verified if the PFD contribution from each element is known.
• Elements with redundancy are often modelled with an additional common cause failure
contribution, which far outweighs the contribution from individual failures of components in that
element. This means that the redundant structure may be disregarded, while the CCF “element” is
retained and considered as an artificial single component element.

Hence, one could argue that a minimal order approach will be more realistic, where the SIS order is
defined as 𝑂SIS = min{𝑜𝑒 }. The situation is illustrated in the example in Figure D.3.The situation is
illustrated in Figure D.3. In the figure, the order of each element is indicated. Using equation (*), the
appropriate CF is 4/3.

1A
1C
1B
CCF CCF
1A 1C

1oo2
2oo3
SIS1

o=2 o=1 o=2 o=1 o=1

Representative
CF = 4/3
structures
2A
SIS2

2B
2C
CCF CCF
2A 2B

1oo2
1oo3
o=3 o=2 o=1 o=1 o=1

Figure D.3: Example of the “minimal order” approach to determining CF. Element order is
indicated beneath each element, and the minimal order is highlighted for each SIS

This will in most cases be order 1 due to the presence of single component elements (including artificial
CCF elements), which will make the approach identical to the “global” approach described in the
handbook. There is a risk of underestimating the actual PFD, but this may be considered negligible. The
suitability of the approach may be assessed if the PFD contribution from each element is known.

D.1.4 “Dominant element” approach


The approaches based on element order do not explicitly take into account the PFD contributions from the
elements in each SIS. If information about PFD contributions is available, this can be used to better select a
representative element for each SIS. In the “dominant element” approach, the element that contribute most
to the PFD of a SIS is taken as the representative for that SIS, and the particular structure of this element
taken as the representative structure for the SIS.

78 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

The situation is illustrated in the example in Figure D.4. In the figure, the relative PFD contribution from
each element is indicated. Using equation (**) above, the appropriate CF is 4/3.

1A
1C
1B
CCF CCF
1A 1C

1oo2
2oo3
SIS1

6% 10 % 3% 54 % 27 %

Representative
CF = 4/3
structures
2A
SIS2

2B
2C
CCF CCF
2A 2B

1oo2
1oo3
2% 0% 70 % 23 % 5%

Figure D.4: Example of the “dominant element” approach to determining CF. PFD contribution is
indicated beneath each element, and the dominant element is highlighted for each SIS

Note that this approach often reduces to the “global” approach due to the presence of dominant single
component elements. But this does not always have to be the case; one can envisage situations where
single component elements are either absent or have a non-dominant contribution (e.g., single logic with
very low failure rates). This would in addition require that CCF elements are either not modelled or have a
non-dominant contribution.

The “dominant” approach is very simple and intuitive, but there are some issues that need consideration:

• Often there will be no single dominant element, and the sum of all the other contributors to the
PFD may be substantial, even outweighing the biggest contributor. In this case one can hardly say
that the biggest contributor is representative.
• This approach disregards the structural information in the non-dominant elements. It might be
difficult to assess whether this leads to a conservative or non-conservative approximation.

“Dominant order” approach


These issues with the “dominant element” approach might to some extent be alleviated by adopting a more
accurate variant based on element order (cf. the order-based approaches above). In what can be termed the
“dominant order” approach, the order of each element is identified, and the PFD contributions from
elements of the same order are summed. The order corresponding to the highest sum of PFD contributions
is then considered as the “dominant order” for the SIS, and the CF is calculated using equation (*) above.

In our example the dominant order of SIS1 is 𝑂1 = 1 (91 % of the PFD contribution), while the dominant
order of SIS2 is also 𝑂1 = 2 (98 % of the PFD contribution). Hence, the approach based on dominant
order gives the same result as the approach based on dominant element. This will be true in most cases, and
the (limited) additional calculation effort might not be worthwhile.

79 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

D.1.5 “Structural average” approach


In order to resolve the issues of the “dominant element” approach, one could argue that the representative
m-oo-n structure of each SIS should be a weighted average of all the contributing element structures, with
the PFD of the elements as weights. The “structural average” approach is illustrated in the example in
Figure D.5. In the figure, the relative PFD contribution from each element is indicated.

1A
1C
1B
CCF CCF
1A 1C
1.06-oo-1.15
1oo2
2oo3 1.06-oo-1.15
SIS1

6% 10 % 3% 54 % 27 %

Representative
CF = 1.36
structures
2A
SIS2

2B
1-oo-1.04
2C
CCF CCF
2A 2B
1-oo-1.04
1oo2
1oo3
2% 0% 70 % 23 % 5%

Figure D.5: Example of the “structural average” approach to determining CF. Each element vote for
the representative structure using their PFD contribution as weights

This approach creates artificial, non-integer m-oo-n representative SIS structures, but this does not pose a
problem in the calculations. The calculation is shown in Table D.4 for SIS1, yielding a 1.06-oo-1.15
representative structure. Similarly, for SIS2 the representative structure is 1-oo-1.04. Using equation (**)
above, the appropriate CF is then 1.36.

Table D.4: Calculation for finding the representative m-oo-n structure for SIS1

Structure Weighted structure


Element PFD [%]
m-oo-n m n m n
1A 2oo3 2 3 6 0.12 0.18
1B 1oo1 1 1 10 0.1 0.1
1C 1oo2 1 2 3 0.03 0.06
CCF 1A 1oo1 1 1 54 0.54 0.54
CCF 1C 1oo1 1 1 27 0.27 0.27
Total 100 1.06 1.15

D.1.5. “Cut set” approach


The information about PFD contributions from individual elements may be used to derive even more
accurate CFs. In the “cut set” approach we do not seek a single representative m-oo-n structure for each

80 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

individual SIS, but rather let all possible element combinations (one from each SIS) vote for a CF, using
the element PFDs as weights. The situation is illustrated in Figure D.6. In the figure, the relative PFD
contribution from each element is indicated, as well as the resulting set of combinations with associated
CFs and weights.

1A
1C
1B
CCF CCF
1A 1C

1oo2 1A 1A 1A
2oo3
SIS1

6% 10 % 3% 54 % 27 % ccf
1C

2oo3 2oo3 2oo3


2A 2B
etc. etc.

2A ccf ccf
SIS2

2B 2B
2B
1oo2 0.3 % 1.35 %
2C 1oo3 CF = 1.5 CF = 1.33
0%
CCF CCF 0.12 % CF = 1 8
2A 2B CF = 2.0

1oo2
1oo3
CF = 1.35
2% 0% 70 % 23 %

Figure D.6: Example of the "cut set" approach to determining CF. The voting logic and the relative
PFD contribution of the individual elements are indicated

In this example there are 25 possible element combinations. Table D.5 shows selected combinations with
the associated CF calculated according to equation (**) above, as well as a weighted CF contribution from
each combination. The overall appropriate CF is 1.35, calculated as the sum of weighted CF contributions
from all combinations 𝑖, i.e.,

CF = � CF𝑖 ∙ 𝑤𝑖
𝑖

Table D.5: Example of finding an appropriate CF using the “cut set” approach. Only selected
combinations are shown

𝐒𝐈𝐒𝟏 𝐒𝐈𝐒𝟐
Weight
PFD PFD 𝐂𝐅𝒊 𝐂𝐅𝒊 ∙ 𝒘𝒊
Element Structure Element Structure 𝒘𝒊 [%]
[%] [%]
1A 2oo3 6 2A 1oo3 2 2.0 0.12 0.0024
1A 2oo3 6 2B 1oo2 0 1.8 0 0
... ... ... ... ... ... ... ... ...
1A 2oo3 6 CCF 2B 1oo1 5 1.5 0.3 0.0045
1B 1oo1 10 2A 1oo3 2 1.6 0.2 0.0032
... ... ... ... ... ... ... ... ...
CCF 1A 1oo1 54 2C 1oo1 70 1.33 37.8 0.504
... ... ... ... ... ... ... ... ...
CCF 1C 1oo1 27 CCF 2B 1oo1 5 1.33 1.35 0.018
Total 100 1.35

81 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

Relation to MCS analysis


It can be shown that the “cut set” approach is equivalent to applying a separate CF to each minimal cut set
(MCS) in a fault tree analysis, where the CF is determined by the order of the MCS (i.e. the number of
components in the MCS). An MCS is a set of components whose simultaneous failure will be both
necessary and sufficient for the multiple SIS to fail. A MCS comprising 𝑛 components can be considered to
be a parallel structure with a 1oon voting logic, suggesting a CF of 2𝑛 /(𝑛 + 1). Each possible
combination of elements (one element from each SIS) yields a MCS, except in cases of advanced voting
logics (m-oo-n; 𝑚 > 1), where several MCS will arise.

This result can be used to calculate the PFD of the overall SIS quite accurately without explicitly
identifying a CF for the multiple SIS by the “cut set” method, as long as all the MCSs are known. The PFD
contribution from a MCS is the product of the PFDs of the components in the MCS, multiplied by the CF,
and the total PFD of the multiple SIS is the sum of all contributions, i.e.

𝐾 𝑛𝑘
2𝑛𝑘
PFD = � . � PFD𝑘𝑖
𝑛𝑘 + 1
𝑘=1 𝑖=1

where 𝐾 is the number of MCS, 𝑛𝑘 is the number of components in MCS 𝑘 and PFD𝑘𝑖 is the PFD of
component 𝑖 in the MCS 𝑘.

It should be noted that the “cut set” approach is rather advanced compared to the other approaches
discussed in this chapter. The required calculations are somewhat tedious to perform by hand, but may be
easily automated in a spread sheet or a more dedicated computer tool.

D.1.6 Discussion
Comparing the six approaches discussed in this handbook, we see that they all end up with very similar
CFs when applied to a common example (Table D.6), except the very conservative “maximal order”
approach.

Table D.6: Summary of possible approaches to determining the CF for a multiple SIS

Element PFD
Approach SIS element structure CF (example)
contribution
1 “Global” Unknown/disregarded Unknown/disregarded 1.33
2 “Maximal order” Known Unknown/disregarded 2.0
3 “Minimal order” Known Unknown/disregarded 1.33
4 “Dominant element” Known Known 1.33
5 “Structural average” Known Known 1.36
6 “Cut set” Known Known 1.35

The similar results are explained by the fact that in the given example each SIS is dominated by single
component elements with respect to PFD contribution. Especially the CCF elements, which are always
modelled as voted 1oo1, have a significant contribution. This dominance of single component elements
makes each approach identical or very similar to the “global” approach. If the single component elements
were less dominant, e.g., if CCF failures were not modelled, there would be a greater variation in the
suggested CFs from the various approaches.

82 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

More than anything, the results shows that the simple “global” approach will be adequate for the vast
majority of conceivable multiple SIS calculations, and that there is seldom need for adopting one of the
other advanced or overly conservative approaches.

D.2 The effect of differences in testing


In order to be able to assess the need for correction factors for multiple SIS, we first need to study the
effect of testing on the PFD estimates. In the following we consider test intervals of different phasing
and/or different length. Furthermore, we consider simple SIS structures consisting of two components (in
parallel), and generalize to three or more components when appropriate.

D.2.1 Different phasing of test intervals


We first consider the case with two components voted 1oo2 having failure rates 𝜆1 and 𝜆2 , respectively,
and the same test interval 𝜏, but where the testing is not done simultaneously. The situation is illustrated in
Figure D.7.

PFD

≈ 𝜆2 (𝑡 + 𝜏 − 𝑎)

≈ 𝜆1 𝑡 PFD2(t)

≈ 𝜆2 (𝑡 − 𝑎) PFD1(t)
t
0 a τ 2τ
Figure D.7: PFD(t) for two components with equal test intervals, but not tested simultaneously. The
representative interval (0,τ) is in grey

Assuming that component 2 is tested at time 𝑎 inside the test interval of component 1, we have:

𝜏 𝑎 𝜏
1 1 1
PFD(𝑎) = � PFD1 (𝑡) ∙ PFD2 (𝑡)d𝑡 = � 𝜆1 𝑡 ∙ 𝜆2 (𝑡 + 𝜏 − 𝑎)d𝑡 + � 𝜆1 𝑡 ∙ 𝜆2 (𝑡 − 𝑎)d𝑡
𝜏 𝜏 𝜏
0 0 𝑎
4 2𝑎 2𝑎2 𝜆1 𝜆2 𝜏 2 4 2𝑎 2𝑎2
=� − + 2 �∙ =� − + 2 � ∙ PFD1 ∙ PFD2
3 𝜏 𝜏 4 3 𝜏 𝜏

The expression in parentheses is the CF needed when calculating with average PFD values.

PFD(a) attains its maximum value


4
PFDmax = ∙ PFD1 ∙ PFD2
3

when 𝑎 = 0 or 𝑎 = 𝜏, i.e., when the components are tested simultaneously. This corresponds to the well
known CF of 4/3 for two redundant components. Further, PFD(𝑎) attains its minimum value when
𝑎 = 𝜏/2, i.e., when component 2 is tested in the middle of the test interval of component 1:

5
PFDmin = ∙ PFD1 ∙ PFD2
6

83 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

Note that this minimum PFD is actually lower than the PFD obtained when simply multiplying the average
PFD values of the components. This implies a CF less than 1, i.e., that the uncorrected product of average
PFD values is actually conservative. Compared to the case of simultaneous testing, we see a PFD reduction
of 38 % in the case of “optimal” testing. Hence, there is a substantial potential for improvement in the total
PFD if components are tested at different times. This is exploited in staggered testing, /15/, where
components are not tested simultaneously, but tests are distributed as evenly as possible in time. Staggered
testing is intended to reduce the impact of common cause failures, but have a significant positive effect on
the PFD arising from independent failures as well. In our example, staggered testing would imply testing
component 2 at 𝑡 = 𝜏/2, which yields the minimum PFD value for the system of two components. Despite
the positive effect, staggered testing is often a desk exercise; in real operation such a test regime may prove
unlikely due to practical considerations.

If we have no prior knowledge of 𝑎, i.e., if the two components are tested completely independently (this
corresponds to selecting 𝑎 at random), the expected average PFD can be obtained by integrating PFD(𝑎). 15

1 𝜏 𝜆1 𝜆2 𝜏 2
PFD = � PFD(𝑎)d𝑎 = = PFD1 ∙ PFD2
𝜏 0 4

We see that this result is equal to the PFD obtained when simply multiplying the average PFD values.
Hence, with independent testing, there is no correlation between the PFD(𝑡) functions of the components,
and the systemic dependencies vanish.

1oo3 voting
Next we consider the case with three components voted 1oo3 having the same test interval 𝜏, but where
the testing is not done simultaneously. We assume that component 2 and 3 are tested at times 𝑎 and 𝑏
inside the test interval of component 1, such that 0 ≤ 𝑎 ≤ 𝑏 ≤ 𝜏. Analogous to the 1oo2 case, it can be
shown that the PFD of this system of three components is:

8(𝑎 + 𝑏) 4𝑎(𝑎 + 𝑏) 8(𝑎3 + 𝑏 3 ) − 12𝑎𝑏(𝑎 + 𝑏)


PFD(𝑎, 𝑏) = �2 − + + � ∙ PFD1 ∙ PFD2 ∙ PFD3
3𝜏 𝜏2 3𝜏 3

Similar to the 1oo2 voting, this function attains its maximum value when the components are tested
simultaneously (i.e.: 𝑎 = 𝑏 = 0) and its minimum value when the tests are evenly distributed in time (i.e.,
𝑎 = 𝜏/3 and 𝑏 = 2𝜏/3).

PFDmax = 2 ∙ PFD1 ∙ PFD2

2
PFDmin = ∙ PFD1 ∙ PFD2
3

The lengthy expression in parentheses above is the required CF, and we have CF = 2 for PFDmax and
CF = 2/3 for PFDmin . Also, in case of no prior knowledge of 𝑎 and 𝑏, i.e., if the three components are
tested completely independently, the average PFD is again equal to the product of average PFD values.

𝜏 𝑏
2 𝜆1 𝜆2 𝜆3 𝜏 3
PFD = 2 � � PFD(𝑎, 𝑏) d𝑎 d𝑏 = = PFD1 ∙ PFD2 ∙ PFD3
𝜏 8
0 0

15
Note the difference in interpretation of PFD(𝑎) and PFD: PFD(𝑎) is the average PFD of the 1oo2 structure given
that comp 2 is tested at time a, while PFD is the average PFD of the 1oo2 structure when we have no information
about 𝑎.

84 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

Generalization to 1-oo-n voting


Although it is not proven here, these results are generalizable to the 1-oo-n case, where n redundant
components have the same test interval 𝜏, but where the testing is not done simultaneously. This means
that:

1. The PFD attains its maximum value when components are tested simultaneously.
𝑛
∏𝑛𝑖=1 𝜆𝑖 𝜏 2𝑛
PFDmax = = ∙ � PFD𝑖
𝑛+1 𝑛+1
𝑖=1

2. PFDmax is always higher than the product of average PFD values, corresponding to a CF =
2𝑛 ⁄(𝑛 + 1). This implies that the product of average PFD values in this case is non-conservative.

3. The PFD attains its minimum value when tests are distributed evenly in time, i.e., at times
{𝑡𝑘 = 𝑘𝜏⁄𝑛}𝑘=1…𝑛 .

4. PFDmin is always lower than the product of average PFD values, corresponding to a CF < 1. This
implies that the product of average PFD values in this case is conservative.

5. The average PFD with independent testing is equal to the product of average PFD values. This implies
that no CF is necessary.
𝑛 𝑛
𝜆𝑖 𝜏
PFD = � = � PFD𝑖
2
𝑖=1 𝑖=1

D.2.2 Different length of test intervals


We consider the case with two components voted 1oo2 with failure rates 𝜆1 and 𝜆2 , respectively, and test
intervals 𝜏1 and 𝜏2 , respectively. We first assume that the test interval of component 2 is a multiple of the
interval of component 1, i.e., 𝜏2 = 𝑛𝜏1 , and that the testing is “quasi-simultaneous” in the sense that
whenever component 2 is tested, component 1 is tested also. The situation is illustrated in Figure D.8 with
𝑛 = 4.

PFD

PFD1(t)

PFD2(t)

t
τ1 2τ1 3τ1 4τ1 5τ1
τ2

Figure D.8: PFD(t) for two components with different test intervals and failure rates. The case of
τ2=4τ1 is shown

The overall PFD for this system is:

85 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

𝑛𝜏 𝑛
𝑘𝜏
1 1
PFD = � PFD1 (𝑡) ∙ PFD2 (𝑡) d𝑡 = �� 𝜆1 (𝑡 − (𝑘 − 1)𝜏) ∙ 𝜆2 𝑡 d𝑡
𝑛𝜏 𝑛𝜏 (𝑘−1)𝜏 (D.2)
0 𝑘=1
3𝑛 + 1
= ∙ PFD1 ∙ PFD2
3𝑛

If there is a difference in phasing of the tests, we know from the discussion above that this will reduce the
PFD. Assuming that component 2 is tested at time 𝑎 inside the test interval of component 1, it can be
shown that:
3𝑛 + 1 2𝑎 2𝑎2
PFD(𝑎) = � − + � ∙ PFD1 ∙ PFD2
3𝑛 𝑛𝜏 𝑛𝜏 2

Mirroring the results in section D.2.1, this function attains its maximum value (with CF > 1) when the
components are tested simultaneously and its minimum value (with CF < 1) when 𝑎 = 𝜏/2:

3𝑛 + 1
PFDmax = ∙ PFD1 ∙ PFD2
3𝑛

6𝑛 − 1
PFDmin = ∙ PFD1 ∙ PFD2
6𝑛

Also, in case of no prior knowledge of a, i.e., if the two components are tested completely independently,
the average PFD is equal to the product of average PFD values, and no CF is needed:

1 𝜏
PFD = � PFD(𝑎)d𝑎 = PFD1 ∙ PFD2
𝜏 0

We see that as n grows, the effect of different phasing diminishes rapidly. For example, for two
components tested quarterly and yearly, respectively (𝑛 = 4), calculating with individual average PFD
values will require a CF of 1.08. In other words: omitting the CF in the calculation underestimates the
actual PFD by a mere 8 %.

The formula D.2 may therefore be used as a fairly good – and conservative – approximation also in cases
of different phasing. Furthermore, the formula applies for integer 𝑛, but can also be used as an
approximation in cases where 𝜏2 is not an exact multiple of 𝜏1 .

86 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

APPENDIX E: PFD VERSUS PFH AND THE EFFECT OF DEMANDS


This Appendix provides some of the background for the main results presented in chapter 6. In addition to
the standard parameters, we now introduce the following notation:

𝑿 = Number of demands in a test interval (0, 𝜏]. 𝑋 is Poisson distribution with parameter 𝛿 ∙ 𝜏, where

𝜹 = Constant rate of demands which serve as a functional test.

𝒁 = Number of hazards in a test interval (0, 𝜏], i.e., number of demands not being “prevented” by the SIS,
due to a DU failure of the SIS.

𝒀𝟏 , 𝒀𝟐 , … , 𝒀𝒌 = Occurrences of the demands in a test interval, given 𝑋 = 𝑥, where 0 < 𝑌1 < 𝑌2 < ⋯ <
𝑌𝑥 < 𝜏.

Regarding PFD, PFH and HR we use the following notation:

𝐏𝐅𝐃(𝒕) = Instantaneous PFD at time 𝑡; 0 < 𝑡 ≤ 𝜏, the general case also accounting for demands.

𝐏𝐅𝐃 = Average PFD over the test interval (0, 𝜏]; i.e., general case when demands are accounted for.

𝐏𝐅𝐃𝒙 = Average PFD, when demands are accounted for, and given that 𝑋 = 𝑥 in the test interval (0, 𝜏].

𝐏𝐅𝐃𝟎 = “Traditional” PFD. Average PFD over the test interval, not accounting for demands, (i.e., X = 0).

𝐏𝐅𝐃𝟎 (𝒕) = Instantaneous PFD at time 𝑡, not accounting for demands, i.e., 𝑋 = 0.

𝐏𝐅𝐇𝟎 (𝒕) = Rate of SIS failures at time 𝑡, not accounting for demands, i.e., 𝑋 = 0.

𝐏𝐅𝐇𝟎 = Average rate of SIS failures, given X = 0.

𝐏𝐅𝐇(𝒕) = Rate of SIS failures at time 𝑡; 0 < 𝑡 ≤ 𝜏, the general case also accounting for demands.

𝐏𝐅𝐇 = Average rate of SIS failures, general case accounting for demands.

𝐇𝐑 = Hazard rate, i.e., rate of demands not prevented by the SIS and thus giving a hazardous event
(HR = E(Z)/𝜏).

Here it is assumed that we have an on demand system where it is the DU failures that make the essential
contributions to loss-of-safety; (this is often a fair assumption, since when D failures are detected
“immediately” safety precautions may be taken to avoid hazards). Further, a constant failure rate of DU
failures is assumed, and we make the standard simplifying assumption PFD0 (𝑡) = 1 − e−𝜆DU⋅𝑡 ≈ 𝜆DU ⋅ 𝑡
(and e.g., PFD0 = 𝜆DU ⋅ 𝜏/2. The system is as assumed as good as new after each functional test, and the
demands included here are also considered to give a “perfect test”.

It should be noted that if a demand does not always serve as a functional test of the system, 𝛿 should be
defined as the rate of demands which serve as a functional test; (might depend on the voting
configuration), and thus 𝑋 is the number of those demands.

First it is observed that the occurrence of demands does essentially not affect the SIS failure rate PFH.
Unless we account for combinations of independent failures, the PFH is independent of δ and equal to
PFH0 . Further, assuming a constant PFH = (PFH0 ), we get PFD0 = PFH0 ⋅ τ/2. This is the relation
between PFD and PFH, when demands are not accounted for (i.e., X = 0).

87 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

The objective now is to derive expressions for the PFD, PFH (and HR), also taking the effect of demands
into account. We present the results for the case that PFD(𝑡) is a step-wise linear function in 𝑡 (linear case);
corresponding to time-dependent PFH being constant. This means that we restrict to consider either NooN
configuration or for 𝑀oo𝑁 configurations (𝑀 < 𝑁) restrict to consider CCF (i.e., PFH0 = C𝑀oo𝑁 ⋅ 𝛽 ⋅
𝜆DU .) This is a standard assumption and not considered a serious limitation, and we skip discussing the
contribution of independent DU failures.

The approach chosen will be to derive the expressions for basic parameters (as HR), by conditioning on the
values of 𝑋 and 𝑌1 , 𝑌2 , … , 𝑌𝑋 . The approach will follow the line of probabilistic arguments found e.g., in
/15/. First we give the following basic probabilistic result:

Lemma
Given 𝑋 = 𝑥, the distribution of 𝑌1 , 𝑌2 , … , 𝑌𝑋 , will be that of the order statistics of a uniform distribution
over (0, 𝜏]. That is, the joint probability density function (pdf) of 𝑌1 , 𝑌2 , … , 𝑌𝑋 given 𝑋 = 𝑥 equals

𝑥!
𝑓𝑌 (𝑦1 , 𝑦2 , … 𝑦𝑥 ) = ; 0 < 𝑦1 < 𝑦2 < 𝑦𝑥 < 𝜏. □
𝜏𝑥

Proof
Let 𝑋 = 2. Then,

𝑃(𝑌1 ≤𝑦1 ,𝑌2 ≤𝑦2 ∩𝑋=2)


P(𝑌1 ≤ 𝑦1 , 𝑌2 ≤ 𝑦2 |𝑋 = 2) = 𝑃(𝑋=2)
.

(𝛿𝜏)2
According to the Poisson distribution, P(𝑋 = 2) = � 2!
� e−δτ . Further, P(𝑌1 ≤ 𝑦1 , 𝑌2 ≤ 𝑦2 ∩ 𝑋 = 2) is
the probability of the event that

• In the interval (0, 𝑦1 ] there is exactly one demand;


• In the interval (𝑦1 , 𝑦2 ] there is exactly one demand;
• In the interval (𝑦2 , 𝜏] there is no demand.

It follows that,

P(𝑌1 ≤ 𝑦1 , 𝑌2 ≤ 𝑦2 ∩ 𝑋 = 2) = 𝛿𝑦1 e−δy1 ⋅ 𝛿(𝑦2 − 𝑦1 )e−𝛿(𝑦2 −𝑦1 ) ⋅ e−𝛿(𝜏−𝑦2 ) .

And so the conditional cumulative distribution function equals

𝑦1 ⋅(𝑦2 −𝑦1 )
P(𝑌1 ≤ 𝑦1 , 𝑌2 ≤ 𝑦2 |𝑋 = 2) = 𝜏2 /2
2,

giving the conditional pdf

2
𝑓𝑌 (𝑦1 , 𝑦2 , … 𝑦𝑥 ) = ; 0 < 𝑦1 < 𝑦2 < 𝜏.
𝜏2

This proves the result for 𝑋 = 2. The general result is proved similarly. □

A hazard is defined as a demand occurring when the SIS is in a failed state (not responding properly to the
demand). The Hazard Rate, HR will be given as E(𝑍)/𝜏, and so we first find E(𝑍). Given 𝑋 = 𝑥 demands,
the number of hazards in (0, 𝜏] be written as

88 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

𝑍 = 𝐼1 + 𝐼2 + ⋯ + 𝐼𝑘 ,

where 𝐼𝑘 equals 1 if demand no. 𝑘 results in a hazard; that is if the SIS has a DU failure at that demand
(occurring at time 𝑌𝑘 ), and 0 otherwise. So 𝑃(𝐼𝑘 = 1) = E(𝐼𝑘 ) = probability that demand no. 𝑘 meets a
failed SIS. First consider the conditional case that both 𝑋 and 𝑌1 , 𝑌2 , … , 𝑌𝑋 are given. In the conditional case
(Figure E.1):

𝑃(𝐼𝑘 = 1) = 𝜆DU ⋅ (𝑦𝑘 − 𝑦𝑘−1 ); 𝑘 = 2,3, … , 𝑛 and P(𝐼1 = 1) = 𝜆DU ⋅ 𝑦1 .

(writing out the result just for a single SIS component).

It follows by a simple calculation that, conditionally, given 𝑋 = 𝑥 and (𝑌1 , 𝑌2 , … , 𝑌𝑥 ) = (𝑦1 , 𝑦2 , … , 𝑦𝑥 ):

𝐸(𝑋) = E(𝐼1 + 𝐼2 + ⋯ + 𝐼𝑥 ) = 𝜆DU ⋅ 𝑦𝑥 .

PFD(t)

λDU(y2 – y1) λDU(yx – yx –1)


λDUy1
λDU(τ – yx)

y1 y2 yx τ
Figure E.1: PFD(t) for a single component, given X = x demands, and given the instances (yk) of
demand (“linear case”, DU failures only)

By integrating this over the joint pdf of 𝑌1 , 𝑌2 , … , 𝑌𝑥 (given 𝑋 = 𝑥), we obtain the conditional value of the
mean number of hazards, E(𝑍), in an interval with 𝑋 = 𝑥 demands:

𝐸(𝑍|𝑋 = 𝑥) = 𝑥 ⋅ 𝜆DU 𝜏/(𝑥 + 1); (= 𝜆DU 𝜏/(1 + 𝑥 −1 ) for 𝑥 > 0).

For 𝑥 = 0, it is observed that we get the obvious result, E(𝑍│𝑋 = 0) = 0; (there can be no hazard in an
interval with no demand). Further, E(𝑍│𝑋 = 1) = 𝜆DU 𝜏/2, which is also as expected: the single demand
occurs randomly in the interval, and the average probability that the demand “meets” a failed system
equals the average PFD0 = 𝜆DU 𝜏/2. However as number of demands, 𝑋, increases it is seen that mean no.
of hazards, 𝐸(𝑍│𝑋 = 𝑥), also increases, (approaching 𝜆DU 𝜏 = 2 ⋅ PFD0 ).

It remains to find the unconditional value of E(𝑍). Since 𝑋 has a Poisson distribution with mean 𝛿 ∙ 𝜏, it
can be derived that

E(𝑍) = ∑𝑥 E(𝑍|𝑋 = 𝑥) ⋅ P(𝑋 = 𝑥) = 𝜆DU (e−𝛿𝜏 − 1 + 𝛿𝜏)/𝛿.

And so

HR = E(𝑍)/𝜏 = 𝜆DU (e−𝛿𝜏 − 1 + 𝛿𝜏)/(𝛿𝜏).

This rather fundamental result is showing how the HR depends on all the parameters 𝜆DU , 𝜏 and 𝛿. It is
valid for a SIS with a constant failure rate, here denoted, 𝜆DU (i.e., a 1oo1 configuration). However, if we

89 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

restrict to consider CCF – which is usually the main contribution to loss of safety – the SIS failure rate will
actually be constant also for 𝑀oo𝑁, i.e., we write PFH0 = C𝑀𝑜𝑜𝑁 ⋅ 𝛽 ⋅ 𝜆DU ; 𝑀 < 𝑁, provided we have a
constant failure rate for each component, and restrict to include CCF in the calculations, (or have a 𝑁oo𝑁
configuration with PFH0 = 𝑁 ⋅ 𝜆DU ), we have the more general result

HR = PFH0 ⋅ (e−δτ − 1 + 𝛿𝜏)/(𝛿𝜏).

So the fundamental parameter HR is now given as the product of PFH0 and a factor, which we see is
entirely determined by 𝛿𝜏 (= mean number of demands in the test interval). Actually, by expanding e−δτ,
we get (e−δτ − 1 + 𝛿𝜏)/(𝛿𝜏) ≈ 𝛿𝜏/2 − (𝛿𝜏)2 /6 ≈ 𝛿𝜏/2 (for “small” 𝛿𝜏). It follows that

HR ≈ PFH0 ⋅ 𝛿𝜏/2 , small 𝛿𝜏,

Next, investigate the expression for PFD when demands are accounted for. By following the same
procedure as above, first find the time dependent PFD, i.e., PFD(𝑡), given 𝑋 = 𝑥 and (𝑌1 , 𝑌2 , … , 𝑌𝑥 ) =
(𝑦1 , 𝑦2 , … , 𝑦𝑥 ). For a 1oo1 system, PFD(𝑡) increases linearly from 0 to 𝜆DU 𝑦1, where it drops to 0 and
then again increases linearly to 𝜆DU (𝑦2 − 𝑦1 ) at time 𝑦2 , etc.; see Figure E.1. Given 𝑋 = 𝑥, we can derive
the conditional PFD, given 𝑋 = 𝑥:

PFD𝑥 = 𝜆DU ⋅ 𝜏/(𝑥 + 2),

showing how PFD decreases as 𝑥 increases (due to the “testing” performed by the demands). Of course we
get the special case PFD0 = 𝜆DU ⋅ 𝜏/2. From PFD𝑥 we can now derive the overall average PFD =
∑𝑥 PFDx ⋅ P(𝑋 = 𝑥). However, the overall PFD is most easily obtained utilizing that the HR at time 𝑡
equals HR(𝑡) = PFD(𝑡) ⋅ 𝛿 and by integrating over (0, 𝜏] this gives

HR = PFD ∙ 𝛿.

From the above result on HR it then directly follows that

PFD = 𝜆DU (e−𝛿𝜏 − 1 + 𝛿𝜏)/(𝛿 2 𝜏).

Again, we may replace 𝜆DU ⋅ 𝜏 by using 𝑃𝐹𝐷0 = 𝜆DU ⋅ 𝜏/2, and the average PFD becomes

PFD = PFD0 ⋅ 2(e−𝛿𝜏 − 1 + 𝛿𝜏)/(𝛿𝜏)2

or

PFD = PFD0 ⋅ 𝑓(𝛿𝜏),

where we define

𝑓(𝑥) = 2(e−𝑥 − 1 + 𝑥)/𝑥 2 .

Here it can be proved that 𝑓(𝑥) → 1 as 𝑥 → 0, which assures that PFD approaches PFD0 as 𝛿 → 0.

By performing an expansion of e−x , we get the approximation 𝑓(𝑥) ≈ 1 − 𝑥/3, for small 𝑥. Then, the
PFD becomes:

𝛿𝜏
PFD ≈ PFD0 �1 − �, (small 𝛿𝜏).
3

90 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

For “small” 𝛿𝜏 this will give a simple relation between PFD and the “traditional” PFD0 . Note that the
exact formula is obtained by multiplying the “traditional” PFD0 with a simple factor which depends on 𝛿𝜏.

Summary of main results


Note that the above results are valid in general for the “linear case”, i.e., PFH0 being constant:

PFH0 ≈ 𝑁 ⋅ 𝜆DU for the 𝑁oo𝑁 configuration, and

PFH0 ≈ C𝑀𝑜𝑜𝑁 ⋅ 𝛽 ⋅ 𝜆DU , for the 𝑀oo𝑁 configuration, 𝑀 < 𝑁,

an approximation being valid when combinations of independent failures are not taken into consideration.
It is seen that this neither depends on the length of test interval, 𝜏, nor on the demand rate, 𝛿. So this
approximation, which usually represents the major contribution to PFH0 , will neither capture the effect of 𝜏
nor of 𝛿.

Further, by introducing 𝑓(𝑥) = 2(e−𝑥 − 1 + 𝑥)/𝑥 2 we have the following relations between the three
measures, PFH0 , PFD and HR, (with PFH = PFH0 ):

PFD = PFD0 𝑓(𝛿𝜏) = PFH0 ∙ 𝑓(𝛿𝜏)𝜏/2.

HR = PFD ∙ 𝛿 = PFD0 ∙ 𝛿𝑓(𝛿𝜏).

HR = PFH0 ∙ 𝑓(𝛿𝜏) ∙ 𝛿𝜏/2.

In particular, HR is is a product of the constant PFH0 and is a function of the mean number of demands in
the test interval, i.e., 𝛿𝜏.

These results generalize the standard results, which disregard the effect of demands. However, note that
the relations are given under the assumption that PFH remains constant and equal to PFH0 throughout the
test interval (which is usually a fair approximation).

Some numerical results


Table E.1 indicates the effect on PFD of the demands, by giving the “modification factor”, 𝑓(𝛿𝜏) =
PFD/PFD0 for some values of 𝛿𝜏. Note that the interpretation of 𝛿𝜏 is the mean number of demands
(acting as a test) in one test interval. As stated above the modification factor approaches 1 as 𝛿𝜏 approaches
0. If, for instance, there are five demands per test interval (acting as tests), PFD is seen to be reduced to
about one third of its value when there are no demands.

Table E.1: Values of “modification factor” = PFD/PFD0, as a function of δτ. Last line: HR/PFD0,
(with τ = 1)
𝜹𝝉 = mean number of demands per test interval
0.1 0.2 0.5 1 2 5 10

𝒇(𝜹𝝉) = 𝐏𝐅𝐃/𝐏𝐅𝐃𝟎 0.97 0.94 0.85 0.74 0.57 0.32 0.18

𝜹 ∙ 𝒇(𝜹𝝉) = 𝐇𝐑/𝐏𝐅𝐃𝟎 ; (𝝉 = 𝟏) 0.10 0.19 0.43 0.74 1.1 1.6 1.8

The last line of the table gives the factor HR/PFD0 = 𝛿 ∙ 𝑓(𝛿𝜏), inserting 𝜏 = 1 (𝜏 is chosen as the time
unit, and 𝛿 is interpreted as the mean number of demands in a test interval). For small 𝛿, it is seen that we
actually get HR/PFD0 = 𝛿, i.e., HR ≈ 𝛿 ⋅ PFD0 , and the error by calculating HR from 𝛿 ⋅ PFD0 is

91 of 93
Reliability Prediction Method for Safety Instrumented Systems
PDS Method Handbook, 2013 Edition

negligible. But as 𝛿 increases (above 1), HR/PFD0 becomes << δ, and so using PFD0 rather than PFD to
calculate HR would distinctly lead to an underestimation. For 𝛿 = 1, using HR = 𝛿 ⋅ PFD0 , would give 74
% of the “true” value HR = 𝛿 ∙ PFD. This might not be too bad, but if 𝛿 = 10, this percentage is reduced to
18 %.

Conclusions regarding the use of loss-of-safety measure


The recommendation presented in the IEC 61508 standard seems to be based on the assumption that we
have to choose between PFD0 and PFH0 , (that is measures which do not take demands into account). In
that case PFD0 has an obvious draw back for an increasing demand rate, but neither PFH0 is a good
measure in that case. Therefore the choice of loss-of-safety parameter should be based on analyses, as
presented above, to see the effect of all relevant parameters on these measures.

That is, IEC 61508 seems to reject the use of PFD for moderate/large 𝛿 since PFD0 does not work well as a
measure in that case. Here it is chosen rather to derive a PFD expression that works well also in this case.
From the investigations summarized above we conclude as follows regarding measures for loss-of-safety
due to DU failures:

• PFH0 = PFH seems a sensible measure for systems operated in continuous mode, when we are
talking of more or less immediate failure detection
• PFH0 alone is not suited as a measure for loss of safety for on demand systems, (neither low
demand nor high demand); irrespective of 𝜏 and 𝛿.
o The main contributor to PFH0 (i.e., the CCF or complete failure rate for 𝑁oo𝑁) does not
depend on the length of the test interval, 𝜏. So the decrease in safety experienced by
increasing 𝜏 is essentially not captured by PFH0 .
o No argument is found for using PFH0 instead of PFD if the number of demands is above 1
per year. PFH0 is constant, independent of the demand rate, 𝛿, and so does not reveal how
the risk depends on the demand rate.
• However, note that by starting from PFH0 , we can easily calculate both PFD and HR. These
relations are quite simple both for X = 0 and for the general case (𝛿 > 0). So no harm is done by
calculating PFH0 as long as the result is afterwards used to calculate HR.
• PFD (and in particular HR = 𝛿 ∙ PFD) is well suited to describe how loss-of-safety depends on both
𝜏 and 𝛿. It is highly recommended to calculate HR, However, also PFD is a measure reflecting both
𝜏 and 𝛿, and there is little doubt that PFD is a better loss-of-safety parameter than is PFH0 (when
dormant failure is the main issue).
• When demands actually serve as functional tests, it is recommended that the expressions for HR
and PFD referred above should be used to determine whether the SIS has an acceptable reliability,
(i.e., whether loss-of-safety is sufficiently low).
o It is observed that PFD depends on the demands only through 𝛿𝜏 = mean number of
demands in a test interval. We take this as an indication that the previous IEC 61508
definition of low demand mode in /23/ is more sensible than the new one focusing entirely
on the no. of demands in one year, (as one year also seems a rather arbitrary time unit).

92 of 93

You might also like