Download as pdf or txt
Download as pdf or txt
You are on page 1of 15

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/280971388

A method for barrier-based incident investigation

Article  in  Process Safety Progress · June 2015


DOI: 10.1002/prs.11738

CITATIONS READS
12 1,013

4 authors, including:

Robin Pitblado
Det Norske Veritas
77 PUBLICATIONS   716 CITATIONS   

SEE PROFILE

All content following this page was uploaded by Robin Pitblado on 23 November 2021.

The user has requested enhancement of the downloaded file.


Process Safety Progress

A Method for Barrier-based Incident Investigation

Journal: Process Safety Progress

Manuscript ID: Draft


Fo

Wiley - Manuscript type: Original Article

Date Submitted by the Author: n/a


r

Complete List of Authors: Pitblado, Robin; DNV-GL, Oil & Gas Risk Management Solutions
Potts, Tony; DNV GL,
Re

Fisher, Mark; DNV GL,


Greenfield, Stuart; DNV GL,

Keywords: Incident Investigations, Risk Assessment


vi

Incident investigation is a formal requirement for high hazard facilities with


the aim to learn from each incident and to prevent future
recurrences. There are many published investigation methods, with most
ew

driving to the management system root cause and some applying newer
barrier-based methods. However, these methods either do not link tightly
to the facility risk assessment or are very difficult to apply, and lessons
from incidents that might reveal weaknesses, especially relating to major
accidents, can be missed. This paper describes a novel method for incident
Abstract: investigation (BSCAT) that combines the ideas of barrier-based risk
On

assessment with a well-established systems-based root cause analysis


method (SCAT). The method described is efficient and can be applied by
properly trained supervisors, and this potentially allows every incident or
near-miss event to be assessed in a consistent risk-based format. The
method clearly establishes links back to the facility risk assessment and
ly

thus identifies risk pathways that are potentially too optimistic (i.e. the risk
is higher than predicted), and this can be due to initial optimism or
degradation of safety barriers (human or hardware).
Page 1 of 13 Process Safety Progress

A Method for Barrier-based Incident Investigation


Robin Pitblado, a Tony Potts, b Mark Fisher, b Stuart Greenfield b
a
DNV GL, 1400 Ravello Dr, Katy TX 77441
b
DNV GL, Highbank House, Exchange St, Stockport, UK

Incident investigation is a formal requirement for high hazard facilities with the aim to learn from each
incident and to prevent future recurrences. There are many published investigation methods, with most driving to
the management system root cause and some applying newer barrier-based methods. However, these methods
either do not link tightly to the facility risk assessment or are very difficult to apply, and lessons from incidents
that might reveal weaknesses, especially relating to major accidents, can be missed. This paper describes a novel
method for incident investigation (BSCAT) that combines the ideas of barrier-based risk assessment with a well-
established systems-based root cause analysis method (SCAT). The method described is efficient and can be
applied by properly trained supervisors, and this potentially allows every incident or near-miss event to be
Fo
assessed in a consistent risk-based format. The method clearly establishes links back to the facility risk
assessment and thus identifies risk pathways that are potentially too optimistic (i.e. the risk is higher than
predicted), and this can be due to initial optimism or degradation of safety barriers (human or hardware).
r

INTRODUCTION
Re

Formal incident investigation is required by US regulations for high hazard facilities onshore and SEMS
regulations for offshore facilities. A similar requirement also applies under safety case regulations in Europe,
both onshore and offshore. None of these however specifies any specific method; the operator is free to select any
vi

method deemed suitable.


Early incident investigation focused too much on direct causes, assigning blame, and rarely delved into system
ew

causes. Bird et al [1] quotes statistics from 1490 old incident reports and these show ineffective investigations
which identified only 1% to be the fault of the employer, with bulk of the remainder being either unpreventable
(65%) or some kind of human error (31%). With this depth of analysis, it is not surprising that accidents
continued without significant reduction as the true underlying causes of accidents were not being identified.
Modern investigation techniques drive beyond the initial or direct causes and attempt to identify deeper root
On

causes, usually linked to the management system (CCPS [2]). A selection of current techniques which do drive to
root causes is provided in
.
ly

With such a full list of methods, it might be asked why there is a need for a new investigation method?
Incident investigation techniques need to evolve to match the management processes in use, otherwise the lessons
learned through the investigation will not match the system being employed. At a relatively simple level this
means the system categories generated by the investigation should match the management system elements
employed at the facility (e.g. OSHA PSM [3], CCPS [4], or ISRS [5]). However, at a deeper level, there has been
major change in the management of high hazard facilities from a traditional safety management structure (as in
OSHA PSM) and towards a risk-based structure (as in CCPS and ISRS). Extracting lessons is more than
matching incidents to management system elements, and ideally it should provide also a direct linkage to the risk
tools being used to manage the facility risks.
Investigation methods can be characterized by the amount of structure inherent in the method and by the
complexity of applying the method. For example MORT and SCAT both have a high degree of methodology
structure and the user mostly selects options from within this predefined structure; however the complexity of
application between these two techniques is very different, with SCAT requiring less specialist investigation
knowledge and MORT much more. The 5 Why’s and the Fault Tree methods provide a similar pair of examples
in the flexible area. Here the methods do not provide predefined options and the user must develop the solution
Process Safety Progress Page 2 of 13

from first principles using the methodology rule set. BSCAT uses the fixed structure of SCAT but combines this
with the flexibility of a bow tie model, so it would high mid-way on the structure/flexibility axis. Similarly in
terms of detail it extends the simple model of SCAT to address the risk domain, but not in as much detail as some
complex techniques, so it lies midway on the Overview/Detailed axis. CGE Risk have developed a figure
mapping the different techniques (Figure 1). While subjective – it does show that BSCAT provides a good
balance between structure and complexity – making it suitable for general application by facility supervisors
rather than only by highly qualified investigation specialists. More sophisticated techniques like Tripod Beta are
more difficult to apply and suitable for only a subset of total incidents. This limits their lessons learned potential
for all barriers – but it would be justified by the greater depth of information for cultural influences that would
show in most incidents.
Figure 1. Application features of several investigation methods (CGE Risk)
In the following sections, the authors review the new barrier-based operational risk assessment method,
frequently termed bow tie diagrams, and then show how these can be adapted to incident investigation using only
those arms of the bow tie that capture the accident pathway. The well-established SCAT method (Systematic
Fo
Cause Analysis Technique) is then described. BSCAT (Barrier-based SCAT) then merges the two techniques
allowing a tight link between the risk assessment and the root causes to be established. Finally a worked example
shows the application of the method to the Buncefield oil terminal fire event.
r

BARRIER RISK METHODS


Re

Barrier based risk assessment has been applied to process safety risks for over two decades, with Shell taking
a lead (Zuijderduijn [6]). The original thinking derives from the well-known Swiss Cheese model proposed by
James Reason, but the method does not follow his structure. Regulators also recognized the value of this risk-
vi

based approach (UK Parliament [7]) as it permits a focus on major accident risks during the operational phase;
most other risk techniques focus on the design stage. The model shows a number of safety barriers lying between
the threats and the major accident outcome. The barriers are not perfect and hence the holes which represent the
ew

failure modes associated with individual barriers. If all the holes “line-up”, then the unwanted event occurs. The
model is intuitive and easy to explain; a safer system would employ more barriers with smaller holes.
Currently there is no publicly available guideline document describing the bow tie method, although CCPS has a
working party (Project 237) on this. In the meantime shorter method statements have been published
On

(Zuijderduijn [6], Pitblado and Weijand [8]) or available as software support manuals (from ABS for Thesis and
CGE Risk [9] for BowTieXP). In the absence of a formal specification, there tends to be multiple terminology
describing elements of the bow tie, although the method is similar in all cases.
Figure 2 shows the primary elements of a bow tie diagram. At the top is the hazard – this is the material or
ly

condition that if control is lost could give rise to the unwanted consequences. The hazard leads directly to the top
event which is the central circle. This is the specific loss of control or loss of containment of the hazard (e.g. leak
of a hazardous material). On the left side are various threats or causes (e.g. corrosion, dropped object) that could
cause this loss of control, and on the right side are the consequences or undesired outcomes (e.g. injury, explosion,
etc.). In between the threats and the top event are prevention barriers (or safeguards / controls), and similarly on
the other side are mitigation barriers (or safeguards / controls). Not shown on this simplified diagram are barrier
decay mechanisms (also known as escalation factors) which show how individual barriers can degrade (e.g.
failure to inspect) and the additional barriers installed (e.g. inspection and preventive maintenance programs) to
keep these at their performance standard. Barriers are more than bars on a bow tie diagram – each represents an
AND Gate with inputs of “demand” and “barrier fails”. This provides an underpinning of sound safety science to
the method. The barrier decay mechanism builds out the fault tree AND gate showing the mechanisms how the
barrier might fail. Pitblado & Weijand [8] give multiple examples of good and poor bow tie elements and how
these can affect the quality and utility of the final bow tie.
Figure 2. Bow Tie diagram elements
Page 3 of 13 Process Safety Progress

Real bow ties are more complex than shown in this figure, often with 5-8 threat arms entering and 2-4
consequence arms emerging. Generally, 3-4 barriers per arm represents a well-protected system, however
examples are seen with many more than this, but that is most often due to faults in drawing the bow tie with
barrier decay mechanism barriers incorrectly promoted on to the main pathway. Shell guidance [6] is that 10 –
15 bow ties are sufficient to capture the most important top events and barriers, and usually little value is obtained
from creating a greater number.
An important opportunity is to link incident investigations to these facility risk assessment bow ties, showing
which barriers must have failed in order to have an accident (reaching all the way to the right hand side) or a near
miss (having stopped somewhere along the accident pathway).

The SCAT Root Cause Methodology


The Systematic Cause Analysis Technique (SCAT) was developed in the 1980’s by Frank Bird [1]. It is
based on the DNV GL Loss Causation Model (Figure 3). This model when used from right to left – to investigate
incidents, is the SCAT approach. This shows that a Loss (e.g. occupational accident, fire, or near miss event) is
Fo
created by an Incident. Incidents have an Immediate Cause, which is categorized as due to Substandard Acts or
Substandard Conditions, these in turn have a deeper Basic Cause, which is categorized as due to a Personal factor
or a Job / System Factor. These basic causes lead to the management system lack of control areas which may be
r
in need to corrective action. The corrective action type will depend on whether the lack of control is due to an
inadequate system, inadequate standards within the system, or poor compliance to those standards.
Re

Figure 3. Loss Causation Model


The aim of SCAT is that it can be applied to all incidents or near miss events by supervisors, who have some
vi

training in investigation, but who are not specialists. To aid them in the correct categorization of immediate and
basic causes, the SCAT system has pre-defined categories of substandard acts and conditions, and similarly for
personal and job factors. Supervisors, after collecting all needed evidence (interviews, documents, photographs,
ew

etc.) would refer to these lists to most closely match the incident immediate and basic causes to the available
categories. The current SCAT (version 8) has immediate cause categories with 28 substandard acts and 21
substandard conditions – some examples are shown in Table 2.
The list of basic causes is longer and to make this manageable these are divided into main categories and
On

subcategories. There are 8 personal factors and 10 job/system factors, and each of these has an around of 8 – 20
subcategories, giving a total of over 200 sub-categories. A sampling of these is provided in Table 3.
The purpose of these lists of categories is to help the user define correctly the immediate and basic causes.
Without such a list, it might be possible to confuse causes and assign a basic cause as an immediate cause, or to
ly

list two immediate causes as the immediate and basic cause combination. This would not point correctly towards
the lack of control issue and a faulty corrective action might be developed. When assigning categories there is no
restriction to a single immediate or basic cause, in fact most incidents have multiple immediate and basic causes.
These lists have been the subject of much feedback and careful revision over the years and the current list is
considered effective.
The lack of control categories should match the facility safety management system. If this is the risk-based
International Safety Management System (ISRS v8 [5]) then this has 15 elements, if it is based on the CCPS Risk
Based Process Safety [4] then this will have 20 elements.

BSCAT METHOD
Accidents can be converted from a traditional description or storyboard diagram into a bow tie pathway
showing the barriers that were degraded or failed. This pathway can be in the form of a bow tie diagram with a
single top event in the center and barriers on either side, or as a sequence of intermediate events with barriers
Process Safety Progress Page 4 of 13

around these. The sequence can be initiated by a single failure (e.g. dropped object) or by multiple failures (e.g.
corrosion and excess pressure). Similarly there can be one or more consequences (e.g. safety, environment, asset
damage, etc.). An advantage of the bow tie diagram format is that the incident analysis can link directly back to
the facility risk assessment diagrams.
The BSCAT methodology follows the CCPS [2] approach in terms of collecting evidence (physical/positional,
photographs/video, witness statements, paper records, and electronic data) and organizing this with the aid of a
timeline or storyboard. This collates multiple different sources and helps resolve conflicts in evidence. The new
part involves reviewing the existing bow ties and selecting the bow tie most closely matching the actual incident
and selecting amongst the threat and consequence arms for the relevant pathway.
The BSCAT approach combines the incident bow tie with the SCAT analysis. Each barrier failure is treated
as an incident and a SCAT analysis is applied. The difference between a traditional SCAT and BSCAT is shown
in Figure 4. It might be assumed from this figure that BSCAT requires significantly more effort than SCAT, but
this is not the case. All the barriers that failed and are analyzed in BSCAT need to be identified and assessed in
SCAT as well, but now there is no clear guide as to how deep the analysis should proceed. Using the barrier
Fo
model – all the barrier failures must be developed.
Figure 4. Comparison of SCAT and BSCAT approach
r

WORKED EXAMPLE – BUNCEFIELD INCIDENT


Re

The Buncefield incident provides a good example showing application of the BSCAT methodology and uses
an incident that is well known publicly. The incident has been well investigated and the HSE [10] published a
summary report with their overall assessment as to causes, which were seen to be due to a series of “broader
management system failings”. The authors have used this report exclusively as the source of information for this
vi

BSCAT worked example.


The Buncefield Oil Storage Depot explosion and fires occurred on 11 December 2005 at an oil storage facility
ew

located just north of London. A storage tank was overfilled with unleaded gasoline, which escaped over the rim
of the tank, causing the loss of about 300 tonnes of fuel. The splashing to ground formed a massive vapor cloud
in still weather which eventually found an ignition source and caused a series of explosions and resulting fires,
involving 20 large storage tanks. Analysis of damage and later experiments at the DNV GL Spadeadam test site
On

showed this was probably a DDT event – Deflagration to Detonation Transition. There were no fatalities in the
adjacent business park as the event occurred on a Sunday morning; however there was significant property
damage and environmental impact.
The HSE report allowed a series of intermediate or key events to be determined. These events are points at
ly

which the potential for an incident either increased or decreased i.e. control was lost or regained. The key events
help prompt for barriers that were, could, or should have been in place. It is possible for different analysts to
choose different sets of key events – but the barrier failures all need to be mapped and well selected events help
identify all these. For Buncefield the following key events are selected:
• Filling the Tank with Gasoline (the threat)
• Bulk Storage of Gasoline / Overfill, spill and formation of vapor cloud (top event)
• Ignited Release casing Explosion and Fire (the consequence)
These key events relate directly to the Cause, Top Event and Consequence of an incident bow tie pathway.
Since there were no preexisting bow ties, the incident bow tie had to be created from first principles. It can be
useful to choose several intermediate key events as this encourages deeper thinking about the incident and
associated barriers. It is also recommended that possible barriers that could have been in place according to
legislation, company standards and / or international best practices etc. should be mapped – even if not present,
but they would be shown as “missing” barriers.
Page 5 of 13 Process Safety Progress

Once added to the diagram, the barriers can be then be classified as one of the following types: present and
operational (i.e. worked as designed), missing, failed, or low reliability (where it is unclear if the barrier actually
did work or if it did work then it might not on the next demand). For the Buncefield incident

Figure 5 shows an extract of key events and barriers – only half the barriers are shown to aid clarity, with the
left and right segments shown vertically to improve readability. All barriers have been assumed to have failed
(shown as broken bars).

Figure 5. Key events and barriers summary


The next stage in the BSCAT analysis is to complete the SCAT (or root cause analysis) for each of the barriers,
barriers, and this is shown in Figure 6. The SCAT development appears as text boxes beneath each barrier.
Referring to the first barrier in
Figure 5 (automatic tank gauging system) – the top two boxes are the immediate cause and its category (here
Fo
IC21: defective equipment), the next two are the basic cause and its category (BC13: inadequate
maintenance/inspection), and the bottom two are the finding or recommendation and the safety management
system category (MSF10.3: execution of maintenance).
r

Figure 6 also shows two display formats: the BSCAT results in full and partial modes – in full mode (Part a)
the free text description and associated SCAT categories are displayed and in partial display mode (Part b) only
Re

the free text description is displayed. The full display mode is normally only used by the analyst – to ensure that
the free text is correctly categorized as a valid immediate or basic cause and to collect statistics as to longer term
trends of causes. Neither of these is directly important to readers of the investigation and by removing these, Part
b) in the figure, the BSCAT diagram is simpler to read and would be the normal format of presentation.
vi

Figure 6. BSCAT analysis for Buncefield incident


ew

CONCLUSIONS
The BSCAT method was developed to update the well-established SCAT method by addressing the barrier
theory of accident causation. It provides a transparent linkage to the risk management system and to modern risk-
On

based management systems, and compared to other investigation systems it has a good balance between level of
detail and formal structure. Its simplicity and use of checklist categories and if they exist, the use of pre-
constructed bow tie risk diagrams help both experienced analysts and supervisors to apply the method to all
incidents or near miss events. This allows every incident to identify not only the management system root causes,
ly

but also to document which safety barriers failed or were degraded, and also importantly those which worked.
A feature of bow ties is that many barriers repeat between different bow ties and even different arms of the
same bow tie. For example, if on one incident bow tie the cause is related to failure to calibrate inspection
equipment, then that barrier in otherwise unrelated bow ties would also be suspect. Software can automatically
detect and communicate such common failures and display these on all the bow ties where that barrier appears – it
does not require active intervention or insight by a safety specialist. Thus over time, owners of bow ties will see
many of the barriers overlaid with failure events (often from other bow ties and other incidents). This is a visual
indication of robustness of the barrier system against each threat and a powerful lessons learned feature.
The authors have applied the BSCAT method on multiple occasions and generally have found it aids in
communication of the final results as recommendations are directly linked to barrier failures. It tends to reduce
good practice recommendations, not directly related to the incident causation, which can sometimes confuse
investigations. These are better captured as additional findings.
The visual nature of the result merges some features of a modified the storyboard (here showing the sequence
of barrier failures) with the root cause of each failure. This enhances communication and allows the facility risk
Process Safety Progress Page 6 of 13

assessment to be reinforced with every investigation. Faulty risk assessments will be quickly identified and
rectified, without waiting for a five year revision requirement.
REFERENCES
1. F. Bird, G. Germain, D. Clark. Practical Loss Control Leadership, 3rd Ed. Atlanta, DNV GL (2003).
2. CCPS. Guidelines for Investigating Chemical Process Accidents, 2nd Ed. New York, Wiley / AIChE (2003).
3. Occupational Safety and Health Administration, Process Safety Management of Highly Hazardous Chemicals
Regulation, OSHA 29 CFR 1910.119 (1992).
4. CCPS, Risk Based Process Safety. New York, Wiley / AIChE (2007).
5. DNV GL, International Safety Rating System (ISRS 8th Ed), Manchester UK (2012).
6. C. Zuijderduijn, Risk management by Shell refinery/chemicals at Pernis, the Netherlands. EU Safety
Conference: Implementation of the Seveso II Directive Athens (2000).
7. UK Parliamentary Office of Science and Technology. Managing Human Error, Report 156, London (2001).
8. R. Pitblado and P. Weijand, Barrier diagram (bow tie) quality issues for operating managers. Process Safety
Fo
Progress, pp. Online: 27 JAN 2014, DOI: 10.1002/prs.11666 (2014).
9. CGE Risk. Bow Tie XP Software Manual. Leidschendam, the Netherlands (2010).
10. Health and Safety Executive, Buncefield: Why did it happen?
r

http://www.hse.gov.uk/comah/buncefield/buncefield-report.pdf, accessed Nov 25, 2014.


Re

Acknowledgement:
Images in this paper were created in IncidentXP Software from CGERisk, Leidshendam, The Netherlands.
vi
ew
On
ly
Page 7 of 13 Process Safety Progress

Table 1. Selected Incident Investigation methods


Category Name
Generic 5 Why’s
Fishbone Diagrams
Fault Trees
Proprietary Common List of Causes (BP)
(developer name) MORT – Management Oversight and Risk Tree
(US Department of Energy)
Source (ABS)
TapRoot (System Improvements Inc.)
TriPod Beta (Reason and Hudson)
SCAT and BSCAT (DNV GL)

Table 2. Examples of SCAT list of Immediate Causes


Fo

Substandard Acts Substandard Conditions


Operating Equipment without Authority Inadequate or Improper Protective Equipment
Failure to Warn/Secure Failure to Reach Business Goals and/or Objectives
r

Making Safety Devices Inoperative Presence of Fire and/or Explosion Hazards


Using Defective Equipment Inadequate Information Data/Indicators
Re

Improper Operation of Equipment Inadequate Preparation/Planning


Improper Employee/Management Behavior Inadequate Support/Assistance/Resources
Being Under the Influence of Alcohol or Other Drugs Inadequate EQSH System
Etc. Etc.
vi

Table 3. Examples of SCAT list of Basic Causes (Categories with sub-categories)


ew

Personal Factors Job / System Factors


Inadequate Physical/Physiological Capability Inadequate Leadership and/or Supervision
- Inappropriate height, weight, size, strength, etc. - Unclear or conflicting reporting relationships
- Restricted range of body movements - Lack of supervisory/management job knowledge
- Substance sensitivities - Improper or insufficient delegation
On

Inadequate Mental/Psychological Capability Inadequate Maintenance Inspection and Controls


- Fears and phobias - Inadequate inspections
- Mental illness/emotional disturbance - Part substitution
- Intelligence level Etc.
ly

Etc.
Process Safety Progress Page 8 of 13

Fo
r
Re
vi
ew
On
ly
Page 9 of 13 Process Safety Progress

Fo
r
Re
vi
ew
On
ly
Process Safety Progress Page 10 of 13

Fo
r
Re
vi
ew
On
ly
Page 11 of 13 Process Safety Progress

Fo
r
Re
vi
ew
On
ly
Process Safety Progress Page 12 of 13

Fo
r
Re
vi
ew
On
ly
Page 13 of 13 Process Safety Progress

Fo
r
Re
vi
ew
On
ly

View publication stats

You might also like