Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 53

Root Cause Analysis

Presented by
James J. Rooney
Terminology
Presentation to ASQ Seattle Section

• Accident/adverse event/loss event –


an unplanned sequence of events that
resulted in personnel injuries,
environmental insult, reliability
impacts, and/or quality
nonconformities
• Near miss – an unplanned sequence
of events that reasonably could have
resulted in an accident or more severe
consequences
• Incident = accidents + near misses

2
What Is Root Cause Analysis (RCA)?
Presentation to ASQ Seattle Section

• A systematic method that:


– investigates an incident or series of
incidents
– attempts to understand the underlying
causes of the incident(s)
– generates effective corrective actions to
prevent and mitigate the incident(s)

3
Survey Question – RCA Goals
Presentation to ASQ Seattle Section

What is the primary goal of performing an


RCA at your organization?

4
Survey Question – Most Common
Types of Recommendations
Presentation to ASQ Seattle Section

What do the most common types of


recommendations from your current RCAs
involve?

5
Why Perform RCAs for Equipment
Failures?
Presentation to ASQ Seattle Section

• The first level of analysis


– Repair/replace the failed equipment and
restart the system
– Analyze complicated failures to determine
exactly what equipment failed and how
– “Fix it and forget it” approach

6
How Effective Is the “Fix It and Forget It”
Approach in Preventing the Next Failure?
Presentation to ASQ Seattle Section

• Question: Will implementing these


recommendations prevent the next failure?
• Answer: Not unless we understand and
address why the equipment failed?
– To understand why the equipment failed,
we need to understand the underlying
human errors that led to the failure
– all equipment failures can be traced back to
some human error

7
Why Perform RCAs for Human
Errors?
Presentation to ASQ Seattle Section

• Even when the equipment


operates properly human
errors can prevent the
system from achieving its
goal
• We need to investigate
human errors so we can
identify and correct the
error
– Do we just need to provide
a “fix it and forget it”
approach for human
errors?
8
Let’s Try “Fixing” the Person
Presentation to ASQ Seattle Section

• Let’s “fix” the person by punishing him/her


• We’ll punish the personnel (employees and
contractors) who committed the error to
ensure they don’t commit the error again

9
Direct Punishment
Presentation to ASQ Seattle Section

• We’ll ensure the personnel involved in


the incident will not commit the same
mistake by:
– assigning them the worst jobs
– reassigning them to the worst shift times
– providing them time off
– terminating them

10
What Is the Purpose of These
Punishments?
Presentation to ASQ Seattle Section

• The employee needs an attitude adjustment


– time off and lousy work assignments will convince
the employee that he/she works for a great
organization and the employee should care more
about it
– the employee’s improved attitude will be contagious
and everyone else’s attitude will improve too
• The employee needs to be terminated
– getting rid of this person and replacing him/her with
another will eliminate the undesirable incidents
– this action will also eliminate this undesirable
behavior in other workers

11
More Subtle Punishments
Presentation to ASQ Seattle Section

• We counsel them – If they had a better safety


attitude, a better attitude toward their job,… they
wouldn’t be making these errors
• We provide training - If they just knew how to do
their job they wouldn’t make this mistake again
• We review the procedure – We review the
procedure in detail so they will know what to do
• We ask for task performance demonstrations –
We ask them to demonstrate proper task
performance so we can be sure they really have
the skills to do the task
These are punishments when the cause of the
incident is unrelated to the “solution”
12
Have You Ever Made a Math Error?
Presentation to ASQ Seattle Section

• What if, following the error, you were required to:


– attend a class on basic math (addition, subtraction)
– get counseling on the “goodness” of math
– demonstrate your math skill by passing a test
• How would this action make you feel toward the
organization that assigned these corrective
actions to you?

13
Survey Question – Recommendation
Effectiveness
Presentation to ASQ Seattle Section

• How effective have the following measures


been in preventing incidents at your
facility?
– counseling
– training
– procedure reviews
– termination/time off

14
These Recommendations Are Not Always Bad – All of
These Methods Are Appropriate in SOME Incidents
Presentation to ASQ Seattle Section

• These can be effective corrective actions when:


– there is a willful desire to perform the task
incorrectly, knowing it will turn out wrong, or
– lack of knowledge results in the wrong behavior
• We need to sort out the situations where the task
was willfully performed incorrectly to “help” the
company (“I did it wrong on purpose.”)
– examples
• the task cannot be performed correctly
• the worker (or the company) gets punished for
performing the task correctly
• the worker gets rewarded for performing the task
incorrectly

15
These Recommendations Are Not Always Bad – All of
These Methods Are Appropriate in SOME Incidents
Presentation to ASQ Seattle Section

• In SOME cases, personnel really do develop poor


attitudes or don’t care about the organization or
their fellow employees
– shouldn’t we have dealt with this bad attitude
long before an RCA identified the issue?
• In some cases, training on how and why to
perform the task properly can help
– but if they weren’t trained, shouldn’t we find
out why they weren’t trained?

16
Why Don’t Most of These Methods
Work for the TYPICAL Incident?
Presentation to ASQ Seattle Section

• Because most employees ARE trying to do the job


right
– in fact, this is often why they get involved in the
incident in the first place
– in order to get the job done on time and with the
tools and information available, they take shortcuts
and push the limits, believing they are doing the
best thing for the organization
• Many factors affect our personnels’ job
performance

17
How Effective Is the “Fix It and Forget It”
Approach in Preventing the Next Error?
Presentation to ASQ Seattle Section

• Question: Will implementing these actions


prevent the next error?
• Answer: Not unless we understand why
the human made this error?
– To understand why the human committed
the error we need to understand the
underlying causes that led to the failure
– This helps us determine what, if anything,
is wrong with the person involved and/or
the situation they were in

18
Performance Shaping Factors (PSFs)
Presentation to ASQ Seattle Section

• Factors that influence human performance


• Types of PSFs
– internal PSFs – the person
– external PSFs – the work environment
• Most PSFs are within management control

19
Internal PSFs – Factors Associated
with the Individual Worker
Presentation to ASQ Seattle Section

• Can be modified “easily”


– training/experience/skill level
– knowledge of required
standards and goals How
How many
many ofof
– physical condition these
these factors
factors are
are
• Are difficult to modify you
you addressing
addressing in in
– personality the
the corrective
corrective
– overall intelligence actions
actions from
from your
your
– motivation and attitudes RCAs?
RCAs?
– emotional state
– stresses outside the work site

20
External PSFs - Situational
Presentation to ASQ Seattle Section

• Factors related to the general work


environment
– general work environment
– work hours/schedule/shift
rotation
– availability/adequacy of
How
How many
many ofof
tools/equipment these
these factors
factors are
are
– staffing parameters you
you addressing
addressing in in
– supervision the
the corrective
corrective
– planning
actions
actions from
from your
your
– rewards/recognitions/benefits
– goals/objectives
RCAs?
RCAs?
– general work standards, policies,
and administrative controls

21
External PSFs – Job Directions
Presentation to ASQ Seattle Section

• Job and task instructions


• Written and unwritten
procedures How
How many
many ofof
• Written and oral these
these factors
factors are
are
you
you addressing
addressing in in
communications the
the corrective
corrective
• Task-specific work actions
actions from
from your
your
standards, policies, and RCAs?
RCAs?
administrative controls

22
External PSFs – Task and Equipment
Characteristics
Presentation to ASQ Seattle Section

• Control-display relationships
• Human-machine interface
parameters
• Equipment functionality and How
How many
many ofof
constancy (maintenance and these
these factors
factors are
are
design) you
you addressing
addressing in in
• Anticipatory requirements
the
the corrective
corrective
• Complexity
actions
actions from
from your
your
• Memory requirements
• Calculational requirements
RCAs?
RCAs?
• Feedback
• Team structure and communication
• Dynamic versus step-by-step

23
Relationship of PSFs
Presentation to ASQ Seattle Section

Organizational Environment
• General work environment
• Standards, policies, & administrative
controls
• Tools and equipment How many of these
• Staffing
• Supervision factors are youTask characteristics
• Planning
• Rewards/punishments
addressing in your
• Task Resources
– procedures
RCAs? – equipment
• design
Situational
stressors • HF engineering
Personal factors
• maintenance
– Training and – personnel
Information
experience • Environment
Actions
– Experiences – conditions
– Inherent Feedback
characteristics – protective equipment
– External
experiences

24
How Do Organizations Control These
PSFs?
Presentation to ASQ Seattle Section

• Through management systems


– design engineering and human factors
– equipment records
– maintenance programs
– procedures
– training
– communications practices
– standards and policies
– procurement standards
– human resources
– quality inspections and customer interfaces
– management of change (MOC) programs
– proactive and reactive analyses

25
What Should These Management
Systems Control?
Presentation to ASQ Seattle Section

• You can’t control everything


• So, what should you control?
– those incidents that can cause intolerable pain
– incidents with the highest risk – a combination of
frequency and consequences
• Proactive analysis helps to identify high risk
issues
– proactive analysis methods include:
• hazard and operability (HAZOP)
• failure modes and effects analysis (FMEA)
• reliability-centered maintenance (RCM)
• fault tree analysis (FTA)
• what-if/checklist

26
Survey Question – Proactive Analysis
Techniques
Presentation to ASQ Seattle Section

• Which of the following proactive analysis


techniques does your organization typically
use to proactively identify risks?
– HAZOP
– FMEA
– RCM
– what-if/checklist analysis
– None

27
Relationship Among Analyses, Operations,
and Management Systems
Presentation to ASQ Seattle Section

PROACTIVE ANALYSIS MANAGEMENT SYSTEMS

Perform proactive analysis to identify Set up systems to manage equipment


significant risks and safeguards to prevent and human behavior within our system
and mitigate the associated consequences to adequately control risk. Examples of OPERATIONS
management systems include:
ŸWhat could go wrong? Ÿ Equipment design Operation of the facility
ŸWhat are the consequences of these Ÿ Maintenance strategies, methods, and in accordance with the
incidents? procedures management systems
ŸWhat could cause these consequences? Ÿ Administrative processes
ŸHow likely are these consequences? Ÿ Training
Ÿ Employee screening

REACTIVE ANALYSIS (Incident Investigation/Root Cause Analysis)


Perform reactive analysis to identify improvements in the safeguards to prevent and mitigate the
associated consequences to adequately control risk
Unacceptable failures,
Ÿ What did go wrong? losses, and
Ÿ What were the consequences of these incidents? inefficiencies
Ÿ What caused these consequences?
Ÿ What changes should be made to the proactive analysis process and the management
systems to adequately control risk?

28
What Is Root Cause Analysis?
Presentation to ASQ Seattle Section

• A systematic method that:


– investigates an incident or series of incidents
– attempts to understand the underlying causes of
the incident(s), including the management system
issues that caused or led to the incident
– generates effective corrective actions to prevent
and mitigate the incident(s) including the
identification of changes to management systems

29
Levels of Investigative Analysis
Presentation to ASQ Seattle Section

Increasing
Increasing Increasing Scope of
Depth of Level of Corrective Causal Factors
Analysis Learning Actions
Individual
(Performance
Error or Gaps
Equipment
Failure For Front-line
Personnel)
Task Control
Issues

Root Causes
Process Control Issues
(Performance Gaps
For Support
Organizations)
Management System Issues

Organizational Culture Issues

30
The Real Goal of RCA
Presentation to ASQ Seattle Section

• To develop and implement systems to set


up our frontline personnel to succeed
instead of setting them up for failure
• The same goal
we have in
proactive
analysis

31
An Overview of the RCA Process
Presentation to ASQ Seattle Section

Overall RCA Program Management System

Yes Follow up and


Analyze Initiate Identify root Develop
Gather data Analyze data resolve
now? investigation causes recommendations
recommendations

No Trend incident Trend root


characteristics causes
Analyze
data to find
chronic
incidents

Yes Enter into


Trend? incident
database

No

No
formal
analysi
s

32
An Overview of the RCA Process
(cont.)
Presentation to ASQ Seattle Section

Overall RCA Program Management System

Analyze Yes Initiate Follow up and


Identify root Develop
now? investigation Gather data Analyze data resolve
causes recommendations
recommendations

No Trend incident Trend root


characteristics causes
Analyze
data to find
chronic
incidents

Yes Enter into


Trend? incident
database

No

No
formal
analysi
s

33
Types of Data Needed for an RCA
Presentation to ASQ Seattle Section

People*
People*

Paper*
Paper*
Types
Types of
of Position
Position
Data
Data
Electronic*
Electronic*

Physical*
Physical*

*Also a source of data


34
An Overview of the RCA Process
(cont.)
Presentation to ASQ Seattle Section

Overall RCA Program Management System

Yes Follow up and


Analyze Initiate Identify root Develop
Gather data Analyze data resolve
now? investigation causes recommendations
recommendations

No Trend incident Trend root


characteristics causes
Analyze
data to find
chronic
incidents

Yes Enter into


Trend? incident
database

No

No
formal
analysi
s

35
Fault Tree Analysis
Presentation to ASQ Seattle Section

• Begins with a known event and describes possible


combinations of events and conditions that can
lead to this event
– troubleshooting
– structured guessing
• Looks backward in time to describe potential
causes
• Uses AND and OR logic to show the causes and
combinations of causes
• Smallest possible tree used for RCA
• Goal is to identify the performance gaps of
equipment and frontline personnel

36
Example Fault Tree
Presentation to ASQ Seattle Section

Incorrect gas
delivered to patient

AND

Incorrect gas
Error not detected at
delivered through
patient location
system
AND Why? OR
What methods
do we have?
Tank with incorrect Tank error not
No indication of wrong Corrective action
contents connected to detected at point of
gas insufficient
the system connection
OR CF OR

Oxygen monitor not


Tank filled with Tank with correct used
incorrect gas gases not sufficiently CF
connected to system mixed
CF
OR Why?
Why?
Policies? Indication of wrong
Talk to
gas not detected
supplier

Wrong tank connected


Tank mislabeled
to system

37
Causal Factor Charting
Presentation to ASQ Seattle Section

• Used to show time relationships


– similar to a timeline
• Used to show cause-effect relationships
– logic tests used to ensure a complete understanding
of an incident
• Brings together the data collected from multiple
sources
• Goal is to identify performance gaps of equipment
and frontline personnel (causal factors)

38
Example Causal Factor Chart
Presentation to ASQ Seattle Section

Prior
modifications?
Maintenance
records? Logical conclusion
Prior Why? What is the
maintenance? factory setting? How Why? What controls
Centrifuge is still spinning
Any similar Maintenance does the system the interlock?
What could the
problems with records? detect speed? Manufacturer
technician see?
other centrifuges of Manufacturer June 21, 2004 @ ~1405
Technician
this type/model? Testing of centrifuge CF
Testing of centrifuge,
Purchasing records LT, Centrifuge, Testing Logical conclusion Logical conclusion LT LT, Observation

OPEN LID light illuminates Centrifuge's interlock Technician immediately


Centrifuge started and Technician's hand injured
Centrifuge purchased while centrifuge is still in disengages while reaches into centrifuge to
operated properly from spinning centrifuge
motion centrifuge is still in motion remove sample

April 2002 June 21, 2004 June 21, 2004 @ ~1405 June 21, 2004 @ ~1405 June 21, 2004 @ ~1405 June 21, 2004, ~1405

CF Extent of injuries?
Manufacturer's data LT, Conclusion Technician
Centrifuge has a factory Medical Reports
Centrifuge was towards the
installed lid, lid latch, and
end of the cycle
speed interlock
NA June 21, 2004 @ ~1405

39
Applicability of Analysis Techniques
to Different Types of Incidents
Presentation to ASQ Seattle Section

Type of Causal Factor Fault Tree


Incident Charting Analysis
Acute incidents Good Good

Can only characterize Good


Chronic incidents typical incident

People-oriented
problems (large, Best Good
acute accidents)
Incidents where
timing is important Best Not very useful

Equipment and
system-oriented
problems Good Best
(including most
chronic problems)

40
Survey Question – RCA Analysis
Tools
Presentation to ASQ Seattle Section

• Which of the following types of analysis


tools do you currently use for your RCAs?
– Fault tree (or other logic tree)
– Causal factor chart (or other timeline/logic
technique)
– Other tools
– No formal tools

41
An Overview of the RCA Process
(cont.)
Presentation to ASQ Seattle Section

Overall RCA Program Management System

Yes Follow up and


Analyze Initiate Identify root Develop
Gather data Analyze data resolve
now? investigation causes recommendations
recommendations

No Trend incident Trend root


characteristics causes
Analyze
data to find
chronic
incidents

Yes Enter into


Trend? incident
database

No

No
formal
analysi
s

42
What Is a Root Cause?
Presentation to ASQ Seattle Section

• For virtually every incident, some


improvement(s) in management systems
could have prevented most (or all) of the
contributing events from occurring
• The absence, neglect, or deficiencies of
management systems are fundamentally
the causes of incidents

43
Importance of Addressing
Root Causes
Presentation to ASQ Seattle Section

• Promotes more cost-effective solutions to


problems because the proper solutions
are implemented
• Provides a leveraged solution that prevents
or mitigates many past and potential
incidents

44
ABS Consulting’s Root Cause Map™
An Overview of the RCA Process
(cont.)
Presentation to ASQ Seattle Section

Overall RCA Program Management System

Yes Identify root Follow up and


Analyze Initiate Develop
Gather data Analyze data causes resolve
now? investigation recommendations
recommendations

No Trend incident Trend root


characteristics causes
Analyze
data to find
chronic
incidents

Yes Enter into


Trend? incident
database

No

No
formal
analysi
s

46
Recommendation Characteristics
Presentation to ASQ Seattle Section

• Timing
– Short-term, medium-term, and long-term
• Recommendation levels
– Level 1: Address the performance gap (the causal
factor: human error or equipment failure)
– Level 2: Address the underlying cause of the
specific problem
– Level 3: Fix similar, existing problems (much like an
audit)
– Level 4: Correct the process that creates these
problems (the management system)

47
Recommendation Levels
Presentation to ASQ Seattle Section

Production
Product
System
X? X? X?
X? X? X Levels
1&2
Level 4 X? X? X?
X? X? X?
Level 3

48
Recommendation Levels (cont.)
Presentation to ASQ Seattle Section

Management Products
(procedures, MOC packages,
System trained operators, designs)

X? X? X?
X? X? X Levels
1&2
Level 4 X? X? X?
X? X? X?
Level 3

49
Example Root Cause Summary Table
Presentation to ASQ Seattle Section

50
Documenting the Analysis
Presentation to ASQ Seattle Section

• Minimize the effort needed to develop the


report
• Use the analysis tools to provide the bulk
of the documentation
– causal factor chart/fault tree – what
happened and how it happened
– three-column form – performance gap, root
causes, and recommendations
– standard report form - basics

51
Questions?
James J. Rooney
ABS Consulting
Operational Performance and Risk Consulting
Division
10301 Technology Drive
Knoxville, TN 37932-3392

Phone: (865) 671-5814


Fax: (865) 966-5287
jrooney@absconsulting.com

You might also like