Download as pdf or txt
Download as pdf or txt
You are on page 1of 15

Root Cause Analysis

Motivation, Process, Tools, and Perspectives


Summary
Root Cause Analysis (RCA) is a structured investigative process that aims to identify the true cause of a problem, and the actions necessary to eliminate, or mitigate that problem.. The trigger to start an RCA can be a major accident or incident, or an overall improvement program in the areas of safety, quality, or production/maintenance. The article starts with an example of a major railway accident whereby root causes needed to be investigated. A discussion of the RCA process is next, followed by an investigation of available RCA tools, and the role of RCA in improvement programs. The article ends with references for further reading on this subject.
GS02003 Gerard Schram 15 pages May 2002 (Revised September 2004) SKF Reliability Systems @ptitudeXchange 5271 Viewridge Court San Diego, CA 92123 United States Tel. +1 858 496 3554 fax +1 858 496 3555 email: info@aptitudexchange.com Internet: www.aptitudexchange.com

Use of this document is governed by the terms and conditions contained in @ptitudeXchange.

Root Cause Analysis Introduction......................................................................................................................................3 Importance of RCA..........................................................................................................................4 Example: Railway Accident .....................................................................................................4 RCA Process ....................................................................................................................................6 RCA Tools/Methods ........................................................................................................................7 Problem Identification/Understanding......................................................................................7 Possible Cause Generation and Consensus Reaching ..............................................................7 Problem and Cause Data Collection .........................................................................................7 Possible Cause Analysis ...........................................................................................................8 Cause-Effect Analysis ..............................................................................................................9 Tool Selection.........................................................................................................................11 The Wider Perspective of RCA .....................................................................................................11 Role in HAZOP ......................................................................................................................11 Role in TQM / Six Sigma .......................................................................................................11 Role in TPM ...........................................................................................................................11 Role in Asset Management.....................................................................................................11 Role in (S) RCM.....................................................................................................................12 A Survey among Maintenance Professionals .........................................................................12 The Consequences Of RCA ...........................................................................................................12 Commercial Methods/Software .....................................................................................................13 PROACT.................................................................................................................................14 Taproot....................................................................................................................................14 Conclusion .....................................................................................................................................14 References......................................................................................................................................14

2004 SKF Reliability Systems All Rights Reserved

Root Cause Analysis

Introduction
The greatest tragedy underlying errors and resultant failures is that many of them are avoidable. Yet, one of the best effective concepts for improving reliability in engineering is often neglected. That concept is the learning and continuous improvement from (historical) case analysis. Well-studied examples are failures in civil engineering structures, such as the collapse of various suspension bridges (Tacoma Narrows bridge in oscillating mode due to wind, 1940). Aeronautical and aerospace failures are also the subject of much attention, especially in the mass media. Nuclear and chemical engineering incidents can have major impacts too. Mechanical engineering failures generally result in somewhat less life-threatening situations, but can cause massive recall campaigns and product liability suits. It is obvious then, that recognizing and understanding failure (or a near failure) plays a key role in error-free design and operation. This understanding is necessary to eliminate the same causes and effects in the future. Apart from physical failures, safety incidents, quality defects, customer complaints, etc., can be the reason for a thorough investigation into their causes. In general, we can state that a problem is a deviation from what is defined normal, with negative impact. A problem is not always recognized (it can be perceived as normal). However, with an open-minded team and/or internal or external benchmarking, problems can be identified. Problem solving consists of identifying causes, and finding ways to eliminate them and prevent them from recurring. In other words, identifying the cause/s is often half the answer.

The NASA defines so called "direct" or "proximate" causes as: The event(s) that occurred, including any condition(s) that existed immediately before the undesired outcome, directly resulted in its occurrence and, if eliminated or modified, would have prevented the undesired outcome. Regarding an "undesired outcome", the NASA provides examples such as: failure, anomaly, schedule delay, broken equipment, product defect, problem, close call, mishap, etc. Then as definition of root cause, the NASA states: One of multiple factors (events, conditions or organizational factors) that contributed to or created the proximate cause and subsequent undesired outcome and, if eliminated, or modified would have prevented the undesired outcome. Typically multiple root causes contribute to an undesired outcome. NASA defines Root Cause Analysis (RCA) as: A structured evaluation method that identifies the root causes for an undesired outcome and the actions adequate to prevent recurrence. The American Society for Quality (ASQ) defines Root Cause Analysis (RCA) as: RCA is a structured investigation that aims to identify the true cause of a problem, and the actions necessary to eliminate it. In fact, RCA is a collective term used to describe a wide range of approaches, tools, and techniques used to uncover and model causes to problems. RCA is a method that helps professionals determine what happened, how it happened, and why it happened. It allows learning from past problems, failures, and accidents. RCA can be applied to any organizational, production, and administrative (etc.) problem. 3

A problem is often the result of multiple causes at different levels. The root cause is the evil at the bottom" that sets in motion the cause-and-effect chain and creates the problem. 2004 SKF Reliability Systems All Rights Reserved

Root Cause Analysis There exist slightly different terms, including Failure Analysis (FA) and Root Cause Failure Analysis (RCFA). Failure Analysis refers to the observation, categorization, and possibly documentation of a failure. As such it does not necessarily intend to find the root causes that resulted in that failure (how it failed). Root Cause Failure Analysis includes the investigation towards root causes, but is somewhat limited to the term "failure." The term failure is biased to physical failures, while root cause analysis is applicable to many more situations, such as safety incidents, quality problems, etc. Finally, Failure Mode Effect Analysis (FMEA) is a more hypothetical analysis to determine how a component or process could fail (failure modes), including their risks and consequences. FMEA can be considered a proactive way to avoid problems that have not occurred before. On the other hand, RCA is generally initiated when an unplanned problem is happening. It then focuses on preventing reoccurrence in the future. The preventive actions effect on risks and consequences are generally not taken into account. open environment for learning from failures is essential [Latino, 2001]. Example: Railway Accident A real example shows how small root causes can lead to serious damage. This example originates from SKF Belgium. A goods train traveled from Antwerp harbor to a factory in France. After 30 km the train passed a station where the temperature of the axle boxes is measured to detect possible hot boxes. Everything was normal. 35 km further the train derailed. 8 wagons were destroyed, and damage was done to the rails and overhead electrical cabling. The goods traffic was stopped for several hours. The accident happened in Belgium, the goods were French owned, and the railway wagons were property of the German State Railways. The wagon in question was overhauled just before the accident. (By international agreement, the Belgian Railways paid damages: > US $1,000,000.)
Starting point

Importance of RCA
Why perform a RCA? If achievements from eliminating the problem and its consequences are larger than the efforts put into a RCA, this seems obvious. Although eliminating risk of recurrence of similar situations looks admirable, it could be perceived as the "program of the month." Resolving emergencies when they occur, while RCA aims to eliminate root causes and reduce the maintenance persons responsibilities, may recognize a maintenance person. Therefore, it is extremely important to align everyone in the same direction, both at management level and production and maintenance personnel. Creating the right,
derailment

Hot box control

50 km

Figure 1. Relevant Locations within Belgium.

The remains of the failed axle box, equipped with two spherical roller bearings SKF 229750 J/C3R505 (Y 25 bogie 20-ton axle load design) are shown in Figure 2. We are looking for the root cause, as we want to eliminate this problem forever!

2004 SKF Reliability Systems All Rights Reserved

Root Cause Analysis

Figure 4. The Axle Box as part of the Boogie.

Figure 2. Remains of the Axle Box Bearings.

The wagons were equipped with Y25 bogies, with axle boxes with double spring suspension. Maximum authorized axle load is 20 tons. The axle boxes incorporated spherical roller bearings SKF 229750 J/C3R505.

Figure 5. Technical Drawing of the Axle Box with Two Spherical Roller Bearings and the Spacer Ring.

Figure 3. The Wagons.

In the analysis of root causes, one can clearly see that this was more than a hot runner. To some extent, the inside bearing was completely deformed from red-hot running. In fact, there are clues to indicate what happened: There is a gap between the (inside bearing) outer ring and the labyrinth seal. The inside bearing moved towards the outside

2004 SKF Reliability Systems All Rights Reserved

Root Cause Analysis In principle, this should not be possible. For a 20 ton/axle arrangement, the distance ring on the axle between bearings is 35 mm wide, and regulates the precise bearing location The width of the distance ring - called the spacer ring - was 14 mm In fact, there are TWO different executions of this axle box: 20 ton / axle payload - axle box with a 35 mm spacer between bearings. And, a 22.5 ton / axle payload, a similar but slightly narrower axle box, with a 14 mm spacer between bearings Somehow, the maintenance personnel installed the wrong spacer ring The bearing assembly was allowed to slide to the outside, which resulted in heavier axle load, more axle bending, material fatigue, and final collapse. The bearing was running at more than red hot, and was completely deformed. The train derailed just for a spacer! Problem Understanding: It is necessary to understand the nature, or essential failure modes, of the problem Root Cause Identification: Find the correct root cause(s). This includes brainstorming and investigating possible root causes, and cause-effect relationships Root Cause Elimination: Eliminate the root cause(s) to prevent the problem from recurring Symptom Monitoring: Monitor symptoms to show the presence or elimination of the problem. Regularly take performance checks

Generally, a team performs the RCA process. As stated before, it is essential to create the right environment for an open, trustful approach. The following roles are distinguished within a manufacturing plant (2001): Executives: Put a stamp of approval on RCA, including expectations and time lines. They should be fully educated in RCA RCA Champions: Administer, support, and ensure the RCA effort from a management standpoint. They should be a mentor to the drivers and analysts, and should have the authority to protect persons in case of politically sensitive facts. They set performance expectations RCA Drivers: Team leaders who organize all details. The team meets, analyzes, hypothesizes, verifies, and draws factual conclusions. They develop recommendations to eliminate root causes

This example shows the necessity of finding problem root causes with the goal of eliminating them from recurring. Human mistakes or erroneous procedures can be the root cause, but we should acknowledge the errors and learn from the mistakes.

RCA Process
The following steps are generally found in a RCA procedure:

Problem Identification: The problem should be recognized and assigned a name. If a problem is perceived as normal, it never improves. In the case of engineering Structured RCA effort intends to be a constructions, the problem can be proactive task, so it should reside under the identified by symptom analysis and control of a reliability department. In the equipment inspections. In general, internal absence of such a department, RCA should be or external benchmarking can also identify controlled by operations or engineering. The problems (or opportunities) 2004 SKF Reliability Systems All Rights Reserved

Root Cause Analysis RCA effort should not be placed under the control of a reactive maintenance department, as their role is to respond to day-to-day activities in the field. of (business) processes compares with other organizations or departments (benchmarking). It compares and determines which problems are most critical from an external viewpoint. Performance Matrix: Used to illustrate the performance and importance of problems and causes. High importance, high performance impact problems and causes are only selected. Possible Cause Generation and Consensus Reaching The following section covers idea-generating tools to determine possible problem causes and tools to reach an agreement in case of disputes or different views. Brainstorming: Generic process of generating a list of problem areas, consequences, causes, and ways to eliminate them. It can be structured or unstructured. Brain Writing: Similar to brainstorming, brain writing uses written cards or a gallery of white boards or flip charts. It is preferred, as it reduces problem complexity, dominating people, or the possible anonymity. Nominal Group Technique: A kind of brainstorming in which all participants have the same vote when selecting solutions / causes. Ideas are first generated, and then participants rank them individually. By totaling the points, a consensus is reached. Paired Comparisons: Instead of comparing ideas all at once, they are compared pair-wise to reach a consensus. Problem and Cause Data Collection Here we include tools and techniques to collect reliable root cause analysis data. Sampling: Sampling draws conclusions about a larger group based on a smaller sample. A minimum understanding of statistics is required to perform reliable sampling. 7

RCA Tools/Methods
The American Society for Quality distinguishes tools and methods by their specific purposes (2000): Problem identification/understanding Possible cause generation and consensus reaching Problem and cause data collection Possible cause analysis Cause-and-effect analysis

We briefly mention the various techniques. Please refer to detailed publications, such as the original work of Ishikawa of the Asian Productivity Organization. Problem Identification/Understanding Problem identification and understanding includes tools to identify and gain solid understanding of the problem. Flowcharts: Many problems are connected to business or work processes. A process flowchart is an appropriate first step to illustrate where problems occur, and to provide an understanding of processes that contain or influence problems. Critical Incident: A method to explore the most critical issues in a situation. A collection of people from different departments or functional areas is asked about most critical incidents. The answers are collected, sorted, and analyzed based on frequency. The most critical ones are the starting point for RCA. Spider Chart: The spider chart gives a graphical impression of how the performance

2004 SKF Reliability Systems All Rights Reserved

Root Cause Analysis Surveys: Used to collect data about attitudes, feelings, or opinions, such as customer satisfaction, needs, and/or expectations. Check Sheets: A check sheet table used to systematically register data. Cause of Machine Trouble unbalance Jan Feb Totals per cause 3 4 2
25 20

Possible Cause Analysis Possible cause analysis covers techniques for analyzing the impact of different causes. Histogram: A bar chart used to visualize the distribution and variation of a data set. The diagram helps to identify patterns or anomalies. The frequency of occurrence is depicted vertically, while the classes are ordered along the horizontal axis.

II

I III II

misalignment I bearings .

15 10 5 0 <1hr 1-4hr 5-8hr 8-24hr shutdown

Table 1. Example of a Simple Check Sheet.

A Computer Maintenance Management System (CMMS) is another good source for data (data entering is properly done). For example, statistics may be derived on breakdowns and possible causes. Again, a representative set of data should be present. Like the CMMS, other documentation on health/safety/environmental (HSE) accidents and incidents can be a valuable data source. Possibly, extra fields can be added to these systems to better trigger and track problems. Relevant data may also be found in general databases with reliability data (often referred to as RAM data). A few example databases: OREDA for Offshore Reliability Data, with turbines, compressors, etc. http://www.oreda.com Process Equipment Reliability Database (PERD) of the American Institute of Chemical Engineers http://www.AIChe.org

Figure 6. Histogram Example.

Pareto Chart: The Pareto principle states that most effects, often 80 percent, are the result of a small number of causes, often 20 percent. The main purpose of the chart is to show the causes sorted by the degree of seriousness, expressed as the frequency of occurrence, cost, performance, etc. It shows which causes need further attention. Figure 7 is a simple example, in which two causes cover 80% of the problem.

2004 SKF Reliability Systems All Rights Reserved

Root Cause Analysis Relations Diagram: A tool to identify logical relationships between different ideas or issues in a complex or confusing situation. The factors under investigation are distributed in an empty chart area, and arrows illustrate the relationships between them.

F re q u e n c y

C u m u la tiv e % 100%

20

10

Affinity Diagram: A chart approach that helps identify seemingly unrelated ideas, causes, or other concepts so they might collectively be further explored. A way to handle and brainstorm about causes in a qualitative way rather than quantitative.
0%
cau se 1 cau se 2 cau se 3 cau se 4 cau se 5

Cause-Effect Analysis The last stage is the cause-effect analysis. A few tools are mentioned here. Cause-Effect Chart: This is a well-known technique used to relate possible causes to a problem. It is also called the Ishikawa diagram or fishbone diagram. After completing the cause-effect diagram, examples / facts can also be entered. These illustrate the relationships, and provide an idea about their strength. The cause-effect diagram shows that multiple causes can result in the same problem. The diagram can be used as a discussion aid to determine which causes are considered the primary (root) causes of the actual problem. If enough data is available, a probabilistic approach could yield the most likely root causes.

Figure 7. Pareto Chart Example.

Scatter Charts: Illustrate relationships between two causes or other variables in a problem situation. This is achieved by plotting at least 30 samples of data pairs in one figure. Possible logarithmic axes may also be used. The data may be generated by experiments of changing variables and plotting the effects.

Paper thickness

"knob A"

Figure 8. Scatter Chart Example.

Figure 9. Cause-Effect Diagram (Fishbone).

2004 SKF Reliability Systems All Rights Reserved

Root Cause Analysis Fault-Trees: Another visual way to represent cause-effect relationships. The fault tree starts with faults / problems. Causes (can be different layers) are then depicted with arrows indicating the relationships. Matrix Diagram: A visual technique for arranging possible causes by their contribution to the problem. Problem characteristics are ordered vertically, and possible causes horizontally. The contributions of the cause to problem characteristics are depicted in the matrix. By accumulating individual contributions, you get an idea of which causes are most significant. It is also sometimes referred to as a cause-effect matrix. Five Whys: The main purpose is to keep asking "why" when a cause is identified. Each cause is questioned whether it is a symptom, a lower level cause, or a root cause. The chains of causes can be drawn in a simple chart. The rule of thumb is that the method often required five rounds of the question why. Advanced Tools: There are various other ways to model cause-effect relations based on (statistical) correlations or regression techniques. However, they fall outside the scope of this introduction article on RCA. Other advanced techniques stem from artificial intelligence, such as artificial neural networks, fuzzy models, logical decision trees, and other network representation. The causeeffect networks are used to reason forward or backward. The network, together with reasoning capacity, forms a so-called expert system, or knowledge-based system. These tools can be tuned by both "data" and "heuristics." For example, the Bayesian network is used to model cause-effect relations, where the strength of the relationship is modeled as probabilities. SKF applies the Bayesian network to support bearing failure or damage investigations.

Figure 10. A Bayesian Network Used to Model Relations Between Causes and Effects. The Arrows Denote relationships, While Numbers and Red Bars Denote Probability of Occurrence.

2004 SKF Reliability Systems All Rights Reserved

10

Root Cause Analysis Tool Selection These tools and methods are aids to get to the goal, rather than the solution. In the general RCA process, the tools support problem understanding and root cause identification steps. The American Society for Quality further outlines the particular strengths and weaknesses of the tools (2000). In general, the selection is very situation dependent. Doggett (2004) concludes after investigating three RCA tools (Cause and Effect Diagrams, Interrelationship Diagrams, and Current Reliability Trees), that none of the tools were perceived significantly better in terms of finding root causes. On the other hand, the complexity of the tools varies, and as such the training requirements. Role in TQM / Six Sigma Total Quality Management (TQM) and Six Sigma stand for a stream of programs aimed to tackle major causes of quality defects. We can state that RCA originates from quality improvement philosophies, and many RCA tools / methods are present in TQM and Six Sigma. Some RCA tools can be embedded in a plant's quality procedures, as one main goal is achieving a continuous process of quality improvement. For example, critical incidents investigation, performance spider charts, etc., can be done on regular basis. Role in TPM Total Productive Maintenance (TPM) stands for an improvement program that covers both production and maintenance functions. It is founded on the concept of ownership and complete integration of the production and maintenance functions. The prime driver for TPM is the concept of Overall Equipment Effectiveness (OEE). The philosophy hinges on making equipment effectiveness the concern of everyone in the organization. OEE requires strict attention to the measurement and quantification of losses. When identifying big losses and their root causes, RCA tools play a useful role. As such, RCA tools can be part of a TPM program. Role in Asset Management Asset Management (AM) tries to attain the lowest life cycle cost with maximum availability, performance efficiency, and quality (maximum OEE). In other words, AM is the systematic planning and control of a physical asset throughout its life. An outcome of AM is the defining what specific maintenance practices need to be undertaken while considering the optimum means of implementing them. This is where RCA tools can again play a useful role.

The Wider Perspective of RCA


Root cause analysis can be used after a major incident or accident like the railway problem outlined earlier. However, RCA can also be part of a bigger improvement program, such as safety, quality, or maintenance improvement programs. RCA identifies problems (opportunities to improve) and finds root causes. Role in HAZOP A Hazard and operability (HAZOP) study is a methodical review of a defined operation system to identify potential hazards and operability problems. It identifies and defines process and design deficiencies, the potential for, and consequences of human and organizational error, accidents from neighboring plant or activities, natural occurrences and catastrophes, and the possibilities of equipment component failures. As such, many RCA tools and methods can play a role in a HAZOP study.

2004 SKF Reliability Systems All Rights Reserved

11

Root Cause Analysis Role in (S) RCM Reliability Centered Maintenance (RCM) and SRCM are structured processes to proactively identify equipment modifications and/or safety devices required to avoid any significant consequence as a result of equipment failure. Consequences can be operational loss, safety, health, or environmental. By RCM study, all of the potential modes of failure are uncovered and a maintenance strategy is devised to mitigate the consequences of the failure based on the criticality of the failure mode. In RCM, these failure modes are identified as the root cause(s) of the failure. This is where the main difference lies. The purpose of RCA is to uncover the underlying reasons (root causes) why an event (not just equipment related events, but any type of event) is occurring so that the necessary steps can be taken to eliminate the event in its entirety. This is accomplished by analyzing the modes (the point at which RCM stops). RCA uses for example a logic tree that stresses verification at every level. The advantage is that the actual root causes that are uncovered are facts that have been derived from the verification process. RCM is driven by deriving a maintenance strategy, while RCA is driven by maintenance prevention. Within RCM, FMEA stands as the central vehicle; however, the RCA tools and methods can be of additional help when performing FMEA in the need of deeper investigation of the failure modes. Secondly, RCA is to be used in the process of updating (on periodic basis) the derived maintenance strategy from RCM such that a continuous improvement of the maintenance strategy is achieved. A Survey among Maintenance Professionals A survey of the use of RCA techniques by maintenance professionals was conducted on the Plant Maintenance Resource Center in 2000. See the results at: http://www.plantmaintenance.com/articles/rca-survey-01.shtml The key findings are: 59% of respondents indicated that they use some form of RCA process Of those who indicated that they used some form of RCA, 79% indicated that they used formal, structured processes Those using formal processes considered that the overall effectiveness of their approach was significantly better than did those people using informal processes. Supervisory and technical staff are more likely to be involved in RCA than shop floor personnel. The greatest benefits appear to be in the area of improved equipment availability and reliability. 60% of respondents indicated that they used external consultants to assist with their RCA implementation. 55% of respondents indicated that they used software to assist with their RCA process.

The survey shows that RCA is quite wide spread amongst maintenance functions, and that the structured process of RCA is key to make RCA become effective.

The Consequences Of RCA


To prevent the problem from recurring, the root cause(s) should be eliminated. The root cause investigation results necessary actions are considered the outcome of RCA. It is 12

2004 SKF Reliability Systems All Rights Reserved

Root Cause Analysis essential to know cause-effect relationships to prevent problems from recurring. The assessment of these actions is generally not addressed within the RCA context. This is typically the second part of an FMEA process, whereby possible actions are assessed after their effect, in terms of risk or consequence decrease. It is worthwhile to consider this approach when assessing alternative actions. @ptitudeXchange provides articles on FMEA for further reading. In order to arrive at a continuous improvement situation, RCA needs to be embedded into the normal work processes. As an example, within the SKF concept of Proactive Reliability MaintenanceTM (PRM), an improvement loop is defined (Figure 11). Starting with an operational review, a predictive maintenance program is set-up. Where critical anomalies are detected, RCA is applied, providing corrective actions to prevent anomalies from occurring again. Formulating a number of key performance indicators monitors the process.

Figure 11: Proactive Reliability MaintenanceTM.

These types of work processes generally need adjustment in the standard job plans. For example, anomalies detected during predictive maintenance should feed/start RCA procedures. RCA results have to be documented extensively (see e.g. Reed, 2003), and recorded appropriately in CMMS for keeping good machinery history. Corrective work (e.g., cleaning, repair) or adjustments in maintenance strategy (e.g., preventive vs. predictive) needs to be planned and scheduled.

management of change is needed (Schram & Yolton, 2004).

Commercial Methods/Software
Just two of the many tools are mentioned here. Most commercial tools are tools with which cause-failure trees can be made or searched through, and then visualized. It should again be emphasized that RCA is more a process than a tool - the tool supports the structuring of the process.

In case of large changes, a change management project may follow RCA. For example, when changing organizational structure or major responsibilities, a structured 2004 SKF Reliability Systems All Rights Reserved

13

Root Cause Analysis PROACT Reliability Center Inc. offers a method called PROACT accompanied with a software tool. PROACT stands for: PReserving event data Ordering the analysis team Analyzing event data Communicating findings and recommendations Tracking for bottom line results Equipment Troubleshooting Tables Component Troubleshooting Tables FRETT Analysis Equipment 7 Cause Categories

More information can be found at: http://www.taproot.com/ Summary: PROACT is a process with an empty, supportive tool, while TapRoot is a step-by-step search in a database with tables and trees.

The method is clear, and a great deal of attention is spent on human organizational errors. Many other software tools only focus on (modeling) the mechanical issues. More information can be found at: http://www.reliability.com/ Taproot System Improvements Inc. offers a software suite called TapRoot. The suite of tools includes Root Cause Tree software, which provides the investigator with a fairly comprehensive list of causes that should be considered for any incident. Each causal factor that contributed to the incident should be analyzed one at a time. A dictionary provides explanations and definitions of each part of the root cause tree. This allows for consistent, non-overlapping root causes that create trending in a database. It also includes a checklist that ensures consideration of the most frequently occurring human performance contributors to an incident, which helps narrow down the seven basic cause categories. It also helps keep the investigator's mind open and focused. A second software, Equifactor was created in cooperation with Heinz Bloch's equipment troubleshooting techniques. These techniques include:

Conclusion
Root Cause Analysis (RCA) is a structured investigation that aims to identify the true cause of a problem, the cause-effect relationships, and the actions necessary to eliminate it. The trigger to start an RCA can be a major accident or incident, or an overall improvement program in the areas of safety, quality, or production / maintenance. The RCA process consists of problem identification / understanding. The outcomes of RCA are recommendations for change and monitoring to keep the problem from reoccurring. Several tools and methods exists that can support the RCA process.

Acknowledgements
The author would like to thank Wayne Reed for his contributions to this paper.

References
Petroski, H. Design Paradigms - Case Histories of Error and Judgment in Engineering. Cambridge University Press, United Kingdom: 1994. Ishikawa, K., Guide to Quality Control. Asian Productivity Organization: 1982.

2004 SKF Reliability Systems All Rights Reserved

14

Root Cause Analysis Magnusson, K., Kroslid, D., Bergman, B. Six Sigma - The Pragmatic Approach. Studentlitteratur, Lund: 2000. Burr, J.T., "The Tools of Quality, Part I: Going with the Flow", Quality Progress, June 1990. Sarazan, S., "The Tools of Quality, Part II: Cause-and-effect Diagrams", Quality Progress, July 1990. Shaldin, P.D., "The Tools of Quality, Part III: Control Charts", Quality Progress, August 1990. Juran Institute, "The Tools of Quality, Part IV: Histograms", Quality Progress, September 1990. Juran Institute, "The Tools of Quality, Part V: Check Sheets", Quality Progress, October 1990. Burr, J.T., "The Tools of Quality, Part VI: Pareto Charts", Quality Progress, November 1990. Burr, J.T., "The Tools of Quality, Part VII: Scatter Diagrams", Quality Progress, December 1990. Anderson, B., Fagerhaus, T. Root Cause Analysis - Tools and Techniques. American Society for Quality (ASQ), Quality Press, Milwaukee, Wisconsin: 2000. NASA, Root Cause Analysis Overview, July 2003. Office of Safety & Mission Assurance, Chief Engineers Office. Doggett A.M., A Statistical Comparison of Three Root Cause Analysis Tools. Journal of Industrial Technology. Vol. 20(2), 2004. Latino, R.J., "Creating the environment for RCA to succeed", Maintenance Technology Magazine, April: 2001. 2004 SKF Reliability Systems All Rights Reserved 15 Latino, K.C., "Fighting Failure", Maintenance Technology Magazine, December: 2001. Latino, R.J., Latino, K.C. "Root Cause Analysis - Improving Performance for Bottom Line Results". CRC press: 1999. (Second edition 2002.) Clements-Jewery, K. "Structure the machine failure information", In: Asset Maintenance Management, Wilson, A. (Eds). Conference Communication, Monks Hill, UK: 1999. Goodacre, J. "Identifying current needs using root cause analysis." Maintenance & Asset Management, Vol. 16(6), 18-23, 2001. Bloch, H.P, Geitner F.K. Machinery Failure Analysis & Troubleshooting (3rd Edition). Gulf Professional Publishing: 1997. Paradies, M, Unger L. Taproot : The System for Root Cause Analysis, Problem Investigation & Proactive Improvement. System Improvements Inc.: 2000. Schram G., van der Vorst B., DecisionSupport System for Bearing Failure Mode Analysis. EVOL03_no1_p25. http://www.aptitudeXchange.com System Improvements Inc., Equipment Troubleshooting. SI04003. http://www.aptitudeXchange.com Barratt, M., Proactive Maintenance. MB02028, http://www.aptitudeXchange.com Reed, W., RCFA Report Template. GS03008, http://www.aptitudeXchange.com Schram, G., Yolton J., Change Management. GS04015, http://www.aptitudeXchange.com

You might also like