Download as pdf or txt
Download as pdf or txt
You are on page 1of 12

ARTICLE IN PRESS

Reliability Engineering and System Safety 95 (2010) 87–98

Contents lists available at ScienceDirect

Reliability Engineering and System Safety


journal homepage: www.elsevier.com/locate/ress

Development and evaluation of a computer-aided system for analyzing


human error in railway operations
Dong San Kim a, Dong Hyun Baek b,, Wan Chul Yoon c
a
Department of Industrial & Systems Engineering, Korea Advanced Institute of Science and Technology, 373-1 Guseong-dong, Yuseong-gu, Daejeon 305-701, South Korea
b
Department of Business Administration, Hanyang University, 1271 Sa-l dong, Sangnok-ku, Ansan, Kyeonggi-do 426-791, South Korea
c
Department of Knowledge Service Engineering, Korea Advanced Institute of Science and Technology, 373-1 Guseong-dong, Yuseong-gu, Daejeon 305-701, South Korea

a r t i c l e in fo abstract

Article history: As human error has been recognized as one of the major contributors to accidents in safety-critical
Received 21 April 2009 systems, there has been a strong need for techniques that can analyze human error effectively. Although
Received in revised form many techniques have been developed so far, much room for improvement remains. As human error
4 August 2009
analysis is a cognitively demanding and time-consuming task, it is particularly necessary to develop a
Accepted 22 August 2009
computerized system supporting this task. This paper presents a computer-aided system for analyzing
Available online 29 August 2009
human error in railway operations, called Computer-Aided System for Human Error Analysis and
Keywords: Reduction (CAS-HEAR). It supports analysts to find multiple levels of error causes and their causal
Human error analysis relations by using predefined links between contextual factors and causal factors as well as links
Computer-aided analysis
between causal factors. In addition, it is based on a complete accident model; hence, it helps analysts to
Accident analysis
conduct a thorough analysis without missing any important part of human error analysis. A prototype of
Accident model
Railway accident CAS-HEAR was evaluated by nine field investigators from six railway organizations in Korea. Its overall
usefulness in human error analysis was confirmed, although development of its simplified version and
some modification of the contextual factors and causal factors are required in order to ensure its
practical use.
Crown Copyright & 2009 Published by Elsevier Ltd. All rights reserved.

1. Introduction and aviation, since the 1980s, many researchers from various
fields have devoted much effort to understand how and why
With the development of technology, system reliability has human error occurs. As a result, a variety of methods and
increased dramatically during the past decades, while human techniques for analyzing human error have been developed, but
reliability has remained unchanged over the same period. little is known of computer-aided systems that support the
Accordingly, human error is now considered as the most analysis of human error effectively and efficiently. Although
significant source of accidents or incidents in safety-critical several systems for human error analysis or accident analysis
systems. According to statistics regarding railway accidents in have been developed, there is much room for improvement. As
Korea from 1998 to 2007 [1], 68% of the train accidents involving human error analysis is a cognitively demanding and time-
collisions, derailments and fires were attributed to human error. consuming task, there is a definite need for a well-developed
In addition, 92% of level crossing accidents were caused by human computerized system supporting this task.
error, most of which were violations by car drivers. Also in the US This research was conducted as part of the effort to develop a
railroad industry, train accidents related to human factors make managerial error analysis system, referred to as Human Error
up a significant proportion of all train accidents [2]. This trend is Analysis and Reduction (HEAR), for use in the Korean railway
similar to other industries such as the aviation industry and industry. HEAR, which includes a detailed procedure, useful tools,
chemical industry. Therefore, there is a definite need for and recording forms, was first developed. A prototype of
techniques that can help identify the causes of human error, and computer-aided system for HEAR (CAS-HEAR) was then developed
derive effective countermeasures to prevent or reduce their future to increase the quality and efficiency of human error analysis
recurrences. In safety-critical industries such as nuclear power using HEAR. To develop HEAR and CAS-HEAR, the advantages and
disadvantages of existing techniques for human error analysis
were critically reviewed, and a complete model of accident
 Corresponding author. Tel.: +82 31 400 5636; fax: +82 31 400 5591. causation was developed from which the main components of
E-mail addresses: kimdongsan@gmail.com (D.S. Kim), estarbaek@hanyang.ac.kr the analysis were derived. For each step of the analysis, the
(D.H. Baek), wcyoon@kaist.ac.kr (W.C. Yoon). functional and design requirements for CAS-HEAR were derived in

0951-8320/$ - see front matter Crown Copyright & 2009 Published by Elsevier Ltd. All rights reserved.
doi:10.1016/j.ress.2009.08.005
ARTICLE IN PRESS
88 D.S. Kim et al. / Reliability Engineering and System Safety 95 (2010) 87–98

terms of improving the quality and efficiency of the analysis. A In the maritime industry, Casualty Analysis Methodology for
prototype of CAS-HEAR, with only features pertaining to the Maritime Operations (CASMET) was developed as part of the
analysis itself, was then implemented based on the requirements. movement towards an integrated system for human factors and
As finding the root causes of an error is the most important and accident analysis in Europe. The human factors classification of
demanding task in human error analysis, the prototype system the Marine Accident Investigation Branch (MAIB) in the UK has
focuses especially on aiding the task. To support the task, CAS- also been used in this industry [13]. Recently, Human Factors
HEAR provides causal links between contextual factors and causal Investigation Tool (HFIT) was developed to analyze human errors
factors and between causal factors. The causal links between in the UK offshore oil and gas industry [14].
related factors not only improve the quality of the analysis by As one of the techniques developed for use in any industry,
supporting analysts to identify the higher-level latent causes as Cognitive Reliability and Error Analysis Method (CREAM) is a well-
well as the external and observable causes of an error, but also known one [15]. It provides detailed classification schemes of
reduce the time and effort required to find the multiple levels of erroneous actions and causal links between genotypes, or error
error causes and their causal relations. causes, but more specialized classification schemes are needed
The remainder of this study is organized as follows: Section 2 before it can be used for a specific domain. While other techniques
briefly reviews several representative methods and techniques for are mainly concerned with the completeness of human error
human error analysis. Section 3 explains the underlying model analysis by providing detailed procedures or classification
and procedure of CAS-HEAR, and the main aiding features of CAS- systems, CREAM provides not only detailed procedures and
HEAR are presented in Section 4. The evaluation results of CAS- classification schemes, but also effective means to increase the
HEAR are described in Section 5, which is followed by conclusions efficiency of the analysis. One approach is to use the description of
and future work in Section 6. common performance conditions (CPCs) as the basis for deter-
mining probable causes. This is done using a simple matrix that
indicates the relationship between nine common performance
2. Related work conditions (e.g., adequacy of organization, working conditions)
and three main genotype groups (i.e., person, technology, and
Since the Three Mile Island (1979) and Chernobyl (1986) organization-related genotypes). Another way to enhance the
accidents, extensive research on human error has been conducted efficiency of the analysis is to provide the possible relationships
especially in the nuclear power industry. Human performance between the classification groups. In other words, the possible
enhancement system (HPES) [3] and human performance inves- cause-and-effect relationships between the elements in the
tigation process (HPIP) [4] are the two representative methods for classification groups are predefined. These links make it easier
analyzing and managing human error in nuclear power plants. to determine the causes of human error and their relationships.
These methods include the full range of human error analysis For practical use, however, these two means should be refined and
from analyzing the accident sequence to reporting the results; at specified for a certain domain.
each step, a detailed procedure, useful techniques, and work- Although few in number, several computer-aided systems exist
sheets are provided. They have been updated over time based on that support human error analysis. In Korean nuclear power
both continued feedback from field applications and theoretical plants, a computerized system of K-HPES, referred to as CAS-HPES
developments (e.g. [5]). HPES has been adapted by other countries [16], has been used since 2000. It has recently been revised as a
since its development in the United States. K-HPES has been web-based system [17]. In the railroad industry, Rail Accident
developed and used in Korea, and J-HPES in Japan. TapRooTs, Investigation Tool (RAIT) [18], a computer-based tool, was
which has procedures and methods that are similar to those of developed based on solid theoretical grounds; however, at present
HPIP but is intended for use in any industry, is being used in a it is difficult to find examples of its application. There are also
wide variety of industries such as those related to health care, commercial software tools for root cause analysis (RCA) or human
railway, oil, chemical, airlines, and construction [6]. Moreover, factors analysis (e.g., TapRooTs, Apollo, ProActs, Reasons,
according to a recent survey [7], TapRooTs software commands RAIDTM). These commercial software tools, however, were devel-
nearly 50% of the market share of root cause analysis (RCA) oped for use in any industry, and most of them are intended to
software. investigate general problems including human error. For this
In the aviation industry, human factors analysis and classifica- reason, their processes and techniques are rather simple and
tion system (HFACS) [8] is widely used as a technique for human general, and the aiding features they offer are not sufficient. Table
error analysis. It does not involve the full range of the analysis, 1 summarizes the advantages and disadvantages of the main
unlike HPES and HPIP, but systematically classifies the types and techniques for analyzing human error in safety-critical industries.
causes of errors by human operators. It was developed based on a
model of the causality behind an accident, which is known as the
‘‘Swiss cheese’’ model [9]. Technique for the Retrospective and 3. Underlying model and procedure of CAS-HEAR
predictive Analysis of Cognitive Errors in air traffic control
(TRACEr) [10] is composed of eight classification schemes that 3.1. Underlying model of CAS-HEAR
are grouped into three classes: those describing the context
within which the error occurred, those explaining how the error To conduct a systematic and thorough analysis of human error
was produced, and those describing error recovery. TRACEr in an accident, it is essential to understand the process of accident
enables to analyze the modes and mechanisms of human error causation. In safety-critical systems, most accidents and incidents
more deeply than HFACS does; it also includes an analysis of error are not the result of a single event, but the consequence of
detection and correction. Recently, the two techniques were multiple events including human error and mechanical failure.
adapted to the railway industry: HFACS-RR (railroad) [2] and When an adverse event occurs, there are two measures to prevent
TRACEr for driving tasks [11], both of which slightly modified the it from causing the occurrence of another adverse event or an
original techniques. A recent study also proved that HFACS was accident: one is the automatic response of built-in protective
effective in categorizing errors from Australian rail incident/ systems, the other is human response. Accident analysis, there-
accident investigation reports and in capturing the full range of fore, should include analysis of both the response of protective
relevant rail human factors data [12]. systems and human response. Among these, the analysis of the
ARTICLE IN PRESS
D.S. Kim et al. / Reliability Engineering and System Safety 95 (2010) 87–98 89

Table 1
Advantages and disadvantages of the main techniques for human error analysis.

Industry HEA techniques Advantages Disadvantages

Nuclear Power K-HPES/CAS-HPES  Detailed procedure and supporting tools  No link between contextual analysis and root cause analysis
Plant [16]  detailed classification of error causes  no consideration of violations as a type of errors
 classification of error types based on a model of human  no analysis of error detection and recovery process
decision making
 computer-aided analysis (CAS-HPES)

HPIP/HPEPa [3,4]  Detailed procedure and supporting tools  No analysis of error detection and recovery process
 detailed classification of error causes
 decision flowcharts to reduce the complexity of analysis

Aviation HFACS [8]  Classification system based on an accident causation  Little consideration of cognitive error causes
model  no analysis of error detection and recovery process
 systematic analysis from error types to organizational
factors
 easy-to-understand or general terms are used

TRACEr [10]  Detailed analysis of individual cognitive processes  Difficult to determine internal error modes and psychological
 decision flowcharts to reduce the complexity of analysis error mechanisms.
 analysis of error detection and recovery process

Railway RAIT [18]  Computer-aided analysis  Difficult to find its detailed procedures and its application cases.
 systematic analysis of barriers and organizational factors  no analysis of error detection and recovery process

Oil & Gas HFIT [14]  Analysis procedure based on an accident causation model  No consideration of barriers in the model
 analysis of error detection and recovery process
 decision flowcharts to reduce the complexity of analysis

Elseb TapRooTs [6]  Detailed procedures and tools for human error analysis  Little or no consideration of error types and cognitive error
 analysis of error detection and recovery process causes
 decision flowcharts to reduce the complexity of analysis  no analysis of error detection and recovery process
 computer-aided analysis (S/W)  incomplete for use in a specific domain

CREAM [15]  Links between contextual analysis and root cause analysis  Difficult to understand the method and terms
 links between error causes  incomplete for use in a specific domain
 contextual and flexible classification system

a
Human performance evaluation process [4].
b
Techniques in this group are not intended for use in a specific domain.

former has been known as ‘‘barrier analysis’’ for a long time. There (or at least not explicitly) include error handling processes such as
are several accident models that emphasize the roles of barriers in error detection and recovery, and (3) as it does not contain cycles
the occurrence of an accident. The representatives of them are of events, it cannot model a chain of multiple events that leads to
Reason’s model of organizational accident causation [19], which is an accident.
the most widely known model of accident causation, and the The AEB model enables to model the accident evolution as a
Accident Evolution and Barrier Function (AEB) model [20]. series of interactions between human and technical systems. It
Reason’s model describes two interrelated causal pathways: an focuses on understanding why a number of barrier functions
active failure pathway running from organizational processes via failed, and how they could be reinforced or supported by other
error- and violation-producing conditions in the workplace to barrier functions. Therefore, for each link between consecutive
unsafe acts (errors and violations) of an individual or team, and a error event boxes, barrier functions that failed to arrest the link
latent condition pathway running directly from the organizational and possible barrier functions that could have arrested the link are
processes to deficiencies in the defenses. According to this model, identified [22]. The AEB model overcomes some of the limitations
modern technological systems have many defensive layers. Each of Reason’s model in that it forces analysts to integrate human
layer, like slices of Swiss cheese, has many holes created by error events and technical error events simultaneously and it can
combinations of active failures and latent conditions. When the describe the chain of accident evolution using a flow diagram. But,
holes in many defensive layers momentarily line up, a window of the AEB model also has its own limitations such as little or no
opportunity exists for an accident trajectory to bring hazards into consideration of error handling processes and organizational
damaging contact with victims. Reason’s model has been the basis factors.
for many numbers of techniques for human error analysis (e.g., Less is known about the roles of human responses to adverse
[8]) and it has been used to analyze accidents and incidents in a events than the roles of protective systems [23]. Although both
systems perspective (e.g., [21]). Reason’s model, however, has Reason’s model and the AEB model regard ‘‘human barrier’’
some limitations: (1) it considers human failure only as an (especially, front-line operators) as a kind of barrier system,
adverse event, ‘‘technical failures’’ and ‘‘external intrusions’’ are human barrier functions are primarily related to visual inspection
not (or at least not explicitly) included in the model, (2) it does not or checking of the system state [24]. Error handling processes, in
ARTICLE IN PRESS
90 D.S. Kim et al. / Reliability Engineering and System Safety 95 (2010) 87–98

which a human operator detects and recovers from an error made Failure compensation process model is distinguished from
by herself or other operators, are not, or at least not explicitly, other models in that it describes the entire failure compensation
considered. Since the early 1990s, however, the mechanisms of process in detail, from failures (any possible combination of
error handling processes have been elucidated by several human, technical, or organizational failures) via a dangerous and
researchers [23,25–27], and in recent years, several techniques unwanted situation to failure compensation (detection, explana-
that include the analysis of error handling processes have been tion, and correction), and its outcomes. This model, however, has
developed [10,11,14]. There are a number of accident models that drawbacks in that it does not include external intrusions as a type
focus on error handling processes, which include HFIT model of of failure and, other than Reason’s model and the HFIT model, it
incident causation [14] and failure compensation process model cannot explain a variety of factors affecting the occurrences of
[23]. adverse events.
HFIT model of incident causation is similar to Reason’s model Although the models discussed so far have their own
in that it describes the causal sequence of incidents, from threats advantages, they do not include all elements related to the
(situations that can encourage the occurrence of errors), to occurrence of an accident; hence, analyses using these models
situation awareness, to action errors, to an accident. But, by may be effective but are ultimately incomplete. For a more
adding a new category called ‘‘error recovery’’ in place of the thorough analysis, a complete model of accident causation was
defenses in Reason’s model, it explains that if an action error or developed in this study as shown in Fig. 1.
reduced situation awareness is detected and recovered from The model explains how an accident/incident or a near-miss
before an accident occurs, a near-miss results. On the basis of this occurs, what types of adverse events can contribute to the
model, Gordon et al. [14] developed a Human Factors Investiga- accident, which factors can influence the events, and how to
tion Tool (HFIT). The HFIT model also has some disadvantages in prevent an adverse event or an accident from occurring. For an
that it does not consider technical failures and external intrusions accident to occur, the initiating event must occur by penetrating
as types of adverse events, and there are no cycles in the model, barriers in a normal situation. There are three types of adverse
and it does not consider the roles of protective systems (or events: (1) human failures (errors or violations by operators at the
barriers) in the recovery from failures by focusing on the human sharp end), (2) technical failures (faults in hardware or software),
operator’s contribution to recovery. and (3) external intrusions (e.g., a person on the railroad track).

Organizational Factors
Process / Policy Supervision Safety Culture

Design / Maintenance Procedures Training

Local Factors
Human Factors Task Characteristics

Tools / Equipment Work Environment

Occurrences Interventions Outcome

Human
Human Responses
Responses
Human Failure (detection/
(detection/
((error ,violation) ) diagnosis/
diagnosis/ A id t/
Accident/
correction)
correction) Incident

Normal Technical
TechnicalFailure
Failure Unsafe
Situation (H/W, S/W) Situation

Protective
ProtectiveSystems
Systems
(physical/
(physical/ Near
External
External Intrusion
Intrusion
organizational)
organizational) Miss

Fig. 1. HEAR model of accident causation.


ARTICLE IN PRESS
D.S. Kim et al. / Reliability Engineering and System Safety 95 (2010) 87–98 91

None of the aforementioned models does not consider ‘‘external are covered by the aforementioned models. A circle indicates that
intrusion’’ as a type of adverse event, but in open systems such as the model explicitly considers the component; a triangle denotes
aircraft and railroad, other than closed systems such as that the model takes into account the component but the
nuclear power plants (NPPs) and hospitals, external intrusions consideration is implicit or insufficient.
should be included.
When an adverse event occurs, the entire system in operation
enters into an unsafe situation. In this situation, there are two 3.2. Procedure of CAS-HEAR
possibilities: (1) another adverse event occurs by breaking through
barriers (the left loop at the bottom in Fig. 1), or (2) the adverse The procedure of CAS-HEAR was developed based on the HEAR
event is detected and recovered by humans or protective systems model of accident causation. It also maximized the strengths and
(physical or organizational barriers) or the interaction of both. In minimized the weaknesses of the earlier techniques, considering
the latter case, if the intervention is implemented in a timely and both the quality and efficiency of human error analysis. The
accurate manner or if luck is involved, a near-miss results; analysis procedure and information flow of CAS-HEAR are
otherwise, an accident/incident results with loss of life and/or depicted in Fig. 2. The procedure consists of nine steps. It starts
property, or the entire system continues in the risky and sometimes by selecting the human errors to be analyzed in detail and ends
worsened situation (the right loop at the bottom in Fig. 1). after evaluating the corrective actions that are developed.
The two loops at the bottom in Fig. 1 indicate that both As Fig. 2 suggests, it is assumed that all related information
occurrences and interventions are cyclical processes. More than (e.g., physical evidence, documents, and the initial statements of
two adverse events—human failures, technical failures, or external people involved) has been collected and that the accident
intrusions—can occur one after another before an accident/incident sequence analysis has already been completed. The full version
or a near-miss results. For example, the preceding human error can of HEAR, however, also provides worksheets and guidelines for
trigger another human error. Likewise, it is possible that more than information collection and accident sequence analysis [28]. As
two interventions occur before an accident/incident or a near-miss interviews are essential to reconstruct an accident sequence
results if the earlier intervention(s) is incorrect or incomplete. correctly, the guideline for information collection includes
According to this model, human error can occur at two recommendations for effective interviews (e.g., ‘‘While interview-
different stages: the occurrence stage and the intervention stage. ing, the interviewer has to find not only commission errors of the
In other words, human error can not only be a new event, but can interviewee, but also especially omission errors’’). The scope of
also be a failure of human responses to adverse events that have the accident sequence analysis includes what types of events
already occurred. A variety of factors can cause human error. Local occurred in a sequential order, how these events were mediated
factors, which include human factors, task characteristics, tools/ by humans or protective systems, and what the consequences
equipment, and the work environment, influence human perfor- were. All of the events that occurred, the resulting system states,
mance directly. Likewise, local factors are under the influence of and causal relationships between the events are represented in a
organizational factors that include organizational processes/ chart similar to a Sequentially Timed Events Plotting (STEP) chart
policies, supervision, safety culture, system design and main- [29]. The procedure in Fig. 2 shows a Lite version of the original
tenance, procedures, and training. The arrows between the local HEAR, which omits some parts judged to be inadequate for a
factors and the organizational factors indicate that there are computer-based analysis. Details of the CAS-HEAR procedure are
causal links between these various factors. The causal links as follows:
between causal factors are very useful to identify the root causes
of errors. How the causal links support error analysis is described 1. Select human errors to be analyzed: Human errors to be
in detail in Section 4. analyzed are selected from the accident sequence. In other
The HEAR model can help prevent analysts from missing any words, the analyst determines the critical human errors for
important aspects of human error analysis, which include event each of which a detailed causal analysis is needed. Essentially,
sequence analysis, context analysis, root cause analysis, barrier it is recommended that analysts select all human errors in the
analysis, and the analysis of error detection and recovery sequence. In cases that involve numerous human errors,
processes. The usefulness and comprehensiveness of the model however, it may be inefficient to perform a causal analysis
were validated by explaining nearly a hundred accident or (steps 3–6) of all human errors. For this reason, a number of
incident cases with it. selection criteria (e.g., ‘‘Human error triggered by the preced-
Table 2 summarizes several components deemed important for ing error, if there is no particular cause except the preceding
models of accident causation and shows which of the components error, can be eliminated’’) are given.

Table 2
Comparison of accident causation models.

Model Component

Occurrences (of adverse events) Interventions Outcome Workplace Cycle


(incident/ and Org. (event
Human Technical External Protective Human accident, near factors chain)
failures failures intrusions systems responses miss)
(barriers)

Reason’s J X X J W J J X
AEB model J J X J W J W J
HFIT model J X X X J J J X
FCP modela J J X J J J W J
HEAR model J J J J J J J J

a
Failure compensation process model.
ARTICLE IN PRESS
92 D.S. Kim et al. / Reliability Engineering and System Safety 95 (2010) 87–98

H for each human operator


Er f each human error
for

the operators H
Accident 1. Select involved 2. Analyze Information
sequence Human Errors
the Context collected
to be Analyzed

candidate causes

human errors
to be analyzed Er

3. Identify 4. Identify 5. Analyze 6. Analyze


Error Types Error Causes Error Handling Barriers

why-because
h b trees
t
causes of error handling failures failed barriers
causes of barrier failures

7. Review
Causal Analysis

key causes

corrective
8. Develop actions 9. Evaluate
Corrective Actions Corrective Actions

Fig. 2. The CAS-HEAR procedure and information flow.

2. Analyze the context: For each of the operators involved in the ment, work environment, train/infrastructure, rules/proce-
selected errors, the context is analyzed. Four tables are dures, human resource management, communications, team
provided for analyzing operator-related, task-related, environ- factors, supervision, and organizational processes/policies/
ment-related, and organization-related contexts. Each table culture. Each category consists of approximately 10 causal
consists of approximately 11 factors with a total of 45 factors. factors with a total number of causal factors of 138. The
For each factor, the content is recorded and the degree of classification scheme also includes causal links between the
influence on the accident is rated on a five-level scale ranging factors in order to reduce the complexity of finding the root
from ‘‘very low’’ to ‘‘very high’’. Each contextual factor has links causes. As mentioned above, causal factors linked to contextual
to the causal factors analyzed in step 4, and causal factors factors rated as problematic in step 2 are considered as
related to the contextual factors rated as ‘‘high’’ or ‘‘very high’’ candidate causes. After a causal factor is selected as an error
are highlighted to support the task of identifying error causes. cause, other factors that influenced the selected factor are
For each of the selected errors, steps 3 through 6 are iterated. identified by following the causal links between causal factors.
3. Identify error types: First, the task types related to the error are This process is repeated until the root causes of the error are
selected. Second, the error types are determined. CAS-HEAR found. While finding error causes, a why-because tree, which
provides five error types. Four of these are based on a model of represents multiple levels of error causes and their causal
human decision making—perception error, situation assess- relationships, is automatically drawn.
ment error, decision-making error, and execution error. The 5. Analyze error handling processes: As the HEAR model (Fig. 1)
other type refers to violation, or non-compliance with rules or suggests, if an error occurred is detected and recovered from in
procedures. Each type is re-divided into several sub-types. a timely and accurate manner, a near-miss results rather than
More than two types can be selected. an accident or incident does. Therefore, analyzing how an error
4. Identify error causes: All factors that influenced the occurrence was handled after its occurrence is as important as analyzing
of the error, often known as performance-shaping factors (PSFs), why the error occurred. At this step, error handling processes
are identified by using a given classification scheme. The including error detection and recovery are analyzed. First,
classification scheme has 13 categories: mental states of factors related to when, how and by whom the error was
operators, physical states of operators, knowledge/experi- detected are analyzed. If the error was detected by someone,
ences/abilities of operators, task characteristics, tools/equip- when, how and by whom the error was recovered from, are
ARTICLE IN PRESS
D.S. Kim et al. / Reliability Engineering and System Safety 95 (2010) 87–98 93

then analyzed. Finally, the factors that had positive or negative gories are provided so that analysts can develop corrective
influences on the detection and recovery processes are actions from various viewpoints. As aforementioned, the result
identified. of barrier analysis (step 6) is used to develop corrective actions.
6. Analyze barriers: First, whether or not barriers existed that 9. Evaluate corrective actions: Each corrective action developed in
could have prevented the error from occurring is analyzed. If the previous step is evaluated by four criteria, which are
they existed, the reason behind the failure of each barrier is adapted from SMARTER (specific, measurable, accountable,
analyzed. Second, whether or not barriers existed that could reasonable, timely, effective, and reviewed) used in TapRooTs
have prevented the error from proceeding to an accident is [5].
analyzed. If they existed, the reasons for the failure of each of
these barriers are analyzed. Four failure types of barriers are
provided: not existent, not working (fault), not practical, and
operator error. Barriers are analyzed not only for under- 4. Major aiding features of CAS-HEAR
standing the accident but also for finding ways to prevent
the same or similar accidents from occurring in the future [24]. There is no disagreement that root cause analysis, which
Thus, the result of this step is very useful in developing corresponds to step 4 of the CAS-HEAR procedure, is the most
corrective actions in step 8. In CAS-HEAR, barriers can be of important and demanding step in accident analysis or human
two types: (1) physical barriers such as Automatic Train error analysis. Although many existing techniques provide a
Control (ATC), warning devices, signs and signals; (2) organi- comprehensive taxonomy of causal factors, it is often cognitively
zational barriers such as Standard Operating Procedures demanding and time-consuming to find the causes of an error
(SOPs), checklists, restrictions and laws. from among numerous—sometimes over one hundred—possible
7. Review causal analysis: At this step, the analysis results of steps causes. Furthermore, allowing that a list of causes was selected
3–6 (i.e., types and why-because tree of each error, factors that from a classification scheme, it is another complicated task to
had negative effects on error detection and recovery, and failed determine causal relationships between the causes so that the
barriers and their causes) are put together. If there is a causal root causes can be found. There is a strong need to maximize the
relation between two errors, why-because trees of the errors efficiency of this process without losing the quality of the result.
can be combined by the causality. The key causes of the Thus, CAS-HEAR focuses on aiding the task of finding the root
accident are then determined. causes of human errors. For this, Hollnagel’s ideas [15], linking
8. Develop corrective actions: For the key causes, corrective actions causal factors (‘‘genotypes’’ in his words) with contextual factors
are derived in two dimensions. The first of these focuses on (‘‘common performance conditions’’ in his words) and linking
improving physical systems; the second focuses on improving between causal factors, were refined and extended for practical
organizational systems. For each category, several subcate- use in the railway industry.

Context Analysis Error Cause Analysis

Operator-related factors 1. Mental states of operators


1.1 √ 1.7, 3.8, 4.1, 4.4, 9.5
17
1.7 1.
1 12, 3.2, 3.4
√ 1.1, 1.2 1.2 4.2, 9.5, 10.12, 12.2
3.8 -
1.3 1.5, 2.3, 5.5, 6.2, 6.4
4.1 11.2
1.4 1.11, 12.6, 12.10, 13.3
4.4 7.13, 12.7
1.5 11.8 9.10 √ N/A
Task-related factors 9.5 √ 9.10, 13.8
13 8
13.8 √ 13 9
13.9
2. Physical states of operators
2.1 2.2, 2.3, 6.3, 11.6
√ 2.3
2.2 1.4, 1.5, 1.10
3.1 √ 9.1, 9.3
2.3 √ 3.1, 6.1, 6.2, 7.13
6.1
61 7.8, 7.14
Work Env-related factors 2.4 3.1, 5.7, 5.8, 9.2
6.2 5.3, 5.5, 6.8
√ 2.1, 3.1 2.5 3.1 7.16 N/A
7.13 √ 7.16, 7.17
7.17 √ N/A

3. Know-/Abilities of operators
3 1
3.1 9 1, 9.3

3.2
Organization-related factors 9.4 ~ 9.9, 9.11
3.3

3.4 3.1, 4.7, 9.5

3.5 5.8
√ 3.4

√ 3.4

Fig. 3. An illustration of the causal links between factors.


ARTICLE IN PRESS
94 D.S. Kim et al. / Reliability Engineering and System Safety 95 (2010) 87–98

Fig. 3 shows a schematic picture of the links between This simple matrix, however, is not very helpful in practice. As he
contextual and causal factors and the links between causal noted, developing a specific version of the matrix is necessary for
factors. The links are predefined; they are incomplete but use in practice for a given domain. In CAS-HEAR, probable links
plausible. Contextual factors marked as problematic send their between 45 contextual factors and 138 causal factors are
related causal factors (e.g., 1.1, 1.2, 2.3, 2.1, 3.1, and 3.4 in Fig. 3) to provided. In other words, each contextual factor has links to
the ‘‘error cause analysis’’ step where the causal factors are causal factors related to it.
highlighted and used as candidate causes. Causal factors marked As mentioned in Section 3, the analysis of the context is
as error causes (e.g., 1.1 and 2.3) from both the highlighted and conducted as the second step of CAS-HEAR. Fig. 4 displays part of
non-highlighted causes send their related causal factors (e.g., 1.7, the context analysis (‘‘operator-related’’ context analysis). For
3.8, 4.1, 4.4, 9.5, 3.1, 6.1, 6.2, and 7.13) to the analysis of the second- each of the human operators involved, four tables (operator-
level causes. Likewise, causal factors marked as second-level related, task-related, environment-related, and organization-
causes (e.g., 9.5, 3.1, and 7.13) send their related causal factors to related) are recorded. While the content of a contextual factor is
the analysis of the third-level causes. This is the mechanism by recorded, the corresponding contents of other operators are
which the error cause analysis of CAS-HEAR proceeds. If a causal shown on the left side of the screen as a reference. As it is often
factor selected as an error cause has no further link (9.10 and 7.17), the case that the operators involved in an accident are in the same
other factors that are not linked to it are searched or the error work environment and in the same organization, this aiding
cause analysis ends. feature will be helpful especially when analyzing the
The links support human error analysis in terms of quality and environment- and organization-related factors.
efficiency. With regard to the quality of the analysis, the links If the influence on the accident of a contextual factor is rated as
make analysts find not just the external and observable causes of ‘‘high’’ or ‘‘very high’’, such as ‘‘sleeping hours’’ in Fig. 4, the causal
an error but also the higher-level causes, most of which are factors (e.g., ‘‘physical fatigue’’) related to it are highlighted in the
organizational factors, of the error. In regard of the efficiency of error cause analysis (step 4) for human errors by the operator, as
the analysis, the links reduce the time and effort required for the shown in Fig. 5. The highlighted causal factors can be used as
analysis. candidate causes of the errors. In addition, part of the results of
the context analysis related to the highlighted causal factors is
4.1. Linking contextual factors with causal factors presented on the left side of the display for reference. The
predefined links between contextual factors and causal factors are
As an accident or a human error occurs in a specific context, it not complete, but they reduce the complexity of the analysis. As
is unnecessary to examine all causal factors in a given taxonomy; Hollnagel [15] states, however, the analysis of the context enables
it is also impractical in the field because there are a great many a prior sensitization of the various causes, but it cannot lead to the
accidents or incidents to be analyzed, but time and resources are exclusion of any of them. The highlighted causal factors should
limited. For any given context, there are a number of causal factors not be used as absolute criteria but only as a reference.
that are more likely to apply compared with others. Therefore, it
can be helpful to use the result of assessing the context as the 4.2. Linking between causal factors
basis for determining the probable causes of an error.
As a starting point of this approach, Hollnagel [15] proposed a An accident occurs by multiple layers of factors rather than by
simple matrix indicating the relationship between the common only one factor [9]. Most human errors also occur by several levels
performance conditions (CPCs) and the main genotype groups. of causes. For an effective error analysis, therefore, not only the

Fig. 4. Screenshot of a context analysis screen.


ARTICLE IN PRESS
D.S. Kim et al. / Reliability Engineering and System Safety 95 (2010) 87–98 95

Fig. 5. Screenshot of an error cause analysis screen (the main window).

Fig. 6. Screenshot of an error cause analysis screen (pop-up windows).

immediate causes but also the root causes of the error should be the category are checked as error causes, the analyst clicks on the
determined. Although a large number of classification schemes for button ‘‘why-because tree’’ at the top-left of the screen. A new
error causes have been developed thus far, it is not easy to find the window displaying a why-because tree of the error then appears,
root causes of an error using these schemes. Most classification as shown in Fig. 6. A description of the error at hand is in the first
schemes merely provide a list or a structured taxonomy of column of the table; factors that have been checked as error
possible error causes and do not inform relationships between the causes, the first-level causes, are shown in the second column.
error causes. If possible relationships between various causes are For each first-level cause, the second-level causes should be
provided, the root causes of an error will be found relatively easily. found. When ‘‘find the next level of causes’’ is selected from the
To determine the causal links between causal factors completely context menu, which is created by pressing the right button of the
is impossible, but it is possible to establish effective links based on mouse, a new smaller window appears (i.e., the popup window in
theory and experience. Fig. 6). It shows a list of possible second-level causes from the
In contrast to other classification schemes, CREAM [15] predefined links between the causal factors. The results of the
provides possible links between the groups of error causes. For context analysis are also used in this window. Among the causal
practical use in a particular domain, however, the elements in factors in the list, those that are related to contextual factors rated
each group and links between them should be extended. as problematic are highlighted (e.g., ‘‘poor management of
Considering these limitations of existing schemes, CAS-HEAR working hours’’ in Fig. 6). Moreover, part of the results of the
selected 138 causal factors for the railway industry, classified context analysis, which is related to the highlighted causal factors,
them into 13 categories, and determined possible causal links is presented on the top of the window as a reference. If there is no
between the causal factors. appropriate factor in the list, searching for other causal factors by
The analysis of the causes of an error is performed for each typing in keywords is possible. Causal factors, whether high-
category of causal factors one by one. When the analysis of a lighted or not, that are selected as the second-level causes are
category (e.g., ‘‘2. Physical states of operators’’ in Fig. 5) is placed in the third column of the why-because tree table. The
completed, if more than one factor (e.g., ‘‘2.1. physical fatigue’’) in causal relationships between the first-level and the second-level
ARTICLE IN PRESS
96 D.S. Kim et al. / Reliability Engineering and System Safety 95 (2010) 87–98

Fig. 7. Screenshot of a why-because tree drawn by CAS-HEAR.

causes are indicated using directional arrows. In this way, the  Since it includes almost all elements necessary in human error
third-level, fourth-level, and fifth-level causes (and other causes, analysis, investigators are directed not to overlook any
if they exist) can be found. This process is continued until the root important aspects.
causes of the error are found. If a causal factor is selected in the  It is possible to identify error causes and develop corrective
popup window and generated in the why-because tree, it is also actions consistently across investigators.
checked automatically in the main window of the error cause  To analyze the context in terms of operator, task, work
analysis. According to the predefined links, causal factors included environment, and organization is of great use.
in the latter categories, in particular the 12th category, supervision,  The links between contextual factors and causal factors help
and the 13th category, organizational processes/policies/culture, are identify key factors in error cause analysis.
more likely to be the root causes of an error.  To represent the multiple levels of causes behind an error in
While the error cause analysis is performed, the why-because the form of why-because tree is useful.
tree is automatically generated. Fig. 7 shows an example of a
completed why-because tree. Marking a box (i.e., an error cause)
On the other hand, most participants suggested the necessity of
as the key cause of the error (e.g., the shaded boxes in Fig. 7),
a more simplified procedure for analyzing minor accidents or
deleting a box, and moving the location of a box are allowed using
incidents since it is practically not easy to apply the whole
the context menu.
procedure of CAS-HEAR to every human error analysis because of
time constraints in the field. Especially, the causal analysis, steps
3–7 in Fig. 2, was evaluated most complicated. Furthermore, many
5. Evaluation of CAS-HEAR
comments indicate that some of the contextual factors and causal
factors are ambiguous to judge or named improperly.
The usefulness of CAS-HEAR was evaluated by nine field
Table 3 shows the result of the quantitative evaluation of CAS-
investigators from six Korean main train operating companies
HEAR. The average score on the 19 questions was 5.67 out of 7 and
including KORAIL and Seoul Metro and by one investigator from
the average standard deviation was 0.8. The participants gave the
the Korean Aviation and Railway Accident Investigation Board
highest rating, 6.22, to Question 2, ‘‘Is it useful to analyze the
(ARAIB). They have experience in accident investigation ranging
context in terms of operator, task, work environment, and
from 5 to 10 years. The evaluation procedure is as follows:
organizations prior to causal analysis?’’, and Question 16, ‘‘Is the
barrier analysis useful to identify root causes and develop
(1) A workshop for accident investigators in Korean train corrective actions?’’; they gave the lowest score, 5.11, to
operating companies was held to introduce them HEAR and Question 11, ‘‘Is it adequate to classify causal factors into 13
CAS-HEAR and disseminate its manual. categories?’’ An interesting point is that questions on the
(2) After the workshop, nine investigators from six organizations procedure and method of the context analysis (Questions 2, 7,
volunteered to participate in the CAS-HEAR evaluation. and 12) received high scores, but questions on the adequacy of the
(3) Each participant, in his office, selected a railway accident that contextual factors (Questions 3–6) received relatively low scores.
had recently occurred mainly because of human error, and In addition, questions on the error cause analysis table (Questions
reanalyzed it with CAS-HEAR. In this process, some partici- 10, 11, and 13) obtained the lowest scores. Regarding the adequacy
pants asked several questions about the procedure and of the contextual factors and causal factors themselves, the
method of CAS-HEAR by phone or email; the answers to the standard deviation, indicating the difference between the
questions were provided immediately. participants’ ratings, was also largest. These results strongly
(4) After reanalyzing an accident with CAS-HEAR, each partici- suggest that it is necessary to revise and reorganize the contextual
pant completed an evaluation form and submitted it to one of factors and causal factors.
the authors. Concerning the two major aiding features mentioned in
Section 4, mixed results were obtained. The question about
The evaluation of CAS-HEAR consisted of two parts: qualitative ‘‘linking contextual factors with casual factors’’, Question 12,
evaluation in which its advantages and possible improvements received a relatively high score (5.89) and the difference between
were described freely; and quantitative evaluation in which 19 the participants’ ratings was small (standard deviation: 0.33). In
aspects of CAS-HEAR were rated on a seven-point scale and the contrast, the question about ‘‘linking between causal factors’’,
reasons for the ratings were described. Major advantages of CAS- Question 13, received a relatively low score (5.33) and the
HEAR presented in the qualitative evaluation are as follows: difference between the participants’ ratings was relatively large
ARTICLE IN PRESS
D.S. Kim et al. / Reliability Engineering and System Safety 95 (2010) 87–98 97

Table 3
The result of quantitative evaluation of CAS-HEAR.

Questions for the evaluation of CAS-HEAR Average Standard


deviation

1. Are criteria for selecting human errors to be analyzed useful? 5.67 0.71
2. Is it useful to analyze the context in terms of operator, task, work environment, and organizations prior to causal analysis? 6.22 0.44
3. Are operator-related contextual factors adequate on the whole? 5.33 0.87
4. Are task-related contextual factors adequate on the whole? 5.56 1.01
5. Are work environment-related contextual factors adequate on the whole? 5.33 1.00
6. Are organization-related contextual factors adequate on the whole? 5.33 1.00
7. Is it useful to assess the degree of influence on the accident on a five-point scale for each contextual factor? 5.78 0.67
8. Is the classification of error types adequate? 5.67 0.87
9. Is it useful to determine the decision-making step (among perception, situation, assessment, decision making, and execution) to which the 5.67 0.71
occurred error is pertinent?
10. Are causal factors used in error cause analysis adequate on the whole? 5.33 1.22
11. Is it adequate to classify causal factors into 13 categories? 5.11 1.27
12. Is it useful for error cause analysis to link the contextual factors, rated as ‘‘high’’ or ‘‘very high’’ in the degree of influence on the accident, 5.89 0.33
with related causal factors?
13. Is it useful for error cause analysis to link between causal factors? 5.33 0.87
14. Is it useful to represent the multiple levels of causes behind an error in the form of why-because tree? 6.11 0.60
15. Is it useful to analyze the error handling processes in detail? 5.78 0.97
16. Is the barrier analysis useful to identify root causes and develop corrective actions? 6.22 0.44
17. Are the given failure types of barriers adequate? 6.00 0.50
18. Are the categories of corrective actions useful to develop various corrective actions? 5.78 0.83
19. Are the four criteria for evaluating corrective actions developed useful? 5.56 0.88

(standard deviation: 0.87). Some participants described that the be added, through which the factors and links between them can
links between causal factors are useful to find the root causes of be continuously updated as the number of analyzed cases with
an error because the causal relations among causal factors is well CAS-HEAR increases. Moreover, it is possible to provide the
established; other participants stated that it is sometimes difficult analyst with analysis results of similar accident cases using
to select appropriate ones among the given related causal factors techniques of case-based reasoning (CBR), although this kind of
and that the given related causal factors prevent, rather than feature should be carefully designed because there is a risk that
support, the identification of the real causes of an error. the analyst can be prematurely biased to the results of similar
cases provided. If this feature is carefully designed, the efficiency
of error analysis can be much enhanced without losing the quality
6. Conclusions and future work of the analysis. Providing statistical analysis functions will also be
beneficial to the usefulness of this system.
A prototype of CAS-HEAR, a computer-aided system for Although CAS-HEAR was developed specifically for the railway
analyzing human error in railway operations, was developed industry, it can also be used in other industries with minor
based on a critical review of existing techniques for human error modifications including the customization of some of the
analysis. This system is intended to reduce the cognitive load in contextual factors and causal factors for a particular industry.
the analysis of human error and to improve the quality of the The examination of the transferability of CAS-HEAR to other
analysis. It helps the analyst to find multiple levels of error causes domains represents another potential future effort.
and their causal relations efficiently by using both predefined
links between contextual factors and causal factors and links
between causal factors. Furthermore, it is based on a complete
Acknowledgements
model of accident causation; hence it helps the analyst to conduct
a thorough analysis without missing any significant part of human
error analysis. This research was supported by a Grant (05RS-B02) from the
The procedure of CAS-HEAR was subjectively evaluated by nine Railroad Technology Development Program funded by the
experienced field investigators. Its overall usefulness in human Ministry of Land, Transport, and Maritime Affairs of the Korean
error analysis was confirmed, although development of its government.
simplified version and some modification of the contextual factors
and causal factors are required in order to ensure its practical use.
References
For a more thorough evaluation, inter-rater reliability tests also
have to be performed. In this case, the level of agreement between
[1] MLTM. Statistics of Korean railway accidents. Korean Ministry of Land,
investigators should be measured in terms of the critical human Transport, and Maritime affairs. Retrieved March 30, 2009, from /www.
errors of an accident/incident, types and causes of the errors, mltm.go.krS.
corrective actions, etc. In addition, more extensive field applica- [2] Reinach S, Viale A. Application of a human error framework to conduct train
accident/incident investigations. Accid Anal Prev 2006;38:396–406.
tions of CAS-HEAR are clearly needed before its actual use in the [3] INPO. Human performance enhancement system. INPO 90-005. Atlanta:
field. These tests are expected to be conducted when the system is Institute of Nuclear Power Operations; 1990.
fully implemented and its use in field analysis is approved by the [4] NRC. Development of the NRC’s human performance investigation process
(HPIP). NUREG/ CR-5455, SI-92-01, vol. 1. Washington, DC: US Nuclear
relevant regulatory authorities.
Regulatory Commission; 1993.
Further research should be directed at extending the aiding [5] NRC. The human performance evaluation process: a resource for reviewing
features to support the cognitively burdensome aspects of human the identification and resolution of human performance problems. NUREC/
error analysis and ensure the quality of the analysis. First, as the CR-60751. Washington, DC: US Nuclear Regulatory Commission; 2001.
[6] Paradies M, Unger L. TapRooTs: the system for root cause analysis, problem
current predefined links of CAS-HEAR are not complete, a function investigation, and proactive improvement. System Improvements Inc.; 2000.
to manage the contextual and causal factors and their links should [7] Retrieved January 10, 2008, from /www.plant-maintenance.comS.
ARTICLE IN PRESS
98 D.S. Kim et al. / Reliability Engineering and System Safety 95 (2010) 87–98

[8] Wiegmann DA, Shappell SA. A human error approach to aviation accident [19] Reason J. A systems approach to organizational error. Ergonomics
analysis: the human factors analysis and classification system. Aldershot, UK: 1995;38:1708–21.
Ashgate Publishing Company; 2003. [20] Svenson O. The accident evolution and barrier function (AEB) model applied
[9] Reason J. Human error. New York: Cambridge University Press; 1990. to incident analysis in the processing industries. Risk Anal 1991;11:499–507.
[10] Shorrock ST, Kirwan B. Development and application of a human error [21] Lawton R, Ward NJ. A systems analysis of the Ladbroke grove rail crash. Accid
identification tool for air traffic control. Appl Ergon 2000;33:319–36. Anal Prev 1995;37:235–44.
[11] RSSB. Rail-specific human reliability assessment technique for driving tasks. [22] Svenson O. Accident and incident analysis based on the accident evolution
Research Project T270, Final Report. London: Rail Safety & Standards Board; and barrier function (AEB) model. Cogn Tech Work 2001;3:42–52.
2005. [23] Kanse L, Van der Schaaf TW. Recovery from failures in the chemical process
[12] Baysari MT, McIntosh AS, Wilson JR. Understanding the human factors industry. Int J Cogn Ergon 2001;5(3):199–211.
contribution to railway accidents and incidents in Australia. Accid Anal Prev [24] Hollnagel E. Accident analysis and barrier functions. Halden, Norway:
2008;40:1750–7. Institute of Energy Technology; 1999.
[13] Thomas LJ, Rhind DJA. Human factors tools, methodologies and practices in [25] Rizzo A, Ferrante D, Bagnara S. Handling human error. In: Hoc JM, Cacciabue
accident investigation: implications and recommendations for a database for PC, Hollnagel E, editors. Expertise and technology: cognition & human-
the rail industry. Technical report, Version 4, 2003, Cranfield University. computer cooperation. Hillsdale, NJ: Lawrence Erlbaum Associate; 1995.
[14] Gordon R, Flin R, Mearns K. Designing and evaluating a human factors [26] Kontogiannis T. User strategies in recovering from errors in man-machine
investigation tool (HFIT) for accident analysis. Saf Sci 2005;43:147–71. systems. Safety Science 1999;32:49–68.
[15] Hollnagel E. Cognitive reliability and error analysis method. Oxford: Elsevier; [27] Sarter NB, Alexander HM. Error types and related error detection mechanisms
1998. in the aviation domain: an analysis of aviation safety reporting system
[16] KEPCO Research Center. Development of Korean HPES for nuclear power incident reports. Int J Aviat Psychol 2000;10(2):189–206.
plants (II). Final report. Korea Electric Power Corporation; 1998. [28] KAIST. A manual for human error analysis and reduction (HEAR) system. Final
[17] KHNP. Development of analyzing method for near misses and improvement of report. Korea Advanced Institute of Science and Technology; 2007.
K-HPES (II). A05NJ14, Final report. Korea Hydro & Nuclear Power Co. Ltd; [29] Hendrick K, Benner L. Investigating accidents with sequentially timed and
2007. events plotting (STEP). New York: Marcel Dekker; 1987.
[18] Reason J, Free R, Havard S, Benson M, Van Oijen P. Railway Accident
Investigation Tool (RAIT): a step by step guide for new users. Department of
Psychology, University of Manchester; 1994.

You might also like