Demonstrating The Value of Human Factors For Process Design in A Controlled Experiment

SANDIA REPORT
SAND2018-0725
Unlimited Release
Printed February 2018
Demonstrating the Value of Human

Factors for Process Design in a
Controlled Experiment
Judi E. See
Prepared by
Sandia National Laboratories
Albuquerque, New Mexico 87185 and Livermore, California 94550
Sandia National Laboratories is a multimission laboratory managed and operated

by National Technology and Engineering Solutions of Sandia, LLC, a wholly owned
subsidiary of Honeywell International, Inc., for the U.S. Department of Energy’s
National Nuclear Security Administration under contract DE -NA0003525.
Issued by Sandia National Laboratories, operated for the United States Department of Energy by
National Technology and Engineering Solutions of Sandia, LLC.
NOTICE: This report was prepared as an account of work sponsored by an agency of the United
States Government. Neither the United States Government, nor any agency thereof, nor any of their
employees, nor any of their contractors, subcontractors, or their employees, make any warranty,
express or implied, or assume any legal liability or responsibility for the accuracy, completeness, or
usefulness of any information, apparatus, product, or process disclosed, or represent that its use
would not infringe privately owned rights. Reference herein to any specific commercial product,
process, or service by trade name, trademark, manufacturer, or otherwise, does not necessarily
constitute or imply its endorsement, recommendation, or favoring by the United States Government,
any agency thereof, or any of their contractors or subcontractors. The views and opinions expressed
herein do not necessarily state or reflect those of the United States Government, any agency thereof,
or any of their contractors.
Printed in the United States of America. This report has been reproduced directly from the best
available copy.
Available to DOE and DOE contractors from

U.S. Department of Energy
Office of Scientific and Technical Information
P.O. Box 62
Oak Ridge, TN 37831
Telephone: (865) 576-8401

Facsimile: (865) 576-5728
E-Mail: reports@osti.gov
Online ordering: http://www.osti.gov/scitech
Available to the public from

U.S. Department of Commerce
National Technical Information Service
5301 Shawnee Rd
Alexandria, VA 22312
Telephone: (800) 553-6847

Facsimile: (703) 605-6900
E-Mail: orders@ntis.gov
Online order: https://classic.ntis.gov/help/order-methods/
2
SAND2018-0725
Printed February 2018
Unlimited Release
Demonstrating the Value of Human Factors for

Process Design in a Controlled Experiment
Judi E. See
Nuclear Weapons Systems Analysis Department
Sandia National Laboratories
P. O. Box 5800
Albuquerque, New Mexico 87185-MS0151
Abstract
A controlled between-groups experiment was conducted to demonstrate the value of
human factors for process design. Most evidence to convey the benefits of human
factors is derived from reactive studies of existing flawed systems designed with little
or no human factors involvement. Controlled experiments conducted explicitly to
demonstrate the benefits of human factors have been scarce since the 1990s. Further,
most previous research focused on product or interface design as opposed to process
design. The present study was designed to fill these research gaps. Toward that end, 24
Sandia National Laboratories employees completed a simple visual inspection task
simulating receipt inspection. The experimental group process was designed to
conform to human factors and visual inspection principles, whereas the control group
process was designed without consideration of such principles. Results indicated the
experimental group exhibited superior performance accuracy, lower workload, and
more favorable usability ratings as compared to the control group. Given the
differences observed in the simple task used in the present study, the author concluded
that incorporating human factors should have even greater benefits for complex
products and processes. The study provides evidence to help human factors
practitioners revitalize the critical message regarding the benefits of human factors
involvement for a new generation of designers.
3
ACKNOWLEDGMENTS
Dr. Susan Adams and Ms. Victoria Newton provided constructive feedback during concept
development. Ms. Allison Noble, Ms. Kay Rivers, and Ms. Liza Kittinger conducted human factors
heuristic evaluations of the experimental group and control group task designs. Dr. Thor Osborn
located a reference supporting the value of process variation reduction and discussed the concept
with the author. Mr. Richard Craft, Dr. Mallory Stites, and Ms. Victoria Newton completed peer
reviews of the final paper.
4
TABLE OF CONTENTS
1. Introduction ..........................................................................................................................9
1.1. Reactive Case Studies ............................................................................................10
1.2. Controlled Experiments .........................................................................................10
1.3. Research Gaps ........................................................................................................11
1.4. Objectives of the Present Study .............................................................................12
2. Methodology ......................................................................................................................13
2.1. Participants .............................................................................................................13
2.2. Design ....................................................................................................................13
2.3. Primary Task Design..............................................................................................13
2.4. Procedure ...............................................................................................................14
2.4.1. Orientation and Practice ........................................................................14
2.4.2. Task Implementation .............................................................................17
3. Results ................................................................................................................................21
3.1. Task Accuracy and Speed ......................................................................................21
3.1.1. Task Accuracy.......................................................................................21
3.1.2. Task Speed ............................................................................................22
3.1.3. Error Analysis .......................................................................................22
3.1.4. Reductions in Variability ......................................................................23
3.2. Workload................................................................................................................24
3.2.1. NASA-TLX Workload Ratings ............................................................24
3.2.2. Secondary Task Analysis ......................................................................25
3.3. Usability Ratings ....................................................................................................25
4. Discussion ..........................................................................................................................27
4.1. Process Variation Reduction ..................................................................................27
4.2. Limitations of the Present Study ............................................................................28
4.3. Directions for Future Research ..............................................................................28
5. Conclusions ........................................................................................................................29
5.1. Key Points ..............................................................................................................29
References ................................................................................................................................31
5
FIGURES
Figure 1. Grammatical Reasoning Task Examples ...................................................................... 16

Figure 2. Experimental and Control Group Sorting Techniques ................................................. 18
Figure 3. Experimental and Control Group Data Entry Forms for Acceptable Tiles .................. 19
Figure 4. Mean NASA-TLX Subscale Scores in the Experimental and Control Groups ............ 25
TABLES
Table 1. Primary Task Design for the Experimental and Control Groups ................................... 15
Table 2. Acceptable Tile Accuracy Results ................................................................................. 21
Table 3. Rejectable Tile Accuracy Results .................................................................................. 22
Table 4. Control Group Errors and Experimental Group Mitigations ......................................... 23
Table 5. Experimental and Control Group Standard Deviations ................................................. 24
Table 6. Experimental and Control Group Usability Ratings ...................................................... 26
6
NOMENCLATURE
Abbreviation Definition
CI confidence interval
dBA decibel, A scale
F false
GSA General Services Administration
ID identification
M mean
min minutes
NASA-TLX National Aeronautics and Space Administration – Task Load Index
SD standard deviation
sec seconds
SNL Sandia National Laboratories
T true
7
This Page Intentionally Left Blank
8
1. INTRODUCTION
It is well established that the discipline of human factors provides numerous benefits
throughout the product lifecycle (Bailey, 1993; Bruseberg, 2008; Burgess-Limerick,
Cotea, & Pietrzak, 2010; Hendrick, 1996; Hendrick, 2008; Rouse, Kober, & Mavor,
1997; Sager & Grier, 2005; Shaver & Braun, 2008; Yousefi & Yousefi, 2011). Such
benefits have been demonstrated through positive examples, which highlight the value
of including human factors engineers early and often throughout the lifecycle; and
negative examples, which underscore the adverse consequences of neglecting human
factors (Burgess-Limerick, Cotea, & Pietrzak, 2010). Prominent success stories include
the Comanche helicopter acquisition program, maintainability of the F119 engine for
the F-22 Raptor, C-141 cargo plane development, and the F/A-18 Hornet (Burgess-
Limerick, Cotea, & Pietrzak, 2010; Sager & Grier, 2005). Well-known historical
failures include the Three Mile Island nuclear reactor accident in 1979, the Titan II
missile explosion in 1980, the Bhopal chemical leak in 1984, the Chernobyl accident in
1986, and the grounding of the Royal Majesty cruise ship in 1995 (Burgess-Limerick,
Cotea, & Pietrzak, 2010; See, 2017).
Demonstrated benefits encompass both the design process itself and the subsequent
operations and maintenance phase of the lifecycle. Design is impacted through reduced
product development time and costs, by as much as 50%, when human factors
engineering is incorporated (Sager & Grier, 2005). As just one example, Bailey (1993)
demonstrated that interfaces developed by human factors experts had fewer design
errors after a single iteration as compared to the same interfaces developed by
programmers after three to five iterations. In sum, the extra time, labor, and costs for
programmers to develop an effective and usable product could have been saved by
incorporating human factors experts from the beginning.
The operations and maintenance phase is impacted in terms of increases in advantageous
states and reductions in detrimental states. Advantages include improved safety,
effectiveness, efficiency, productivity, and operator satisfaction (Burgess-Limerick,
Cotea, & Pietrzak, 2010; Sager & Grier, 2005). Positive reductions include decreased
training time and costs, accidents, error rates, maintenance costs, and equipment damage
(Shaver & Braun, 2008). One particularly noteworthy positive reduction involves a
decrease in the number of errors requiring resolution during operations and
maintenance, attributable explicitly to investing in human factors early in design. First,
errors can become 30 to 1500 times costlier to correct in operations and maintenance as
compared to early design phases (Steicklein, Dabney, Dick, Haskins, Lovells, &
Moroney, 2004). Second, most of the costs during operations and maintenance, as much
as 67%, stem from modifications to resolve operator dissatisfaction with the original
system (Rauterberg & Strohm, 1992). Ultimately, a large portion of detrimental impacts
typically associated with error cost escalation could be avoided with proper attention to
human factors during design.
Some benefits such as productivity can be expressed in quantitative cost savings; other
less tangible benefits such as improved operator attitudes and enhanced safety may be
difficult to measure and quantify, but nevertheless have a positive influence. In a review
of 24 diverse human factors projects, Hendrick (2008) concluded that human factors
9
typically has a direct cost benefit of approximately 1:10+, with a typical payback period
of 6 to 24 months. He further inferred that earlier incorporation of human factors
translates into even lower costs and greater benefits.
Despite a proven record of success, human factors practitioners continue to face
challenges convincing personnel outside their field. As Sager and Grier (2005) stated,
“inadequate consideration of human factors engineering issues is a familiar
problem…issues are not expressly considered, they are considered but their importance
is underestimated, or they are considered too late in the design process” (p. 1). Hence,
there is an ongoing need for evidence documenting the benefits of human factors.
1.1. Reactive Case Studies

Much of this ongoing evidence is derived from reactive case studies that begin with
fielded systems exhibiting signs of trouble. Specifically, such studies represent an
instantiation of operations and maintenance phase errors stemming from failure to
properly include human factors during development. As one example of this approach,
Sen and Yeow (2003) analyzed an existing electronic motherboard that suffered from
70% rejects and poor quality, leading to low productivity and an overall loss at the
factory. Their review indicated that the motherboard design created problems for
operators during manual soldering of certain components and required considerable
unproductive manual cleaning. When the motherboard was redesigned to correct the
poor design issues, manual soldering was replaced by machine soldering and the need
for manual cleaning was significantly reduced. The time to fabricate a manufacturing
lot was reduced from six to two shifts. Human factors interventions eliminated rejects,
reduced repairs, improved lost business, and saved the factory approximately $582k per
year. With the redesigned motherboard, the job also became less boring and presented
fewer ergonomic issues for operators. In effect, Sen and Yeow (2003) demonstrated that
human factors is an investment, not an expenditure. Benefits exceeded costs by nearly
246 times.
In a similar reactive study, Yeow and Sen (2004) investigated a failing visual inspection
process at a printed circuit assembly factory that caused an annual rejection cost of
nearly $300k, poor quality, customer dissatisfaction, and operator occupational safety
and health issues. Human factors interventions resolved three primary issues
encompassing operator eye problems, insufficient time for 100% inspection involving
an average of 7.5 components per second, and ineffective visual inspection processes.
Specifically, magnifying glass usage was minimized, inspection template glare was
diminished by changing the template angle and substituting a less reflective material,
steps requiring visual inspection were eliminated wherever functional tests were
available, and the random search pattern used during visual inspection was replaced
with systematic scanning techniques. Human factors interventions resolved all three
operator issues, reduced customer site defects by 2.5%, improved customer satisfaction,
and saved the factory over $250k per year.
1.2. Controlled Experiments

While reactive case studies are informative, they lack the rigor of controlled
experiments to draw definitive conclusions about causality. Unlike field studies,
10
controlled experiments allow for precise control of independent variables to afford
establishing cause-and-effect relationships and of extraneous variables that can bias
results. To be sure, some researchers have conducted controlled experiments comparing
“old designs” (without human factors involvement) and “new designs” (with human
factors involvement) to provide additional confidence in reactive studies demonstrating
the value of human factors.
Walkenstein and Eisenberg (1996) conducted an experimental comparison of old and
new designs of a computer-telephony product. The original version had been developed
without human factors engineering involvement. Subsequently, human factors
practitioners were asked to redesign the product late in the development cycle, under
time constraints, and with limitations imposed on the amount and types of changes that
could be made. Twenty-three target customers used either the old design or the new
design to complete a series of 13 tasks. Results indicated that 9 of 15 features were
significantly easier to use in the new design, and overall ease of use was rated more
highly for the new design. Participants working with the new design were able to
successfully complete 11 of 13 assigned tasks, whereas participants working with the
old design could complete only 8 of 13 tasks. The authors concluded that the direct
involvement of human factors engineers led to substantial improvements in the user
interface, despite the issues associated with involvement late in the development cycle.
Similarly, two other studies used controlled experiments to demonstrate the value of
human factors in medicine. Lin, Isla, Doniz, Harkness, Vicente, and Doyle (1995)
compared the original design for a patient-controlled analgesia machine interface and a
redesigned interface guided by a cognitive task analysis as well as a set of human factors
design principles. Twenty-four novice users who programmed three different sets of
doctors’ orders into both the old and new machines demonstrated faster performance,
fewer errors, and lower workload with the new interface. Russ et al. (2014)
demonstrated that applying human factors principles for the redesign of a medication
alert interface improved performance time and usability, while reducing prescriber
workload and prescribing errors.
1.3. Research Gaps

While some studies employed controlled experiments, their bases were still rooted in
reactive investigation of existing flawed systems designed with little or no human
factors involvement. The author was unable to locate any study that used a truly bottom-
up approach, starting with a novel design created simultaneously with and without
incorporation of human factors principles, followed by a controlled experimental
comparison of the two designs. Further, existing research of this nature has focused
primarily on product and interface design as opposed to process design. Here, too, the
author was unable to locate any study focusing on explicit demonstration within a
controlled experiment of the value of human factors for process execution. Finally, even
experimental comparisons of old and new designs have been scarce since the 1990s.
Most current human factors research implicitly attests to the value of human factors, but
it is not typically an explicit goal. The purpose of the present study was to fill these gaps
and revitalize the message regarding the benefits of human factors involvement for a
new generation of designers.
11
1.4. Objectives of the Present Study
The primary goal of the present study was to conduct a controlled experiment to
demonstrate the value of human factors for process design. Toward that end, a visual
inspection task simulating a typical receipt inspection process was designed. The
process was designed with adherence to common human factors principles
(experimental group) and without (control group). Impacts of incorporating human
factors into process design were evaluated in terms of performance accuracy and speed,
workload, and ratings of process usability. It was hypothesized the experimental group
would exhibit more accurate performance, faster task completion time, reduced
workload, and higher usability ratings as compared to the control group.
12
2. METHODOLOGY
This research complied with the American Psychological Association Code of Ethics
and was approved by the Institutional Review Board at SNL (ID# SNL000155).
Informed consent was obtained from each participant.
2.1. Participants
Twenty-five SNL employees volunteered to participate in the experiment in response to
an advertisement published in the electronic Sandia Daily News. No special knowledge,
skills, or previous experience were required to participate. However, participants had to
meet a criterion of 70% correct on the secondary task used in the experiment in order to
continue. As a result, one individual who volunteered was dismissed after the practice
session and replaced with the next volunteer. The 24 participants (14 males) who
completed the experiment ranged in age from their twenties to their sixties, with 42%
of participants in their thirties.
2.2. Design
A between-groups design with two groups (N = 12 per group) was used. The
experimental group task was designed to conform to human factors principles, whereas
the control group task was designed without consideration of human factors principles.
2.3. Primary Task Design

The primary task simulated a receipt inspection process wherein a lot of vendor parts is
visually inspected to accept quality parts and remove flawed items, based on pre-defined
defects that might impact functionality during subsequent operations. As documented
in numerous empirical investigations, visual inspection is a difficult task susceptible to
human error if not designed in accordance with human factors principles and research
lessons learned (Drury & Watson, 2002; See, 2012).
To eliminate requirements for prior experience and minimize participant training, parts
for inspection in the present study consisted of 350 tiles from the Hasbro, Inc. Scrabble
game. The criterion for acceptability was based on a single tile feature. Namely,
acceptable parts contained any one of six different Roman characters, and rejectable
parts contained any one of four different Cyrillic characters. The inspector’s task was to
sort the tiles, categorize them by acceptability and letter type, determine the quantities
of each letter type, and calculate vendor fees. Fees were calculated using the number on
each tile to represent its dollar value. The product of value and quantity equaled the total
dollar amount for each letter type, constituting either an amount to pay (acceptable parts)
or charge (rejectable parts) the vendor. To simulate the low defect rates typically seen
in visual inspection, only 15 tiles (4%) were rejectable (See, 2012).
Tasks were designed with and without adherence to general human factors and specific
visual inspection principles (Table 1). In brief, the experimental group task conformed
to the principles of user-centered design, whose goal is to develop usable and
understandable products and processes based on user needs and interests (Norman,
1988). Accordingly, the experimental group task was structured to accommodate the
range of participant physical dimensions and preferences and to facilitate all
13
components of the inspection process—sorting and counting tiles and calculating fees—
to maximize accuracy, speed, and usability and minimize workload.
The control group task design provided the minimum tools necessary to complete the
process, without consideration of usability or user preferences. While designing the
control group task, every effort was made to avoid intentionally exaggerating task
difficulty (i.e., to prevent artificially biasing outcomes in favor of the experimental
group). Toward that end, the process design for the control group was grounded in issues
commonly reported in the research literature (Table 1). In reality, the greater difficulty
was refraining from incorporating human factors principles in the control group task.
For example, the experimenter had to overcome a natural inclination to format the
control group work instruction for readability and ease of use and to supply well-
designed ergonomic tools for task completion (e.g., calculator).
Application of human factors principles for the experimental group (or lack thereof for
the control group) was confirmed through independent heuristic evaluations of each
design. For example, the control group heuristic evaluator recommended providing
sorting bins to facilitate counting and a calculator with larger buttons and a larger
display for usability. Design of the experimental group task effectively eliminated or
resolved these issues. Specifically, the experimental group heuristic evaluator
highlighted the benefits of workspace customization; sorting trays with divided,
redundantly coded slots that each held five tiles; pictorial labels on each sorting tray;
and electronic spreadsheet design (preloaded with tile pictures and values, formatted
with variable shading to support row scanning, and designed to provide automatic
calculations).
2.4. Procedure
The experiment occurred in a private enclosed office at SNL. Two side-by-side sit-stand
workstations provided a large adjustable workspace for task completion. Light levels
during the experiment, achieved via overhead LED lights as well as natural outdoor
lighting from the office window, were 430 lux. All sessions were conducted weekdays
between 8:30 a.m. and 3:30 p.m. Given the environments in which visual inspection
may occur in the field, no attempt was made to minimize surrounding office noise. Each
participant individually completed a study session lasting approximately 1.3 hours.
2.4.1. Orientation and Practice

Upon reporting for the experiment, participants were randomly assigned to the
experimental or control group, with an equal number of participants (N = 12) in each
group. The experimenter briefly described the activities required during the session and
gave participants time to review and sign the informed consent form (which had
previously been e-mailed to help volunteers decide whether to participate). Afterwards,
participants had an opportunity to practice each of the elements of the experiment: the
NASA-TLX technique for workload ratings, the secondary grammatical reasoning task,
and the primary visual inspection task.
14
Table 1. Primary Task Design for the Experimental and Control Groups
Activity Implementation Issue/Principle
Control: could use any available table • Configurability adheres to principle of
space for work area, but no other
Workspace
fitting the job to worker human body

Configure
customization options were offered dimensions for fit and reach (Kroemer
Experimental: customized workspace for & Grandjean, 1997)
standing/sitting, table/chair heights, tool • Workspace flexibility supports
placement, and laptop/tablet mode usability/efficiency (Nielsen, 1995)
• Formatted instructions with structured
Control: minimal formatting; broad text
blocks of information reduce cognitive
Follow Work
definitions of inspection criteria; no photos

Instruction
effort (Luna, Sturdivant, & McKay,

of acceptable/rejectable tiles
1988)
Experimental: comprehensive formatting,
with steps in blocks or fields; precise text • Clear defect definitions and standards
for comparison improve visual
and photo definitions of inspection criteria
inspection (See, 2012)
• Procedure ambiguity increases human
Control: process individually determined error (Matzen, 2009)
(no specific work instruction guidance); no
Sort Tiles
sorting bins • Redundant coding reduces errors

(Konz & Johnson, 2008)
Experimental: process informed by work
instruction and setup; labeled, numbered, • Labels adhere to principles for visual
and color-coded sorting bins provided display of static information (Sanders
& McCormick, 1993)
• Poor equipment increases process
Count Tiles
Control: counting process individually

variation and human error (Drury &
determined
Watson, 2002)
Experimental: numbered slots in sorting
• Proper apparatus is critical to improve
bins facilitated counting
inspection (See, 2012)
• Paper forms reduce legibility and
Control: paper form to manually record
Enter Quantities
present more opportunities for human

letter types, tile values, and quantities
error (GSA, 1993)
Experimental: electronic form pre-
• Row scanning cues, alphabetical
populated with date, lot number, letter type
ordering, and automatic generation of
(listed alphabetically), and tile value;
routine/static data reduce errors (Smith
usability formatting (Figure 3)
& Mosier, 1986)
• Poor tool ergonomics increases errors,
Control: handheld calculator provided for dissatisfaction, and completion time
Compute
Totals
manual calculations (Sanders & McCormick, 1993)

Experimental: automatic calculations built • Automatic calculations adhere to error
into electronic form prevention principle for usable
interfaces (Nielsen, 1995)
15
2.4.1.1. NASA-TLX Practice
First, participants practiced the NASA-TLX workload rating scale (Hart & Staveland,
1988). The NASA-TLX provides an overall workload score based on a weighted
average of six subscale ratings (mental demand, physical demand, temporal demand,
performance, effort, and frustration). Weightings are achieved by presenting 15 pairwise
comparisons and asking participants to choose which subscale in each pair was more
important to task workload. The number of times each subscale is selected provides a
weighting to compute an overall workload score. Both subscale scores and overall
weighted workload scores range from 0 to 100, with higher scores representing greater
workload. After the experimenter described the NASA-TLX, participants practiced
using the rating scale to rate the workload of a simple task (sorting an ordinary deck of
cards based on suit).
2.4.1.2. Secondary Task Practice

Second, the experimenter described the secondary task that was designed to provide
another indicator of primary task demand to supplement the NASA-TLX workload
ratings. The task was modeled after Baddeley’s (1968) paper-and-pencil version of the
task, but adapted for electronic implementation. The grammatical reasoning task has
previously been used as a secondary task due to its sensitivity to multiple stressors. This
type of secondary task might exhibit workload impacts by depleting cognitive resources
like those used in the primary task to process alphanumeric stimuli. It further simulates
the disruptive nature of interruptions commonly encountered during visual inspection
in the field (Drury, 1985). This simple reasoning test involves processing short
sentences of various levels of complexity for veracity. In brief, each stimulus consists
of a pair of letters, followed by a sentence that varies in terms of four dimensions:
positive/negative, active/passive, precede/follow, and true/false (Figure 1).
BA BA AB AB
“A precedes B” “A follows B” “A is not followed by B” “B is preceded by A”
Figure 1. Grammatical Reasoning Task Examples
Participants first practiced the grammatical reasoning task with five paper examples,
circling T or F to indicate whether each statement was true or false. The majority of
participants (88%) answered all five examples correctly; the remaining three
participants missed only one example. The experimenter reviewed response accuracy
with participants before directing them to a computer practice session. For the electronic
practice session, 10 grammatical reasoning problems with feedback were presented.
Each stimulus remained on the screen until participants pressed either the T key (true)
or the F key (false). Given that the grammatical reasoning task was designed to run in
the background while participants performed the primary visual inspection task, a one-
second 57 dBA auditory stimulus served as a warning signal to indicate that a
16
grammatical reasoning problem was on the computer screen awaiting response. During
practice, the auditory alert preceded each stimulus to prepare participants for its
occurrence during the primary task. Participants had up to three opportunities to
complete the electronic practice session and reach a criterion of 70% correct by the final
attempt. All participants achieved at least 90% accuracy. Two participants required two
attempts, and two participants required three attempts.
2.4.1.3. Primary Visual Inspection Task Practice

Third, the final practice session was devoted to the primary visual inspection task. The
experimenter described the purpose and importance of a generic receipt inspection task,
referencing real-world examples relevant to the SNL mission. The experimenter then
reviewed the work instruction. Participants were informed that Scrabble tiles were used
for convenience to minimize training, but advised to regard the task with a level of rigor
appropriate for the SNL mission. They were also informed that both speed and accuracy
were important for the primary task. After any questions were addressed, participants
completed two levels of practice for the primary task. For the first level of practice,
participants verbally provided an inspection decision for each of five tiles and stated the
tile values. All participants achieved 100% accuracy on the first level of practice. For
the second level of practice, participants completed the entire end-to-end inspection
process (sort, count, and derive total fees) with a small set of 20 tiles. The structure of
this second level of practice was congruent with the experimental and control group
process designs for the main task (e.g., control group participants completed the practice
without sorting bins, and they used paper versus electronic forms). The majority of
participants (92%) achieved 100% accuracy in the second practice session. Two
participants each made a single error recording acceptable tile quantities, which led to
shortages of $1 for the acceptable tile total dollar amount. The experimenter reviewed
these errors with participants before continuing to the main task.
2.4.2. Task Implementation

Before starting the primary task, experimental group participants had an opportunity to
customize the workspace. Customization included the ability to change the heights of
the sit-stand tables to support a preference for sitting or standing during the task, arrange
the sorting and counting trays, and set up the work area with respect to proximity to the
computer presenting the secondary task. Experimental group participants also had the
option to configure the Surface Pro 4 used to complete the electronic spreadsheet in
either laptop mode or tablet mode. Control group participants were informed they could
use any of the available workspace on the two sit-stand workstations to complete the
task, but were not offered customization options.
All participants received a single bin containing the lot of tiles for inspection in a
random arrangement. At that time, they were informed the entire task takes
approximately 20 minutes. This benchmark was based on pilot testing and designed to
amplify the demand of what proved to be a relatively simple primary task. During the
primary visual inspection task, participants periodically responded to the secondary
grammatical reasoning task when they heard the auditory alert. They were advised to
respond quickly and accurately, without detracting from the primary task. Both reaction
17
time and accuracy on the grammatical reasoning task were recorded for analysis. Stimuli
were presented for the duration of the primary visual inspection task at random intervals
of either 30 or 45 seconds. The experimenter remained in the room during task
completion to document observations for post-session interviews.
Most of the task time was consumed by sorting and categorizing the tiles. Experimental
group participants used trays to sort and categorize tiles. Each of the six acceptable letter
types had its own labeled sorting tray, with numbered and color-coded slots holding
exactly five tiles each. A separate tray contained four labeled holding fixtures, one for
each rejectable letter type. The wooden racks included in Scrabble games to hold tiles
were used in the experiment as holding fixtures for rejectable tiles. The wooden racks
were labeled with numbered slots to facilitate subsequent counting of tile quantities.
Control group participants received only the bin containing the lot of tiles to be
inspected. They developed their own individual methods to sort and categorize tiles
within the available workspace. Techniques included sorting tiles into rows, columns,
piles, or vertical stacks according to letter type. Figure 2 illustrates sorting techniques
used for acceptable tiles in the experimental and control groups (experimental group
trays for rejectable tiles are not shown).
Experimental Group Control Group
Figure 2. Experimental and Control Group Sorting Techniques
Following sorting and categorization, participants calculated vendor fees. Experimental

group participants used electronic forms pre-populated with letter type and tile values
as well as built-in formulas to automatically calculate totals, based on tile quantity
entries. Control group participants manually recorded letter types, tile values, and
quantities on paper forms and computed totals with a handheld calculator. Three primary
elements on the forms (Figure 3) supported subsequent analyses of task accuracy: tile
values, quantities, and dollar amounts.
After finishing the primary task, participants completed a survey requesting basic
background information as well as inspection task usability and NASA-TLX workload
ratings. The usability scale was adapted from Lewis’ (1995) after-scenario usability
18
questionnaire to address inspection task (1) ease of completion, (2) amount of time, and
(3) task work instructions. A seven-point rating scale for each item ranged from Strongly
Disagree (1) to Strongly Agree (7). Before concluding the session, the experimenter
interviewed participants to gain insight into their thought processes throughout the
experiment, collect subjective descriptions of any errors that occurred, and discuss
experimenter observations.
Experimental Group Control Group
Figure 3. Experimental and Control Group Data Entry Forms for Acceptable Tiles
19
20
3. RESULTS
Incorporating human factors in process design led to superior performance accuracy,
lower workload, and more favorable usability ratings in the experimental group as
compared to the control group. The experimental group process design promoted more
uniform task approaches among participants, effectively reducing process variation and
mitigating or eliminating errors observed in the control group.
3.1. Task Accuracy and Speed
3.1.1. Task Accuracy

Accuracy was addressed by examining errors recording tile values, quantities, and dollar
amounts. For acceptable tiles, all experimental group and control group participants
correctly recorded all tile values. However, differences emerged between the two groups
in terms of accuracy for quantities and dollar amounts (Table 2). Control group
participants either miscounted acceptable tiles or mistakenly categorized a rejectable
letter type as acceptable. These quantity errors led to under payments, ranging from $44
to $98, and over payments of up to $24. The single experimental group error resulted
from a simple miscount, which led to a $5 over payment. None of the differences in
accuracy was statistically significant.
Table 2. Acceptable Tile Accuracy Results

Incorrect
Dependent Variable Group Statistical Significance
Responses
Experimental 0 Dependent variable is a constant;
Acceptable Tile Values
Control 0 no statistics computed
Experimental 1 p = .295, Fisher’s exact test, one-

Acceptable Quantitiesa
Control 3 tailed
Acceptable Dollar Experimental 1 p = .077, Fisher’s exact test, one-

Amounts Control 5 tailed
aSignal detection theory analysis of the present data using hits (percentage of flawed tiles correctly rejected) and false alarms
(percentage of acceptable tiles incorrectly rejected) was not possible due to the absence of false alarms in the experimental
group. No experimental group participant incorrectly rejected an acceptable tile, but one participant did err in recording
quantities of correctly categorized tiles. Therefore, tile quantities were analyzed instead.
For rejectable tiles, there were accuracy differences between the two groups for tile
values, quantities, and dollar amounts (Table 3). In all instances, errors were confined
to the control group. Value errors occurred when value and quantity entries for a single
tile were transposed on the paper form. Rejectable quantities were all under-recorded by
1 to 3 tiles due to the transposition error, misclassifying one rejectable tile type as
acceptable, and miscounting tiles. All four dollar amount errors consisted of
undercharging, ranging from $2 to $24. Differences in accuracy for quantities and dollar
amounts were statistically significant.
21
Table 3. Rejectable Tile Accuracy Results
Incorrect
Dependent Variable Group Statistical Significance
Responses
Rejectable Tile Values
Control 1 tailed

Rejectable Quantities
Control 4 tailed
Rejectable Dollar Experimental 0 p = .047, Fisher’s exact test, one-

Amounts Control 4 tailed
3.1.2. Task Speed

With respect to task completion time, the experimental group averaged 25 minutes
(SD = 6), while the control group averaged 26 minutes (SD = 3). This difference was
not statistically significant, F(1, 22) = .222, p = .642, 95% CI of the difference [-3, 5],
d = .21.
3.1.3. Error Analysis

Accuracy differences can also be understood by analyzing the specific types of errors
that occurred. Eleven different types of errors were observed in the present study, all of
which occurred in the control group (Table 4). The experimental group process design
mitigated or prevented each error type. Apart from miscounts, all errors were prevented
in the experimental group. The experimental group setup minimized miscounting, as
evidenced by an overall reduction in quantity errors from seven (control group) to one
(experimental group), but did not prevent miscounting altogether, given the occurrence
of a single miscount of the acceptable tiles in the experimental group.
The failure types in Table 4 can further be interpreted with respect to Reason’s (1990)
four categories of errors: slips, lapses, mistakes, and violations. In the present study,
violations or intentional inappropriate acts were not observed; however, the remaining
three error types did occur, primarily in the control group. For example, slips took the
form of incorrectly recording a calculated dollar amount of $49 as $4 on paper. Slips
occur when an action is taken, but it is not the action the individual intended. Forgetting
to record the date and lot number on the control group paper form is an example of a
lapse, wherein individuals forget to perform an activity they meant to accomplish.
Incorrect categorizations represent mistakes—people perform the act they intended, but
the act itself is inappropriate. In particular, some control group participants incorrectly
categorized a rejectable tile type as acceptable, due to work instruction ambiguity. The
experimental group process design successfully mitigated or prevented the three
categories of errors that occurred in the control group.
22
Table 4. Control Group Errors and Experimental Group Mitigations
Observed Error Control Group Instantiation Experimental Group Mitigation
Sorting trays contained multiple slots
Tiles placed into piles or groupings
Miscounts that each accommodated five tiles to
prone to miscounting
minimize miscounting
Work instruction and sorting trays
Incorrect Rejectable tiles incorrectly
contained photos of acceptable and
Categorizations categorized as acceptable
rejectable tile types
Electronic spreadsheet was pre-
Incorrect Tile Tile values entered incorrectly on
populated with static information such as
Values paper form
tile values
Incorrect Dollar amounts calculated and Electronic spreadsheet automatically
Calculations recorded incorrectly on paper form calculated dollar amounts
Stacks of sorted tiles bumped during Sorting trays contained inspected tiles
Overturned Tiles
inspection of remaining tiles separate from unsorted tiles
Study ID, date, and lot number Electronic spreadsheet was pre-
Missing Entries
fields left blank populated with this information
Incorrect entries scratched out or Changes made in the electronic form
Scratchouts
overwritten replaced existing entries
Handwriting sometimes ambiguous Electronic spreadsheet used only legible
Handwriting
and open to interpretation typewritten entries
Sorting trays accommodated all 350 tiles
Amount of space required to sort
Space Allocation and required a finite, identifiable amount
350 tiles not well planned
of table space
Tiles arranged in configurations that Sorting trays used numbered and divided
Re-Counting did not support possible need for re- slots to facilitate re-counting and
counting verification of counts
End state not conducive to transfer Sorting trays also served as a convenient
End State for follow-on work (tiles in various mechanism to transfer tiles for next level
types of groupings on the table) of work
3.1.4. Reductions in Variability

Impacts of human factors interventions in the experimental group can additionally be
considered in terms of reductions in variability. First, the experimental group exhibited
a smaller number and variety of errors during task completion as compared to the control
group (refer to Table 4). The experimental group work instruction and tools prompted
a more consistent, uniform approach and mitigated or eliminated the types of errors
observed in the control group. Second, standard deviations were lower in the
experimental group for 11 of 13 key dependent variables, and 7 of the differences were
statistically significant (Table 5). Reduced standard deviations signify the process
23
design minimized individual differences that contribute to process variation and hinder
consistency in manufacturing.
Table 5. Experimental and Control Group Standard Deviations

Mean (SD)
Dependent Variable
Experimental Control
Acceptable Quantity Recorded* 335.08 (.29) 335.50 (1.0)
Acceptable Dollar Amount Recorded* $433.42 ($1.44) $419.25 ($37.27)
Acceptable Percent Correct* 100.0% (0.00%) 99.9% (.29%)
Rejectable Quantity Recorded* 15.00 (0.00) 14.33 (1.16)
Rejectable Dollar Amount Recorded* $58.00 ($0.00) $53.00 ($9.32)
Rejectable Percent Correct* 100.0% (0.0%) 95.6% (7.7%)
Task Duration 25 min (6 min) 26 min (3 min)
NASA-TLX Global Workload* 14.9 (7.9) 26.4 (14.2)
Grammatical Reasoning Task Accuracy 94.1% (6.1%) 94.1% (5.7%)
Grammatical Reasoning Task Reaction Time 7.5 sec (3.9 sec) 20.1 sec (38.6 sec)
Ease of Completion Usability Rating 6.7 (.65) 6.0 (1.09)
Amount of Time Usability Rating 6.3 (.62) 5.8 (1.03)
Work Instructions Usability Rating 6.8 (.39) 6.7 (.65)
*p < .05, Levene’s test for equality of variances
3.2. Workload
3.2.1. NASA-TLX Workload Ratings

The average global weighted NASA-TLX score was 14.9 (SD = 7.9) in the experimental
group and 26.4 (SD = 14.2) in the control group. The mean difference was statistically
significant, F(1, 22) = 6.03, p = .022, 95% CI of the difference [1.6, 21.4], d = 1.0.
Participant comments at the end of the task revealed that two primary challenges led to
increased task demand in the control group—devising an efficient sorting and counting
method (while striving to meet the 20-minute completion time) and performing manual
counts and calculations. Three control group participants indicated that sorting tiles
without bins and using a calculator with small buttons also added to the demand. Indeed,
as shown in Figure 4, ratings for all six NASA-TLX subscales were higher in the control
group as compared to the experimental group. However, one-way ANOVAs of each
workload dimension indicated that none of the differences was statistically significant
(p > .05).
24
Mean NASA-TLX Subscale Scores
50
Experimental Control
40
30
Rating
20
10
0
Mental Physical Temporal Performance Effort Frustration
NASA-TLX Subscale
Figure 4. Mean NASA-TLX Subscale Scores in the Experimental and Control

Groups. Error bars represent standard errors.
3.2.2. Secondary Task Analysis

Unlike the NASA-TLX global workload ratings, the grammatical reasoning task did not
exhibit any statistically significant differences between the experimental and control
groups (p > .05). In terms of accuracy, both groups averaged 94% correct. While the
average reaction time appeared much lower in the experimental group (M = 7.5 sec,
SD = 3.9 sec) as compared to the control group (M = 20.1 sec, SD = 38.6 sec), this
disparity was largely driven by one control group individual who reported refraining
from responding to the secondary task until each immediate primary task action was
thoroughly complete. Excluding this individual from the analysis reduced average
reaction time to 9.0 seconds (SD = 2.7 sec), but did not alter statistical significance.
End-of-session comments suggest that many participants viewed the grammatical
reasoning task as a mildly distracting but welcome diversion from the tedium of sorting
tiles. They surmised that the grammatical reasoning task prolonged inspection task
completion time, but enjoyed periodically addressing a different type of challenge. As
one participant stated, “If I had to do this for eight hours, I might long for the ping [of
the auditory alert].” Another participant claimed “it was nice to have a break between
each task [segment on the visual inspection task].”
3.3. Usability Ratings

In accordance with the approach used by Walkenstein and Eisenberg (1996), a criterion
was established specifying the process would be deemed usable if 80% of participants
provided ratings of 6 or 7 for all three usability items. This criterion was met in the
experimental group (92%), but not the control group (67%). In fact, experimental group
participants did not have any ratings below 5. In contrast, control group participants
assigned ratings of 4 and used more ratings of 5 than the experimental group (Table 6).
25
Table 6. Experimental and Control Group Usability Ratings
Participant Ratings
Group Participant Ease of Amount of Task Work
All 6 or 7?
Completion Time Instructions
1 5 5 6 No
2 7 6 7 Yes
3 7 6 7 Yes
Experimental Group
4 7 7 7 Yes
5 7 6 7 Yes
6 7 6 7 Yes
7 7 6 7 Yes
8 6 6 6 Yes
9 7 7 7 Yes
10 6 6 7 Yes
11 7 7 7 Yes
12 7 7 7 Yes
13 5 4 7 No
14 5 6 7 No
15 5 5 5 No
16 4 4 6 No
Control Group
17 7 6 7 Yes
18 7 7 7 Yes
19 6 6 6 Yes
20 7 7 7 Yes
21 7 7 7 Yes
22 6 6 7 Yes
23 7 6 7 Yes
24 6 6 7 Yes
26
4. DISCUSSION
The value of human factors was demonstrated in a controlled between-groups
experiment for a simple visual inspection process. The experimental group achieved
greater performance accuracy, with reduced workload and more favorable usability
ratings, as compared to the control group. Such improvements stemmed from applying
a user-centered design approach for the experimental group that focused on thorough
consideration of general human factors and specific visual inspection principles. The
result was a more efficient and usable process for the experimental group that lends
itself well to follow-on manufacturing steps and analysis. For example, use of electronic
spreadsheets to enter inspection outcomes resulted in legible records for archiving and
future analysis, with none of the scratchouts or writeovers observed in the control group
(refer to Figure 3). Further, the experimental group process design resulted in an end
state configuration suitable for the next level of processing (refer to Figure 2). By
contrast, control group participants concluded the task with rows, columns, piles, or
vertical stacks of tiles scattered across the table.
In a realistic manufacturing process, inspected parts must be organized to support
subsequent processing, either installation in the next level of assembly (acceptable
parts) or preparation for analysis and troubleshooting (rejectable parts). In the present
study, this step was incorporated into the experimental group process via sorting bins
and trays; however, this step was not specifically required in the control group. At a
minimum, depositing tiles into separate bins after the sorting was done would have not
only prolonged completion time for the control group but also increased opportunities
for error. In effect, although task completion time differences were not statistically
significant, the experimental group was ultimately faster since the control group would
still have to complete the final step.
4.1. Process Variation Reduction

Observed improvements in the present study were possible because the experimental
group process design promoted more uniform task approaches among participants,
effectively reducing process variation and mitigating or eliminating errors observed in
the control group. Specifically, the observed number and variety of errors as well as
standard deviations for most dependent variables were smaller in the experimental
group as compared to the control group. Such differences indicate the process design,
including the work instruction and tools, reduced individual differences in experimental
group task approaches.
This reduction in process variation implies that including human factors during design
can generate a more consistent, repeatable manufacturing process by focusing on human
factors for the visual inspection component of the manufacturing process.
Understanding and reducing process variation in all components of the manufacturing
process is critical. Excess variation at any point can increase scrap and rework, require
additional inspections, impair functionality, and reduce reliability and durability
(Steiner & MacKay, 2014).
27
4.2. Limitations of the Present Study
The magnitude of effects in the present study was limited by inspection task simplicity,
as evidenced in part by performance accuracy ceiling effects for acceptable tiles. This
outcome was the result of applying a simple inspection criterion based on a single tile
feature (the character printed on the tile). In reality, receipt inspection typically involves
simultaneous inspection for numerous defect types such as scratches, dents,
discolorations, and the presence of foreign material. As indicated in See’s (2012)
review, inspection only becomes more difficult as the number of different defect types
increases, magnifying task demand and reducing performance accuracy. Thus,
additional differences between the experimental and control groups might have
occurred in the current study with a more complex inspection task.
Another limitation in the present study was the relative ineffectiveness of the secondary
grammatical reasoning task as a supplemental measure of task demand. Previous
research has demonstrated its sensitivity to various stressors such as narcosis during
diving, automobile driving, and white noise (Baddeley, 1968). In the car driving study,
as in the present study, the grammatical reasoning task was used as a secondary task. In
that study, grammatical reasoning task reaction times increased by 44% and accuracy
fell by 28% when participants completed an auditory version of the task while
navigating a vehicle. In comparison to previous applications, the inspection task used in
the present study may not have been stressful enough to impact grammatical reasoning
task performance. Namely, 83% of NASA-TLX frustration ratings; which encompass
stress, annoyance, and irritation; were 10 or below. As in the driving study, an auditory
version of the grammatical reasoning task may also have been useful for a more
continuous secondary task presentation minimizing task switching.
4.3. Directions for Future Research

Future research might focus on designing a more complex process to increase task
demands, while striving to minimize the amount of required participant training and
preparation. Human error and workload tend to increase as task complexity increases,
while usability can become degraded (Swain & Guttmann, 1983). Therefore, a more
complex task should generate more robust impacts in performance, workload, and
usability and simultaneously enhance ecological validity. With a higher-demand, more
complex task, the grammatical reasoning task or other form of secondary task might
also become more suitable as a supplemental indicator of inspection task demand. At
the same time, however, any task should be simple enough to minimize requirements
for specialized skills and training, accommodating the general population and reducing
participant time burdens.
28
5. CONCLUSIONS
In summary, if the incorporation of human factors can make a difference in a simple
task such as that used in the present study, even greater benefits might be expected to
accrue for more complex products and processes. In effect, designing a task simply by
using available tools, without true consideration of the human in the system, might
yield a workable process, but not an optimal process that promotes effectiveness,
reduces workload, and enhances usability. Non-human factors practitioners may
periodically require current, relevant evidence to help convince them that human
factors issues must be expressly considered early and often throughout the lifecycle.
To paraphrase Walkenstein and Eisenberg (1996), experimental results such as these
help demonstrate the value and need of involving human factors engineering in the
design and development process and making it an integral part of that process.
5.1. Key Points

• The value of human factors was demonstrated in a controlled between-groups
experiment for a simple visual inspection process.
• Twenty-four SNL employees completed a receipt inspection task that was
designed with consideration of human factors (experimental group) and without
(control group).
• The experimental group exhibited superior performance accuracy, reduced
workload, and more favorable usability ratings as compared to the control group.
• Results suggest that incorporating human factors during process design could have
even greater benefits for complex products and processes.
• The study provides evidence to revitalize the critical message regarding the
benefits of human factors involvement for a new generation of designers.
29
30
REFERENCES
1. Baddeley, A. D. (1968). A 3 min reasoning test based on grammatical transformation.

Psychonomic Science, 10, 341-342.
2. Bailey, G. (1993). Iterative methodology and designer training in human-computer interface
design. In S. Ashlund, A. Henderson, E. Hollnagel, K. Mullet, & T. White. Proceedings of
INTERCHI ’93. Paper presented at INTERCHI 93: Conference on Human Factors in
Computing Systems, Amsterdam, The Netherlands (pp. 198-205). Amsterdam, The
Netherlands: IOS Press.
3. Bruseberg, A. (2008). Presenting the value of human factors integration: Guidance,
arguments, and evidence. Cognition, Technology & Work, 10, 181-189.
4. Burgess-Limerick, R., Cotea, C., & Pietrzak, E. (2010). Human systems integration is worth
the money and effort! The argument for the implementation of human systems integration
processes in defence capability acquisition. Commonwealth of Australia: Department of
Defence.
5. Drury, C. G. (1985). Stress and quality control inspection. In C. I. Cooper & M. J. Smith
(Eds.), Job stress and blue collar work (pp. 113-129). Chichester, UK: Wiley.
6. Drury, C.G., & Watson, J. (2002). Good practices in visual inspection. Washington, D.C.:
Federal Aviation Administration/Office of Aviation Medicine. Retrieved from
http://www.faa.gov/about/initiatives/maintenance_hf/library/documents/media/human_fact
ors_maintenance/good_practices_in_visual_inspection_-_drury.doc
7. GSA (1993). Electronic Forms Systems Analysis and Design. Report KMP-92-6-R.
Washington, D. C.: U. S. General Services Administration Information Resources
Management Service. Retrieved from
http://webapp1.dlib.indiana.edu/virtual_disk_library/index.cgi/4280768/.../Efsadg.pdf
8. Hart, S. G., & Staveland, L. E. (1988). Development of NASA-TLX (Task Load Index):
Results of empirical and theoretical research. Advances in Psychology, 52, 139-183.
9. Hendrick, H. W. (1996). The ergonomics of economics is the economics of ergonomics.
Proceedings of the Human Factors and Ergonomics Society 40th Annual Meeting, 40, 1-10.
10. Hendrick, H. W. (2008). Applying ergonomics to systems: Some documented “lessons
learned.” Applied Ergonomics, 39, 418-426.
11. Konz, S., & Johnson, S. (2008). Work design: Occupational ergonomics (7th ed.). Scottsdale,
AZ: Holcomb Hathaway.
12. Kromer, K. H. E., & Grandjean, E. (1997). Fitting the task to the human: A textbook of
occupational ergonomics (5th ed.). Boca Raton, FL: CRC Press.
13. Lewis, J. R. (1995). IBM computer usability satisfaction questionnaires: Psychometric
evaluation and instruction for user. International Journal of Human-Computer Interaction,
7, 57-78.
14. Lin, L., Isla, R., Doniz, K., Harkness, H., Vicente, K., & Doyle, D. J. (1995). Analysis,
redesign, and evaluation of a patient-controlled analgesia machine interface. Proceedings of
the Human Factors and Ergonomics Society 39th Annual Meeting, 39, 738-741.
31
15. Luna, S. F., Sturdivant, M. H., & McKay, R. C. (1988). Factoring humans into procedures.
In Human Factors and Power Plants, 1988, Conference Record for 1988 IEEE Fourth
Conference on Human Factors and Power Plants. Paper presented at 1988 IEEE Fourth
Conference on Human Factors and Power Plants, Monterey, CA (pp. 201-207). Monterey,
CA: Institute of Electrical and Electronics Engineers.
16. Matzen, L. E. (2009). Recommendations for Reducing Ambiguity in Written Procedures.
Report SAND2009-7522. Albuquerque, NM: Sandia National Laboratories.
17. Nielsen, J. (1995). 10 usability heuristics for user interface design. Retrieved from
https://www.nngroup.com/articles/ten-usability-heuristics/
18. Norman, D. (1988). The psychology of everyday things. New York: Basic Books, Inc.
19. Rauterberg, M., & Strohm, O. (1992). Work organization and software development. Annual
Review of Automatic Programming, 16, 121-128.
20. Reason, J. T. (1990). Human error. Cambridge, England: Cambridge University Press.
21. Rouse, W., Kober, N., & Mavor, A. (Eds.) (1997). The case for human factors in industry
and government: Report of a workshop. Washington, D. C.: National Academy Press.
22. Russ, A. L., Zillich, A. J., Melton, B. L., Russell, S. A., Chen, S., Spina, J. R.,…Saleem, J.
J. (2014). Applying human factors principles to alert design increases efficiency and reduces
prescribing errors in a scenario-based simulation. Journal of the American Medical
Informatics Association, 21, 287-296.
23. Sager, L., & Grier, R. A. (2005). Identifying and measuring the value of human factors to an
acquisition project. Paper presented at the Human Systems Integration Symposium,
Arlington, VA.
24. Sanders, M. S., & McCormick, E. J. (1993). Human factors in engineering and design (7th
ed.). New York: McGraw-Hill.
25. See, J. E. (2017). Human Factors for NES: 18 Fundamental Topics. Report SAND2017-
2739 O. Albuquerque, NM: Sandia National Laboratories.
26. See, J. E. (2012). Visual Inspection: A Review of the Literature. Report SAND2012-8590.
Albuquerque, NM: Sandia National Laboratories.
27. Sen, R. N., & Yeow, P. H. P. (2003). Cost effectiveness of ergonomic redesign of electronic
motherboard. Applied Ergonomics, 34, 453-463.
28. Shaver, E. F., & Braun, C. C. (2008). The return on investment (ROI) for human factors and
ergonomics initiatives. Moscow, ID: Benchmark Research & Safety, Inc.
29. Smith, S. L., & Mosier, J. N. (1986). Guidelines for Designing User Interface Software.
Report ESD-TR-86-278. Bedford, MA: The MITRE Corporation. Retrieved from
http://www.hcibib.org/sam/
30. Steicklein, J. M., Dabney, J., Dick, B., Haskins, B., Lovell, R., & Moroney, G. (2004, June).
Error Cost Escalation Through the Project Life Cycle. Report JSC-CN-8435. Houston, TX:
NASA Johnson Space Center.
31. Steiner, S. H., & MacKay, R. J. (2014). Statistical engineering and variation reduction.
Quality Engineering, 26, 44-60.
32
32. Swain, A. D., & Guttmann, H. E. (1983). Handbook of Human Reliability Analysis with
Emphasis on Nuclear Power Plant Application. Technical Report NUREG/CR-1278-F
SAND-0200. Albuquerque, NM: Sandia Corporation.
33. Walkenstein, M., & Eisenberg, R. (1996). Benefiting design even late in the development
cycle: Contributions by human factors engineers. Proceedings of the Human Factors and
Ergonomics Society 40th Annual Meeting, 40, 318-322.
34. Yeow, P.H.P., & Sen, R.N. (2004). Ergonomics improvements of the visual inspection
process in a printed circuit assembly factory. International Journal of Occupational Safety
and Ergonomics, 10, 369-385.
35. Yousefi, P., & Yousefi, P. (2011). Cost justifying usability: A case study at Ericsson
(Unpublished master’s thesis). Blekinge Institute of Technology, Karlskrona, Sweden.
33
34
DISTRIBUTION
1 MS0148 Allison Noble 10112

1 MS0150 Daniel Briand 00150
1 MS0151 Steven M. Trujillo 00151
1 MS0151 Judi See 00151
1 MS0152 Courtney Dornburg 09131
1 MS0152 Susan Adams 09131
1 MS0152 Kay Rivers 09131
1 MS0152 Liza Kittinger 09131
1 MS0158 Richard Craft 00158
1 MS0158 Thor Osborn 00159
1 MS0469 Caren Wenner 09130
1 MS0472 Jason Morris 02337
1 MS0348 Victoria Newton 10653
1 MS0481 Nathan Brannon 02220
1 MS1327 Phil Bennett 01463
1 MS1327 Laura Matzen 01463
1 MS1327 Mallory Stites 01463
1 MS0899 Technical Library 09536 (electronic copy)
35

Demonstrating The Value of Human Factors For Process Design in A Controlled Experiment

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Demonstrating The Value of Human Factors For Process Design in A Controlled Experiment

Uploaded by

Copyright:

Available Formats

SANDIA REPORT

Demonstrating the Value of Human

Sandia National Laboratories is a multimission laboratory managed and operated

Available to DOE and DOE contractors from

Telephone: (865) 576-8401

Available to the public from

Telephone: (800) 553-6847

Demonstrating the Value of Human Factors for

Figure 1. Grammatical Reasoning Task Examples ...................................................................... 16

1.1. Reactive Case Studies

1.2. Controlled Experiments

1.3. Research Gaps

2.3. Primary Task Design

2.4.1. Orientation and Practice

fitting the job to worker human body

definitions of inspection criteria; no photos

effort (Luna, Sturdivant, & McKay,

sorting bins • Redundant coding reduces errors

Control: counting process individually

present more opportunities for human

manual calculations (Sanders & McCormick, 1993)

2.4.1.2. Secondary Task Practice

“A precedes B” “A follows B” “A is not followed by B” “B is preceded by A”

Figure 1. Grammatical Reasoning Task Examples

2.4.1.3. Primary Visual Inspection Task Practice

2.4.2. Task Implementation

Experimental Group Control Group

Figure 2. Experimental and Control Group Sorting Techniques

Following sorting and categorization, participants calculated vendor fees. Experimental

Experimental Group Control Group

3.1. Task Accuracy and Speed

3.1.1. Task Accuracy

Table 2. Acceptable Tile Accuracy Results

Experimental 1 p = .295, Fisher’s exact test, one-

Acceptable Dollar Experimental 1 p = .077, Fisher’s exact test, one-

Experimental 0 p = .047, Fisher’s exact test, one-

Rejectable Dollar Experimental 0 p = .047, Fisher’s exact test, one-

3.1.2. Task Speed

3.1.3. Error Analysis

3.1.4. Reductions in Variability

Table 5. Experimental and Control Group Standard Deviations

3.2.1. NASA-TLX Workload Ratings

Figure 4. Mean NASA-TLX Subscale Scores in the Experimental and Control

3.2.2. Secondary Task Analysis

3.3. Usability Ratings

4.1. Process Variation Reduction

4.3. Directions for Future Research

5.1. Key Points

1. Baddeley, A. D. (1968). A 3 min reasoning test based on grammatical transformation.

1 MS0148 Allison Noble 10112

1 MS0899 Technical Library 09536 (electronic copy)

You might also like