Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

Journal of Safety Research xxx (xxxx) xxx

Contents lists available at ScienceDirect

Journal of Safety Research


journal homepage: www.elsevier.com/locate/jsr

The effect of punishment and feedback on correcting erroneous behavior


Curtis G. Calabrese ⇑, Brett R.C. Molesworth, Julie Hatfield
School of Aviation, University of New South Wales, Sydney, NSW 2052, Australia

a r t i c l e i n f o a b s t r a c t

Article history: Introduction: Understanding the consequences of non-punitive sanctions and feedback for nonintentional
Received 15 February 2023 deviations (i.e., errors) is important to effective safety policy. This study aims to address a lack of research
Received in revised form 2 June 2023 on the effects of punishment and feedback on correcting erroneous behavior in the context of multitask-
Accepted 1 September 2023
ing. Method: A Multi-Attribute Task Battery (MATB-II) was employed to simulate the demands of aviating,
Available online xxxx
an important area of applied safety. Sixty participants were randomly assigned to one of four experimen-
tal groups (no intervention, punishment, feedback, punishment + feedback) and asked to perform the
Keywords:
MATB-II. Punishment, feedback, and punishment + feedback decreased error and increased performance,
Error
Multitasking
with punishment alone having the greatest effect. Results: The results highlight the need for behavioral
Punishment consequences or feedback to reduce erroneous behavior. Practical Applications: From an applied perspec-
Feedback tive, these results have implications for policy and training.
Restorative justice Ó 2023 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY license (http://
Retributive justice creativecommons.org/licenses/by/4.0/).

1. Introduction To discourage reckless or erroneous behavior, regulatory bodies


have implemented various jurisprudential models. Retribution and
Multitasking is common and prevalent in many safety–critical restitution are common approaches to achieving social order and
professions, including aviation, where pilots must simultaneously justice in modern correctional theory (Cullen & Jonson, 2016;
operate flight controls, monitor flight instruments, and communi- Tabachnick & Fidell, 2013). Retributive justice relies on deterrence
cate with air traffic control, among other tasks. Multitasking does, theory (e.g., punishment) to reduce recidivism and dissuade others
however, heighten the propensity to commit errors. Hence, it is from wrongdoing. Conversely, restorative justice promotes victim
imperative to examine the factors contributing to pilot errors dur- and offender restoration without the use of offender punishment.
ing multitasking and explore potential strategies to mitigate them, Such restorative-based systems predominantly rely on counseling
thereby bolstering aviation safety. This study aims to investigate and informational feedback to promote offender compliance
the impact of two approaches to discouraging erroneous behavior: (Marshall, 1999). Both justice systems aim to maintain social order
retributive justice (punishment) and restorative justice (feedback). by increasing compliance through behavioral modification, but
Erroneous behavior is the leading cause of incidents and acci- through different mechanisms (e.g., punishment vs. feedback).
dents in aviation (Visser, Pijl, Stolk, Neeleman, & Rosmalen, How these differing mechanisms affect human performance in avi-
2007). Erroneous behavior can typically be broken down into two ation, and specifically within a multitasking context, is the main
categories, namely violations and errors. Violations are intentional focus of the present study.
actions (or inactions) that deviate from established rules, proce- Restorative justice is translated into practice using a wide range
dures, or norms; while errors are unintentional actions (or inac- of methods (e.g., performance feedback, counseling, restitution)
tions) that are said to occur outside of conscious thought and/or and with varying degrees of success (Cullen & Jonson, 2016). One
control. Existing research generally focuses on violations and the such feedback scheme is employed by the U.S. Federal Aviation
use of punishment to reduce them, in accordance with deterrence Administration (FAA) via its ‘Compliance Program,’ this study’s
theory (Chalfin & McCrary, 2017). However, there is limited litera- impetus. The Compliance Program primarily utilizes verbal coun-
ture on the effects of punishment on actions (or inactions) outside seling, comprised of informational feedback about the deviation
an individual’s conscious thought (i.e., errors) and overall human and suggestions for improvement, as the regulatory consequence.
performance. The FAA claims that under this form of justice (compared to
retributive systems), individuals are more likely to report errors,
so allowing for these errors to be addressed at either the individual
⇑ Corresponding author at: University of New South Wales, Sydney, School of
or systemic level (Federal Aviation Administration, 2015). How-
Aviation, Old Main Building, Room 205, NSW 2052, Australia.
E-mail address: c.calabrese@unsw.edu.au (C.G. Calabrese).
ever, recent research has highlighted how such an approach may

https://doi.org/10.1016/j.jsr.2023.09.001
0022-4375/Ó 2023 The Authors. Published by Elsevier Ltd.
This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).

Please cite this article as: C.G. Calabrese, Brett R.C. Molesworth and J. Hatfield, The effect of punishment and feedback on correcting erroneous behavior,
Journal of Safety Research, https://doi.org/10.1016/j.jsr.2023.09.001
C.G. Calabrese, Brett R.C. Molesworth and J. Hatfield Journal of Safety Research xxx (xxxx) xxx

adversely affect safety (Calabrese, Molesworth, Hatfield, & Slavich, ing. The threat of punishment for exceeding the speed limit, such
2022). as a monetary penalty, has the potential to motivate individuals
Informational feedback informs an individual about the correct- to pay more attention to their speedometer, and thus reduce the
ness, physical effect, social or emotional impact of their behavior likelihood of unintentionally (i.e., error) exceeding the speed limit,
through words, sounds, visual cues, or other stimuli (Chui, as well as intentionally exceeding the speed limit (i.e., violation).
Molesworth, & Bromfield, 2021). It may be delivered during or There is a wealth of literature on the effects of punishment for
after the relevant behavior (Kulhavy & Stock, 1989; VandenBos, intentional offenses (defined as ‘‘violations”) on human behavior,
2007). The primary role of feedback is to provide individuals with and meta-analyses (Pratt, Cullen, Blevins, Daigle, & Madensen,
information about how to improve their behavior. Feedback may 2006; Rupp, 2008) broadly suggest that punishment is effective
also influence motivation by serving as positive reinforcement (re- in reducing violations, consistent with deterrence theory (Chalfin
ward) for correct behavior; or punishment for incorrect (or imper- & McCrary, 2017). In contrast, there is limited literature about pun-
fect) behavior. The effectiveness of informational feedback appears ishment’s effects on unintentional actions, conceived here as
mixed. In a meta-analysis, Mory’s (1992) found approximately half errors, and even fewer about its effects on ‘human performance.’
of the studies found feedback improved test performance, while An error is ‘‘a generic term to encompass all those occasions in
the other half did not show improvement. A meta-analysis con- which a planned sequence of mental or physical activities fails to
ducted by Kluger and DeNisi (1996) found that feedback interven- achieve its intended outcome, and when these failures cannot be
tions produced negative effects on performance; and they attributed to the intervention of some chance agency” (Reason,
suggested that the effectiveness of feedback decreases as attention 1990, p. 195); whereas, ‘performance’ is the measure of a task out-
decreases during task performance. A more recent meta-analysis of come (Fleishman, 1975). Hence, errors can be a component of per-
435 information feedback studies ranging from a low level of feed- formance indices.
back (referred to as ‘simple’) to a high level of feedback (referred to In terms of performance, Visser, van der Put, and Assink (2022)
as ‘elaborate’), concluded that the effectiveness of feedback is conducted a meta-analysis examining punishment’s effect on stu-
related to the amount of information contained within the feed- dent spelling, reading, vocabulary, and mathematic scores. The
back (Wisniewski, Zierer, & Hattie, (2020). The variance in studies results showed a negative relationship between punishment and
comes as no surprise considering the vast methods of feedback performance. Likewise, the Akhtar and Awan (2018) survey of
implementation (e.g., type, time at which provided, quality of feed- 300 participants concluded that corporal punishment negatively
back, delivery methods – who or what). Similarly, and as noted affects student performance in public schools. As noted above,
with the literature examining punishment, this literature does these studies do not bifurcate between intentional (violations)
not bifurcate between intentional (violations) and unintentional and unintentional actions (errors) resulting in punishment.
actions (errors). The latter is central to the aim of the present In terms of error, there appears to be few studies that examine
study, and specifically in a multitasking context. the relationship between punishment and errors. One of the stud-
In contrast to restorative justice, retributive justice aims to ies investigated how punishment and reward affected errors and
reduce the likelihood of noncompliance through retribution (i.e., reaction time on a visual task. Error did not differ based on punish-
punitive sanctions). According to deterrence theory, the threat of ment or reward, however, response time did. Response time was
punishment deters nonconforming behavior (Grosvenor, Toomey, faster in the punishment condition as opposed to the reward con-
& Wagenaar, 1999; Nader, 1986) at both the individual and popu- dition (Stürmer, Nigbur, Schacht, & Sommer, 2011).
lation levels. Punishing offenders directly (specific deterrence) The aim of the present research is to evaluate the effects of pun-
reduces their likelihood of reoffending (Freeman, Szogi, Truelove, ishment and feedback on erroneous behavior. Punishment is oper-
& Vingilis, 2016; Nagin, Cullen, & Jonson, 2009), as well as deter- ationalized in the form of monetary loss (a negative punishment),
ring members of the public who know of the punishment and while feedback is operationalized in the form of verbal counseling,
criminal act/s (general deterrence; Chalfin & McCrary, 2017). which consisted of identifying task correctness, errors, and meth-
Deterrence theory aligns with behavioralist Skinner’s (1938) ods to improve. In pursuit of this aim, the study seeks to answer
theories on behavior modification through reinforcement and pun- the following research question: How does punishment and feed-
ishment. Skinner (1938) postulated that behavior can be shaped by back affect errors and performance on a multitasking activity?
its consequences, underscoring the significance of environmental
factors in shaping behavior. He introduced the concepts of positive
reinforcement (i.e., supplying positive outcomes; e.g., reward,
praise) and negative reinforcement (i.e., removing negative out- 2. Methodology
comes; e.g., stopping nagging) to increase desired behavior, and
positive punishment (i.e., implementing negative outcomes; e.g., 2.1. Participants
fines) and negative punishment (i.e., removing positive outcomes;
e.g., withholding payment) to decrease undesired behavior. While Graduate and undergraduate students from the University of
Skinner did not explicitly concentrate on motivation as an internal New South Wales (UNSW) were invited to participate in a project
state, his theories on reinforcement and punishment can be related about human error and performance in multitasking, involving
to the effort that drives behavior. ‘‘completing a computer-based task where you are to manage sev-
Motivation has been variously defined (Sundberg, 2013); in the eral different activities (i.e., multitasking) simultaneously, like
current context, motivation is considered as the effort that drives pilots do during flight. The project will last no longer than 90 min-
behavior (VandenBos, 2007). According to Rasmussen (1982), utes.” Sixty participants (30 male) volunteered for the research.
behavior/actions can be performed either consciously through The average age was 25.93 (SD = 5.73) years, and participants aged
deliberate thought, or unconsciously. Violations are defined as between 18 and 46 years. The sample size was deemed sufficient to
intentional, and are therefore, under conscious control/thought. detect between group effects using Cohen’s criteria. Power calcula-
In contrast, errors are defined as unintentional, and are therefore tions demonstrated that this sample size was sufficient to reveal a
thought to be outside of conscious thought. It is plausible that medium to large effect size (effect size = 0.45, alpha = 0.05, actual
through effort (i.e., motivation), actions typically thought to be power = 0.828, groups = 4, N = 60). The research, including all stim-
outside of conscious thought (i.e., errors) can be modified uli, was approved in advance by the University of New South Wales
(Calabrese et al., 2022). Take for example the common act of speed- (UNSW), Sydney Human Research Ethics Committee.
2
C.G. Calabrese, Brett R.C. Molesworth and J. Hatfield Journal of Safety Research xxx (xxxx) xxx

2.2. Design ond monitor was placed alongside the joystick and speakers in the
research cubical. The computer simultaneously ran MATB-II and
The study comprised a single factorial between-groups design Microsoft Visual Studio for continuous measurements for the
with four levels. The control group (group 1) completed the multi- tracking task required for the main analysis.
tasking activity with no intervention. The three experimental
groups received forms of punishment and/or feedback and 2.4. Procedures
included: information feedback group (i.e., instructions on how
to improve performance based on participant performance/error; Participants were recruited through an internal mailing list for
(group 2), punishment group (i.e., payment reduced by $0.53 for students. Interested individuals selected a time for the experiment.
each error; (group 3), and punishment plus informational feedback On the day of testing, participants first completed the demograph-
group (group 4). ics questionnaire. Next, participants were instructed on the use of
To determine an appropriate monetary level of punishment MATB-II and task objectives as described below.
(balance between perceived impact and severity) for the main The system monitoring task requires participants to identify
experiment, a pilot study was conducted, employing identical pro- and respond to the non-normal state of two warning lights (repre-
cedures to the main experiment and six participants. Financial sented by the squares labeled F5 and F6, Fig. 1) and four moving
penalties ranged from 25c to 1.50c per error. Evaluation of the data scales (represented by the columns labeled F1, F2, F3, and F4,
revealed a penalty of approximately AUD$0.50 per error emerged Fig. 1) by clicking on the associated light/scale. Participants were
as a suitable and effective motivator. A punishment of AUD$0.53 informed that the lights are currently in the normal color state
per error was elected to minimize additional cognitive load associ- (i.e., green and grey, respectively) and if the light changes color,
ated with participants calculating their total punishment easily participants would have seven seconds to respond correctly by
(7  0.53 is more difficult to calculate than 7  0.50). clicking the light with the mouse, otherwise one error would be
Both the control and intervention groups were subjected to recorded.
tasks of identical duration, facilitated by the use of the MATB-II, In the resource management task, participants attempted to
which inherently dictates a set time. However, the temporal keep two simulated fuel tanks within defined parameters by turn-
dynamics of completion diverged slightly between the groups, pri- ing various pumps on and off. Participants were informed that the
marily due to the method of breaks, feedback, and punishment uti- green represent fuel and the blue rectangles labeled A through F
lized in each case. The control group finished the experiment represented fuel tanks. The rectangles labeled 1 through 8 repre-
slightly faster (3 minutes) due to the lack of intervention. The sent fuel pump with the arrow indicating the direction of fuel flow.
two dependent variables were multi-tasking error, and overall per- Participants were instructed to keep the level of fuel in tanks A and
formance (both including metrics from the communication, B as close to the blue line (i.e., 2,500 units) as possible and within
resource management, system monitoring, and tracking tasks that the light blue band (i.e., 2,000–3,000 units). An error is recorded for
are described in detail in Procedures). every four seconds the fuel is beyond the blue band.
The communications task prompts participants with aural com-
2.3. Materials, software, and equipment mands to change radio frequencies. Participants were instructed to
respond to the correct aural command only indicated by the call-
The material comprised a paper demographics questionnaire sign ‘‘NASA 504.” For example, ‘‘NASA 504, tune your NAV 1 radio
(i.e., age, sex, piloting hours, and handedness), a paper penalty to 113.525.” In this case, the participant must select the respective
form (used by the research to convey error count and financial NAV1 radio button, then use arrows to select the proper frequency.
penalty to the two punishment groups), and a feedback script used Participants were informed that they had 20 seconds to correctly
by the researcher with the two feedback groups. The feedback respond or one error would be recorded.
script involved three components: self-debriefing (e.g., asking the Lastly, the tracking task requires participants to hold a con-
participant about their errors), reinforcing comments (e.g., infor- stantly moving dot near the center of an x–y axis using a joystick.
mation on what went well), and corrective comments (e.g., how Participants were informed to keep the dot as close to center of the
to improve). x–y axis as possible and to not go outside the blue circle. They were
The computer-based multitasking software Multi-Attribute also informed that one error would be recorded for every two sec-
Task Battery II (MATB) was utilized in this experiment. This pro- ond period the dot is outside the circle.
gram is a widely used research platform for multitasking assess- Multi-tasking error was the total error count of each task across
ment designed to evaluate human multitasking performance and the five experimental trials. Performance encompassed three met-
workload (Liu, 2017; Santiago-Espada, Myer, Latorella, & rics: errors, accuracy, and efficiency (time). It was computed by
Comstock Jr, 2011). Liu and Nam (2018) summarize 18 previous adding the z-score of the following task scores: (1) System Moni-
studies of performance measures utilizing MATB. This tried and toring: average system monitoring reaction time and error count;
tested software is ideal for this experiment because of its ability (2) Resource Management: average difference from 2,500 units
to capture multitasking performance of non-pilot operators. and error count; (3) Communications: average communications
The MATB-II requires the simultaneous performance of four reaction time and error count; and (4) Tracking: average distance
tasks: (1) system monitoring, (2) resource management, (3) com- from the center of the x–y axis when inside the circle, the average
munications, and (4) tracking (Santiago-Espada et al., 2011). number of two-second periods outside the circle (i.e., error count),
Fig. 1 identifies these tasks on the MATB interface with headings and the average number of seconds outside the circle.
above the respective task. Each task and the measures derived from Participants learned MATB-II through an initial training block of
them are described in Procedures. seven three-minute trials. Each trial focused on teaching the indi-
The data captured through MATB-II (Fig. 1) was analyzed using vidual tasks and built upon these tasks (i.e., compartmentalization
the SPSS software v28. The equipment comprised: a Dell Inspiron training as occurs with flight training) in the following order: (1)
3505 computer operating with Windows 10, two Dell P2722H tracking, (2) system monitoring, (3) tracking and system monitor-
2700 monitors, one Logitech Extreme 3D Pro joystick, one Dell wired ing, (4) tracking and communications, (5) tracking, system moni-
mouse, and a pair of Logitech Z200 speakers (Fig. 2). The computer, toring and communications, (6) resource management, and (7)
including one monitor and mouse was set up in the researcher tracking and resource management. Learning and performance sta-
observer station outside of the research cubical (2  3 m). The sec- bilization was observed through a second training block of 10
3
C.G. Calabrese, Brett R.C. Molesworth and J. Hatfield Journal of Safety Research xxx (xxxx) xxx

Fig. 1. MATB-II User Interface.

The difficulty of the second set of training trials (10 in total) and
the following five experimental trials were equal. Difficulty was
achieved through a set amount of visual and auditory prompts
(i.e., non-normal indication requiring participant response). Each
trail contained the same number of prompts: 22 system monitor-
ing (11 lights and 11 scales), 5 communication prompts, and 10
resource management (i.e., pump failures). Two prompts triggered
within five seconds are defined as a ‘difficult pairing.’ Each trial
contained an equal disbursement of nine difficult pairings. All
other singular prompts were triggered at no sooner than a six sec-
onds interval.
Following the final training trial, the participants were informed
that there would be five more three-minute experimental trials,
and the difficulties of the experimental trials and training trials
would be the same. The Control Group received no further instruc-
tions and no intervention during the experimental trials. The Feed-
back Group was told that the researcher would enter the room
immediately following each test session and provide feedback to
the participant on their performance. Feedback was conducted in
the same manner for each participant utilizing a script (see Appen-
dix A). The Punishment Group was told that the amount of their
payment was predicated on their performance for these last five
sessions. The researcher entered the room immediately following
each session and recorded the participant’s error count, error pen-
alty (i.e., trial total), and monetary penalty on the penalty form.
The Punishment and Feedback Group first received the treatment
Fig. 2. Participant Workstation. then the instructions of both the Feedback Group and the Penalty
group.
three-minute training trials. Through a pilot study with five partic- Participants completed the experiment in 90 minutes and were
ipants, it was observed that learning was achieved by the fifth trial. given AUD$40 gift voucher (given in four AUD$10 amounts) at the
Five more training trials were included to accommodate potential conclusion of their experimental session. Although the threat of
outliers in aptitude. monetary loss loomed as a punishment for those in the designated
4
C.G. Calabrese, Brett R.C. Molesworth and J. Hatfield Journal of Safety Research xxx (xxxx) xxx

groups, every participant ultimately received the full AUD$40 upon score and performance). As can be seen in Table 1, significant dif-
the successful conclusion of their data collection session. ferences were evident based on ‘error’ scores in all but two group
pairing, namely between Feedback vs. Punishment, and Feedback
vs. Feedback + Punishment. In terms of ‘performance,’ significant
3. Results differences were evident in all but one group pairing, namely
between Feedback vs. Punishment. As can be seen in Fig. 3, with
3.1. Data screening both ‘error’ score and ‘performance,’ punishment yielded the great-
est improvement relative to the control group followed by feed-
Prior to the conduct of the main analysis, a simple factorial back, and feedback + punishment.
analysis examining differences between groups based on age and
baseline performance was performed to determine if the random-
ization of participants in the four groups was effective. With alpha
4. Discussion
set at 0.05 and test assumptions satisfactory no significant group
differences in age or baseline performance was observed (largest
This study examined the effects of punishment and feedback on
F, F (3, 56) = 2.11, p =.109 for Age).
operator performance in a multitasking task. Operator perfor-
Next, to examine whether performance had plateaued following
mance was measured based on errors, as well as at a more granular
training, four separate paired sample t tests (one for each group)
level (i.e., performance score), which incorporated task accuracy,
were conducted comparing mean error on the eighth baseline ses-
reaction time, displacement tracking, and error. The results
sion to mean error on the ninth and tenth baseline session.
revealed that participants in the control group (i.e., absence of
Assumptions of normality were met, and the results of the four
any punishment or feedback) exhibited significantly worse perfor-
separate t tests revealed no statistically significant difference (lar-
mance compared to the other experimental groups, irrespective of
gest t, t(14) = 1.192, p =.253; punishment group).
the performance measure (error vs. performance). Punishment
Pre-data screening using the interquartile range technique
alone yielded the most substantial improvement. Likewise feed-
(Moore, McCabe, & Craig, 2012) revealed the presence of outliers.
back also improved performance (error and performance score),
This outlying data (three participants) was transformed using the
however, its effect was not as pronounced as punishment, and
next most extreme score in the distribution plus/minus one tech-
combining it with punishment appeared to temper its effect. There
nique, as recommended by Tabachnick and Fidell (2013). Pearson
are several possible mechanisms that could account for these
product-moment correlational analysis showed a strong positive
findings.
relationship between overall error score and performance score
It is possible that motivation (e.g., threat of punishment)
(since the four performance measures were different, a z-score
induces more control over behavior, which reduces error. Punish-
was employed to provide a unified, single performance measure),
ment’s ability to motivate compliance may be cognitively medi-
r(60) = 0.89, p =.001. To check if this result was influenced by one
ated (Trang & Brendel, 2019; Xue, Liang, & Wu, 2011). It is
or a number of the different experimental groups, four additional
plausible that the motivational effect of punishment was a signifi-
correlational analyses were performed. With alpha adjusted to
cant driver (i.e., stimulating excellence), offsetting any fatigue or
control for familywise error (Bonferroni adjustment 0.05/4), the
complacency with the task. It is also plausible that the combination
results revealed a strong positive relationship between error and
of feedback and punishment was cognitively taxiing, thereby tem-
performance score for the Control (r(30) = 0.864, p <.001) and the
pering improvements in performance. While punitive deterrence
Punishment groups (r(30) = 0.693, p <.041), but not the Feedback
has been shown (i.e., deterrence theory) to positively affect behav-
(r(30) = 0.114, p =.685) and Feedback + Punishment groups (r
ior that is conscious (i.e., violation; Axelrod & Apsche, 1983; Braga,
(30) = 0.164, p =.560). This result highlights that while ‘perfor-
Weisburd, & Turchan, 2018), there are limited studies that show its
mance’ data encapsulates ‘error’ data, they measure two different,
effect on unconscious behavior (i.e., errors). In contrast, there is a
yet complimentary artefacts (i.e., performance is more granular).
breath of literature indicating the effect of cognitive load on uncon-
Therefore, both dependent variables (i.e., error score and punish-
scious as well as conscious behavior (Engström, Markkula, Victor, &
ment) were examined in the main analysis.
Merat, 2017).
Punishment may also have influenced behavior or motivation
3.2. Main analysis through the elicitation of negative emotions, such as disappoint-
ment. Van Dijk (1999) underscores the importance of examining
The next series of analyses examined differences between the emotional ramifications stemming from adverse outcomes.
groups based on the two dependent variables, ‘error’ score and Among these emotions, disappointment is particularly notable, as
‘performance’ on MATB-II. With assumptions of normality satisfac- it is more closely linked to the absence of a desired outcome in
tory, the one-way ANOVA for ‘error’ score revealed a statistically comparison to other related negative emotions, such as sadness,
significant difference among the groups (F(3, 56) = 35.41, anger, frustration, and regret. Although emotional measures were
p <.001). A similar result was evident with ‘performance,’ (F(3, not specifically assessed in this study, the majority of participants
56) = 41.02, p <.001). Table 1 displays the results of two separate voluntary mentioned that their performance was influenced by
Sidak post-hoc tests, one for each of the dependent variables (error factors such as the threat of punishment, the presence as an

Table 1
Sidak post-hoc descriptive statistics for error score and performance.

Error Score Performance (z-score)


Experimental Conditions Mean Difference, 95% CI p r2 Mean Difference, 95% CI p r2
Control vs. Feedback 9.48, [6.35, 12.60] <0.001 0.58 7.882, [5.40, 10.36] 0.077 0.60
Control vs. Punishment 11.34, [8.29, 14.40] <0.001 0.67 12.16, [9.83, 14.48] 0.007 0.80
Control vs. Feedback + Punishment 7.79, [4.60, 10.97] 0.001 0.47 5.79, [2.88, 8.69] <0.001 0.37
Feedback vs. Punishment 1.87, [3.14, 0.59] 0.078 0.25 4.27, [2.86, 5.70] 0.062 0.57
Feedback vs. Feedback + Punishment 1.69, [3.26, 0.13] 0.864 0.15 2.09, [4.34, 0.15] 0.017 0.12

5
C.G. Calabrese, Brett R.C. Molesworth and J. Hatfield Journal of Safety Research xxx (xxxx) xxx

of punishment might interact with other factors such as reporting


behavior.

6. Conclusions

Punishment and feedback affect individuals’ motivation and


attitude (Tricomi & DePasque, 2016), both of which are central in
the commission of violations (Ajzen, 2005; Neal & Griffin, 2006).
Motivation drives behavior; thus, the manipulation of motivation
influences intentional behavior. However, Calabrese et al. (2022)
observed that nonintentional deviations increased following a
decrease in punishment, suggesting that such factors might also
extend to errors.
Fig. 3. Performance (z score) and error score (sum of error counts) for the four This experiment evaluated the effect of punishment and feed-
groups. The error bars denote standard error (95% CI).
back on both error (i.e., nonintentional behavior) and performance
through a multitasking activity. Results show that punishment and
authority figure, and their aspiration to impress, sometimes result- feedback increase human performance and decrease error. Con-
ing in disappointment due to perceived failure. versely, an absence of these stimuli was shown to have the oppo-
In the punishment group, during training the researcher site effect. The results are the first of their kind to illustrate the
instructed the participants on MATB-II in a supporting role. This effect of punishment and feedback on unintentional behavior. They
shifted when the experimental manipulation occurred, and the also highlight the benefit of delineating between error and the
researcher became the enforcer. This change of role may have trig- more granular performance metrics when examining human
gered the psychological effect of disappointment, embarrassment, behavior.
and a motivation to please and perform (Grasmick & Bursik Jr,
1990; Williams & Hawkins, 1986). Declaration of Competing Interest
The applied implications of the findings are not straightforward.
Results suggest that if the sole objective is to obtain performance The authors declare that they have no known competing finan-
improvements, then adopting an intervention involving punish- cial interests or personal relationships that could have appeared
ment would yield the best result. However, if performance to influence the work reported in this paper.
improvements are sought in combination with the reporting of erro-
neous actions (e.g., self-disclosure to support systemic improve- Appendix A
ments), punitive repercussions may undermine reporting,
thereby countering overall benefits at a system level. Punishment Verbal Counseling Protocol Example
is a form of reprisal, which has been shown to adversely affect Research Team:
reporting of safety information (McMurtrie & Molesworth, 2018). At the end of each session, conduct the following counseling:
Counseling and Punishment and Counseling Groups only:
5. Limitations and future research
(1) Ask the participant if they know they erred.
(2) Tell the participant their total number of errors and errors
This experiment is the first of its kind to evaluate the effects of
per task.
punishment and feedback on solely nonintentional behavior (i.e.,
(3) Ask the participant if they know why they erred.
errors). While it provides valuable insights, there are several limi-
(4) Give instruction on how to improve (e.g., techniques).
tations to consider. The task employed in this study represented a
Instructions are limited to:
multitasking activity. While this task has been used widely and no
a. Remind the participant that each three-minute session
doubt involves multitasking, it is not an operational task. Addition-
is the same level of difficulty as the previous baseline
ally, the effects of punishment likely vary based on context and
sessions.
application. How punishment and feedback affect real-world oper-
b. Do not focus on any one task, continue to scan.
ational tasks remains unknown. Similarly, the operators of the task
c. Communications Task: Verbalize/repeat auditory
were students in an experiment. How performance would have
commands.
varied if they were professionals performing in a particular profes-
d. Tracking Task: Use your peripheral vision for the track-
sion remains unknown, and hence is another area for research.
ing task.
How both punishment and feedback affect individual cognitive
e. System Monitoring Task: Keep your mouse in this area
functions, such as attention, memory, and decision-making, was
for faster response.
not investigated in the present study.
f. Resource Management: Do not forget about the transfer
Punishments’ perceived magnitude, swiftness, clarity, and cer-
pumps.
tainty have been shown to affect intentional behavior (e.g., compli-
(5) Ask the participant if they are ready to continue.
ance) to varying degrees (Choi, 2005). If the level of punishment is
set too high, individuals may avoid, refuse, or eliminate the behav-
ior or activity. Conversely, if the punishment is low, it may be such
that the punishment doesn’t exist at all. Consistent with this, in our References
pilot study, it was observed that the level of punishment (i.e.,
Ajzen, I. (2005). Attitudes, personality, and behavior. London, England: McGraw-Hill
financial penalty) affected performance. Further research is Education.
required to support decision-making about the characteristics of Akhtar, S. I., & Awan, A. G. (2018). The impact of corporal punishment on students’
punishment that would best support desired behavior. From an performance in public schools. Global Journal of Management, Social Sciences and
Humanities, 4(3), 606–621.
applied perspective, the findings should be interpreted with cau- Axelrod, S., & Apsche, J. (1983). The effects of punishment on human behavior. New
tion, especially in safety–critical industries where the introduction York: Academic Press.

6
C.G. Calabrese, Brett R.C. Molesworth and J. Hatfield Journal of Safety Research xxx (xxxx) xxx

Calabrese, C. G., Molesworth, B. R. C., Hatfield, J., & Slavich, E. (2022). Effects of the Rupp, T. (2008). Meta analysis of crime and deterrence: A comprehensive review of the
Federal Aviation Administration’s Compliance Program on aircraft incidents and literature. BoD–Books on Demand.
accidents. Transportation Research Part A: Policy and Practice, 163, 304–319. Santiago-Espada, Y., Myer, R. R., Latorella, K. A., & Comstock Jr, J. R. (2011). The multi-
https://doi.org/10.1016/j.tra.2022.07.016. attribute task battery ii (matb-ii) software for human performance and workload
Chalfin, A., & McCrary, J. (2017). Criminal deterrence: A review of the literature. research: A user’s guide.
Journal of Economic Literature, 55(1), 5–48. https://doi.org/ Skinner, B. F. (1938). The behavior of organisms: an experimental analysis. Oxford,
10.1257/jel.20141147. England: Appleton-Century.
Choi, K. (2005). The effects of actual punishment levels on perceptions of punishment: A Stürmer, B., Nigbur, R., Schacht, A., & Sommer, W. (2011). Reward and punishment
multi-level approach. The Florida State University. effects on error processing and conflict control. Frontiers in Psychology, 2, 335.
Chui, T. K., Molesworth, B. R., & Bromfield, M. A. (2021). Feedback and student Sundberg, M. L. (2013). Thirty points about motivation from skinner’s book verbal
learning: Matching learning and teaching style to improve student pilot behavior. Analysis of Verbal Behavior, 29(1), 13–40. https://doi.org/10.1007/
performance. The International Journal of Aerospace Psychology, 31(2), 71–86. bf03393120.
Cullen, F. T., & Jonson, C. L. (2016). Correctional theory: Context and consequences Tabachnick, B. G., & Fidell, L. S. (2013). Using multivariate statistics (Vol. 6) Boston,
(2nd ed.). Thousand Oaks, CA: Sage Publications. MA: Pearson Education.
Engström, J., Markkula, G., Victor, T., & Merat, N. (2017). Effects of cognitive load on Trang, S., & Brendel, B. (2019). A meta-analysis of deterrence theory in information
driving performance: The cognitive control hypothesis. Human Factors, 59(5), security policy compliance research. Information Systems Frontiers, 21(6),
734–764. 1265–1284.
Federal Aviation Administration (2015). Federal Aviation Administration Compliance Tricomi, E., & DePasque, S. (2016). The role of feedback in learning and motivation.
Philosophy (Order 8000.373). Washington, D.C. Recent developments in neuroscience research on human motivation. Emerald
Fleishman, E. A. (1975). Toward a taxonomy of human performance. American Group Publishing Limited.
Psychologist, 30(12), 1127. Van Dijk, W. W. (1999). Not having what you want versus having what you do not
Freeman, J., Szogi, E., Truelove, V., & Vingilis, E. (2016). The law isn’t everything: The want: The impact of type of negative outcome on the experience of
impact of legal and non-legal sanctions on motorists’ drink driving behaviors. disappointment and related emotions. Cognition and Emotion, 13(2), 129–148.
Journal of Safety Research, 59, 53–60. https://doi.org/10.1016/j.jsr.2016.10.001. https://doi.org/10.1080/026999399379302.
Grasmick, H. G., & Bursik, R. J. Jr, (1990). Conscience, significant others, and rational VandenBos, G. R. (2007). APA dictionary of psychology. American Psychological
choice: Extending the deterrence model. Law and Society Review, 24(3), Association.
837–861. https://doi.org/10.2307/3053861. Visser, E., Pijl, Y. J., Stolk, R. P., Neeleman, J., & Rosmalen, J. G. (2007). Accident
Grosvenor, D., Toomey, T. L., & Wagenaar, A. C. (1999). Deterrence and the proneness, does it exist? A review and meta-analysis. Accident Analysis &
adolescent drinking driver. Journal of Safety Research, 30(3), 187–191. https:// Prevention, 39(3), 556–564.
doi.org/10.1016/S0022-4375(99)00013-4. Visser, L. N., van der Put, C. E., & Assink, M. (2022). The association between school
Kluger, A. N., & DeNisi, A. (1996). The effects of feedback interventions on corporal punishment and child developmental outcomes: A meta-analytic
performance: A historical review, a meta-analysis, and a preliminary feedback review. Children, 9(3), 383.
intervention theory. Psychological Bulletin, 119(2), 254. Williams, K. R., & Hawkins, R. (1986). Perceptual research on general deterrence: A
Kulhavy, R. W., & Stock, W. A. (1989). Feedback in written instruction: The place of critical review. Law and Society Review, 20(4), 545–572. https://doi.org/10.2307/
response certitude. Educational Psychology Review, 1(4), 279–308. 3053466.
Liu, S. (2017). Quantitative modeling of user performance in multitasking environments Wisniewski, B., Zierer, K., & Hattie, J. (2020). The power of feedback revisited: A
Ph.D.. Raleigh, North Carolina: North Carolina State University. meta-analysis of educational feedback research. Frontiers in Psychology, 10,
Liu, S., & Nam, C. S. (2018). Quantitative modeling of user performance in 3087.
multitasking environments. Computers in Human Behavior, 84, 130–140. Xue, Y., Liang, H., & Wu, L. (2011). Punishment, justice, and compliance in
Marshall, T. F. (1999). Restorative justice: An overview. Home Office London. mandatory IT settings. Information Systems Research, 22(2), 400–414.
Moore, D. S., McCabe, G. P., & Craig, B. A. (2012). Introduction to the practice of
statistics (6th ed.). Curtis Calabrese is a PhD candidate at UNSW Sydney and a commercial airline
Mory, E. H. (1992). The use of informational feedback in instruction: Implications pilot. His area of expertise is in Systems Engineering (BS) and Aviation (MS Hons
for future research. Educational Technology Research and Development, 40(3), and Airline Transport Pilot Licence). His research interests include understanding
5–20. https://doi.org/10.1007/BF02296839. policy, behavior, and human factors in aviation.
Nader, L. (1986). The law as a behavioral instrument (Vol. 33) Lincoln, NE: U of
Nebraska Press. Brett Molesworth is a Professor at UNSW Sydney. His area of expertise is Human
Nagin, D. S., Cullen, F. T., & Jonson, C. L. (2009). Imprisonment and reoffending. Factors and Aviation Safety with qualifications in Psychology (PhD and is a regis-
Crime and Justice, 38(1), 115–200. https://doi.org/10.1086/599202. tered Psychologist in Australia) and in Aviation (BAv Hons and Commercial Pilot
Neal, A., & Griffin, M. A. (2006). A study of the lagged relationships among safety Licence). His research interests focus on understanding human performance in
climate, safety motivation, safety behavior, and accidents at the individual and complex socio-technical environments.
group levels. Journal of Applied Psychology, 91(4), 946–953. https://doi.org/
10.1037/0021-9010.91.4.946. Julie Hatfield is an Associate Professor at UNSW. She holds a PhD in Psychology
Pratt, T. C., Cullen, F. T., Blevins, K. R., Daigle, L. E., & Madensen, T. D. (2006). The from the University of Sydney. Her primary research areas include: psychological
empirical status of deterrence theory. A Meta-Analysis. contributors to risky behaviour; distraction; driver education; vulnerable road
Reason, J. (1990). Human error. New York, NY: Cambridge University Press. users and active transport; human response to aircraft noise.

You might also like