Professional Documents
Culture Documents
Algorithmic Versus Human Advice
Algorithmic Versus Human Advice
To cite this article: Sangseok You, Cathy Liu Yang & Xitong Li (2022) Algorithmic versus Human
Advice: Does Presenting Prediction Performance Matter for Algorithm Appreciation?, Journal of
Management Information Systems, 39:2, 336-365, DOI: 10.1080/07421222.2022.2063553
ABSTRACT KEYWORDS
We propose a theoretical model based on the judge-advisor system Algorithmic advice;
(JAS) and empirically examine how algorithmic advice, compared to algorithm appreciation;
identical advice from humans, influences human judgment. This effect algorithmic transparency;
online trust; cognitive load;
is contingent on the level of transparency, which varies with whether
prediction performance
and how the prediction performance of the advice source is presented.
In a series of five controlled behavioral experiments, we show that
individuals largely exhibit algorithm appreciation; that is, they follow
algorithmic advice to a greater extent than identical human advice
due to a higher trust in an algorithmic than human advisor.
Interestingly, neither the extent of higher trust in algorithmic advisors
nor the level of algorithm appreciation decreases when individuals are
informed of the algorithm’s prediction errors (i.e., upon presenting
prediction performance in an aggregated format). By contrast, algo
rithm appreciation declines when the transparency of the advice
source’s prediction performance further increases through an elabo
rated format. This is plausibly because the greater cognitive load
imposed by the elaborated format impedes advice taking. Finally, we
identify a boundary condition: algorithm appreciation is reduced for
individuals with a lower dispositional need for cognition. Our findings
provide key implications for research and managerial practice.
Introduction
Human judgment is often influenced by other people’s opinions [4, 13, 37]. Meanwhile,
predictions from algorithms are increasingly used as advice sources (so-called “algorithmic
advice”) to aid human judgment. Recent field studies have shown that decision makers follow
algorithmic advice when making important decisions, such as providing medical diagnoses
[27, 33] and releasing criminals on parole [7]. Despite evidence that individuals rely on
algorithmic advice to some degree, the literature suggests that people depend less on such
advice than on advice from a human advisor—even when the algorithm makes better
predictions (see review paper by Meehl [42]). By contrast, the nascent literature demonstrates
algorithm appreciation: individuals are more likely to be influenced by an algorithm than by
humans, presumably because people believe that prediction performance of algorithms is
superior to that of humans in various contexts [12, 16, 17, 26, 38, 39]. Such findings imply an
algorithm’s greater efficacy over humans in influencing individuals’ decisions.
CONTACT Xitong Li lix@hec.fr Department of Information Systems and Operations Management, HEC Paris, V-207, 1
Rue de la Liberation, 73851 Jouy-en-Josas, France
Supplemental data for this article can be accessed online at https://doi.org/10.1080/07421222.2022.2063553
© 2022 Taylor & Francis Group, LLC
JOURNAL OF MANAGEMENT INFORMATION SYSTEMS 337
Research Question 2: What roles do individuals’ trust in advisor, cognitive load, and NC play in
the relationship between increased performance transparency (per PPFs) and algorithm
appreciation?
To answer our research questions, we propose a research model nesting the hypotheses
within our theoretical framework. We empirically examine this model via five controlled
behavioral experiments. The findings show that algorithm appreciation manifests when no
performance information is presented or when the PPF is aggregated; this appreciation
declines when the PPF is elaborated. An elaborated PPF offers the most transparent
prediction performance, but in the meantime, it imposes a greater cognitive load and
thus impedes a judge’s reliance on advice. Furthermore, individuals with higher NC exhibit
greater algorithm appreciation when presented with an elaborated PPF. Our results con
tribute to the burgeoning literature on algorithm appreciation and algorithmic transparency
by highlighting the roles of trust and cognitive properties in the efficacy of algorithmic
advice.
Theoretical Background
Here we present the theoretical background of our research based on Figure 1, outlining
advisor and judge properties that can influence one’s degree of algorithm appreciation.
Trust
Measured? Performance Advice Quality Performance Cognitive Load
Research Dependent Variable (Mediator?) Revealed? PPF Compares PPFs? Controlled? Controlled? Measured?
Promberger and Baron [50] Alignment with the Yes No NA NA Yes No No
recommendation (No)
Önkal et al. [47] Degree of advice taking Yes No NA NA Yes NA No
(No)
Dietvorst et al. [16] Choice of algorithmic over Yes Yes Elaborated Yes, No No No
human prediction (Yes) (no information vs.
elaborated)
Dietvorst et al. [17] Choice of algorithmic over Yes Yes Aggregated No No No No
human prediction (No)
Gunaratne et al. [26] Degree of advice taking No No NA NA. Yes NA No
(NA)
Castelo et al. [12] Preference for algorithm Yes Yes Aggregated Yes No No No
relative to human advisor (No) (no information vs.
aggregated)
Liel and Zalmanson [38] Degree of advice taking Yes No NA No Yes NA No
(No)
Logg et al. [39] Degree of advice taking Yes No NA NA Yes NA No
(No)
Longoni et al. [40] Choice of algorithmic over No Yes Aggregated No Yes Yes No
human recommender (NA)
Yeomans et al. [73] Choice of algorithmic over Yes Yes Elaborated No Yes No No
human recommender (No)
Our paper Degree of advice taking Yes Yes Aggregated and Yes, Yes Yes Yes
(Yes) Elaborated (no information vs.
aggregated vs. elaborated)
JOURNAL OF MANAGEMENT INFORMATION SYSTEMS 341
Castelo et al. [12] and Dietvorst et al. [17] show that, when presented with prediction
performance, individuals display higher trust in an algorithm than a human when they can
modify their final prediction outcomes based on the advisor’s prediction.
Higher trust in an algorithmic versus human advisor is consistent with findings in the
information systems literature. The findings suggest that an algorithm’s perceived higher
reliability lowers individuals’ decision-related uncertainty when interacting with recom
mendation agents [1, 35, 67, 68, 72]. In this paper, we follow the recommendation agent
literature by studying the role that trust plays in advice taking using the JAS paradigm.
Given the evidence that advisor trust positively mediates one’s extent of advice taking [e.g.,
24, 55], we explore whether trust in an advisor’s prediction performance positively mediates
the impact of advice source (algorithm vs. human) on the degree of advice taking.
opposite in their Study 4: individuals choose a human over an algorithm to make predic
tions on their behalf when notified of the algorithm’s prediction errors compared to when
not being notified. These mixed findings can be explained by different PPFs used in Castelo
et al. [12] (an aggregated format) and Dietvorst et al. [16] (an elaborated format). Taken the
two studies together, one might speculate that higher transparency in prediction perfor
mance reduces algorithm appreciation because the elaborated PPF in Dietvorst et al. [16]
communicates more information about algorithmic prediction than the aggregated PPF in
Castelo et al. [12]. However, it is difficult to draw this conclusion given different measures of
algorithm appreciation, varying degrees of flexibility enabling individuals to modify pre
diction outcomes, and no exploration of the underlying mechanism.
As a result, we explore whether and how algorithm appreciation varies with increased
levels of prediction performance transparency (per PPFs) under the JAS paradigm. To
benchmark our work with prior findings, we consider three PPFs from the algorithm
appreciation literature (see Column 5 in Table 1): (1) no performance presentation; (2)
an aggregated PPF (presenting average prediction accuracy) [12, 17, 40]; and (3) an
elaborated PPF (presenting the prediction accuracy of individual prediction cases) [17,
73]. Notably, mixed findings persist regarding whether increased algorithmic transparency
enhances trust [12, 16, 20, 49]. We therefore explore whether increased performance
transparency influences individuals’ trust in an advisor’s prediction performance and
their trust mediates the impact of advice source on the degree of advice taking under the
JAS paradigm.
recalling instructional content [5, 62, 64]. Under the JAS paradigm, a heavier cognitive load
(due to an elaborated presentation of an advisor’s performance information) may hinder
the judge from integrating advice to make a decision [14, 48].
To the best of our knowledge, no research has shown direct process evidence of how
increased performance transparency (via PPFs) affects one’s cognitive load and subsequent
advice taking. Studies on algorithmic advice taking and the adoption of recommendation
agents provide preliminary evidence for our conjecture. Xu et al. [72] show that increased
transparency (i.e., by revealing trade-off information among features) could increase indi
viduals’ cognitive load and reluctance to use a recommender system. Poursabzi-Sangdeh
et al. [49] find that higher algorithmic transparency in how a linear prediction model works
does not increase one’s reliance on algorithmic advice—people may face cognitive overload
when encountering an algorithm’s details. Springer and Whittaker [57] suggest that indi
viduals do not exhibit greater intentions to use an algorithm given higher algorithmic
transparency, possibly because increased transparency may induce a heavier cognitive load.
We aim to fill these research gaps by measuring PPF-induced changes in cognitive load and
related effects on algorithm appreciation.
Hypothesis Development
Following our theoretical framework in Figure 1, we propose a theoretical model which
uncovers the underlying mechanisms of algorithm appreciation. In this model, we focus
on the mediating role of higher trust in an algorithmic than human advisor and the
moderating role of PPFs that may impose different levels of cognitive load due to
increased performance transparency. We investigate three PPFs with an increased level
of prediction performance transparency: no performance presentation, an aggregated
PPF [12, 17, 40], and an elaborated PPF [17, 73]. We develop our hypotheses in two
sections. The first section (H1–H3) focuses on an aggregated PPF, compared to no
performance presentation, where performance transparency is increased but cognitive
load is not. Specifically, we explore the impact of increasing performance transparency on
algorithm appreciation (H1), followed by investigating the mediating role of trust (H2
and H3). The second section (H4–H7) further explores an elaborated PPF where perfor
mance transparency is further increased (compared to an aggregated PPF and no per
formance presentation) with heavy cognitive load induced (H4). In particular, we
investigate the impact of an elaborated PPF (vs. an aggregated PPF and no performance
344 YOU ET AL.
Hypothesis 1 (H1): Individuals exhibit a lower degree of advice taking from an algorithmic
relative to human advisor when the advisor’s prediction performance is displayed in an aggre
gated format than when no prediction performance is provided.
Because an algorithm is expected to have higher prediction accuracy and reliability than
humans even for challenging prediction tasks [1, 20, 35, 67, 68, 72], people often have higher
trust in an algorithmic than human advisor under different tasks and interfaces with various
information presentation formats [1, 39]. The literature on algorithm appreciation
JOURNAL OF MANAGEMENT INFORMATION SYSTEMS 345
consistently shows that individuals demonstrate higher trust in an algorithmic than human
advisor when prediction performance is either not presented [16, 38, 39, 47, 50], in an
aggregated format [12, 17], or in an elaborated format [16].
In addition, the higher trust in an algorithmic than human advisor does not seem to
differ when prediction performance is presented [17] versus when it is not presented at all
[16]. When individuals do not have the option to modify the algorithmic prediction out
come, Dietvorst et al. [16] find that the higher trust in an algorithm than a human declines
when prediction performance is presented (vs. not at all)—possibly due to a lack of sense of
control. Dietvorst et al. [17] further provide empirical evidence and posit that the sense of
control engendered by the option to modify final prediction outcomes could maintain
individuals’ higher trust in an algorithmic than human advisor. This finding could be
explained by the fact that a sense of control reduces individuals’ skepticism of an algo
rithm’s prediction competence [15].
Taken together, we expect individuals to exhibit higher trust in an algorithmic than
human advisor. This higher trust does not differ across PPFs under the JAS paradigm where
individuals can determine their final prediction outcomes:
Prior research shows the mediating role of trust in the impact of advice source
(algorithmic vs. human advisor) on one’s degree of advice taking when prediction
performance is not presented [16]. A minimal amount of research explores whether
this mediation effect applies when the advisor’s prediction performance is presented in
an aggregated format. Prior studies on trust in recommendation agents document
a strong link between individuals’ trust in an agent and their behavioral outcomes
(e.g., using a recommendation agent and relying on its recommendations) [35, 67].
The JAS literature consistently indicates that trust in the advisor positively mediates
one’s degree of advice taking [e.g., 24, 55]. Together with H2, we expect individuals’
trust in the advice source to mediate the impact of advice source (algorithmic vs.
human advisor) on the degree of advice taking, regardless of whether the advisor’s
prediction performance is presented in an aggregated format or not at all. We there
fore hypothesize the following:
Hypothesis 3 (H3): Trust mediates the impact of the advice source on one’s degree of advice taking
when the advisor’s prediction performance is displayed in an aggregated format and when no
prediction performance is provided.
Research suggests that an increased cognitive load may impede one’s ability to integrate
advice with one’s own judgment when making a final decision [14, 48]. Individuals under
a relatively high cognitive load may have difficulties relying on external advice when making
a prediction regardless of the advice source. An elaborated PPF will likely impose a higher
cognitive load; thus, we expect this type of PPF to hinder individuals from integrating
external advice into their final predictions irrespective of whether the advice source is an
algorithm or human. We therefore posit that an elaborated PPF reduces the impact of the
advice source (algorithmic vs. human advisor) on the degree of advice taking, compared to
an aggregated PPF and no performance presentation. As such, we hypothesize the
following:
Hypothesis 5 (H5): Individuals exhibit a lower degree of advice taking from an algorithmic
relative to human advisor when the advisor’s prediction performance is displayed in an elabo
rated format than when prediction performance is not displayed or displayed in an aggregated
format.
As indicated in H2 and H3, we do not expect PPF to moderate the impact of advice source
(algorithmic vs. human advisor) on one’s trust in the advisor—even if the elaborated PPF
induces a higher cognitive load, compared to an aggregated format or no prediction
performance presentation. Research suggests that a higher cognitive load decreases indivi
duals’ trust in a recommendation system [2, 75], which is unlikely to be moderated by the
type of advice source. Ahmad et al. [2] find no support for the moderating role of advice
source in the impact of cognitive load on trust. As a result, we do not anticipate that
individuals will reduce their trust more in an algorithmic than human advisor when
experiencing a relatively high cognitive load. Rather, we expect that the relatively high
cognitive load induced by an elaborated PPF will hamper one’s ability to integrate external
JOURNAL OF MANAGEMENT INFORMATION SYSTEMS 347
advice when making a final prediction regardless of advice source [14, 48]. To this end,
while we hypothesize that individuals’ trust mediates the impact of advice source on the
degree of advice taking (see H2), we presume that a high cognitive load significantly
undermines the mediating role of trust. Therefore, we hypothesize the following.
Hypothesis 6 (H6): Trust mediates the impact of the advice source on one’s degree of advice
taking, but only when the cognitive load is relatively low (vs. high).
Overview of Experiments
In a series of five between-subjects experiments, we explore how and why increased
transparency in performance (as manifested in PPFs) influences algorithm appreciation
based on the theoretical model in Figure 2.
Advice-Taking Task
Following previous literature [26, 39, 47], we construct an advice-taking task and benchmark
it with existing studies on algorithm appreciation. We specifically adapt an objective predic
tion task from Dietvorst et al. [16, 17] and develop a real algorithm based on actual data.
Across the five experiments, participants are asked to perform a prediction task in which
they predict a target student’s standardized math score (ranging from 0 to 100) based on
nine pieces of information about the student (see Figure A1 in Online Supplemental
Appendix A for details). Under the JAS paradigm, each participant is asked to predict the
target student’s standardized math score twice: before and after being presented with advice
generated by the algorithmic prediction regarding the student’s predicted score.
Participants’ second (final) predictions are incentivized; in addition to earning $1 for
completing a survey session, all participants can earn a bonus payment of at least $0.20
for an answer within 6 points of the target student’s actual math score. The payment
increases by $0.10 for each additional 2 points closer to the truth. To ensure that partici
pants understand the incentive-alignment mechanism, we ask them to answer a question
about the incentive scheme before proceeding to the prediction task. This question also
serves as an attention check.
348 YOU ET AL.
The target student’s true standardized math score is 63, and the advice (i.e., the algor
ithmically predicted target student’s math score) is 62. We intentionally pick a target student
whose advice is close to the truth (i.e., 1 point off) so that participants with a higher degree of
advice taking are rewarded with a higher bonus. The algorithmic performance is reasonably
good—the average absolute error (AAE) is 3.5/100 points off the actual score with 2 out of
10 perfect predictions in the test set. The advice and performance (if presented) are identical
for participants randomly assigned to the human and algorithmic conditions. Details about
the algorithm, the selection criteria for the prediction target, the advisor’s performance, and
experimental stimuli appear in the Online Supplemental Appendix A.
Advice Source
The first manipulation involves the advice source (or advisor): an algorithm or humans
(past participants). We keep the advice identical across all participants and inform each
participant that the advisor makes predictions about the same target student. Framings of
the advice sources in our experiments are adapted from prior studies [16, 17, 39]. We
choose past participants (peers) instead of an expert as the human advisor. The literature on
wisdom of the crowd suggests that peers are more persuasive than another person even
when that person is an expert [44, 58]. We also test a case in which the human advisor is an
expert in Study 1C and find similar results.
Advisor’s PPF
The second manipulation involves the advisor’s PPF. We adopt algorithmic PPFs common
in real life as well as in the literature. Table 2 lists our four selected parsimonious PPFs with
increasing performance transparency.
No Error (NE)
We define NE as the format where no information about advisor performance is presented
(i.e., the lowest performance transparency). The NE format is often applied in research on
individuals’ responses to algorithmic versus human aids [12, 16, 38, 39, 50].
JOURNAL OF MANAGEMENT INFORMATION SYSTEMS 349
Manipulation Check
We conduct a manipulation check and confirm that increasing the amount of information
about the advisor’s prediction performance via NE, aggregated, and elaborated formats
increases participants’ perceived information transparency. Our manipulation check and
results are detailed in Online Supplemental Appendix B.
Participants
We recruit English-speaking participants via Prolific and participants residing in North
America (the US and Canada) via Amazon Mechanical Turk (MTurk) because the
online experiment instrument is in English. We exclude the following categories of
participants: those who decline to consent; fail to pass the attention check; complete the
survey more than once; have poor-quality responses (e.g., straight liners); or complete
the survey on a mobile device, such as a phone or tablet. In the analysis, participants
whose initial predictions are identical to the advice are excluded; the WOA of these
participants is undefined, as the denominator when calculating WOA is zero [23, 24, 28,
39, 47].
Summary of Studies
Table 3 summarizes our experimental design and the sample size in each of the five
experiments. Study 1A and Study 2 are our main studies, designed to test H1–H6 and
answer RQ1 and RQ2. Study 1B is conducted to rule out an alternative explanation
that our findings in Study 1A are subject to the specific choice of the prediction target;
Study 1C is conducted to rule out an alternative explanation that our findings in Study
1A are subject to the choice in the type of human advisor (past participants as
opposed to an expert). Study 3 tests H7 by exploring the moderating role of NC on
algorithm appreciation with an elaborated PPF. Sample sizes are determined by
benchmarking each to the sample sizes of previous studies that employ similar
approaches [16, 17, 39]. Sample sizes are further validated with our own calculations
of effect size and statistical power.
Study 1
Study 1 aims to test H1, H2, and H3 by examining how and why algorithm appreciation is
affected by increased performance transparency with an aggregated PPF compared to NE.
We operationalize the aggregated format by presenting prediction performance in AAE.
Study 1A
Method. We conduct a 2 (advice source: algorithm vs. humans) by 2 (PPF: NE vs. AAE)
between-subjects experiment. We recruit 540 participants from Prolific (50.56 percent
women, Mage = 26.68) and randomly assign them to one of the four experimental condi
tions. We exclude 28 participants who are straight liners and 34 participants whose initial
predictions happen to be identical to the advice. The final sample thus includes 478
participants.
Before each participant gives their final judgment about the target student’s standardized
math score and after being given the advice, we measure participants’ trust in the advice
source using a 3-item scale adapted from Komiak and Benbasat [35]. The items are scored
on a 7-point Likert scale (1 = strongly disagree, 7 = strongly agree). An example item is
“[Source] is with good knowledge about high school students’ math scores.” The items are
reliable (Cronbach’s α = 0.77), and we use the average rating as a composite index to
measure participants’ trust in the advice source.
samples (Model 15; Hayes [29]). We find a positive indirect effect of advice source on WOA
through trust in advice source under the NE (95 percent confidence interval: [0.0079,
0.0431]) and AAE formats (95 percent confidence interval: [0.0070, 0.0481]). We find no
conditional direct effect of advice source on WOA under the NE (95 percent confidence
interval: [−0.0148, 0.0859]) or AAE format (95 percent confidence interval: [−0.0229,
0.0752]). We also do not find that the mediation effect of trust in advice source on the
impact of advice source on WOA is moderated by PPF (NE vs. AAE) (95 percent confidence
interval: [−0.0234, 0.0274]).
Study 1B
Study 1B is carried out to ensure the robustness of findings from Study 1A by using
a different student as the prediction target. The true math score of the target student is 56
while the advice is 55, different from the target student in Study 1A. We recruit 493
participants from Amazon MTurk (35.70 percent women, Mage = 33.91) and exclude 43
participants whose initial predictions are identical to the advice. Our final sample contains
450 participants. The experimental procedure is the same as in Study 1A except that we
do not measure participants’ trust in the advisor when making the final estimates, given
our focus on replicating the behavioral outcome based on the experimental design in
Study 1A.
The two-way ANCOVA results after controlling for participants’ age and education
show a significant main effect of advice source on WOA [F(1, 444) = 11.19, p < 0.001, η2 =
0.02]; that is, participants adjust their final predictions towards the advice to a larger extent
when the advice comes from the algorithm (M = 0.45, SD = 0.40) versus humans (M = 0.33,
SD = 0.38). Similar to Study 1A, we find no main effect of PPF (NE vs. AAE) or any
interaction effect (Fs < 1). A post-hoc Tukey test shows no significant difference in WOA
between NE and AAE regardless of whether the advice comes from an algorithm (MNE =
0.45, SDNE = 0.41; MAAE = 0.46, SDAAE = 0.39; p = 0.99) or humans (MNE = 0.31, SDNE =
JOURNAL OF MANAGEMENT INFORMATION SYSTEMS 353
0.38; MAAE = 0.34, SDAAE = 0.38; p = 0.94). Figure C1 in Online Supplemental Appendix
C shows the plot of the average WOA per experimental condition with error bars denoting
95 percent confidence intervals.
Study 1C
We conduct Study 1C to ensure the robustness of results in Study 1A by using a different
human advisor—an expert—instead of past participants. We recruit 494 participants from
Amazon MTurk (49.80 percent women, Mage = 39.49). Forty-three participants whose
initial predictions happen to be identical to the advice are excluded, leaving 451 participants
as the final sample. The experimental procedure is identical to that in Study 1B except that
we replace the human advisor from “past participants” with “an expert” while using the
same prediction target as in Study 1A.
Two-way ANCOVA results after controlling for participants’ age and education show
a significant main effect of advice source on WOA [F(1, 445) = 4.13, p < 0.05, η2 = 0.01].
Participants thus adjust their final predictions towards the advice to a larger extent when
the advice comes from the algorithm (M = 0.43, SD = 0.40) versus the expert (M = 0.36,
SD = 0.37). Again, we find no main effect of the PPF (NE vs. AAE) or any interaction
effect (Fs < 1). A post-hoc Tukey test also indicates no significant difference in WOA
between NE and AAE when the advice comes from the algorithm (MNE = 0.46, SDNE =
0.39; MAAE = 0.40, SDAAE = 0.40; p = 0.57) or humans (MNE = 0.38, SDNE = 0.38; MAAE =
0.33, SDAAE = 0.36; p = 0.76). Figure C2 in the Online Supplemental Appendix C depicts
the plot of the average WOA per experimental condition with error bars stating 95 percent
confidence intervals.
Discussion
The results of Study 1 echo those of Logg et al. [39] wherein individuals demonstrate
algorithm appreciation under the NE format. Our findings show that algorithm apprecia
tion does not seem to decrease under the AAE compared to the NE format, thus rejecting
H1. This answers RQ1. In fact, our findings suggest that individuals can show similar
levels of algorithm appreciation under the AAE compared to the NE format. The rejection
of H1 might be due to the relatively high prediction performance of the algorithmic
advisor (3.5/100 points off the actual score) given our experimental design. That is, the
relatively high prediction performance does not evoke individuals’ lower tolerance or
higher sensitivity to algorithmic prediction errors [16,20]. The results of Study 1A answer
RQ2 by showing that individuals typically trust an algorithmic advisor more than
a human irrespective of PPF in the NE or AAE format, confirming H2. Furthermore,
trust in advice source also mediates algorithm appreciation under both the NE and AAE
formats, confirming H3.
Study 2
In Study 2, we further examine the impact of increased performance transparency on
algorithm appreciation. Specifically, in addition to the aggregated and NE formats in
Study 1, we introduce a Table-Diff format that tabulates the individual past absolute
deviation of predictions from the truth to operationalize the elaborated format. Because
354 YOU ET AL.
people can derive AHR and AAE from the Table-Diff format, we use AAE-AHR to
operationalize the aggregated format in this study. We test H4, H5, and H6 through this
experimental setup.
Method
Participants are randomly assigned to one of six conditions in a 2 (advice source: an
algorithm vs. humans) by 3 (PPF: NE vs. AAE-AHR vs. Table-Diff) between-subjects
design. We recruit 612 participants from Amazon MTurk (44.61 percent women, Mage
= 37.98) and exclude 46 participants whose initial predictions are identical to the
advice. The final sample consists of 566 participants. Participants are asked to rate
their trust in the advice source based on the same scale used in Study 1A (Cronbach’s
α = 0.76).
Different from Study 1A, we record each participant’s response time in seconds when
making their initial and final estimates, which immediately proceeds and follows (respec
tively) the presentation of advice along with prediction performance (except for the NE
condition) on a separate page. We do not expect differences in individuals’ response time
when giving their initial estimates across conditions; however, we anticipate slower
response time when making final predictions under the Table-Diff format compared to
the NE and AAE-AHR formats due to higher cognitive load per H4. Response time is an
objective measure of extraneous cognitive load [60]. Following Ward and Mann [70], we
account for individual differences in making predictions by using an adjusted response time
in the final estimate to measure cognitive load (i.e., by subtracting each participant’s
response time in making their initial estimate from one’s response time in making their
final estimate).
Results
Cognitive Load
A two-way ANCOVA controlling for participants’ age and education reveals no significant
main effect of advice source (F < 1), PPF [F(2, 552) = 1.45, p = 0.24, η2 = 0.005], or their
interaction (F < 1) on response time (log-transformed due to right skewness) when parti
cipants make initial estimates. To test H4, we submit the adjusted response time to a two-
way ANCOVA controlling for participants’ age and education. We observe a significant
main effect of PPF on adjusted response time when making the final estimate [F(2, 552) =
3.05, p < 0.05, η2 = 0.01]. We find no significant main effect of advice source (F < 1) or its
interaction with PPF on the adjusted response time when participants make their final
estimates [F(2, 552) = 2.18, p = 0.11, η2 = 0.008]. We further contrast the means of adjusted
response time when making the final estimate (negative for adjusted response time due to
making the final estimate more quickly than the initial estimate) of the Table-Diff format to
that of AAE-AHR and NE. Results suggest that participants are slower in giving their final
estimates under the Table-Diff format compared to the NE and AAE-AHR formats (MNE
= -31.89, SDNE = 35.77; MAAE-AHR = -29.67, SDAAE-AHR = 36.56; MTable-Diff = -24.92,
SDTable-Diff = 27.23; t(552) = 2.41, p < 0.05). We find no significant difference in adjusted
response time when making the final estimate between the NE and AAE-AHR formats [t
(552) = 0.54, p = 0.59].
JOURNAL OF MANAGEMENT INFORMATION SYSTEMS 355
WOA
To test H5, we submit WOA to a two-way ANCOVA controlling for participants’ age and
education given the 2-by-3 experimental design. Results indicate a significant main effect of
advice source on WOA [F(1, 552) = 12.65, p < 0.001, η2 = 0.02], demonstrating that parti
cipants adjust their final predictions towards the advice to a larger extent when the advice
comes from an algorithm (M = 0.40, SD = 0.40) versus other humans (M = 0.35, SD = 0.41).
We find a marginally significant main effect of PPF (MNE = 0.40, SDNE = 0.40;
MAAE-AHR = 0.41, SDAAE-AHR = 0.41; MTable-Diff = 0.32, SDTable-Diff = 0.40; F(2, 552) = 2.44,
p < 0.10, η2 = 0.01). We also identify a significant interaction effect of advice source and PPF
[F(2, 552) = 6.19, p < 0.01, η2 = 0.02]. Figure 4 presents a plot of the average WOA; Table C1
in the Online Supplemental Appendix C lists the WOA per experimental condition.
We perform a post-hoc test with Tukey’s HSD to further understand the interaction
effect. We find no significant difference in WOA when the advice is given by humans among
the NE, AAE-AHR, and Table-Diff formats (pairwise ps > 0.65). By contrast, we find that
WOA under the Table-Diff format is significantly smaller than the NE format (p < 0.01) and
marginally significantly smaller than the AAE-AHR format (p < 0.10). No significant
difference emerges in WOA between the NE and AAE-AHR formats (p = 0.91). In sum,
these results show that the Table-Diff format greatly reduces algorithm appreciation
compared to the NE and AAE-AHR formats.
Moderated Mediation
To test H6, we carry out a mediation analysis (advice source→trust in advice
source→WOA) moderated by participants’ cognitive load, controlling for participants’
age and education. We estimate the indirect effect of the moderated mediation analysis
via bootstrapping with 5000 samples (Model 14; Hayes [29]). The impact of advice
source on WOA, mediated through trust in advice source, is negatively moderated by
the level of cognitive load (95 percent confidence interval: [−0.0006, −0.0001]). We
find a positive indirect effect of the advice source on WOA through trust in advice
source when participants face a relatively low cognitive load (one standard deviation
below the mean; 95 percent confidence interval: [0.0084, 0.0309]). We find no indirect
effect of advice source on WOA through trust in advice source when participants
encounter a relatively high cognitive load (95 percent confidence interval: [−0.0063,
0.0162]).
Discussion
Results from Study 2 further answer RQ1 and RQ2 by showing that increased performance
transparency via PPFs can influence individuals’ level of algorithm appreciation.
Specifically, we show that the Table-Diff format increases participants’ cognitive load as
evidenced by longer (adjusted) response time when making final estimates; this pattern
supports H4. In addition, compared with Study 1, further boosting performance transpar
ency by presenting performance in an elaborated format (Table-Diff) can lead to lower
algorithm appreciation, supporting H5. More importantly, our moderated mediation ana
lysis indicates that a higher cognitive load under the Table-Diff format impedes advice
taking, despite participants’ higher trust in the algorithmic than human advisor (supporting
H2). Trust still mediates the impact of advice source (algorithm vs. human) on the degree of
advice taking when individuals encounter a relatively low cognitive load, supporting H6. It
is worth noting that our results in the elaborated PPF (Table-Diff) do not necessarily suggest
that people exhibit algorithm aversion as suggested by Dietvorst et al. [16]. We do not find
that individuals demonstrate greater advice taking upon considering advice from humans
versus from an algorithm.
Study 3
Now we aim to test H7 and examine the boundary condition of algorithm apprecia
tion under the Table-Diff format that induces a relatively high cognitive load. Before
the main study, we conduct an offline pilot study and find that a sample of students
among the top 0.1 percent of scorers on a national exam in a European country
exhibit algorithm appreciation under the Table-Diff condition (see details in Online
Supplemental Appendix D). These students may have a higher cognitive tendency
than participants recruited from MTurk in Study 2. In Study 3, we also consider
whether NC moderates the effect of algorithm appreciation under the Table-Diff
format.
JOURNAL OF MANAGEMENT INFORMATION SYSTEMS 357
Method
We recruit 195 participants from Amazon MTurk (43.60 percent women, Mage = 33.88).
Each participant is randomly assigned to one of two conditions in a one-factor design
(advice source: an algorithm vs. humans) in which prediction errors are presented in Table-
Diff format. We exclude 11 participants whose initial predictions are identical to the advice;
the final sample consists of 184 participants. Different from earlier studies, after making the
final prediction, each participant indicates their NC through an index of three items [9, 30]
scored on a 7-point Likert scale (1 = strongly disagree, 7 = strongly agree; Cronbach’s α =
0.90). An example item is “I like to have the responsibility of handling a situation that
requires a lot of thinking.” We conduct a median split on the composite NC index,
calculated by averaging the three items, to divide our sample into low- and high-NC groups.
Results
To test H7, we submit WOA to a two-way ANCOVA while controlling for participants’ age
and education. Advice source serves as the independent variable with NC as the moderator.
We find a significant moderation effect of NC [F(1, 178) = 4.93, p < 0.05, η2 = 0.02],
suggesting that participants with higher NC show a greater tendency for algorithm appre
ciation (Malgo = 0.35, SDalgo = 0.41; Mhuman = 0.21, SDhuman = 0.33) than those with lower
NC (Malgo = 0.19, SDalgo = 0.33; Mhuman = 0.32, SDhuman = 0.42); see Figure 5.
Discussion
Our results support H7 in that algorithm appreciation under an elaborated PPF (Table-Diff)
might be contingent on individuals’ NC. Specifically, when prediction performance is
presented in the Table-Diff format, people with higher NC show a greater tendency for
algorithm appreciation than those with lower NC. Interestingly, people with relatively high
NC display a similar pattern of algorithm appreciation under the NE and the aggregated
PPFs (AAE and AAE-AHR) in previous studies. These results suggest that an elaborated
PPF (Table-Diff) does not necessarily deter people with relatively high NC from appreciat
ing algorithmic advice.
General Discussion
Summary of Main Findings
In a series of five experiments, we find consistent evidence of algorithm appreciation; that is,
individuals show a greater advice taking from an algorithm than humans. More impor
tantly, we show whether and how increased performance transparency influences one’s level
of algorithm appreciation. In response to RQ1, Studies 1 and 2 indicate that individuals
exhibit similar levels of algorithm appreciation between the NE and aggregated PPFs.
Further increasing performance transparency via an elaborated PPF lowers algorithm
appreciation. The findings of Studies 1, 2, and 3 also answer RQ2. Results from Studies 1
and 2 indicate that trust in an advisor’s prediction performance mediates the impact of
advice source on the degree of advice taking when individuals are likely to experience
a relatively low cognitive load from the NE and an aggregated PPF. An elaborated PPF
increases cognitive load and inhibits the mediation effect of trust in the impact of advice
source on the degree of advice taking. Finally, the results from Study 3 reveal that
individuals with higher NC are more prone to algorithm appreciation despite the relatively
high cognitive load induced by an elaborated PPF. Table 4 outlines these findings and their
relationships with our research questions and hypotheses.
influence individuals’ advice taking. These findings enhance our understanding of the
central role of trust as an underlying mechanism of algorithm appreciation. Our results
suggest that trust in an advisor mediates the impact of advice source on one’s degree of
advice taking when performance information is either not provided or appears in an
aggregated PPF, but this mediation diminishes with an elaborated PPF. Our findings
confirm previous research regarding the mediation effects of trust in the adoption of
recommendation agents [1, 6, 35, 67, 68, 72], although these effects vary; rather, they
might depend on how an algorithm’s performance information is displayed.
Second, we enrich the algorithmic transparency literature by investigating performance
transparency, which conveys why an individual should rely on algorithmic advice rather than
explaining how an algorithm works [10, 18, 54, 57, 71]. Our work is likely the first to
systematically examine how variation in performance transparency influences algorithm
appreciation while controlling for advice quality and the advisor’s prediction performance.
Our findings provide a plausible explanation for the contradictory results from Dietvorst et al.
[16] and Castelo et al. [12] regarding how increased performance transparency affects algo
rithm appreciation. The results of Study 1 accord with those of Castelo et al. [12]: increasing
performance transparency by presenting performance in an aggregated format (vs. no per
formance presentation) does not necessarily decrease algorithm appreciation. Our findings
from Study 2 also align with Dietvorst et al. [16]: enhancing performance transparency by
presenting performance in an elaborated format (vs. no performance presentation) lessens
algorithm appreciation. Taken together, prior mixed findings could be partially attributed to
different PPFs inducing distinct cognitive loads that moderate algorithm appreciation.
Lastly, the majority of prior studies on algorithmic advice taking focus on trust as a driver to
enhance individuals’ adherence to algorithmic advice [12, 16, 17, 21, 39]. Differently, our paper
demonstrates that the cognitive load induced by PPFs is another important factor contributing
to when and why individuals appreciate algorithms than humans. This insight may illuminate
future algorithm advice-taking research. In particular, we speculate that the increased perfor
mance transparency stemming from an elaborated compared to an aggregated PPF could result
in a higher cognitive load and thus compromise interpretability [36, 49]. While our research
does not directly examine algorithms’ interpretability, our findings suggest that the information
presentation entailing relatively low cognitive effort in interpretation is likely to enhance one’s
reliance on algorithmic advice. Moreover, the cognitively effortful interpretation of algorithmic
advice differentially affects algorithm appreciation. In this sense, our paper contributes to the
literature by highlighting the importance of considering one’s cognitive properties in algorith
mic decision making. Future research can consider the impacts of interpretability on algorith
mic advice taking among individuals with varying cognitive resources.
algorithm [3, 63]. Third, our results can alleviate concerns among individuals or organiza
tions implementing algorithmic recommendations subjected to legal enforcement (e.g.,
GDPR). Presenting algorithmic performance (even when prediction performance is not
perfect) does not necessarily lead to lower algorithm appreciation as long as individuals can
easily process this information. Lastly, for algorithm managers, offering the highest algo
rithmic transparency is not always optimal: we show that performance presentation in an
elaborated format does not necessarily boost humans’ algorithm appreciation due to an
increased cognitive load. A possible solution to this problem is to offer individuals PPF
options so they may choose a format that suits their cognitive tendencies.
Conclusion
Despite the converging evidence of algorithm appreciation (i.e., higher persuasive
efficacy of algorithmic advice compared to the identical human advice), scarce research
presents mixed findings regarding whether and how an increase in prediction perfor
mance transparency impacts one’s appreciation of algorithmic relative to human
decision aids. To address these issues, we propose a theoretical model based on the
JAS paradigm and show that presenting prediction performance does not necessarily
deter individuals from generating higher reliance on algorithmic than human advice.
Individuals’ higher trust in an algorithmic relative to a human advisor drives one’s
higher reliance on algorithmic advice, but only if an advisor’s prediction performance
presentation does not induce a high cognitive load. To the best of our knowledge, we
are among the first to highlight the importance of considering individuals’ cognitive
aspects when increasing algorithmic transparency in the context of algorithmic deci
sion making. Our findings provide insights for algorithmic managers and avenues for
future research.
JOURNAL OF MANAGEMENT INFORMATION SYSTEMS 361
Notes
i We also confirm this finding in a preliminary study. Among 125 Amazon Mechanical Turkers
(52.8 percent women, Mage = 39.8), 60 percent prefer to know the prediction accuracy rather
than the algorithm’s mechanism when asked to imagine their use of an algorithmic decision aid
in making predictions (Chi-square test p < 0.03).
ii Following prior research [23, 39], WOA is winsorized at 0 and 1; that is, any value of WOA
greater than 1 is replaced by 1, and any value of WOA less than 0 is replaced by 0. The value of
WOA is not available if the initial judgment is identical to the advice.
Disclosure Statement
No potential conflict of interest was reported by the authors.
Notes on contributors
Sangseok You (sangyou@skku.edu) holds a Ph.D. from University of Michigan and is an assistant
professor of Information Systems at Sungkyunkwan University, South Korea. His research focuses on
understanding how teams working with technologies operate and promote team outcomes, encom
passing such topics as human-robot collaboration, artificial intelligence, and virtual and open and
virtual collaboration. Dr. You’s work has appeared in Journal of Association for Information Systems,
Journal of the Association for Information Science and Technology, and other journals, and in the
proceedings of Academy of Management Annual Meeting and International Conference on
Information Systems.
Cathy Liu Yang (yang@hec.fr) is an assistant professor in the Information Systems Department at
HEC Paris. She received her Ph.D. in marketing from Columbia Business School. Her research
interests include modeling consumer information processing to improve preference measurement
and understanding the causal impact of algorithmic influence by using secondary and laboratory
data. Dr. Yang’s award-winning work appeared in Journal of Marketing Research.
Xitong Li (lix@hec.fr; corresponding author) is an associate professor of Information Systems at HEC
Paris, France. He received his Ph.D. in Management from MIT Sloan School of Management and his
Ph.D. in Engineering from Tsinghua University, China. His research interests include the identifica
tion of causal impacts of using online data/information. Dr. Li’s work has appeared in journals,
including Information Systems Research, Journal of Management Information Systems, MIS Quarterly,
and various ACM/IEEE Transactions. His research work has been granted by the French national
research agency ANR AAPG France and has received several best paper awards,
Funding
This research work is partly supported by funding from the Hi! PARIS Fellowship and the French
National Research Agency (ANR) Investissements d’Avenir LabEx Ecodec [Grant ANR-11-LABX
-0047].
References
1. Adomavicius, G.; Bockstedt, J.C.; Curley, S.P.; and Zhang, J. Reducing recommender system
biases: An investigation of rating display designs. MIS Quarterly: Management Information
Systems, 43, 4 (2019), 1321–1341.
2. Ahmad, M.I.; Bernotat, J.; Lohan, K.; and Eyssel, F. Trust and cognitive load during human-
robot interaction. arXiv:1909.05160 [cs], (September 2019).
362 YOU ET AL.
3. Ahsen, M.E.; Ayvaci, M.U.S.; and Raghunathan, S. When algorithmic predictions use
human-generated data: A bias-aware classification algorithm for breast cancer diagnosis.
Information Systems Research, 30, 1 (March 2019), 97–116.
4. Banerjee, A.V. A simple model of herd behavior. The Quarterly Journal of Economics, 107, 3
(August 1992), 797–817.
5. Barrouillet, P.; Bernardin, S.; Portrat, S.; Vergauwe, E.; and Camos, V. Time and cognitive load
in working memory. Journal of Experimental Psychology: Learning, Memory, and Cognition, 33,
3 (2007), 570–585.
6. Benbasat, I.; and Wang, W. Trust in and adoption of online recommendation agents. Journal of
the Association for Information Systems, 6, 3 (March 2005).
7. Berk, R. An impact assessment of machine learning risk forecasts on parole board decisions
and recidivism. Journal of Experimental Criminology, 13, 2 (June 2017), 193–216.
8. Bettman, J.R.; and Kakkar, P. Effects of information presentation format on consumer infor
mation acquisition strategies. Journal of Consumer Research, 3, 4 (March 1977), 233–240.
9. Cacioppo, J.T.; Petty, R.E.; Feinstein, J.A.; and Jarvis, W.B.G. Dispositional differences in
cognitive motivation: The life and times of individuals varying in need for cognition.
Psychological Bulletin, 119, 2 (1996), 197–253.
10. Cai, C.J.; Winter, S.; Steiner, D.; Wilcox, L., and Terry, M. “Hello AI”: Uncovering the
onboarding needs of medical practitioners for human-ai collaborative decision-making.
Proceedings of the ACM on Human-Computer Interaction, 3. CSCW, Austin, Texas,
November 2019, pp. 1–24.
11. Canfield, C.; Bruin, W.B. de; and Wong-Parodi, G. Perceptions of electricity-use communica
tions: effects of information, format, and individual differences. Journal of Risk Research, 20, 9
(September 2017), 1132–1153.
12. Castelo, N.; Bos, M.W.; and Lehmann, D.R. Task-dependent algorithm aversion. Journal of
Marketing Research, 56, 5 (October 2019), 809–825.
13. Chevalier, J.A.; and Mayzlin, D. The effect of word of mouth on sales: Online book reviews.
Journal of Marketing Research, 43, 3 (August 2006), 345–354.
14. Chun, W.Y.; and Kruglanski, A.W. The role of task demands and processing resources in the
use of base-rate and individuating information. Journal of Personality and Social Psychology, 91,
2 (2006), 205–217.
15. Das, T.K.; and Teng, B.-S. Trust, control, and risk in strategic alliances: An integrated
framework. Organization Studies, 22, 2 (March 2001), 251–283.
16. Dietvorst, B.J.; Simmons, J.P.; and Massey, C. Algorithm aversion: People erroneously avoid
algorithms after seeing them err. Journal of Experimental Psychology: General, 144, 1 (2015),
114–126.
17. Dietvorst, B.J.; Simmons, J.P.; and Massey, C. Overcoming algorithm aversion: people will use
imperfect algorithms if they can (even slightly) modify them. Management Science, 64, 3
(March 2018), 1155–1170.
18. Dove, G.; Balestra, M.; Mann, D.; and Nov, O. Good for the many or best for the few?
A dilemma in the design of algorithmic advice. Proceedings of the ACM on Human-
Computer Interaction, 4. CSCW2, 2020, pp. 1–22.
19. Duan, J.; Xia, X.; and Van Swol, L.M. Emoticons’ influence on advice taking. Computers in
Human Behavior, 79, (February 2018), 53–58.
20. Dzindolet, M.T.; Peterson, S.A.; Pomranky, R.A.; Pierce, L.G.; and Beck, H.P. The role of trust
in automation reliance. International Journal of Human-Computer Studies, 58, 6 (June 2003),
697–718.
21. Fügener, A.; Grahl, J.; Gupta, A.; and Ketter, W. Will humans-in-the-loop become borgs?
Merits and pitfalls of working with AI. Management Information Systems Quarterly, 45, 3
(2021), 1527–1556.
22. Ganzach, Y. Predictor representation and prediction strategies. Organizational Behavior and
Human Decision Processes, 56, 2 (November 1993), 190–212.
23. Gino, F. Do we listen to advice just because we paid for it? The impact of advice cost on its use.
Organizational behavior and human decision processes, 107, 2 (2008), 234–245.
JOURNAL OF MANAGEMENT INFORMATION SYSTEMS 363
24. Gino, F.; and Schweitzer, M.E. Blinded by anger or feeling the love: How emotions influence
advice taking. Journal of Applied Psychology, 93, 5 (2008), 1165.
25. Goodman, B.; and Flaxman, S. European Union regulations on algorithmic decision-making
and a “right to explanation.” AI Magazine, 38, 3 (October 2017), 50–57.
26. Gunaratne, J.; Zalmanson, L.; and Nov, O. The persuasive power of algorithmic and
crowdsourced advice. Journal of Management Information Systems, 35, 4 (October 2018),
1092–1120.
27. Hainc, N.; Federau, C.; Stieltjes, B.; Blatow, M.; Bink, A., and Stippich, C. The bright, artificial
intelligence-augmented future of neuroimaging reading. Frontiers in Neurology, 8, (2017), 489.
doi:10.3389/fneur.2017.00489.
28. Harvey, N.; and Fischer, I. Taking advice: Accepting help, improving judgment, and
sharing responsibility. Organizational behavior and human decision processes, 70, 2
(1997), 117–133.
29. Hayes, A.F. Introduction to Mediation, Moderation, and Conditional Process Analysis, Second
Edition: A Regression-Based Approach. New York, NY: Guilford Publications, 2017.
30. de Holanda Coelho, G.L.; Hanel, P.H.; and Wolf, L.J. The very efficient assessment of need for
cognition: developing a six-item version. Assessment, 1, (2018), 16.
31. Hong, W.; Thong, J.Y.L.; and Tam, K.Y. The effects of information format and shopping task
on consumers’ online shopping behavior: A cognitive fit perspective. Journal of Management
Information Systems, 21, 3 (November 2004), 149–184.
32. Jarvenpaa, S.L. The effect of task demands and graphical format on information processing
strategies. Management Science, 35, 3 (March 1989), 285–303.
33. Jussupow, E.; Spohrer, K.; Heinzl, A.; and Gawlitza, J. Augmenting medical diagnosis deci
sions? An investigation into physicians’ decision-making process with artificial intelligence.
Information Systems Research, 32, 3 (September 2021), 713–735.
34. Kalyuga, S. Cognitive load theory: How many types of load does it really need? Educational
Psychology Review, 23, 1 (2011), 1–19.
35. Komiak, S.Y.X.; and Benbasat, I. The effects of personalization and familiarity on trust and
adoption of recommendation agents. MIS Quarterly, 30, 4 (2006), 941–960.
36. Lage, I.; Chen, E.; He, J.; Narayanan, M.; Kim, B.; Gershman, S.; and Doshi-Velez, F. An
evaluation of the human-interpretability of explanation. arXiv preprint arXiv:1902.00006.
(2019).
37. Li, X.; and Wu, L. Herding and social media word-of-mouth: Evidence from Groupon.
Management Information Systems Quarterly, 42, 4 (December 2018), 1331–1351.
38. Liel, Y.; and Zalmanson, L. What If an AI Told You That 2 + 2 Is 5? Conformity to Algorithmic
Recommendations. (2020). ICIS 2020 Proceedings. 17. https://aisel.aisnet.org/icis2020/hci_
artintel/hci_artintel/17
39. Logg, J.M.; Minson, J.A.; and Moore, D.A. Algorithm appreciation: People prefer algorithmic
to human judgment. Organizational Behavior and Human Decision Processes, 151, (March
2019), 90–103.
40. Longoni, C.; Bonezzi, A.; and Morewedge, C.K. Resistance to medical artificial intelligence.
Journal of Consumer Research, 46, 4 (December 2019), 629–650.
41. McKnight, D.H.; Liu, P.; and Pentland, B.T. Trust change in information technology products.
Journal of Management Information Systems, 37, 4 (October 2020), 1015–1046.
42. Meehl, P.E. Clinical versus Statistical Prediction: A Theoretical Analysis and a Review of the
Evidence. Minneapolis, MN: University of Minnesota Press, 1954.
43. Meyer, J.; Shamo, M.K.; and Gopher, D. Information structure and the relative efficacy of tables
and graphs. Human Factors, 41, 4 (December 1999), 570–587.
44. Mollick, E.; and Nanda, R. Wisdom or madness? Comparing crowds with expert evaluation in
funding the arts. Management Science, 62, 6 (June 2016), 1533–1553.
45. Montazemi, A.R.; and Wang, S. The effects of modes of information presentation on
decision-making: A review and meta-analysis. Journal of Management Information Systems,
5, 3 (December 1988), 101–127.
364 YOU ET AL.
46. Mousavi, S.Y.; Low, R.; and Sweller, J. Reducing cognitive load by mixing auditory and visual
presentation modes. Journal of Educational Psychology, 87, 2 (1995), 319–334.
47. Önkal, D.; Goodwin, P.; Thomson, M.; Gönül, S.; and Pollock, A. The relative influence of
advice from human experts and statistical methods on forecast adjustments. Journal of
Behavioral Decision Making, 22, 4 (2009), 390–409.
48. Pierro, A.; Mannetti, L.; Erb, H.-P.; Spiegel, S.; and Kruglanski, A.W. Informational length and
order of presentation as determinants of persuasion. Journal of Experimental Social Psychology,
41, 5 (September 2005), 458–469.
49. Poursabzi-Sangdeh, F.; Goldstein, D.G.; Hofman, J.M.; Wortman Vaughan, J.W.; and
Wallach, H. Manipulating and Measuring Model Interpretability. In Proceedings of the 2021
CHI Conference on Human Factors in Computing Systems, New York, NY. Association for
Computing Machinery, 2021, pp. 1–52.
50. Promberger, M.; and Baron, J. Do patients trust computers? Journal of Behavioral Decision
Making, 19, 5 (2006), 455–468.
51. Qiu, L.; and Benbasat, I. Evaluating anthropomorphic product recommendation agents:
A social relationship perspective to designing information systems. Journal of Management
Information Systems, 25, 4 (April 2009), 145–182.
52. Remus, W. A study of graphical and tabular displays and their interaction with environmental
complexity. Management Science, 33, 9 (September 1987), 1200–1204.
53. Riedl, R.; Mohr, P.N.; Kenning, P.H.; Davis, F.D.; and Heekeren, H.R. Trusting humans and
avatars: A brain imaging study based on evolution theory. Journal of Management Information
Systems, 30, 4 (2014), 83–114.
54. Samson, B.P.V., and Sumi, Y. Exploring factors that influence connected drivers to (not) use or
follow recommended optimal routes. In Proceedings of the 2019 CHI Conference on Human
Factors in Computing Systems. Glasgow, Scotland, UK: Association for Computing Machinery,
2019, pp. 1–14.
55. Sniezek, J.A.; and Buckley, T. Cueing and cognitive conflict in judge-advisor decision making.
Organizational Behavior and Human Decision Processes, 62, 2 (May 1995), 159–174.
56. Sniezek, J.A.; and Van Swol, L.M. Trust, confidence, and expertise in a judge-advisor system.
Organizational Behavior and Human Decision Processes, 84, 2 (March 2001), 288–307.
57. Springer, A.; and Whittaker, S. Progressive disclosure: When, why, and how do users want
algorithmic transparency information? ACM Transactions on Interactive Intelligent Systems,
10, 4 (October 2020), 29:1-29: 32.
58. Surowiecki, J. The Wisdom Of Crowds: Why the Many are Smarter than the Few and How
Collective Wisdom Shapes Business, Economies, Societies, and Nations. New York, NY:
Doubleday & Co, 2004.
59. Sweller, J. Cognitive load during problem solving: Effects on learning. Cognitive Science, 12, 2
(April 1988), 257–285.
60. Sweller, J. Element interactivity and intrinsic, extraneous, and germane cognitive load.
Educational Psychology Review, 22, 2 (June 2010), 123–138.
61. Sweller, J.; van Merriënboer, J.J.; and Paas, F. Cognitive architecture and instructional design:
20 years later. Educational Psychology Review, 31, 2 (2019), 261–292.
62. Sweller, J.; van Merrienboer, J.J.G.; and Paas, F.G.W.C. Cognitive architecture and instruc
tional design. Educational Psychology Review, 10, 3 (September 1998), 251–296.
63. Teodorescu, M.; Morse, L.; Awwad, Y.; and Kane, G. Failures of fairness in automation require
a deeper understanding of human-ML Augmentation. Management Information Systems
Quarterly, 45, 3 (September 2021), 1483–1500.
64. Van Merriënboer, J.J.G.; and Sweller, J. Cognitive load theory and complex learning: Recent
developments & future directions. Educational Psychology Review, (2005), 147177.
65. Van Merriënboer, J.J.G.; and Sweller, J. Cognitive load theory in health professional education:
design principles and strategies. Medical Education, 44, 1 (January 2010), 85–93.
66. Vessey, I. Cognitive fit: A theory-based analysis of the graphs versus tables literature. Decision
Sciences, 22, 2 (1991), 219–240.
JOURNAL OF MANAGEMENT INFORMATION SYSTEMS 365
67. Wang, W.; and Benbasat, I. Recommendation agents for electronic commerce: Effects of
explanation facilities on trusting beliefs. Journal of Management Information Systems, 23, 4
(May 2007), 217–246.
68. Wang, W.; and Benbasat, I. Empirical assessment of alternative designs for enhancing different
types of trusting beliefs in online recommendation agents. Journal of Management Information
Systems, 33, 3 (July 2016), 744–775.
69. Wang, X.; Guo, Y.; and Xu, C. Recommendation algorithms for optimizing hit rate, user
satisfaction and website revenue. In Proceedings of the 24th International Conference on
Artificial Intelligence. Buenos Aires, Argentina: AAAI Press, 2015, pp. 1820–1826.
70. Ward, A.; and Mann, T. Don’t mind if I do: Disinhibited eating under cognitive load. Journal of
Personality and Social Psychology, 78, 4 (2000), 753–763.
71. Wolf, C.; and Blomberg, J. Evaluating the promise of human-algorithm collaborations in
everyday work practices. Proceedings of the ACM on Human-Computer Interaction, 3, CSCW
(November 2019), 143:1–143: 23.
72. Xu, J.; Benbasat, I.; and Cenfetelli, R.T. The nature and consequences of trade-off transparency
in the context of recommendation agents. MIS Quarterly, 38, 2 (February 2014), 379–406.
73. Yeomans, M.; Shah, A.; Mullainathan, S.; and Kleinberg, J. Making sense of recommendations.
Journal of Behavioral Decision Making, 32, 4 (2019), 403–414.
74. Yu, S.; Chai, Y.; Chen, H.; Brown, R.A.; Sherman, S.J.; and Nunamaker Jr, J.F. Fall Detection
with wearable sensors: A hierarchical attention-based convolutional neural network approach.
Journal of Management Information Systems, 38, 4 (2021), 1095–1121.
75. Zhou, J.; Arshad, S.Z.; Luo, S., and Chen, F. Effects of uncertainty and cognitive load on user
trust in predictive decision making. In R. Bernhaupt, G. Dalvi, A. Joshi, D. K. Balkrishan,
J. O’Neill, and M. Winckler () (eds.), Human-Computer Interaction – INTERACT 2017.
Mumbai, India: Springer International Publishing, Cham, 2017, pp. 23–39.