Download as pdf or txt
Download as pdf or txt
You are on page 1of 31

Journal of Management Information Systems

ISSN: (Print) (Online) Journal homepage: https://www.tandfonline.com/loi/mmis20

Algorithmic versus Human Advice: Does


Presenting Prediction Performance Matter for
Algorithm Appreciation?

Sangseok You, Cathy Liu Yang & Xitong Li

To cite this article: Sangseok You, Cathy Liu Yang & Xitong Li (2022) Algorithmic versus Human
Advice: Does Presenting Prediction Performance Matter for Algorithm Appreciation?, Journal of
Management Information Systems, 39:2, 336-365, DOI: 10.1080/07421222.2022.2063553

To link to this article: https://doi.org/10.1080/07421222.2022.2063553

View supplementary material

Published online: 07 Jun 2022.

Submit your article to this journal

Article views: 232

View related articles

View Crossmark data

Full Terms & Conditions of access and use can be found at


https://www.tandfonline.com/action/journalInformation?journalCode=mmis20
JOURNAL OF MANAGEMENT INFORMATION SYSTEMS
2022, VOL. 39, NO. 2, 336–365
https://doi.org/10.1080/07421222.2022.2063553

Algorithmic versus Human Advice: Does Presenting Prediction


Performance Matter for Algorithm Appreciation?
Sangseok Youa, Cathy Liu Yangb, and Xitong Lib
a
SKK Business School, Sungkyunkwan University (SKKU), Seoul, South Korea; bDepartment of Information
Systems and Operations Management, HEC Paris, Jouy-en-Josas, France

ABSTRACT KEYWORDS
We propose a theoretical model based on the judge-advisor system Algorithmic advice;
(JAS) and empirically examine how algorithmic advice, compared to algorithm appreciation;
identical advice from humans, influences human judgment. This effect algorithmic transparency;
online trust; cognitive load;
is contingent on the level of transparency, which varies with whether
prediction performance
and how the prediction performance of the advice source is presented.
In a series of five controlled behavioral experiments, we show that
individuals largely exhibit algorithm appreciation; that is, they follow
algorithmic advice to a greater extent than identical human advice
due to a higher trust in an algorithmic than human advisor.
Interestingly, neither the extent of higher trust in algorithmic advisors
nor the level of algorithm appreciation decreases when individuals are
informed of the algorithm’s prediction errors (i.e., upon presenting
prediction performance in an aggregated format). By contrast, algo­
rithm appreciation declines when the transparency of the advice
source’s prediction performance further increases through an elabo­
rated format. This is plausibly because the greater cognitive load
imposed by the elaborated format impedes advice taking. Finally, we
identify a boundary condition: algorithm appreciation is reduced for
individuals with a lower dispositional need for cognition. Our findings
provide key implications for research and managerial practice.

Introduction
Human judgment is often influenced by other people’s opinions [4, 13, 37]. Meanwhile,
predictions from algorithms are increasingly used as advice sources (so-called “algorithmic
advice”) to aid human judgment. Recent field studies have shown that decision makers follow
algorithmic advice when making important decisions, such as providing medical diagnoses
[27, 33] and releasing criminals on parole [7]. Despite evidence that individuals rely on
algorithmic advice to some degree, the literature suggests that people depend less on such
advice than on advice from a human advisor—even when the algorithm makes better
predictions (see review paper by Meehl [42]). By contrast, the nascent literature demonstrates
algorithm appreciation: individuals are more likely to be influenced by an algorithm than by
humans, presumably because people believe that prediction performance of algorithms is
superior to that of humans in various contexts [12, 16, 17, 26, 38, 39]. Such findings imply an
algorithm’s greater efficacy over humans in influencing individuals’ decisions.

CONTACT Xitong Li lix@hec.fr Department of Information Systems and Operations Management, HEC Paris, V-207, 1
Rue de la Liberation, 73851 Jouy-en-Josas, France
Supplemental data for this article can be accessed online at https://doi.org/10.1080/07421222.2022.2063553
© 2022 Taylor & Francis Group, LLC
JOURNAL OF MANAGEMENT INFORMATION SYSTEMS 337

Despite compelling evidence of algorithm appreciation, a minimal amount remains


known about its underlying mechanism. Consistent with the recommendation agent litera­
ture [1, 6, 35, 67, 68, 72], studies have offered preliminary evidence that algorithm
appreciation could be driven by individuals’ higher trust in an algorithmic advisor’s ability
to make good predictions (i.e., prediction performance) [12, 16, 17, 38, 39, 47, 50, 73].
However, relevant results have been inconclusive. Research on algorithm appreciation has
generally revealed higher trust in algorithmic prediction performance versus humans [12,
16, 17, 38, 39]. Only Dietvorst et al. [16] have thus far examined the mediating role of trust
in the impact of source (algorithm vs. human) on one’s acceptance to delegation. Moreover,
possibly due to reduced information asymmetry between a “black-box”—natured algorithm
and individual decision makers, increased algorithmic transparency has been found to
increase people’s trust in an algorithm [20, 68]. In this paper, we study transparency in
prediction performance that should directly affect individuals’ trust in algorithmic (relative
to human) predictions and hence influence algorithm appreciation [12, 16, 17, 38, 39].
Recent legal enforcement on increasing the transparency of algorithmic decision-making
tools (e.g., GDPR and California Privacy Act) [25] has spawned machine-learning research
on algorithmic transparency in its mechanisms (e.g., model types and input-variable
weights). Explaining how an algorithm works may increase individuals’ trust in and
adoption of algorithmic decision aids [36, 49, 73, 74]. Yet, studies on human–artificial
intelligence (AI) collaboration indicate that transparency in why an algorithm is trust­
worthy—instead of how an algorithm works—appears more relevant to individuals’ trust in
algorithmic decision aids [10, 18, 54, 57, 71]. Transparency in prediction performance
enriches one’s understanding of why an algorithm is trustworthy. Therefore, we aim to
examine how transparency in prediction performance affects individuals’ trust and reliance
on algorithmic decision aids relative to the human decision aids.
Scarce research has addressed algorithm appreciation while offering a certain level of
performance transparency [12, 16, 17, 40, 73]. As notable exceptions, Castelo et al. [12] and
Dietvorst et al. [16] have demonstrated the impact of variation in performance transparency
on algorithm appreciation. Yet, these findings are contradictory, possibly due to their
different measures of algorithm appreciation and levels of transparency in performance
information (i.e., performance presentation format [PPF]). In addition, neither Castelo
et al. [12] nor Dietvorst et al. [16] control for advice quality (i.e., source-contingent advice)
or sources’ prediction performance (i.e., highlighting an algorithm’s superior performance
over a human advisor). Such contradictory findings may undermine the implications of
algorithmic advice in practice. In particular, managers might be reluctant to provide
algorithmic decision aids or to disclose algorithmic prediction performance.
To address the abovementioned issues, we aim to fill a research gap by investigating
whether and how increased performance transparency via distinct PPFs influence algorithm
appreciation. We propose a theoretical framework based on the judge-advisor system (JAS)
paradigm (see Figure 1). This paradigm entails collaborative decision-making processes
involving two parties: an advisor who provides advice, and a judge who receives and
evaluates the advice to make a final judgment [24, 28, 56]. The JAS paradigm helps us
explore a nomological network of variables related to advice-taking components.
Specifically, we consider how advisor properties (e.g., advisor type and variation in perfor­
mance transparency via PPFs) influence a judge’s advice taking (i.e., judgment) and under­
lying mechanisms through the judge’s trust in advisor and cognitive properties.
338 YOU ET AL.

Figure 1. Theoretical framework.

The JAS paradigm allows us to employ a uniform measurement of algorithm apprecia­


tion. Subsequently, we can understand how algorithm appreciation is affected by varia­
tion in performance transparency while controlling for (a) the advisor’s advice quality
and prediction performance and (b) a judge’s overconfidence in their own judgment [26,
38, 39]. We compare three PPFs derived from the literature, which reflect increasing
performance transparency: no performance presentation [12, 16, 39, 50], an aggregated
PPF [12, 16, 40], and an elaborated PPF [16, 73]. Thus, our first research question is as
follows:

Research Question 1: Does a higher level of performance transparency increase or decrease


algorithm appreciation?

We next investigate the mechanism underlying algorithm appreciation with variation in


performance transparency via PPFs based on our theoretical framework. Increasing per­
formance transparency could influence trust in an advisor [12, 16, 20, 49]. However, the
additional performance information accompanying greater transparency could increase
one’s cognitive load [1, 8, 32]. Cognitive load theory (CLT) suggests that relatively high
cognitive load (i.e., low available working memory) could hinder the judge from integrating
advice to one’s final decision and thus result in a decreased degree of advice taking [14, 48].
Therefore, we posit that cognitive load plays a significant role in predicting one’s degree of
advice taking.
We propose a theoretical model highlighting two judge-related aspects of the JAS
paradigm that could be influenced by performance transparency: (1) trust in the
advisor’s prediction performance and (2) cognitive load. To provide further evidence
of how cognitive load stemming from increased prediction performance transparency
impacts algorithm appreciation, we investigate a boundary condition when individuals
experience a high cognitive load. Prior research suggests that a decline in advice taking
due to a greater cognitive load could be contingent on individuals’ need for cognition
(NC) [14, 48]. We thus include NC in our research model, which refers to individuals’
dispositional tendency to engage in cognitively challenging tasks [9]. Taken together,
our second research question seeks to uncover how variation in performance transpar­
ency influences algorithm appreciation through the underlying mechanisms involving
one’s trust in an advisor, cognitive load, and NC.
JOURNAL OF MANAGEMENT INFORMATION SYSTEMS 339

Research Question 2: What roles do individuals’ trust in advisor, cognitive load, and NC play in
the relationship between increased performance transparency (per PPFs) and algorithm
appreciation?

To answer our research questions, we propose a research model nesting the hypotheses
within our theoretical framework. We empirically examine this model via five controlled
behavioral experiments. The findings show that algorithm appreciation manifests when no
performance information is presented or when the PPF is aggregated; this appreciation
declines when the PPF is elaborated. An elaborated PPF offers the most transparent
prediction performance, but in the meantime, it imposes a greater cognitive load and
thus impedes a judge’s reliance on advice. Furthermore, individuals with higher NC exhibit
greater algorithm appreciation when presented with an elaborated PPF. Our results con­
tribute to the burgeoning literature on algorithm appreciation and algorithmic transparency
by highlighting the roles of trust and cognitive properties in the efficacy of algorithmic
advice.

Theoretical Background
Here we present the theoretical background of our research based on Figure 1, outlining
advisor and judge properties that can influence one’s degree of algorithm appreciation.

Algorithm Appreciation and Trust


Recent work on algorithm appreciation indicates that people exhibit greater advice taking
when the advice comes from an algorithm compared with identical advice from humans.
This pattern holds for various prediction tasks, whether subjective (e.g., predicting
a potential romantic partner’s attraction) or objective (e.g., financial market forecasts)
[26, 38, 39]. Castelo et al. [12] show that individuals favor algorithmic over human
advice. Dietvorst et al. [16] find that participants prefer to delegate their decisions to
an algorithm rather than a human when they are not presented with algorithmic or
human prediction errors; this pattern holds when participants can adjust algorithmic
prediction outcomes in a follow-up study [17]. One’s extent of algorithm appreciation
also depends on contingent factors, such as control over the prediction outcome [17],
task subjectivity [12, 73], task expertise [40, 47, 50], and information on the algorithm’s
prediction performance [16]. Table 1 summarizes the recent literature on algorithm
appreciation.
Individuals’ trust in algorithmic (and human) advisors has been commonly explored to
uncover the underlying mechanism of algorithm appreciation [12, 16, 17, 39] (see Column 3
in Table 1). Trust, defined as one’s willingness to be vulnerable to other agents’ actions, can
explain individuals’ willingness to rely on intelligent agents and recommender systems [1, 6,
35, 41, 51, 53, 57, 67, 68]. With respect to algorithm appreciation literature, advisor trust is
measured as one’s trust in an advisor’s ability to make good predictions (i.e., prediction
performance). The extent to which higher trust in an algorithmic than human advisor
correlates with one’s extent of algorithm appreciation. Dietvorst et al. [16], Liel and
Zalmanson [38], and Logg et al. [39] find that people are more likely to trust an algorithm
to make good predictions than a human even without knowing the prediction performance.
340

Table 1. Comparison of our research with prior studies on algorithm appreciation.


YOU ET AL.

Trust
Measured? Performance Advice Quality Performance Cognitive Load
Research Dependent Variable (Mediator?) Revealed? PPF Compares PPFs? Controlled? Controlled? Measured?
Promberger and Baron [50] Alignment with the Yes No NA NA Yes No No
recommendation (No)
Önkal et al. [47] Degree of advice taking Yes No NA NA Yes NA No
(No)
Dietvorst et al. [16] Choice of algorithmic over Yes Yes Elaborated Yes, No No No
human prediction (Yes) (no information vs.
elaborated)
Dietvorst et al. [17] Choice of algorithmic over Yes Yes Aggregated No No No No
human prediction (No)
Gunaratne et al. [26] Degree of advice taking No No NA NA. Yes NA No
(NA)
Castelo et al. [12] Preference for algorithm Yes Yes Aggregated Yes No No No
relative to human advisor (No) (no information vs.
aggregated)
Liel and Zalmanson [38] Degree of advice taking Yes No NA No Yes NA No
(No)
Logg et al. [39] Degree of advice taking Yes No NA NA Yes NA No
(No)
Longoni et al. [40] Choice of algorithmic over No Yes Aggregated No Yes Yes No
human recommender (NA)
Yeomans et al. [73] Choice of algorithmic over Yes Yes Elaborated No Yes No No
human recommender (No)
Our paper Degree of advice taking Yes Yes Aggregated and Yes, Yes Yes Yes
(Yes) Elaborated (no information vs.
aggregated vs. elaborated)
JOURNAL OF MANAGEMENT INFORMATION SYSTEMS 341

Castelo et al. [12] and Dietvorst et al. [17] show that, when presented with prediction
performance, individuals display higher trust in an algorithm than a human when they can
modify their final prediction outcomes based on the advisor’s prediction.
Higher trust in an algorithmic versus human advisor is consistent with findings in the
information systems literature. The findings suggest that an algorithm’s perceived higher
reliability lowers individuals’ decision-related uncertainty when interacting with recom­
mendation agents [1, 35, 67, 68, 72]. In this paper, we follow the recommendation agent
literature by studying the role that trust plays in advice taking using the JAS paradigm.
Given the evidence that advisor trust positively mediates one’s extent of advice taking [e.g.,
24, 55], we explore whether trust in an advisor’s prediction performance positively mediates
the impact of advice source (algorithm vs. human) on the degree of advice taking.

Algorithmic Transparency in PPF


Algorithmic transparency depends on the extent to which algorithm-related information is
revealed and presented [36, 49]. Studies of human–AI collaboration [10, 18, 36, 49, 54, 57,
71, 73] and recommender systems [68, 72] indicate that increasing algorithmic transpar­
ency can increase individuals’ trust and reliance on algorithmic decision aids.
Amid recent legal enforcement on increasing the transparency and explainability of
algorithmic decision-making tools (e.g., GDPR and California Privacy Act), early research
in machine learning focuses on whether greater algorithmic transparency (i.e., by explain­
ing how a “black-box” algorithm works) increases individuals’ trust and reliance on
algorithmic assistance [36, 49, 73]. However, recent studies on human–AI collaboration
imply that transparency in why an algorithm is trustworthy from a user-centric perspective
—instead of how an algorithm works from a technic-centric perspective—is more relevant
to individuals’ use of algorithmic decision aids [10, 18, 54, 57, 71].i Specifically, several
qualitative studies indicate that individuals evaluate interactions with algorithms by prior­
itizing an understanding of algorithms’ purpose rather than executional details [10, 54, 71].
Dove et al. [18] suggest that describing an algorithm’s goal (i.e., explaining why the
algorithm is useful) is more effective in convincing an individual to adopt an algorithm
compared to a means-end persuasion. Similarly, Springer and Whittaker [57] show that
additional information about an algorithm does not necessarily inform an individual’s
judgment about why they should trust the algorithm to make predictions.
In this paper, we consider whether and how variation in prediction performance
transparency influences algorithm appreciation. As transparency rises with the amount of
information revealed, performance transparency increases when prediction performance is
presented (vs. no performance presentation at all). Under an incentivized JAS paradigm,
individuals receive higher payoffs when making more accurate predictions. Thus, an
advisor’s prediction performance represents a salient cue that may reduce uncertainty
about the advisor making good predictions. The judge may then trust the algorithm to
facilitate more accurate predictions, leading to a higher payoff.
Among prior studies on algorithm appreciation, only Castelo et al. [12] and Dietvorst
et al. [16] vary the transparency in prediction performance, but their findings are mixed as
mentioned earlier. Castelo et al.’s [12] Study 3 shows that merely highlighting superior
algorithmic performance over humans (vs. no performance presentation) increases people’s
stated preference for an algorithm relative to humans. However, Dietvorst et al. [16] find the
342 YOU ET AL.

opposite in their Study 4: individuals choose a human over an algorithm to make predic­
tions on their behalf when notified of the algorithm’s prediction errors compared to when
not being notified. These mixed findings can be explained by different PPFs used in Castelo
et al. [12] (an aggregated format) and Dietvorst et al. [16] (an elaborated format). Taken the
two studies together, one might speculate that higher transparency in prediction perfor­
mance reduces algorithm appreciation because the elaborated PPF in Dietvorst et al. [16]
communicates more information about algorithmic prediction than the aggregated PPF in
Castelo et al. [12]. However, it is difficult to draw this conclusion given different measures of
algorithm appreciation, varying degrees of flexibility enabling individuals to modify pre­
diction outcomes, and no exploration of the underlying mechanism.
As a result, we explore whether and how algorithm appreciation varies with increased
levels of prediction performance transparency (per PPFs) under the JAS paradigm. To
benchmark our work with prior findings, we consider three PPFs from the algorithm
appreciation literature (see Column 5 in Table 1): (1) no performance presentation; (2)
an aggregated PPF (presenting average prediction accuracy) [12, 17, 40]; and (3) an
elaborated PPF (presenting the prediction accuracy of individual prediction cases) [17,
73]. Notably, mixed findings persist regarding whether increased algorithmic transparency
enhances trust [12, 16, 20, 49]. We therefore explore whether increased performance
transparency influences individuals’ trust in an advisor’s prediction performance and
their trust mediates the impact of advice source on the degree of advice taking under the
JAS paradigm.

Impact of PPF on Cognitive Load


Despite its potential impact on trust in an advisor, the additional performance information
accompanying increased transparency could influence individuals’ decisions via cognitive
processes [1, 8, 32]. Making advice-based judgments involves rational evaluation of the
advice and its source [28, 55]. Research suggests that advice taking depends on individuals’
cognitive load [14, 48]; as such, we posit that the advisor’s PPF plays a key role in one’s
degree of advice taking by influencing cognitive load.
CLT is often applied to discern the efficacy of learning and instruction by explaining how
information presentations influence one’s available working memory (i.e., cognitive load)
[59, 62, 64]. CLT usually considers two types of cognitive load that occupy working
memory: intrinsic and extraneous cognitive load [34, 59, 65]. Whereas intrinsic cognitive
load reflects a task’s inherent cognitive load, extraneous cognitive load is associated with the
presentation of information [34, 59, 65]. Intrinsic and extraneous cognitive load are additive
and bounded by an individual’s working memory. It is thus recommended to minimize
extraneous cognitive load so that sufficient working memory can be allocated to one’s
intrinsic cognitive load to facilitate decisions [61, 64].
Higher levels of transparency in prediction performance are associated with a greater
amount of information. Compared to either an aggregated PPF displaying a single number
(e.g., average absolute deviation) or no performance presentation, presenting a series of
numbers to show prediction performance in an elaborated format increases performance
transparency. This format is also likely to generate a higher cognitive load by occupying
one’s working memory [34, 59, 62]. Findings in CLT suggest that a higher cognitive load
impedes information processing, as evidenced by longer decision time and difficulty
JOURNAL OF MANAGEMENT INFORMATION SYSTEMS 343

recalling instructional content [5, 62, 64]. Under the JAS paradigm, a heavier cognitive load
(due to an elaborated presentation of an advisor’s performance information) may hinder
the judge from integrating advice to make a decision [14, 48].
To the best of our knowledge, no research has shown direct process evidence of how
increased performance transparency (via PPFs) affects one’s cognitive load and subsequent
advice taking. Studies on algorithmic advice taking and the adoption of recommendation
agents provide preliminary evidence for our conjecture. Xu et al. [72] show that increased
transparency (i.e., by revealing trade-off information among features) could increase indi­
viduals’ cognitive load and reluctance to use a recommender system. Poursabzi-Sangdeh
et al. [49] find that higher algorithmic transparency in how a linear prediction model works
does not increase one’s reliance on algorithmic advice—people may face cognitive overload
when encountering an algorithm’s details. Springer and Whittaker [57] suggest that indi­
viduals do not exhibit greater intentions to use an algorithm given higher algorithmic
transparency, possibly because increased transparency may induce a heavier cognitive load.
We aim to fill these research gaps by measuring PPF-induced changes in cognitive load and
related effects on algorithm appreciation.

Moderating Role of NC Under the Elaborated PPF


One’s cognitive load may increase when processing performance information in an elabo­
rated format (vs. an aggregated format or no performance presentation). However, research
suggests that a possible decline in advice taking due to a greater cognitive load is susceptible
to individuals’ engagement in information processing [14, 48]. Individuals with relatively
high NC tend to strive to understand information and enjoy cognitively challenging
problems, whereas those with relatively low NC often rely on automatic processes. When
given an advice-taking task, NC can moderate one’s judgment ability [19]. Therefore,
individuals may undergo different cognitive processes related to advice taking based on
their NC when presented with prediction performance in an elaborated format.

Hypothesis Development
Following our theoretical framework in Figure 1, we propose a theoretical model which
uncovers the underlying mechanisms of algorithm appreciation. In this model, we focus
on the mediating role of higher trust in an algorithmic than human advisor and the
moderating role of PPFs that may impose different levels of cognitive load due to
increased performance transparency. We investigate three PPFs with an increased level
of prediction performance transparency: no performance presentation, an aggregated
PPF [12, 17, 40], and an elaborated PPF [17, 73]. We develop our hypotheses in two
sections. The first section (H1–H3) focuses on an aggregated PPF, compared to no
performance presentation, where performance transparency is increased but cognitive
load is not. Specifically, we explore the impact of increasing performance transparency on
algorithm appreciation (H1), followed by investigating the mediating role of trust (H2
and H3). The second section (H4–H7) further explores an elaborated PPF where perfor­
mance transparency is further increased (compared to an aggregated PPF and no per­
formance presentation) with heavy cognitive load induced (H4). In particular, we
investigate the impact of an elaborated PPF (vs. an aggregated PPF and no performance
344 YOU ET AL.

Figure 2. Hypotheses and research model.

presentation) on algorithm appreciation (H5), followed by understanding the role of trust


and cognitive load (H6). In addition, we investigate the role of NC as a boundary
condition of algorithm appreciation under high cognitive load induced by an elaborated
PPF (H7). Figure 2 illustrates our proposed theoretical model with the different
hypotheses.

Algorithm Appreciation Under No Performance Presentation and Aggregated PPF


Given no indication of the advisor’s performance, Logg et al. [39] show that individuals
display algorithm appreciation (i.e., a higher degree of advice taking from an algorithmic than
human advisor) across various advice-taking scenarios, possibly because people perceive an
algorithm as having higher prediction accuracy and reliability than a human advisor [1, 35,
67, 68, 72]. Conversely, displaying an advisor’s prediction performance in an aggregated
format reveals uncertainty in the advisor’s predictions. Due to an algorithm’s “black-box”
nature, individuals may be more sensitive to algorithm- (vs. human-) related information
(e.g., prediction errors). Dietvorst et al. [16] show that displaying prediction performance in
an elaborated format (vs. no performance information) decreases one’s algorithm apprecia­
tion and can even evoke algorithm aversion. Such decrease in algorithm appreciation could
be due to individuals’ lower tolerance of algorithmic than human prediction uncertainty.
Dzindolet et al. [20] also find that individuals are more sensitive to errors made by automa­
tion than humans, such that people rated automation with identical prediction errors more
negatively than errors from human partners. Given potentially higher sensitivity to algorith­
mic advisors’ prediction uncertainty, we expect that individuals show a lower level of reliance
on an algorithmic relative to human advisor when prediction performance is presented in an
aggregated format (vs. not presented at all). The following hypothesis is thus proposed:

Hypothesis 1 (H1): Individuals exhibit a lower degree of advice taking from an algorithmic
relative to human advisor when the advisor’s prediction performance is displayed in an aggre­
gated format than when no prediction performance is provided.

Because an algorithm is expected to have higher prediction accuracy and reliability than
humans even for challenging prediction tasks [1, 20, 35, 67, 68, 72], people often have higher
trust in an algorithmic than human advisor under different tasks and interfaces with various
information presentation formats [1, 39]. The literature on algorithm appreciation
JOURNAL OF MANAGEMENT INFORMATION SYSTEMS 345

consistently shows that individuals demonstrate higher trust in an algorithmic than human
advisor when prediction performance is either not presented [16, 38, 39, 47, 50], in an
aggregated format [12, 17], or in an elaborated format [16].
In addition, the higher trust in an algorithmic than human advisor does not seem to
differ when prediction performance is presented [17] versus when it is not presented at all
[16]. When individuals do not have the option to modify the algorithmic prediction out­
come, Dietvorst et al. [16] find that the higher trust in an algorithm than a human declines
when prediction performance is presented (vs. not at all)—possibly due to a lack of sense of
control. Dietvorst et al. [17] further provide empirical evidence and posit that the sense of
control engendered by the option to modify final prediction outcomes could maintain
individuals’ higher trust in an algorithmic than human advisor. This finding could be
explained by the fact that a sense of control reduces individuals’ skepticism of an algo­
rithm’s prediction competence [15].
Taken together, we expect individuals to exhibit higher trust in an algorithmic than
human advisor. This higher trust does not differ across PPFs under the JAS paradigm where
individuals can determine their final prediction outcomes:

Hypothesis 2 (H2): Individuals exhibit higher trust in an algorithmic compared to human


advisor.

Prior research shows the mediating role of trust in the impact of advice source
(algorithmic vs. human advisor) on one’s degree of advice taking when prediction
performance is not presented [16]. A minimal amount of research explores whether
this mediation effect applies when the advisor’s prediction performance is presented in
an aggregated format. Prior studies on trust in recommendation agents document
a strong link between individuals’ trust in an agent and their behavioral outcomes
(e.g., using a recommendation agent and relying on its recommendations) [35, 67].
The JAS literature consistently indicates that trust in the advisor positively mediates
one’s degree of advice taking [e.g., 24, 55]. Together with H2, we expect individuals’
trust in the advice source to mediate the impact of advice source (algorithmic vs.
human advisor) on the degree of advice taking, regardless of whether the advisor’s
prediction performance is presented in an aggregated format or not at all. We there­
fore hypothesize the following:

Hypothesis 3 (H3): Trust mediates the impact of the advice source on one’s degree of advice taking
when the advisor’s prediction performance is displayed in an aggregated format and when no
prediction performance is provided.

Although we postulate that the aggregated PPF (vs. no prediction performance


presentation) moderates algorithm appreciation (see H1), we do not assume that the
aggregated PPF moderates individuals’ trust in the advice source, consistent with
Dietvorst et al. [16] and Dietvorst et al. [17]. The moderation effect of the aggregated
PPF (if any) could result from individuals’ lower tolerance [16] and/or higher sensi­
tivity [20] to the prediction uncertainty (i.e., errors) of an algorithmic versus human
advisor.
346 YOU ET AL.

Impact of Elaborated PPF on Algorithm Appreciation


CLT asserts that PPFs affect extraneous cognitive load by requiring a certain portion of
individuals’ working memory [59, 62]. As noted, we compare PPFs from the algorithm
appreciation literature: no performance presentation, an aggregated PPF, and an elaborated
PPF. These PPFs reflect growing performance transparency as an increased amount of
performance information is presented. First, an aggregated PPF simply communicates
a summarized number (e.g., average absolute deviation) with which individuals can easily
gauge an advisor’s prediction accuracy. Processing this number does not consume sub­
stantial working memory or increase one’s cognitive load [64]. Second, an elaborated PPF
which details an advisor’s past prediction performance presents a series of prediction
outcomes and errors. Processing and integrating such detailed information impose
a much greater cognitive load [46, 64]. We do not expect the advice source to moderate
the impact of the elaborated PPF (vs. an aggregated PPF or no performance presentation)
on cognitive load, because the amount of performance information does not differ between
an algorithmic and human advisor contingent on each PPF [61]. As such, we posit that an
elaborated PPF will likely elicit a higher cognitive load than an aggregated PPF and no
performance presentation, leading to the following hypothesis:
Hypothesis 4 (H4): Individuals exhibit a higher cognitive load when the advisor’s prediction
performance is displayed in an elaborated format than when prediction performance is not
displayed or displayed in an aggregated format.

Research suggests that an increased cognitive load may impede one’s ability to integrate
advice with one’s own judgment when making a final decision [14, 48]. Individuals under
a relatively high cognitive load may have difficulties relying on external advice when making
a prediction regardless of the advice source. An elaborated PPF will likely impose a higher
cognitive load; thus, we expect this type of PPF to hinder individuals from integrating
external advice into their final predictions irrespective of whether the advice source is an
algorithm or human. We therefore posit that an elaborated PPF reduces the impact of the
advice source (algorithmic vs. human advisor) on the degree of advice taking, compared to
an aggregated PPF and no performance presentation. As such, we hypothesize the
following:
Hypothesis 5 (H5): Individuals exhibit a lower degree of advice taking from an algorithmic
relative to human advisor when the advisor’s prediction performance is displayed in an elabo­
rated format than when prediction performance is not displayed or displayed in an aggregated
format.

As indicated in H2 and H3, we do not expect PPF to moderate the impact of advice source
(algorithmic vs. human advisor) on one’s trust in the advisor—even if the elaborated PPF
induces a higher cognitive load, compared to an aggregated format or no prediction
performance presentation. Research suggests that a higher cognitive load decreases indivi­
duals’ trust in a recommendation system [2, 75], which is unlikely to be moderated by the
type of advice source. Ahmad et al. [2] find no support for the moderating role of advice
source in the impact of cognitive load on trust. As a result, we do not anticipate that
individuals will reduce their trust more in an algorithmic than human advisor when
experiencing a relatively high cognitive load. Rather, we expect that the relatively high
cognitive load induced by an elaborated PPF will hamper one’s ability to integrate external
JOURNAL OF MANAGEMENT INFORMATION SYSTEMS 347

advice when making a final prediction regardless of advice source [14, 48]. To this end,
while we hypothesize that individuals’ trust mediates the impact of advice source on the
degree of advice taking (see H2), we presume that a high cognitive load significantly
undermines the mediating role of trust. Therefore, we hypothesize the following.
Hypothesis 6 (H6): Trust mediates the impact of the advice source on one’s degree of advice
taking, but only when the cognitive load is relatively low (vs. high).

Boundary Conditions of Algorithm Appreciation with Elaborated PPF


NC plays an important role when individuals experience a high cognitive load [9]. Studies
indicate that the extent to which cognitive load reduces one’s degree of advice taking
depends on one’s engagement in information processing [14, 48]. We expect individuals
with a relatively high (vs. low) NC to be less influenced by the higher cognitive load induced
by an elaborated PPF because these people tend to be more engaged in the advice-taking
task. Greater engagement leads to a higher cognitive capacity, enabling them to integrate
advice into their final judgment and producing greater algorithm appreciation. Therefore,
we hypothesize:
Hypothesis 7 (H7): NC moderates the impact of the advice source on one’s degree of advice taking
when the advisor’s performance information is presented in an elaborated format.

Overview of Experiments
In a series of five between-subjects experiments, we explore how and why increased
transparency in performance (as manifested in PPFs) influences algorithm appreciation
based on the theoretical model in Figure 2.

Advice-Taking Task
Following previous literature [26, 39, 47], we construct an advice-taking task and benchmark
it with existing studies on algorithm appreciation. We specifically adapt an objective predic­
tion task from Dietvorst et al. [16, 17] and develop a real algorithm based on actual data.
Across the five experiments, participants are asked to perform a prediction task in which
they predict a target student’s standardized math score (ranging from 0 to 100) based on
nine pieces of information about the student (see Figure A1 in Online Supplemental
Appendix A for details). Under the JAS paradigm, each participant is asked to predict the
target student’s standardized math score twice: before and after being presented with advice
generated by the algorithmic prediction regarding the student’s predicted score.
Participants’ second (final) predictions are incentivized; in addition to earning $1 for
completing a survey session, all participants can earn a bonus payment of at least $0.20
for an answer within 6 points of the target student’s actual math score. The payment
increases by $0.10 for each additional 2 points closer to the truth. To ensure that partici­
pants understand the incentive-alignment mechanism, we ask them to answer a question
about the incentive scheme before proceeding to the prediction task. This question also
serves as an attention check.
348 YOU ET AL.

The target student’s true standardized math score is 63, and the advice (i.e., the algor­
ithmically predicted target student’s math score) is 62. We intentionally pick a target student
whose advice is close to the truth (i.e., 1 point off) so that participants with a higher degree of
advice taking are rewarded with a higher bonus. The algorithmic performance is reasonably
good—the average absolute error (AAE) is 3.5/100 points off the actual score with 2 out of
10 perfect predictions in the test set. The advice and performance (if presented) are identical
for participants randomly assigned to the human and algorithmic conditions. Details about
the algorithm, the selection criteria for the prediction target, the advisor’s performance, and
experimental stimuli appear in the Online Supplemental Appendix A.

Measuring Algorithm Appreciation


We measure algorithm appreciation by comparing the extent to which individuals take
advice from an algorithm to the extent to which they take identical advice from humans.
Our key dependent variable is weight on advice (WOA), which measures the extent to which
a judge adjusts their final judgment towards the advice. WOA has been widely used in the
JAS literature, and is calculated as the difference between one’s initial and final judgment
divided by the difference between the initial judgment and advice [23, 24, 28, 39]. WOA
indicates the weight of the advice in a judge’s final decision, ranging between 0 (completely
ignoring the advice) and 1 (fully taking the advice).ii Algorithm appreciation is observed if
the WOA of algorithmic advice is larger than the WOA of the same advice from humans.
final judgment initial judgment
WOA ¼
advice initial judgment

Advice Source
The first manipulation involves the advice source (or advisor): an algorithm or humans
(past participants). We keep the advice identical across all participants and inform each
participant that the advisor makes predictions about the same target student. Framings of
the advice sources in our experiments are adapted from prior studies [16, 17, 39]. We
choose past participants (peers) instead of an expert as the human advisor. The literature on
wisdom of the crowd suggests that peers are more persuasive than another person even
when that person is an expert [44, 58]. We also test a case in which the human advisor is an
expert in Study 1C and find similar results.

Advisor’s PPF
The second manipulation involves the advisor’s PPF. We adopt algorithmic PPFs common
in real life as well as in the literature. Table 2 lists our four selected parsimonious PPFs with
increasing performance transparency.

No Error (NE)
We define NE as the format where no information about advisor performance is presented
(i.e., the lowest performance transparency). The NE format is often applied in research on
individuals’ responses to algorithmic versus human aids [12, 16, 38, 39, 50].
JOURNAL OF MANAGEMENT INFORMATION SYSTEMS 349

Table 2. Advisor performance presentation formats.


Condition PPF Definition
NE No Information No performance information at all
(No Error)
AAE Aggregated An aggregate indication: average absolute error of past predictions
(Average
Absolute Error)
AAE-AHR Aggregated AAE: An aggregate indication: average absolute error of past predictions
(AAE and
Average Hit AHR: An aggregate indication: the proportion of the number of perfect predictions
Rate) over the total number of past predictions
Table-Diff Elaborated Detailed deviations of past predictions from the truth (zero deviation indicates
(Differences in a perfect prediction)
Table)

Average Absolute Error (AAE)


We define AAE as an aggregated format containing a summarized performance metric,
which shows the average absolute error of the advisor’s past predictions. AAE provides
greater performance transparency than NE given additional information about the advisor’s
performance. AAE is frequently used to capture an algorithm’s prediction performance
both in real life (e.g., demand or inventory forecast accuracy) and in the algorithm
appreciation literature [12, 17, 40].

Average Absolute Error and Average Hit Rate (AAE-AHR)


We define another aggregated PPF, AAE-AHR. This metric comprises the average percen­
tage of perfect past predictions, called the average hit rate (AHR), along with AAE. AHR is
relevant to real-world practice, such as in reflecting the accuracy of perfect predictions for
continuous or dichotomous outcomes. AAE-AHR has been adopted in recommendation
algorithms for e-commerce websites [69]. Of note, AAE-AHR provides more performance
information than AAE by incorporating AHR information while retaining an aggregated
PPF. We use AAE-AHR to convey as much aggregated performance information as possible
from a more detailed PPF.

Differences in Table (Table-Diff)


Studies of algorithm appreciation use an elaborated format to list the prediction performance of
individual cases [16, 73], where participants sequentially observe a series of individual algo­
rithmic prediction errors presented numerically. The information presentation format has been
shown to affect cognitive processing [66]. In particular, people exhibit easier information
processing and better comprehension when presented with detailed numerical information
in a table rather than a list or graph—particularly when encountering cognitively demanding
tasks involving numeracy processing [11, 31, 43]. This tabular format (or its variation) is often
used to display analytical information [11, 22, 43, 45, 52]. We thus operationalize the elaborated
PPF by tabulating a series of absolute errors of past predictions, referred to as the differences-in-
table format (Table-Diff). This tabular format should be less cognitively demanding than a list
or graphic representation, leading to a conservative estimate of how an elaborated PPF affects
cognitive load. The Table-Diff format nests AAE and AHR performance information while
350 YOU ET AL.

presenting a larger amount of information in an elaborated format compared to the aggregated


formats (i.e., AAE and AAE-AHR). Table-Diff therefore possesses the most performance
transparency among all PPFs in this study (i.e., NE, AAE, AAE-AHR, and Table-Diff).

Manipulation Check
We conduct a manipulation check and confirm that increasing the amount of information
about the advisor’s prediction performance via NE, aggregated, and elaborated formats
increases participants’ perceived information transparency. Our manipulation check and
results are detailed in Online Supplemental Appendix B.

Participants
We recruit English-speaking participants via Prolific and participants residing in North
America (the US and Canada) via Amazon Mechanical Turk (MTurk) because the
online experiment instrument is in English. We exclude the following categories of
participants: those who decline to consent; fail to pass the attention check; complete the
survey more than once; have poor-quality responses (e.g., straight liners); or complete
the survey on a mobile device, such as a phone or tablet. In the analysis, participants
whose initial predictions are identical to the advice are excluded; the WOA of these
participants is undefined, as the denominator when calculating WOA is zero [23, 24, 28,
39, 47].

Summary of Studies
Table 3 summarizes our experimental design and the sample size in each of the five
experiments. Study 1A and Study 2 are our main studies, designed to test H1–H6 and
answer RQ1 and RQ2. Study 1B is conducted to rule out an alternative explanation
that our findings in Study 1A are subject to the specific choice of the prediction target;
Study 1C is conducted to rule out an alternative explanation that our findings in Study
1A are subject to the choice in the type of human advisor (past participants as
opposed to an expert). Study 3 tests H7 by exploring the moderating role of NC on
algorithm appreciation with an elaborated PPF. Sample sizes are determined by
benchmarking each to the sample sizes of previous studies that employ similar
approaches [16, 17, 39]. Sample sizes are further validated with our own calculations
of effect size and statistical power.

Table 3. Overview of experiments.


Experiment Experimental Design N
Study 1A 2 (Algorithm vs. Humans) ✕ 2 (NE vs. AAE) 478
Study 1B 2 (Algorithm vs. Humans) ✕ 2 (NE vs. AAE) 450
Study 1C 2 (Algorithm vs. Expert) ✕ 2 (NE vs. AAE) 451
Study 2 2 (Algorithm vs. Humans) ✕ 3 (NE vs. AAE-AHR vs. Table-Diff) 566
Study 3 2 (Algorithm vs. Humans) ✕ 1 (Table-Diff) 184
JOURNAL OF MANAGEMENT INFORMATION SYSTEMS 351

Study 1
Study 1 aims to test H1, H2, and H3 by examining how and why algorithm appreciation is
affected by increased performance transparency with an aggregated PPF compared to NE.
We operationalize the aggregated format by presenting prediction performance in AAE.

Study 1A
Method. We conduct a 2 (advice source: algorithm vs. humans) by 2 (PPF: NE vs. AAE)
between-subjects experiment. We recruit 540 participants from Prolific (50.56 percent
women, Mage = 26.68) and randomly assign them to one of the four experimental condi­
tions. We exclude 28 participants who are straight liners and 34 participants whose initial
predictions happen to be identical to the advice. The final sample thus includes 478
participants.
Before each participant gives their final judgment about the target student’s standardized
math score and after being given the advice, we measure participants’ trust in the advice
source using a 3-item scale adapted from Komiak and Benbasat [35]. The items are scored
on a 7-point Likert scale (1 = strongly disagree, 7 = strongly agree). An example item is
“[Source] is with good knowledge about high school students’ math scores.” The items are
reliable (Cronbach’s α = 0.77), and we use the average rating as a composite index to
measure participants’ trust in the advice source.

Results. To test H1, we conduct a two-way analysis of covariance (ANCOVA) of WOA


while controlling for participants’ age and education. Results show a significant main effect
of advice source on WOA [F(1, 472) = 10.96, p < 0.01, η2 = 0.02], indicating that participants
adjust their final predictions towards the advice to a larger extent when the advice comes
from the algorithm (M = 0.42, SD = 0.38) versus humans (M = 0.32, SD = 0.36).
Interestingly, we find no significant main effect of PPF (NE vs. AAE) or any significant
interaction effect between the advice source and PPF (Fs < 1). A post-hoc Tukey test
suggests no significant difference in WOA between NE and AAE irrespective of whether
the advice comes from an algorithm (MNE = 0.45, SDNE = 0.39; MAAE = 0.41, SDAAE = 0.37;
p = 0.90) or humans (MNE = 0.32, SDNE = 0.35; MAAE = 0.32, SDAAE = 0.38; p = 1.00).
Figure 3 shows a plot of the average WOA per experimental condition with error bars
indicating 95 percent confidence intervals.
To test H2, we submit the measure of trust in advice source to a two-way ANCOVA,
controlling for participants’ age and education. We observe a significant main effect of
advice source on trust [F(1, 472) = 65.05, p < 0.001, η2 = 0.12]. Findings show that partici­
pants have higher trust in the algorithm (M = 4.43, SD = 1.03) than humans (M = 3.66, SD =
1.06). We note no significant main effect of PPF (NE vs. AAE) [F(1, 472) = 1.37, p = 0.25, η2
= 0.002] and no significant interaction effect on trust [F(1, 472) = 1.79, p < 0.18, η2 = 0.003].
A post-hoc Tukey test reveals no significant difference in trust in advice source between NE
and AAE irrespective of whether the advice comes from the algorithm (MNE = 4.45, SDNE =
1.09; MAAE = 4.42, SDAAE = 0.97; p = 0.98) or humans (MNE = 3.55, SDNE = 1.12; MAAE =
3.78, SDAAE = 1.00; p = 0.25).
To test H3, we perform a mediation analysis (advice source→trust in advice
source→WOA) moderated by PPF controlling for participants’ age and education. We
estimate the indirect effect of the moderated mediation analysis via bootstrapping with 5000
352 YOU ET AL.

Figure 3. Weight on advice (WOA) by experimental conditions in study 1A.

samples (Model 15; Hayes [29]). We find a positive indirect effect of advice source on WOA
through trust in advice source under the NE (95 percent confidence interval: [0.0079,
0.0431]) and AAE formats (95 percent confidence interval: [0.0070, 0.0481]). We find no
conditional direct effect of advice source on WOA under the NE (95 percent confidence
interval: [−0.0148, 0.0859]) or AAE format (95 percent confidence interval: [−0.0229,
0.0752]). We also do not find that the mediation effect of trust in advice source on the
impact of advice source on WOA is moderated by PPF (NE vs. AAE) (95 percent confidence
interval: [−0.0234, 0.0274]).

Study 1B
Study 1B is carried out to ensure the robustness of findings from Study 1A by using
a different student as the prediction target. The true math score of the target student is 56
while the advice is 55, different from the target student in Study 1A. We recruit 493
participants from Amazon MTurk (35.70 percent women, Mage = 33.91) and exclude 43
participants whose initial predictions are identical to the advice. Our final sample contains
450 participants. The experimental procedure is the same as in Study 1A except that we
do not measure participants’ trust in the advisor when making the final estimates, given
our focus on replicating the behavioral outcome based on the experimental design in
Study 1A.
The two-way ANCOVA results after controlling for participants’ age and education
show a significant main effect of advice source on WOA [F(1, 444) = 11.19, p < 0.001, η2 =
0.02]; that is, participants adjust their final predictions towards the advice to a larger extent
when the advice comes from the algorithm (M = 0.45, SD = 0.40) versus humans (M = 0.33,
SD = 0.38). Similar to Study 1A, we find no main effect of PPF (NE vs. AAE) or any
interaction effect (Fs < 1). A post-hoc Tukey test shows no significant difference in WOA
between NE and AAE regardless of whether the advice comes from an algorithm (MNE =
0.45, SDNE = 0.41; MAAE = 0.46, SDAAE = 0.39; p = 0.99) or humans (MNE = 0.31, SDNE =
JOURNAL OF MANAGEMENT INFORMATION SYSTEMS 353

0.38; MAAE = 0.34, SDAAE = 0.38; p = 0.94). Figure C1 in Online Supplemental Appendix
C shows the plot of the average WOA per experimental condition with error bars denoting
95 percent confidence intervals.

Study 1C
We conduct Study 1C to ensure the robustness of results in Study 1A by using a different
human advisor—an expert—instead of past participants. We recruit 494 participants from
Amazon MTurk (49.80 percent women, Mage = 39.49). Forty-three participants whose
initial predictions happen to be identical to the advice are excluded, leaving 451 participants
as the final sample. The experimental procedure is identical to that in Study 1B except that
we replace the human advisor from “past participants” with “an expert” while using the
same prediction target as in Study 1A.
Two-way ANCOVA results after controlling for participants’ age and education show
a significant main effect of advice source on WOA [F(1, 445) = 4.13, p < 0.05, η2 = 0.01].
Participants thus adjust their final predictions towards the advice to a larger extent when
the advice comes from the algorithm (M = 0.43, SD = 0.40) versus the expert (M = 0.36,
SD = 0.37). Again, we find no main effect of the PPF (NE vs. AAE) or any interaction
effect (Fs < 1). A post-hoc Tukey test also indicates no significant difference in WOA
between NE and AAE when the advice comes from the algorithm (MNE = 0.46, SDNE =
0.39; MAAE = 0.40, SDAAE = 0.40; p = 0.57) or humans (MNE = 0.38, SDNE = 0.38; MAAE =
0.33, SDAAE = 0.36; p = 0.76). Figure C2 in the Online Supplemental Appendix C depicts
the plot of the average WOA per experimental condition with error bars stating 95 percent
confidence intervals.

Discussion
The results of Study 1 echo those of Logg et al. [39] wherein individuals demonstrate
algorithm appreciation under the NE format. Our findings show that algorithm apprecia­
tion does not seem to decrease under the AAE compared to the NE format, thus rejecting
H1. This answers RQ1. In fact, our findings suggest that individuals can show similar
levels of algorithm appreciation under the AAE compared to the NE format. The rejection
of H1 might be due to the relatively high prediction performance of the algorithmic
advisor (3.5/100 points off the actual score) given our experimental design. That is, the
relatively high prediction performance does not evoke individuals’ lower tolerance or
higher sensitivity to algorithmic prediction errors [16,20]. The results of Study 1A answer
RQ2 by showing that individuals typically trust an algorithmic advisor more than
a human irrespective of PPF in the NE or AAE format, confirming H2. Furthermore,
trust in advice source also mediates algorithm appreciation under both the NE and AAE
formats, confirming H3.

Study 2
In Study 2, we further examine the impact of increased performance transparency on
algorithm appreciation. Specifically, in addition to the aggregated and NE formats in
Study 1, we introduce a Table-Diff format that tabulates the individual past absolute
deviation of predictions from the truth to operationalize the elaborated format. Because
354 YOU ET AL.

people can derive AHR and AAE from the Table-Diff format, we use AAE-AHR to
operationalize the aggregated format in this study. We test H4, H5, and H6 through this
experimental setup.

Method
Participants are randomly assigned to one of six conditions in a 2 (advice source: an
algorithm vs. humans) by 3 (PPF: NE vs. AAE-AHR vs. Table-Diff) between-subjects
design. We recruit 612 participants from Amazon MTurk (44.61 percent women, Mage
= 37.98) and exclude 46 participants whose initial predictions are identical to the
advice. The final sample consists of 566 participants. Participants are asked to rate
their trust in the advice source based on the same scale used in Study 1A (Cronbach’s
α = 0.76).
Different from Study 1A, we record each participant’s response time in seconds when
making their initial and final estimates, which immediately proceeds and follows (respec­
tively) the presentation of advice along with prediction performance (except for the NE
condition) on a separate page. We do not expect differences in individuals’ response time
when giving their initial estimates across conditions; however, we anticipate slower
response time when making final predictions under the Table-Diff format compared to
the NE and AAE-AHR formats due to higher cognitive load per H4. Response time is an
objective measure of extraneous cognitive load [60]. Following Ward and Mann [70], we
account for individual differences in making predictions by using an adjusted response time
in the final estimate to measure cognitive load (i.e., by subtracting each participant’s
response time in making their initial estimate from one’s response time in making their
final estimate).

Results
Cognitive Load
A two-way ANCOVA controlling for participants’ age and education reveals no significant
main effect of advice source (F < 1), PPF [F(2, 552) = 1.45, p = 0.24, η2 = 0.005], or their
interaction (F < 1) on response time (log-transformed due to right skewness) when parti­
cipants make initial estimates. To test H4, we submit the adjusted response time to a two-
way ANCOVA controlling for participants’ age and education. We observe a significant
main effect of PPF on adjusted response time when making the final estimate [F(2, 552) =
3.05, p < 0.05, η2 = 0.01]. We find no significant main effect of advice source (F < 1) or its
interaction with PPF on the adjusted response time when participants make their final
estimates [F(2, 552) = 2.18, p = 0.11, η2 = 0.008]. We further contrast the means of adjusted
response time when making the final estimate (negative for adjusted response time due to
making the final estimate more quickly than the initial estimate) of the Table-Diff format to
that of AAE-AHR and NE. Results suggest that participants are slower in giving their final
estimates under the Table-Diff format compared to the NE and AAE-AHR formats (MNE
= -31.89, SDNE = 35.77; MAAE-AHR = -29.67, SDAAE-AHR = 36.56; MTable-Diff = -24.92,
SDTable-Diff = 27.23; t(552) = 2.41, p < 0.05). We find no significant difference in adjusted
response time when making the final estimate between the NE and AAE-AHR formats [t
(552) = 0.54, p = 0.59].
JOURNAL OF MANAGEMENT INFORMATION SYSTEMS 355

WOA
To test H5, we submit WOA to a two-way ANCOVA controlling for participants’ age and
education given the 2-by-3 experimental design. Results indicate a significant main effect of
advice source on WOA [F(1, 552) = 12.65, p < 0.001, η2 = 0.02], demonstrating that parti­
cipants adjust their final predictions towards the advice to a larger extent when the advice
comes from an algorithm (M = 0.40, SD = 0.40) versus other humans (M = 0.35, SD = 0.41).
We find a marginally significant main effect of PPF (MNE = 0.40, SDNE = 0.40;
MAAE-AHR = 0.41, SDAAE-AHR = 0.41; MTable-Diff = 0.32, SDTable-Diff = 0.40; F(2, 552) = 2.44,
p < 0.10, η2 = 0.01). We also identify a significant interaction effect of advice source and PPF
[F(2, 552) = 6.19, p < 0.01, η2 = 0.02]. Figure 4 presents a plot of the average WOA; Table C1
in the Online Supplemental Appendix C lists the WOA per experimental condition.
We perform a post-hoc test with Tukey’s HSD to further understand the interaction
effect. We find no significant difference in WOA when the advice is given by humans among
the NE, AAE-AHR, and Table-Diff formats (pairwise ps > 0.65). By contrast, we find that
WOA under the Table-Diff format is significantly smaller than the NE format (p < 0.01) and
marginally significantly smaller than the AAE-AHR format (p < 0.10). No significant
difference emerges in WOA between the NE and AAE-AHR formats (p = 0.91). In sum,
these results show that the Table-Diff format greatly reduces algorithm appreciation
compared to the NE and AAE-AHR formats.

Trust in Advice Source


We further test H2 in Study 2 using a two-way ANCOVA controlling for participants’ age
and education. We observe a significant main effect of advice source on trust (MHuman =
4.50, SDHuman = 1.22; MAlgo = 5.07, SDAlgo = 1.06; F(1, 552) = 13.52, p < 0.001, η2 = 0.02). We
identify no significant main effect of PPF or its interaction effect with advice source (Fs < 1).
The means and standard deviations of trust in advice source by condition are displayed in
Table C2 of the Online Supplemental Appendix C.

Figure 4. Weight on advice (WOA) by experimental conditions in study 2.


356 YOU ET AL.

Moderated Mediation
To test H6, we carry out a mediation analysis (advice source→trust in advice
source→WOA) moderated by participants’ cognitive load, controlling for participants’
age and education. We estimate the indirect effect of the moderated mediation analysis
via bootstrapping with 5000 samples (Model 14; Hayes [29]). The impact of advice
source on WOA, mediated through trust in advice source, is negatively moderated by
the level of cognitive load (95 percent confidence interval: [−0.0006, −0.0001]). We
find a positive indirect effect of the advice source on WOA through trust in advice
source when participants face a relatively low cognitive load (one standard deviation
below the mean; 95 percent confidence interval: [0.0084, 0.0309]). We find no indirect
effect of advice source on WOA through trust in advice source when participants
encounter a relatively high cognitive load (95 percent confidence interval: [−0.0063,
0.0162]).

Discussion
Results from Study 2 further answer RQ1 and RQ2 by showing that increased performance
transparency via PPFs can influence individuals’ level of algorithm appreciation.
Specifically, we show that the Table-Diff format increases participants’ cognitive load as
evidenced by longer (adjusted) response time when making final estimates; this pattern
supports H4. In addition, compared with Study 1, further boosting performance transpar­
ency by presenting performance in an elaborated format (Table-Diff) can lead to lower
algorithm appreciation, supporting H5. More importantly, our moderated mediation ana­
lysis indicates that a higher cognitive load under the Table-Diff format impedes advice
taking, despite participants’ higher trust in the algorithmic than human advisor (supporting
H2). Trust still mediates the impact of advice source (algorithm vs. human) on the degree of
advice taking when individuals encounter a relatively low cognitive load, supporting H6. It
is worth noting that our results in the elaborated PPF (Table-Diff) do not necessarily suggest
that people exhibit algorithm aversion as suggested by Dietvorst et al. [16]. We do not find
that individuals demonstrate greater advice taking upon considering advice from humans
versus from an algorithm.

Study 3
Now we aim to test H7 and examine the boundary condition of algorithm apprecia­
tion under the Table-Diff format that induces a relatively high cognitive load. Before
the main study, we conduct an offline pilot study and find that a sample of students
among the top 0.1 percent of scorers on a national exam in a European country
exhibit algorithm appreciation under the Table-Diff condition (see details in Online
Supplemental Appendix D). These students may have a higher cognitive tendency
than participants recruited from MTurk in Study 2. In Study 3, we also consider
whether NC moderates the effect of algorithm appreciation under the Table-Diff
format.
JOURNAL OF MANAGEMENT INFORMATION SYSTEMS 357

Method
We recruit 195 participants from Amazon MTurk (43.60 percent women, Mage = 33.88).
Each participant is randomly assigned to one of two conditions in a one-factor design
(advice source: an algorithm vs. humans) in which prediction errors are presented in Table-
Diff format. We exclude 11 participants whose initial predictions are identical to the advice;
the final sample consists of 184 participants. Different from earlier studies, after making the
final prediction, each participant indicates their NC through an index of three items [9, 30]
scored on a 7-point Likert scale (1 = strongly disagree, 7 = strongly agree; Cronbach’s α =
0.90). An example item is “I like to have the responsibility of handling a situation that
requires a lot of thinking.” We conduct a median split on the composite NC index,
calculated by averaging the three items, to divide our sample into low- and high-NC groups.

Results
To test H7, we submit WOA to a two-way ANCOVA while controlling for participants’ age
and education. Advice source serves as the independent variable with NC as the moderator.
We find a significant moderation effect of NC [F(1, 178) = 4.93, p < 0.05, η2 = 0.02],
suggesting that participants with higher NC show a greater tendency for algorithm appre­
ciation (Malgo = 0.35, SDalgo = 0.41; Mhuman = 0.21, SDhuman = 0.33) than those with lower
NC (Malgo = 0.19, SDalgo = 0.33; Mhuman = 0.32, SDhuman = 0.42); see Figure 5.

Discussion
Our results support H7 in that algorithm appreciation under an elaborated PPF (Table-Diff)
might be contingent on individuals’ NC. Specifically, when prediction performance is
presented in the Table-Diff format, people with higher NC show a greater tendency for
algorithm appreciation than those with lower NC. Interestingly, people with relatively high

Figure 5. Moderation effect of need for cognition in table-diff condition in study 3.


358 YOU ET AL.

NC display a similar pattern of algorithm appreciation under the NE and the aggregated
PPFs (AAE and AAE-AHR) in previous studies. These results suggest that an elaborated
PPF (Table-Diff) does not necessarily deter people with relatively high NC from appreciat­
ing algorithmic advice.

General Discussion
Summary of Main Findings
In a series of five experiments, we find consistent evidence of algorithm appreciation; that is,
individuals show a greater advice taking from an algorithm than humans. More impor­
tantly, we show whether and how increased performance transparency influences one’s level
of algorithm appreciation. In response to RQ1, Studies 1 and 2 indicate that individuals
exhibit similar levels of algorithm appreciation between the NE and aggregated PPFs.
Further increasing performance transparency via an elaborated PPF lowers algorithm
appreciation. The findings of Studies 1, 2, and 3 also answer RQ2. Results from Studies 1
and 2 indicate that trust in an advisor’s prediction performance mediates the impact of
advice source on the degree of advice taking when individuals are likely to experience
a relatively low cognitive load from the NE and an aggregated PPF. An elaborated PPF
increases cognitive load and inhibits the mediation effect of trust in the impact of advice
source on the degree of advice taking. Finally, the results from Study 3 reveal that
individuals with higher NC are more prone to algorithm appreciation despite the relatively
high cognitive load induced by an elaborated PPF. Table 4 outlines these findings and their
relationships with our research questions and hypotheses.

Implications for Theory and Research


Our paper has several implications for theory and research on algorithmic advice taking.
First, we provide a theoretical framework and research model based on the JAS paradigm to
demonstrate how the trust and design aspects of human–AI collaboration can jointly

Table 4. Summary of hypotheses and empirical results.


Experiment RQ Hypotheses Results
Study 1A-C RQ1 H1: Individuals exhibit a lower degree of advice taking from an algorithmic relative to Rejected
human advisor when the advisor’s prediction performance is displayed in an
aggregated format than when no prediction performance is provided.
Study 1A RQ2 H2: Individuals exhibit higher trust in an algorithmic compared to human advisor. Supported
RQ2 H3: Trust mediates the impact of the advice source on one’s degree of advice taking when Supported
the advisor’s prediction performance is displayed in an aggregated format and when no
prediction performance is provided.
Study 2 RQ2 H4: Individuals exhibit a higher cognitive load when the advisor’s prediction performance Supported
is displayed in an elaborated format than when the prediction performance is not
displayed or displayed in an aggregated format.
RQ1 H5: Individuals exhibit a lower degree of advice taking from an algorithmic relative to Supported
human advisor when the advisor’s prediction performance is displayed in an elaborated
format than when prediction performance is not displayed or displayed in an
aggregated format.
RQ2 H6: Trust mediates the impact of the advice source on one’s degree of advice taking, but Supported
only when the cognitive load is relatively low (vs. high).
Study 3 RQ1 H7: NC moderates the impact of the advice source on one’s degree of advice taking when Supported
the advisor’s performance information is presented in an elaborated format.
JOURNAL OF MANAGEMENT INFORMATION SYSTEMS 359

influence individuals’ advice taking. These findings enhance our understanding of the
central role of trust as an underlying mechanism of algorithm appreciation. Our results
suggest that trust in an advisor mediates the impact of advice source on one’s degree of
advice taking when performance information is either not provided or appears in an
aggregated PPF, but this mediation diminishes with an elaborated PPF. Our findings
confirm previous research regarding the mediation effects of trust in the adoption of
recommendation agents [1, 6, 35, 67, 68, 72], although these effects vary; rather, they
might depend on how an algorithm’s performance information is displayed.
Second, we enrich the algorithmic transparency literature by investigating performance
transparency, which conveys why an individual should rely on algorithmic advice rather than
explaining how an algorithm works [10, 18, 54, 57, 71]. Our work is likely the first to
systematically examine how variation in performance transparency influences algorithm
appreciation while controlling for advice quality and the advisor’s prediction performance.
Our findings provide a plausible explanation for the contradictory results from Dietvorst et al.
[16] and Castelo et al. [12] regarding how increased performance transparency affects algo­
rithm appreciation. The results of Study 1 accord with those of Castelo et al. [12]: increasing
performance transparency by presenting performance in an aggregated format (vs. no per­
formance presentation) does not necessarily decrease algorithm appreciation. Our findings
from Study 2 also align with Dietvorst et al. [16]: enhancing performance transparency by
presenting performance in an elaborated format (vs. no performance presentation) lessens
algorithm appreciation. Taken together, prior mixed findings could be partially attributed to
different PPFs inducing distinct cognitive loads that moderate algorithm appreciation.
Lastly, the majority of prior studies on algorithmic advice taking focus on trust as a driver to
enhance individuals’ adherence to algorithmic advice [12, 16, 17, 21, 39]. Differently, our paper
demonstrates that the cognitive load induced by PPFs is another important factor contributing
to when and why individuals appreciate algorithms than humans. This insight may illuminate
future algorithm advice-taking research. In particular, we speculate that the increased perfor­
mance transparency stemming from an elaborated compared to an aggregated PPF could result
in a higher cognitive load and thus compromise interpretability [36, 49]. While our research
does not directly examine algorithms’ interpretability, our findings suggest that the information
presentation entailing relatively low cognitive effort in interpretation is likely to enhance one’s
reliance on algorithmic advice. Moreover, the cognitively effortful interpretation of algorithmic
advice differentially affects algorithm appreciation. In this sense, our paper contributes to the
literature by highlighting the importance of considering one’s cognitive properties in algorith­
mic decision making. Future research can consider the impacts of interpretability on algorith­
mic advice taking among individuals with varying cognitive resources.

Implications for Practice


Our research also provides valuable implications for managers and practice. First, organiza­
tion managers should be aware that algorithms can communicate advice more effectively
than humans due to eliciting stronger consumer trust. Managers implementing algorithmic
decision aids should ensure that users develop trust towards the algorithm when interacting
with it. Second, algorithm appreciation can be expected to generate better decision out­
comes if algorithmic advice benefits individuals (e.g., debiasing human decisions). Even so,
managers should be cautious of individuals’ bias when integrating advice from a biased
360 YOU ET AL.

algorithm [3, 63]. Third, our results can alleviate concerns among individuals or organiza­
tions implementing algorithmic recommendations subjected to legal enforcement (e.g.,
GDPR). Presenting algorithmic performance (even when prediction performance is not
perfect) does not necessarily lead to lower algorithm appreciation as long as individuals can
easily process this information. Lastly, for algorithm managers, offering the highest algo­
rithmic transparency is not always optimal: we show that performance presentation in an
elaborated format does not necessarily boost humans’ algorithm appreciation due to an
increased cognitive load. A possible solution to this problem is to offer individuals PPF
options so they may choose a format that suits their cognitive tendencies.

Limitations and Future Research


Our research is not without limitations, which can inform future studies. First, our
research model highlights the role of advisor trust in algorithmic decisions. Although
we only examine how advice source and performance transparency affect individuals’
algorithm appreciation, future work could investigate factors that might alter people’s
trust in an advisor’s prediction performance (e.g., the type of prediction task, indivi­
duals’ expertise, and prediction performance varying from poor to good). Second, our
study underscores the roles of individuals’ cognitive characteristics in algorithmic
decision making. We explore how three PPFs influence algorithm appreciation; future
research could investigate how information presentation formats influence other types
of algorithmic decisions (e.g., algorithmic delegation). Last, personalized algorithmic
recommendations should consider not only individuals’ product consumption utility
but also the utility associated with information processing contingent upon algorithmic
information presentation.

Conclusion
Despite the converging evidence of algorithm appreciation (i.e., higher persuasive
efficacy of algorithmic advice compared to the identical human advice), scarce research
presents mixed findings regarding whether and how an increase in prediction perfor­
mance transparency impacts one’s appreciation of algorithmic relative to human
decision aids. To address these issues, we propose a theoretical model based on the
JAS paradigm and show that presenting prediction performance does not necessarily
deter individuals from generating higher reliance on algorithmic than human advice.
Individuals’ higher trust in an algorithmic relative to a human advisor drives one’s
higher reliance on algorithmic advice, but only if an advisor’s prediction performance
presentation does not induce a high cognitive load. To the best of our knowledge, we
are among the first to highlight the importance of considering individuals’ cognitive
aspects when increasing algorithmic transparency in the context of algorithmic deci­
sion making. Our findings provide insights for algorithmic managers and avenues for
future research.
JOURNAL OF MANAGEMENT INFORMATION SYSTEMS 361

Notes
i We also confirm this finding in a preliminary study. Among 125 Amazon Mechanical Turkers
(52.8 percent women, Mage = 39.8), 60 percent prefer to know the prediction accuracy rather
than the algorithm’s mechanism when asked to imagine their use of an algorithmic decision aid
in making predictions (Chi-square test p < 0.03).
ii Following prior research [23, 39], WOA is winsorized at 0 and 1; that is, any value of WOA
greater than 1 is replaced by 1, and any value of WOA less than 0 is replaced by 0. The value of
WOA is not available if the initial judgment is identical to the advice.

Disclosure Statement
No potential conflict of interest was reported by the authors.

Notes on contributors
Sangseok You (sangyou@skku.edu) holds a Ph.D. from University of Michigan and is an assistant
professor of Information Systems at Sungkyunkwan University, South Korea. His research focuses on
understanding how teams working with technologies operate and promote team outcomes, encom­
passing such topics as human-robot collaboration, artificial intelligence, and virtual and open and
virtual collaboration. Dr. You’s work has appeared in Journal of Association for Information Systems,
Journal of the Association for Information Science and Technology, and other journals, and in the
proceedings of Academy of Management Annual Meeting and International Conference on
Information Systems.
Cathy Liu Yang (yang@hec.fr) is an assistant professor in the Information Systems Department at
HEC Paris. She received her Ph.D. in marketing from Columbia Business School. Her research
interests include modeling consumer information processing to improve preference measurement
and understanding the causal impact of algorithmic influence by using secondary and laboratory
data. Dr. Yang’s award-winning work appeared in Journal of Marketing Research.
Xitong Li (lix@hec.fr; corresponding author) is an associate professor of Information Systems at HEC
Paris, France. He received his Ph.D. in Management from MIT Sloan School of Management and his
Ph.D. in Engineering from Tsinghua University, China. His research interests include the identifica­
tion of causal impacts of using online data/information. Dr. Li’s work has appeared in journals,
including Information Systems Research, Journal of Management Information Systems, MIS Quarterly,
and various ACM/IEEE Transactions. His research work has been granted by the French national
research agency ANR AAPG France and has received several best paper awards,

Funding
This research work is partly supported by funding from the Hi! PARIS Fellowship and the French
National Research Agency (ANR) Investissements d’Avenir LabEx Ecodec [Grant ANR-11-LABX
-0047].

References
1. Adomavicius, G.; Bockstedt, J.C.; Curley, S.P.; and Zhang, J. Reducing recommender system
biases: An investigation of rating display designs. MIS Quarterly: Management Information
Systems, 43, 4 (2019), 1321–1341.
2. Ahmad, M.I.; Bernotat, J.; Lohan, K.; and Eyssel, F. Trust and cognitive load during human-
robot interaction. arXiv:1909.05160 [cs], (September 2019).
362 YOU ET AL.

3. Ahsen, M.E.; Ayvaci, M.U.S.; and Raghunathan, S. When algorithmic predictions use
human-generated data: A bias-aware classification algorithm for breast cancer diagnosis.
Information Systems Research, 30, 1 (March 2019), 97–116.
4. Banerjee, A.V. A simple model of herd behavior. The Quarterly Journal of Economics, 107, 3
(August 1992), 797–817.
5. Barrouillet, P.; Bernardin, S.; Portrat, S.; Vergauwe, E.; and Camos, V. Time and cognitive load
in working memory. Journal of Experimental Psychology: Learning, Memory, and Cognition, 33,
3 (2007), 570–585.
6. Benbasat, I.; and Wang, W. Trust in and adoption of online recommendation agents. Journal of
the Association for Information Systems, 6, 3 (March 2005).
7. Berk, R. An impact assessment of machine learning risk forecasts on parole board decisions
and recidivism. Journal of Experimental Criminology, 13, 2 (June 2017), 193–216.
8. Bettman, J.R.; and Kakkar, P. Effects of information presentation format on consumer infor­
mation acquisition strategies. Journal of Consumer Research, 3, 4 (March 1977), 233–240.
9. Cacioppo, J.T.; Petty, R.E.; Feinstein, J.A.; and Jarvis, W.B.G. Dispositional differences in
cognitive motivation: The life and times of individuals varying in need for cognition.
Psychological Bulletin, 119, 2 (1996), 197–253.
10. Cai, C.J.; Winter, S.; Steiner, D.; Wilcox, L., and Terry, M. “Hello AI”: Uncovering the
onboarding needs of medical practitioners for human-ai collaborative decision-making.
Proceedings of the ACM on Human-Computer Interaction, 3. CSCW, Austin, Texas,
November 2019, pp. 1–24.
11. Canfield, C.; Bruin, W.B. de; and Wong-Parodi, G. Perceptions of electricity-use communica­
tions: effects of information, format, and individual differences. Journal of Risk Research, 20, 9
(September 2017), 1132–1153.
12. Castelo, N.; Bos, M.W.; and Lehmann, D.R. Task-dependent algorithm aversion. Journal of
Marketing Research, 56, 5 (October 2019), 809–825.
13. Chevalier, J.A.; and Mayzlin, D. The effect of word of mouth on sales: Online book reviews.
Journal of Marketing Research, 43, 3 (August 2006), 345–354.
14. Chun, W.Y.; and Kruglanski, A.W. The role of task demands and processing resources in the
use of base-rate and individuating information. Journal of Personality and Social Psychology, 91,
2 (2006), 205–217.
15. Das, T.K.; and Teng, B.-S. Trust, control, and risk in strategic alliances: An integrated
framework. Organization Studies, 22, 2 (March 2001), 251–283.
16. Dietvorst, B.J.; Simmons, J.P.; and Massey, C. Algorithm aversion: People erroneously avoid
algorithms after seeing them err. Journal of Experimental Psychology: General, 144, 1 (2015),
114–126.
17. Dietvorst, B.J.; Simmons, J.P.; and Massey, C. Overcoming algorithm aversion: people will use
imperfect algorithms if they can (even slightly) modify them. Management Science, 64, 3
(March 2018), 1155–1170.
18. Dove, G.; Balestra, M.; Mann, D.; and Nov, O. Good for the many or best for the few?
A dilemma in the design of algorithmic advice. Proceedings of the ACM on Human-
Computer Interaction, 4. CSCW2, 2020, pp. 1–22.
19. Duan, J.; Xia, X.; and Van Swol, L.M. Emoticons’ influence on advice taking. Computers in
Human Behavior, 79, (February 2018), 53–58.
20. Dzindolet, M.T.; Peterson, S.A.; Pomranky, R.A.; Pierce, L.G.; and Beck, H.P. The role of trust
in automation reliance. International Journal of Human-Computer Studies, 58, 6 (June 2003),
697–718.
21. Fügener, A.; Grahl, J.; Gupta, A.; and Ketter, W. Will humans-in-the-loop become borgs?
Merits and pitfalls of working with AI. Management Information Systems Quarterly, 45, 3
(2021), 1527–1556.
22. Ganzach, Y. Predictor representation and prediction strategies. Organizational Behavior and
Human Decision Processes, 56, 2 (November 1993), 190–212.
23. Gino, F. Do we listen to advice just because we paid for it? The impact of advice cost on its use.
Organizational behavior and human decision processes, 107, 2 (2008), 234–245.
JOURNAL OF MANAGEMENT INFORMATION SYSTEMS 363

24. Gino, F.; and Schweitzer, M.E. Blinded by anger or feeling the love: How emotions influence
advice taking. Journal of Applied Psychology, 93, 5 (2008), 1165.
25. Goodman, B.; and Flaxman, S. European Union regulations on algorithmic decision-making
and a “right to explanation.” AI Magazine, 38, 3 (October 2017), 50–57.
26. Gunaratne, J.; Zalmanson, L.; and Nov, O. The persuasive power of algorithmic and
crowdsourced advice. Journal of Management Information Systems, 35, 4 (October 2018),
1092–1120.
27. Hainc, N.; Federau, C.; Stieltjes, B.; Blatow, M.; Bink, A., and Stippich, C. The bright, artificial
intelligence-augmented future of neuroimaging reading. Frontiers in Neurology, 8, (2017), 489.
doi:10.3389/fneur.2017.00489.
28. Harvey, N.; and Fischer, I. Taking advice: Accepting help, improving judgment, and
sharing responsibility. Organizational behavior and human decision processes, 70, 2
(1997), 117–133.
29. Hayes, A.F. Introduction to Mediation, Moderation, and Conditional Process Analysis, Second
Edition: A Regression-Based Approach. New York, NY: Guilford Publications, 2017.
30. de Holanda Coelho, G.L.; Hanel, P.H.; and Wolf, L.J. The very efficient assessment of need for
cognition: developing a six-item version. Assessment, 1, (2018), 16.
31. Hong, W.; Thong, J.Y.L.; and Tam, K.Y. The effects of information format and shopping task
on consumers’ online shopping behavior: A cognitive fit perspective. Journal of Management
Information Systems, 21, 3 (November 2004), 149–184.
32. Jarvenpaa, S.L. The effect of task demands and graphical format on information processing
strategies. Management Science, 35, 3 (March 1989), 285–303.
33. Jussupow, E.; Spohrer, K.; Heinzl, A.; and Gawlitza, J. Augmenting medical diagnosis deci­
sions? An investigation into physicians’ decision-making process with artificial intelligence.
Information Systems Research, 32, 3 (September 2021), 713–735.
34. Kalyuga, S. Cognitive load theory: How many types of load does it really need? Educational
Psychology Review, 23, 1 (2011), 1–19.
35. Komiak, S.Y.X.; and Benbasat, I. The effects of personalization and familiarity on trust and
adoption of recommendation agents. MIS Quarterly, 30, 4 (2006), 941–960.
36. Lage, I.; Chen, E.; He, J.; Narayanan, M.; Kim, B.; Gershman, S.; and Doshi-Velez, F. An
evaluation of the human-interpretability of explanation. arXiv preprint arXiv:1902.00006.
(2019).
37. Li, X.; and Wu, L. Herding and social media word-of-mouth: Evidence from Groupon.
Management Information Systems Quarterly, 42, 4 (December 2018), 1331–1351.
38. Liel, Y.; and Zalmanson, L. What If an AI Told You That 2 + 2 Is 5? Conformity to Algorithmic
Recommendations. (2020). ICIS 2020 Proceedings. 17. https://aisel.aisnet.org/icis2020/hci_
artintel/hci_artintel/17
39. Logg, J.M.; Minson, J.A.; and Moore, D.A. Algorithm appreciation: People prefer algorithmic
to human judgment. Organizational Behavior and Human Decision Processes, 151, (March
2019), 90–103.
40. Longoni, C.; Bonezzi, A.; and Morewedge, C.K. Resistance to medical artificial intelligence.
Journal of Consumer Research, 46, 4 (December 2019), 629–650.
41. McKnight, D.H.; Liu, P.; and Pentland, B.T. Trust change in information technology products.
Journal of Management Information Systems, 37, 4 (October 2020), 1015–1046.
42. Meehl, P.E. Clinical versus Statistical Prediction: A Theoretical Analysis and a Review of the
Evidence. Minneapolis, MN: University of Minnesota Press, 1954.
43. Meyer, J.; Shamo, M.K.; and Gopher, D. Information structure and the relative efficacy of tables
and graphs. Human Factors, 41, 4 (December 1999), 570–587.
44. Mollick, E.; and Nanda, R. Wisdom or madness? Comparing crowds with expert evaluation in
funding the arts. Management Science, 62, 6 (June 2016), 1533–1553.
45. Montazemi, A.R.; and Wang, S. The effects of modes of information presentation on
decision-making: A review and meta-analysis. Journal of Management Information Systems,
5, 3 (December 1988), 101–127.
364 YOU ET AL.

46. Mousavi, S.Y.; Low, R.; and Sweller, J. Reducing cognitive load by mixing auditory and visual
presentation modes. Journal of Educational Psychology, 87, 2 (1995), 319–334.
47. Önkal, D.; Goodwin, P.; Thomson, M.; Gönül, S.; and Pollock, A. The relative influence of
advice from human experts and statistical methods on forecast adjustments. Journal of
Behavioral Decision Making, 22, 4 (2009), 390–409.
48. Pierro, A.; Mannetti, L.; Erb, H.-P.; Spiegel, S.; and Kruglanski, A.W. Informational length and
order of presentation as determinants of persuasion. Journal of Experimental Social Psychology,
41, 5 (September 2005), 458–469.
49. Poursabzi-Sangdeh, F.; Goldstein, D.G.; Hofman, J.M.; Wortman Vaughan, J.W.; and
Wallach, H. Manipulating and Measuring Model Interpretability. In Proceedings of the 2021
CHI Conference on Human Factors in Computing Systems, New York, NY. Association for
Computing Machinery, 2021, pp. 1–52.
50. Promberger, M.; and Baron, J. Do patients trust computers? Journal of Behavioral Decision
Making, 19, 5 (2006), 455–468.
51. Qiu, L.; and Benbasat, I. Evaluating anthropomorphic product recommendation agents:
A social relationship perspective to designing information systems. Journal of Management
Information Systems, 25, 4 (April 2009), 145–182.
52. Remus, W. A study of graphical and tabular displays and their interaction with environmental
complexity. Management Science, 33, 9 (September 1987), 1200–1204.
53. Riedl, R.; Mohr, P.N.; Kenning, P.H.; Davis, F.D.; and Heekeren, H.R. Trusting humans and
avatars: A brain imaging study based on evolution theory. Journal of Management Information
Systems, 30, 4 (2014), 83–114.
54. Samson, B.P.V., and Sumi, Y. Exploring factors that influence connected drivers to (not) use or
follow recommended optimal routes. In Proceedings of the 2019 CHI Conference on Human
Factors in Computing Systems. Glasgow, Scotland, UK: Association for Computing Machinery,
2019, pp. 1–14.
55. Sniezek, J.A.; and Buckley, T. Cueing and cognitive conflict in judge-advisor decision making.
Organizational Behavior and Human Decision Processes, 62, 2 (May 1995), 159–174.
56. Sniezek, J.A.; and Van Swol, L.M. Trust, confidence, and expertise in a judge-advisor system.
Organizational Behavior and Human Decision Processes, 84, 2 (March 2001), 288–307.
57. Springer, A.; and Whittaker, S. Progressive disclosure: When, why, and how do users want
algorithmic transparency information? ACM Transactions on Interactive Intelligent Systems,
10, 4 (October 2020), 29:1-29: 32.
58. Surowiecki, J. The Wisdom Of Crowds: Why the Many are Smarter than the Few and How
Collective Wisdom Shapes Business, Economies, Societies, and Nations. New York, NY:
Doubleday & Co, 2004.
59. Sweller, J. Cognitive load during problem solving: Effects on learning. Cognitive Science, 12, 2
(April 1988), 257–285.
60. Sweller, J. Element interactivity and intrinsic, extraneous, and germane cognitive load.
Educational Psychology Review, 22, 2 (June 2010), 123–138.
61. Sweller, J.; van Merriënboer, J.J.; and Paas, F. Cognitive architecture and instructional design:
20 years later. Educational Psychology Review, 31, 2 (2019), 261–292.
62. Sweller, J.; van Merrienboer, J.J.G.; and Paas, F.G.W.C. Cognitive architecture and instruc­
tional design. Educational Psychology Review, 10, 3 (September 1998), 251–296.
63. Teodorescu, M.; Morse, L.; Awwad, Y.; and Kane, G. Failures of fairness in automation require
a deeper understanding of human-ML Augmentation. Management Information Systems
Quarterly, 45, 3 (September 2021), 1483–1500.
64. Van Merriënboer, J.J.G.; and Sweller, J. Cognitive load theory and complex learning: Recent
developments & future directions. Educational Psychology Review, (2005), 147177.
65. Van Merriënboer, J.J.G.; and Sweller, J. Cognitive load theory in health professional education:
design principles and strategies. Medical Education, 44, 1 (January 2010), 85–93.
66. Vessey, I. Cognitive fit: A theory-based analysis of the graphs versus tables literature. Decision
Sciences, 22, 2 (1991), 219–240.
JOURNAL OF MANAGEMENT INFORMATION SYSTEMS 365

67. Wang, W.; and Benbasat, I. Recommendation agents for electronic commerce: Effects of
explanation facilities on trusting beliefs. Journal of Management Information Systems, 23, 4
(May 2007), 217–246.
68. Wang, W.; and Benbasat, I. Empirical assessment of alternative designs for enhancing different
types of trusting beliefs in online recommendation agents. Journal of Management Information
Systems, 33, 3 (July 2016), 744–775.
69. Wang, X.; Guo, Y.; and Xu, C. Recommendation algorithms for optimizing hit rate, user
satisfaction and website revenue. In Proceedings of the 24th International Conference on
Artificial Intelligence. Buenos Aires, Argentina: AAAI Press, 2015, pp. 1820–1826.
70. Ward, A.; and Mann, T. Don’t mind if I do: Disinhibited eating under cognitive load. Journal of
Personality and Social Psychology, 78, 4 (2000), 753–763.
71. Wolf, C.; and Blomberg, J. Evaluating the promise of human-algorithm collaborations in
everyday work practices. Proceedings of the ACM on Human-Computer Interaction, 3, CSCW
(November 2019), 143:1–143: 23.
72. Xu, J.; Benbasat, I.; and Cenfetelli, R.T. The nature and consequences of trade-off transparency
in the context of recommendation agents. MIS Quarterly, 38, 2 (February 2014), 379–406.
73. Yeomans, M.; Shah, A.; Mullainathan, S.; and Kleinberg, J. Making sense of recommendations.
Journal of Behavioral Decision Making, 32, 4 (2019), 403–414.
74. Yu, S.; Chai, Y.; Chen, H.; Brown, R.A.; Sherman, S.J.; and Nunamaker Jr, J.F. Fall Detection
with wearable sensors: A hierarchical attention-based convolutional neural network approach.
Journal of Management Information Systems, 38, 4 (2021), 1095–1121.
75. Zhou, J.; Arshad, S.Z.; Luo, S., and Chen, F. Effects of uncertainty and cognitive load on user
trust in predictive decision making. In R. Bernhaupt, G. Dalvi, A. Joshi, D. K. Balkrishan,
J. O’Neill, and M. Winckler () (eds.), Human-Computer Interaction – INTERACT 2017.
Mumbai, India: Springer International Publishing, Cham, 2017, pp. 23–39.

You might also like