Ozanne Et Al 2022 Shall Ai Moderators Be Made Visible Perception of Accountability and Trust in Moderation Systems On

Original Research Article
Big Data & Society

July–December: 1–13
Shall AI moderators be made visible? © The Author(s) 2022
Article reuse guidelines:
Perception of accountability and trust in sagepub.com/journals-permissions
DOI: 10.1177/20539517221115666
moderation systems on social media journals.sagepub.com/home/bds
platforms
Marie Ozanne1 , Aparajita Bhandari2, Natalya N Bazarova2

and Dominic DiFranzo3
Abstract
This study examines how visibility of a content moderator and ambiguity of moderated content influence perception of
the moderation system in a social media environment. In the course of a two-day pre-registered experiment conducted in
a realistic social media simulation, participants encountered moderated comments that were either unequivocally harsh
or ambiguously worded, and the source of moderation was either unidentified, or attributed to other users or an auto-
mated system (AI). The results show that when comments were moderated by an AI versus other users, users perceived
less accountability in the moderation system and had less trust in the moderation decision, especially for ambiguously
worded harassments, as opposed to clear harassment cases. However, no differences emerged in the perceived moder-
ation fairness, objectivity, and participants confidence in their understanding of the moderation process. Overall, our
study demonstrates that users tend to question the moderation decision and system more when an AI moderator is vis-
ible, which highlights the complexity of effectively managing the visibility of automatic content moderation in the social
media environment.
Keywords
Social media, online harassment, content moderation, flagging, AI moderator, automated moderation system
Introduction users confused and unsure about how and why content
moderation occurs (Crawford and Gillespie, 2016; Jhaver
Moderation in social media platforms has become a prom- et al., 2019a). While there is abundant critical scholarship
focused on the issue of content moderation (e.g. Barocas
inent topic, especially with the rising wave of cyberbullying
and online harassment. Online harassment ranges from et al., 2013; Gillespie, 2014, 2020; Kroll et al., 2016),
offensive name-calling or purposeful embarrassment to few empirical studies have looked at how information visi-
bility on moderation decisions affects users. One stream of
stalking, physical threats or harassment over a sustained
period (Vogels, 2021). According to surveys published by research has started to focus on the reaction of the moder-
Pew Research, as many as 41% of U.S. adults have person- ated users (Jhaver et al., 2019a, 2019b), while other litera-
ture has started to explore the effect of information visibility
ally experienced online harassment (Vogels, 2021), and
66% have witnessed potentially harassing behavior directed
toward others online (Duggan, 2017). To understand how 1
Cornell SC Johnson College of Business, Cornell University, Ithaca, NY,
content moderation works to prevent and remove inappro-
USA
priate content, activists have pushed for more transparency 2
Department of Communication, Cornell University, Ithaca, NY, USA
in moderation decisions (Santa Clara Principles, 2018), 3
Lehigh University, P.C. Rossin College of Engineering and Applied Science,
especially since each social media platform (i.e. Bethlehem, PA, USA
Facebook, Twitter, etc.) has its own set of community
Corresponding author:
guidelines and regulations. Yet, most moderation systems Marie Ozanne, Cornell SC Johnson College of Business, Cornell
and decisions remain opaque (Banchik, 2020; Crawford University, 336 Statler Hall, Ithaca, NY 14853, USA.
and Gillespie, 2016; Suzor et al., 2019), which can leave Email: m.ozanne@cornell.edu
Creative Commons Non Commercial CC BY-NC: This article is distributed under the terms of the Creative Commons Attribution-
NonCommercial 4.0 License (https://creativecommons.org/licenses/by-nc/4.0/) which permits non-commercial use, reproduction and
distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access page (https://us.
sagepub.com/en-us/nam/open-access-at-sage).
2 Big Data & Society
on online community members (e.g. Matias et al., 2020). accounting for content subtlety, sarcasm, or cultural
For instance, when a welcome note to newcomers explained meaning (Duarte et al., 2018). They are also not able
the community’s norms and mentioned that harassment was to easily adjust for context, for example, terrorist speech
uncommon within the community, newcomers’ participa- conveyed within a journalistic post (Llanso, 2019).
tion in online feminism discussions increased (Matias Additionally, automated moderation systems can exacerbate
et al., 2020). Yet, there is limited understanding of how rather than ameliorate the content policy problems that plat-
lay users perceive moderation with different types of forms face. Specifically, these systems, even when well-
content and moderation source visibility. designed, increase opacity, make moderation practices more
Previous research has shown that users develop folk the- difficult to understand, and can reduce fairness within
ories about why or how content has been taken down, and large-scale platforms (Gorwa et al., 2020). In the same vein,
users often think that other humans are primarily respon- Gollatz and colleagues (2018) point out similar issues with
sible for such actions (Myers West, 2018). Yet, social ambiguity in the automation of content moderation, and that
media companies often use a mix of different moderators opaque implementation, vague definitions, and lack of
to regulate content on their website, from automated accountability stand in the way of accuracy for automated
systems to commercial moderators to users themselves, content moderation.
but we do not know whether users react differently to differ- The other type of moderation employed by platforms,
ent sources of moderation. In this study, therefore, our goal “human” moderation, can take the form of commercial
is to explore how the display of the source of moderation content moderation or community-based moderation.
(AI vs. other users vs. unspecified source) affects users’ per- There has been an extensive discussion in the literature
ception of, agreement with, trust in, and perceived fairness around the issues that arise from platforms’ use of commer-
of the moderation decision. We also investigate how the cial content moderators, people whose job it is to look
source of moderation affects users’ belief certainty in how through potentially harmful content before it is posted on
moderation works on the social media site as well as per- social media platforms. Specifically, commercial moder-
ceived objectivity and accountability of the moderation ation has been criticized for its harmful labor practices,
system. including extremely stressful and mentally taxing working
This study is important for several reasons. First, it is conditions and mental health challenges faced by modera-
critical to focus on online community members’ perspec- tors, as well as for its overall lack of transparency and
tives as they, as witnesses of problematic content, are indir- accountability (Gillespie, 2018; Roberts, 2016).
ectly impacted by harassment and, consequently, by the On the other hand, sites like Wikipedia, Reddit, and
moderation systems in place. Information visibility, and Twitch rely on their own user communities to do most of
specifically the source of moderation, can have important the moderation. Seering and colleagues (2019) found that
consequences on how users perceive social media plat- this type of moderation system leads to significantly differ-
forms, learn from the moderation system in place, and ent dynamics compared to when moderation is driven
whether they decide to become members and participate top-down by company policy. User-driven moderation is
in an online community (Jhaver et al., 2019b; Matias an intensely social process that is core to community devel-
et al., 2020). Second, existing approaches to algorithm ana- opment (see Gillespie, 2020; Llansó, 2020). Even sites that
lysis have paid limited attention to the relationships do use top-down moderation approaches like commercial
between platform policy and user perspectives (Shin and moderators often do so in conjunction with user flagging,
Park, 2019), so understanding users’ perception is needed which is framed in terms of self-regulation of social
to fully comprehend the ecosystem of moderation media platforms. Many platforms have specific community
decisions. guidelines and encourage users to flag content that violates
these guidelines to distribute the labor of moderation. This
also shifts responsibility (or at least perceived responsibil-
Literature review ity) onto the shoulders of the users rather than on the site.
Calls for increased transparency have grown in the litera-
Sources of moderation ture, as it is believed that transparency increases under-
Social media platforms rely on both automated and human standing, fairness and trust (see Santa Clara Principles,
moderation systems to govern the content allowed on their 2018). However, many researchers have rightfully pointed
sites. Platforms are working toward creating automated out that blanket transparency may not truly be desirable,
tools to identify hate speech, adult content, and extremism especially because transparency can be manipulated, by
more quickly and at a larger scale than human reviewers, emphasizing specific information and downsizing other
ideally removing this type of content before any human types of information (Berkelaar and Harrison, 2017; ter
sees it. However, many researchers caution against a full reli- Hoeven et al., 2021). For example, too much transparency
ance on automatic moderation. Researchers have pointed out as to how decisions are made could lead to increases in indi-
that automated moderation tools are not very good at viduals’ attempts to “game” the system. Some scholars have
Ozanne et al. 3
pointed to ways that moderating algorithms can be “gamed” content, not knowing the source of moderation can
and moderation can be used to harass and silence certain confuse users and lead to inaccurate guesses about moder-
(often minority) groups online (Diakopoulos, 2015; ating decisions (Myers West, 2018; Santa Clara
Munger, 2017; Noble, 2018). Principles, 2018; Suzor et al., 2019). Similarly, visibility
Given the different types of moderation source and the of the source of moderation might influence users’ decision
impact that revealing each of the sources can have on to check the site content policy when seeing a moderated
users, we explore how each of the moderation sources comment. Looking at the site content policy can serve as
(AI, other users or unknown) affects users’ perceptions of an indicator of whether users question or want to gain
1) moderation decision via trust and fairness, their agree- more information on the moderation system and regulation.
ment with the moderation decision, and likelihood to look
at the site content policy; 2) moderation process via belief Belief Certainty in the Moderation Process and Perceived
certainty; and 3) moderation system via perceived system Accountability and Objectivity of the Moderation System. For
accountability and objectivity. In the next section, we the perception of the moderation process, we examine
explain the reasoning behind each of our research ques- belief certainty referring to users’ confidence around their
tions, by accounting for users’ perceptions of moderation understanding of how the moderation process works on
decisions to their general understanding of moderation the site. Certainty is an important dimension of attitudes
process to the platform’s moderation system as a whole. and beliefs because it influences behavior’s stability and
durability over time (Bargh et al., 1992; Grant et al.,
Fairness and Trust in the Moderation Decision. As the use of 1994; Gross et al., 1995). People’s confidence around the
algorithmic systems has increased across platforms and moderation process is likely to be linked to visibility or
technologies, many researchers have started to examine transparency around the identity of the moderations and
users’ perceptions of algorithms’ fairness, accountability, explanations for why certain content has been taken down
and transparency (FAT) (Shin and Park, 2019). Broadly (Grimmelmann, 2015). When the moderation process is
speaking, perceived fairness of algorithms reflects a tenet obscure, users lack the basis for evaluating moderation
that algorithmic decisions should not be discriminatory or and interpret moderators’ decisions in a subjective (and
prejudiced and should not lead to unjust consequences often inaccurate) way (Kempton, 1986). For example,
(Yang and Stoyanovich, 2017). For example, Facebook when a moderation system was opaque, users whose
states that they strive to have fair moderation decisions, content was moderated by algorithms often erroneously
meaning that the same content posted by two users would attributed moderation to other users instead of algorithms
be equally likely to be either pulled down or left up regard- (see Eslami et al., 2015; Myers West, 2018). In other
less of who posted the content (Gorwa, 2018). However, words, in the absence of transparency around moderation
fairness can get quite complicated as the perception of a process, users resort to their own folk theories (Eslami
decision’s fairness is largely contextual and subjective. et al., 2016), which is a combination of a priori beliefs
Gorwa (2018) points out that aspects of fairness involve about how the system works and new information gained
balancing individual rights to expression against potential from direct experience with the platform (DeVito et al.,
widespread societal harm, and this balancing aspect may 2018; French, 2017). Similar to how an explanation
vary from users to users or from platforms to platforms. around content removal facilitates users’ understanding of
In addition, another key facet of user perception of mod- how moderation works (Jhaver et al., 2019a), it can increase
eration is trust. Shin and Park (2019) found that when users their belief certainty, and we specifically examine how dis-
had higher levels of trust in algorithmic systems, they were closing the source of moderation and types of moderators
more likely to see algorithms as fair and accurate. (users vs. AI vs. unknown) influence belief certainty in
Additionally, trust moderated the relationship between the moderation process.
FAT and user satisfaction in their study. Other research On the platform level, we examine users’ perceptions of
found that transparency of an automatic moderation the moderation system’s accountability and objectivity.
system is a prerequisite for trust in that system (Brunk Algorithmic accountability refers to perceived impacts
et al., 2019). Yet, an important question is whether trust or consequences of an algorithmic system, and the
and perceived fairness of the moderation decision vary by extent to which the algorithmic system (or its creators) is
type of moderation source (AI vs. user-based) and its held accountable for unintended harms that arise due to
visibility. the system’s decisions. Users may be concerned that
algorithmic systems are vulnerable to making mistakes
Agreement with the Moderation Decision and Likelihood to or can lead to undesired consequences (Lee, 2018).
Look at the Site Content Policy. In addition to perceptions Accountability is also often linked to visibility of informa-
of the moderation decision, it is important to understand tion. Being able to make observations from visible informa-
users’ actions around it. While social media companies tion helps create insights and knowledge required to hold
give users the tools to help moderate inappropriate systems accountable. It has been shown that visibility of
information can create two forms of perceived accountabil- discrepancy between the ways in which different platforms
ity: a “soft” accountability, where systems and platforms respond to harassment. For example, some platforms
must answer for their action, and a “hard” accountability, merely censor the offensive content, while other policies
where insights and knowledge that come as a result of visi- point to the potential involvement of government and law
bility bring about “power to sanction and demand compen- enforcement such as police and intelligence agencies
sation for harms” (Ananny and Crawford, 2018: 976; see (Brown and Pearson, 2018).
Fox, 2007). Following Rader and colleagues (2018), we The context and intentionality, previous interactions
refer to perceived accountability as participants’ beliefs between the harasser and the person being harassed,
of “how the system might be accountable to them, as offline power dynamics and many other factors can all be
individual users” (p. 8) through measuring their perceived important in determining if a post or a comment constitutes
control over the outputs and perceived system’s fairness, harassment (Langos, 2012). In fact, flagging can sometimes
and we raise a question of how visibility around the move from a mechanism of reporting and upstanding into
source of moderation might influence perceptions of something that can be “gamed” and abused (Crawford
system accountability. and Gillespie, 2016). Finally, because of ambiguity in har-
Finally, we also explore perceived objectivity of the assment comments, building automated systems to detect
moderation system. Perceived objectivity has been linked harassment can be difficult and these systems cannot neces-
to higher perceived trust and credibility in algorithmic deci- sarily understand and interpret ambiguity (Nadali et al.,
sions (Sundar and Nass, 2001). People often assume that 2013).
algorithmic decisions are more credible than human deci- Therefore, we explore how content ambiguity can mod-
sions because computers are more objective than humans erate effects of source visibility on users’ perceptions
(Sundar and Nass, 2001). However, individuals can vary around moderation decision (trust and fairness), moderation
in their experience with algorithms, and when a machine process (belief certainty), and moderation system (account-
is too “machine-like,” system objectivity can backfire and ability and objectivity), as well as people’s agreement with
users can infer more trust in human than machine decisions the moderation decision and their likelihood to look at the
(Dietvorst et al., 2015; Waddell, 2018). Yet, perceived site policy. A summary of our predictions is presented in
objectivity in moderation decisions hasn’t been yet Figure 1.
explored in relation to transparency around the source of
moderation.
Methods
This study was part of a larger experiment, which resulted
Ambiguity in moderation in two separate studies, each one involving different
As discussed previously, there is a great deal of opaqueness research questions, hypotheses, and dependent variables
surrounding content moderation. Content ambiguity can (first study: Bhandari et al., 2021). The experiment was pre-
add another layer of opacity, especially because different registered on the Open Science Framework, with the
social media platforms have different strategies for research questions, measures, and analysis plan available
dealing with problematic content and even have different at the following https://osf.io/bwm8e. It involved a custom-
definitions of allowable or appropriate content. Pater and made social media platform called “EatSnap.Love,” which
colleagues (2016) conducted a content analysis of the har- focuses on sharing pictures of food and reproduces basic
assment regulation policies for fifteen different social functionalities of a social network site (DiFranzo et al.,
media platforms. They find what they call a “striking incon- 2018; Taylor et al., 2019). This platform was designed as
sistency” between different platforms in their definitions of a testbed to study users’ reactions to harassment online in
harassment. This lack of a common definition of harassment a realistic yet controlled environment. As a cover story,
across platforms can cause uncertainty on how to respond to we told participants that they would be beta testing a new
different types of offenses. As a result, there is a social media platform. On this platform, users can create
Figure 1. Summary of predictions.

Ozanne et al. 5
and share their own social media posts through a newsfeed,

and to comment, like, and flag posts. Within EatSnap.Love,
we have used a social media simulation engine called
Truman, which mimics other users’ behaviors on the site
through pre-programmed bots that could comment or like
participants’ posts and others’ posts that were displayed
on the participant’s newsfeed as if they were coming from
real users on the site. This way the Truman platform
allows for a realistic social media experience while preserv-
ing the experimental control through identically curated
socio-technical environments for each participant assigned
to a similar experimental treatment. The Truman platform
is open-source and allows for easy replication of this or
any other study, as all simulation material (bots, posts,
images, etc.) as well as the platform itself are available on
this GitHub https://github.com/cornellsml/truman.
Participants
We recruited 582 participants from Amazon MTurk to par-
ticipate in our study in exchange for a $10 compensation.
The power analysis calculation using the software
program G*Power yielded a target sample size of 350,
with .80 power to detect a medium effect size of .20 at
the standard .05 error probability. To receive the full
payment, participants were asked to log into EatSnap.love
at least twice a day for two days and to create at least one
post each day. When signing up on the platform, partici-
pants could see the terms of service as well as the commu-
nity rules (see supplementary material for details). After
removing participants who did not complete at least one
day of the two-day study, our final sample size was 397 Figure 2. Screenshot of the moderated social media comment.
(54% female). Participants were located in the US, and The screenshot shows the full post for Day 1 for the
unambiguous comment and for the AI as moderation source
their ages ranged from 18 to 70 (M = 35.50, SD = 9.70).
condition. For the other conditions, instead of “our automated
More than 70% of them had some college degree or a system has flagged this comment as harassment,” it was written for
4-year college degree, and 74% were white (296), 10.5% the unspecified source condition: “This comment has been flagged
were Asian, 7% were black or African American, 6% as harassment” and for the other users condition: “Other users
were Hispanic or Latino, 0.5% were American Indian or have flagged this comment as harassment.”
Native American, 1% were Asian and 2% indicated being
another race or ethnicity outside of these options.
ambiguity of the moderated comment was also manipu-
lated, making the moderated comment either clearly haras-
Experimental design sing or ambiguously worded (See Table 1).
Participants were randomly assigned to one of the six con- To determine the appropriate level of ambiguity of the
ditions, and in all the conditions, they encountered in their comments, we pilot tested 40 comments that were evaluated
newsfeed one moderated comment per day. We employed a by 145 participants recruited on MTurk. The comments
3 (other users vs. an automated AI system vs. no source were initially created by our research team based on real
identified) × 2 (ambiguous vs. clear harassment comment) comments they saw on social media platforms. The pilot
between-subject factorial design, as described in the pre- test ascertained the difference between ambiguous and
registration. The visibility of the source of moderation of clear harassment content by evaluating perception of har-
the harassment comment was manipulated by mentioning assment for each comment (3 questions, e.g. to what
that either an automated system or other users on the site extent do you see this as online harassment? 1 = not at all
moderated the comment, or not mentioning the source of to 9 = very much so). Two ambiguous and two clear harass-
moderation at all (See Figure 2 for screenshots of the con- ment comments were selected for both Day 1 and Day 2 and
ditions). Depending on the experimental treatment, the results of the pilot test showed that the comments
Table 1. Comments selected for the experiment across them to the community rules page. The post-survey
conditions and days. included measures on 1) the moderation decision (trusting
Condition Day 1 Day 2
in the decision and its perceived fairness), 2) the moderation
process capturing participants’ belief certainty regarding
Flagged Unambiguous CMT1.1: Is she CMT2.1: This how they understood the moderation process on the plat-
pregnant? Or did photo is uglier form, and 3) perceived accountability and objectivity of
she just gain 500 than you and the platform moderation system.
pounds? LOL that’s saying
For judgments on the moderation decisions, we evalu-
#chubs something
#cheesecake ated trust in the moderation decision (“How much do you
Ambiguous CMT 2.1: ummmm CMT 2.2:plzzz trust that EatSnap.Love make(s) good moderation deci-
imma pass on ever don’t ever sions?”, measured on 1 = no trust at all to 7 = extreme
eating this cook this for trust), and perceived fairness of the moderation decision
me (“How fair is it for [scenario subject] that their comment
was flagged?”, measured on 1 = very unfair to 7 = very
fair; Lee, 2018). Belief certainty about how the moderation
(ambiguous vs. clear harassment) were statistically signifi-
process works was captured with three items (alpha = .90;
cant for Day 1, t(144) = 16.52, p < .01, and for Day 2,
e.g. “How certain are you that you know how
t(144) = 19.61, p < .01.
EatSnap.Love’s moderation process works?” 1 = not at all
certain to 5 = extremely certain) that we adapted from
Procedure French (2017). Finally, we captured judgments about the
platform moderation system through perception of system
Participants first completed a pre-survey that collected accountability with three items (alpha = .87), (e.g.
demographic information as well as information on their “EatSnap.Love’s flagging system acts in the best interest
social media use, web skills, and tolerance for ambiguity of the people who use it,” “EatSnap.Love’s flagging
and affinity for technology. When registering on the system behaves the same way for everyone who uses it.”
study’s social media site, participants could read the com- 1 = strongly disagree to 7 = strongly agree), adapted from
munity rules (e.g. no bullying, no non-food posts, etc.). Rader and colleagues (2018). For perception of system
Each day of their participation in the study, all participants objectivity, we adapted the scale from Sundar and Nass
were exposed to a harassment comment between two differ- (2001), consisting of four adjectives (objective, fair, fact-
ent EatSnap.love users (bots) that was moderated by one of based, and biased (reverse coded) and asking participants
the moderation sources, randomly assigned to each partici- to rate to what extent they believed that each of them
pant out of the three choices described above. The moder- described the EatSnap.Love flagging system (1 = describes
ated comment was always displayed near the top of the very poorly to 7 = describes very well; alpha = .87).
newsfeed, randomly placed between the first and sixth
post Participants visited EatSnap.love on average 12.5
times on the first day and 8.5 times the second day. Results
Participants also engaged with the site by liking on All the analyses include the following control variables:
average 23 posts, commenting on 3.68 posts, and flagging web skills, affinity with technology, tolerance for ambigu-
1.62 comments. At the end of the two days, participants ity, age, gender, and education; they are reported only
completed the post-survey, and their account was when found to be significant predictors on the dependent
deactivated. variables. We present below our most pertinent results orga-
nized by participants’ perceptions of (1) the moderation
decision, (2) the moderation process, (3) the moderation
Measures system, and an additional category of (4) behavioral vari-
There were two types of measures used in the study: two ables (agreement with the moderation decision and choice
behavioral measures captured through log data during par- to review the site’s moderation policy). Moderation deci-
ticipants’ interaction on the platform, and post-survey mea- sion is based on the perception of the moderation related
sures. The behavioral measures for this study were: 1) to harassment comment; moderation process emphasizes
agreement with the moderation decision (“yes”/”no” the understanding of how the process works, and perception
prompt if they agreed with the moderator’s decision) (see of the moderation system focuses on the judgment of the
Figure 2) and 2) the choice to view the site’s moderation overall moderation system. A complete set of the results
policy (after participants indicated their answer to the agree- can be found in Tables 2 and 3. For thoroughness and con-
ment with the moderation decision question, another ques- sistency, we ran the main effects and interactions on all our
tion appeared asking if participants wished to review the dependent variables, which means that some results pre-
site’s community rules; their positive response would take sented below are not included in our pre-registration. For
Ozanne et al. 7
Table 2. ANCOVA results.
Source Dependent variables D F Cohen’s d
Source of moderation (A) Perceived trust 2 3.25** 0.13

Perceived fairness 2 0.33 0.04
Belief certainty in the system 2 1.48 0.09
Perception of system accountability 2 2.73* 0.12
Perception of system objectivity 2 0.26 0.04
Ambiguity of harassment comment (B) Perceived trust 1 31.92**** 0.29
Perceived fairness 1 241.92**** 0.79
Belief certainty in the system 1 12.47**** 0.18
Perception of system accountability 1 20.60**** 0.23
Perception of system objectivity 1 27.62**** 0.27
Interaction (A) * (B) Perceived trust 2 3.48** 0.13
Perceived fairness 2 1.31 0.27
Belief certainty in the system 2 1.47 0.09
Perception of system accountability 2 3.38** 0.13
Perception of system objectivity 2 0.56 0.05
Note. Control variables included in the analyses: web skills, affinity with technology, tolerance for ambiguity, age, gender, and education.
**** = p < .000, *** = p < .01, ** = p < .05, * = p < .1.
Table 3. Binary logistic regression results.
Dependent variables Predictors ba Z value S.E.
Agreement with moderation decision Source of moderation (AI vs. unknown) (A) 0.05 0.11 0.44
(Responding vs. ignoring the question) Source of moderation (AI vs. Users) (B) 0.17 0.40 0.43
Ambiguity of harassment comment (C) −1.21*** −3.14 0.38
Interaction (A) * (C) 0.10 0.11 0.88
Interaction (B) * (C) 0.65 0.75 0.87
Agreement with moderation decision Source of moderation (AI vs. unknown) (A) −0.17 −0.45 0.38
(Answering no vs. yes) Source of moderation (AI vs. Users) (B) 0.23 0.59 0.39
Ambiguity of harassment comment (C) 3.88**** 6.77 0.57
Interaction (A) * (C) −0.71 −0.92 0.78
Interaction (B) * (C) −0.52 −0.65 0.80
Likelihood to look at site policy Source of moderation (AI vs. unknown) (A) 0.23 0.94 0.24
(Responding vs. ignoring the question) Source of moderation (AI vs. Users) (B) 0.46* 1.87 0.24
Ambiguity of harassment comment (C) −0.17 −0.87 0.20
Interaction (A) * (C) 0.29 0.61 0.48
Interaction (B) * (C) −0.20 −0.41 0.49
Likelihood to look at site policy Source of moderation (AI vs. unknown) (A) 0.43 1.39 0.31
(Answering no vs. yes) Source of moderation (AI vs. Users) (B) 0.22 0.69 0.33
Ambiguity of harassment comment (C) −0.34 0.20 0.26
Interaction (A) * (C) −0.06 −0.10 0.62
Interaction (B) * (C) −0.07 −0.11 0.65
Note. Control variables included in the analyses: web skills, affinity with technology, tolerance for ambiguity, age, gender, and education.
a
Unstandardized coefficient estimate.
**** = p < .000, *** = p < .01, ** = p < .05, * = p < .1.
continuous dependent variables, we used the AOV model- Perception of moderation decision: fairness and trust
ing function from stats package on R software to perform
Perception of the moderation decision was measured via
ANCOVAs to highlight differences in means between the
moderation sources. For categorical dependent variables, fairness and trust While the moderation source did not
we used the GLMER modeling function from LME4 change perception of fairness (see Table 2), it did impact
package (Bates et al., 2015: 4) to perform binary logistic perception of trust in the moderation decision F(2389) =
regressions. 3.25, p = .04, Cohen’s d = .13. A pairwise comparison
assessing mean difference showed that participants in the Perception of moderation system: accountability and
automated moderation condition reported lower trust in objectivity
the moderation decision (M = 4.89, SE = .12) than those
in the other users as source of moderation condition (M = Perceptions about the moderation system were captured
5.27, SE = .10, p = .04). No differences were detected with participants’ assessments of system objectivity and
when comparing these two moderators with the unidentified system accountability. While moderation source did not
source condition (M = 5.11, SE = .11). From all the control impact perception of system objectivity, it did impact
variables, only affinity with technology had an effect on system accountability, with higher ratings capturing partici-
trust, with higher affinity for technology indicating higher pants’ belief that the moderation system acted in the best
trust in the moderation decision F(1389) = 6.21 p = .01, interest of its users. Our results showed a marginally signifi-
Cohen’s d = .16. cant effect of the source of moderation F(2389) = 2.73, p =
Furthermore, there was a significant interaction between .07, Cohen’s d = .12 on system accountability such that the
the source of moderation and comment ambiguity on per- system was perceived to be acting less in the best interest of its
ceived trust, F(2387) = 3.48, p = .03, Cohen’s d = .13. users in the automated condition (M = 4.97, SE = .13) than in
When the comment was ambiguously harassing another the other users’ condition (M = 5.34, SE = .11, p = .05). No
user, participants in the automated condition perceived differences were significant when comparing the AI or
less trust in the moderation decision (M = 4.38, SE = .18) other users’ condition with the unidentified source condition
than those in the other users condition (M = 4.95, SE = (M = 5.16, SE = .11), with p with AI condition = .50 and
.15) b = −.59, SE = .20, p = .01, or when moderation p with other users condition = .46.
source was unknown (M = 4.94, SE = .16) b = −.55, SE = When adding the interaction term to the models, moder-
.20, p = .02. Therefore, participants perceived less trust in ation source and comments type were qualified by a signifi-
the moderation when the decision was made by an AI cant interaction F(2387) = 3.38, p = .035, Cohen’s d = .13
than by other users or unidentified moderation source, but on system accountability. When the comment was ambigu-
only when the comment was ambiguously worded. ously harassing, participants in the automatic content mod-
Indeed, no significant differences emerged in the clear har- eration condition had a tendency to perceive the system as
assment comment condition between the AI and the acting less in the best interest of its users (M = 4.52, SE =
unknown moderation source conditions, b = .22, SE = .22, .18) than those in the other users as moderators condition
p = .56, and between AI and the other users conditions, (M = 5.01, SE = .16), b = .50, SE = .22, p = .05 or when
b = −.29, SE = .22, p = .39. moderation source was unknown (M = 5.10, SE = .16),
Ambiguity of the harassment comment did impact b = −.55, SE = .22, p = .03. However, no difference was
both perception of fairness F(1389) = 241.92, p < .001, found between the unknown condition and the other user
Cohen’s d = .79 and trust F(1389) = 31.92, p < .001, condition, b = .04, SE = .22, p = .98. No such differences
Cohen’s d = .29. Specifically, participants in the ambiguous emerged with clear harassment messages between AI and
condition perceived lower trust in the moderation decision the unknown moderation source condition, b = .28, SE =
(M = 4.74, SE = .10) than those in the clear harassment .24, p = .456, and between AI and the other users condi-
comment condition (M = 5.48, SE = .08) as well as tions, b = −.14, SE = .23, p = .81.
lower perceived fairness (ambiguously worded comment: Comment type had a significant effect on both percep-
M = 3.93, SE = .13, clearly harassing comment: M = 6.37, tion of system accountability F(1389) = 20.60, p < .000,
SE = .08). Cohen’s d = .23 and on system objectivity F(1389) =
27.62, p < .000, Cohen’s d = .27. When the comment was
ambiguous, participants reported lower perception of the
Perception of moderation process: belief certainty system acting in users’ best interests (M = 4.86, SE = .10)
For perception of the moderation process, we captured and lower feelings of system objectivity (M = 4.36, SE =
belief certainty about how the moderation process .07) compared to when exposed to clear harassment
works. The results demonstrate that the moderation comment (accountability: M = 5.49, SE = .09, objectivity:
source did not impact how certain participants felt M = 4.85, SE = .05). Tolerance for ambiguity was also a
about how the moderation process worked, and there significant covariate when looking at accountability, with
was no interaction between moderation source and higher tolerance indicating higher feelings of accountabil-
comment type (see Table 2). However, comment type ity, F(1389) = 4.04, p = .04, Cohen’s d = .10.
significantly impacted belief certainty F(1389) = 12.47,
p < .000, Cohen’s d = .18, such that, unsurprisingly, par- Likelihood to check content policy and agreement
ticipants in the ambiguously worded condition perceived
lower belief certainty in the system (M = 2.50, SE = .07) with the moderation decision
than those in the clear harassment comment condition For our final analyses, we examined the effect of source
(M = 2.86, SE = .08). visibility and comment ambiguity on behavioral actions.
Ozanne et al. 9
We used logistic regressions to test whether the source of nor in the interaction of source of moderation and comment
moderation, the comment type and their interaction type were found for fairness of the moderation decision. This
impacted participants’ decision to agree with the moder- is somewhat surprising as fairness is a critical component of
ation decision and their likelihood to check the content trust in policy making (see report OECD, 2017), with higher
policy. We analyzed agreement with the moderation deci- perceived fairness leading to higher trust in the decision. One
sion in two ways, first by comparing those who responded possible explanation may have to do with an a priori knowl-
to the prompt “do you agree with the moderation decision?” edge of the outcome. When people make judgments about a
and those who ignored it completely, and second, within the decision (or about the decision maker) with the knowledge of
group who responded, we compared participants who a certain outcome, they tend to place less weight on fairness
agreed versus those who disagreed with the moderation (Tyler, 1996). Knowing that the comment was already
decision. The only significant impact in both cases was pro- flagged may have dampened users’ judgment of fairness.
duced by the content ambiguity, with participants in the However, our participants tended to have less trust in the
clearly harassing cases being more likely to respond than moderation decision when the moderator was an AI (vs.
to ignore the prompt, b = −1.21, SE = .64, p < .01, and other users or unidentified source), when paired up with
from everyone who responded, participants in the clearly ambiguous content, which may be due to inherent biases
harassing cases were more likely to agree with the moder- that humans have toward machines. As pointed out by
ation decision, b = 3.88, SE = .57, p < .001. Specifically, Lee’s (2018) research on algorithm fairness and trust,
94.5% of those in the clear harassment condition answered while people can trust computers to do their job, they don’t
the prompt, compared to 84.1% participants in the ambigu- give them full trust because of the possibility of system
ous condition; and from everyone who responded to the glitches. When combined with ambiguously worded
prompt, 94.7% of the respondents agreed with the moder- content, people may become less trusting toward machine
ation decision “yes” in the clear harassment condition moderation (compared to other users or unidentified
versus only 27.1% in the ambiguous comment condition. sources) due to this inherent bias and possibility of error
Moderation source was not a significant predictor, and on the part of AI moderators. In other words, the window
there were no moderation effects (See Table 3 for a com- of trust appears to be smaller for AI compared to other
plete set of results). Regarding the likelihood to check the users and unidentified moderation sources when content is
content policy, results showed non-significant differences obscure and does not lend itself to an unequivocal harass-
between those who responded and those who did not, as ment interpretation.
well as between participants who responded “yes” or With regard to the perception of the moderation process,
“no” to the prompt for each of our manipulated conditions we found that when the harassment comment was clearly
(see Table 3). harassing, participants expressed that they were more
certain in their understanding of how moderation worked
(i.e. more belief certainty) compared to when the
Discussion comment was ambiguous. Interestingly, there was no differ-
The goal of this study was to understand how making a ence in belief certainty depending on the moderation
content moderator visible in a social media environment source, indicating that automatic moderation did not cause
may influence perceptions of the moderation system. We more doubts in their understanding of the moderation
found that users tend to question (1) their trust in the mod- system. The literature suggests that when AI is understood
eration decision and (2) the accountability of the moder- and analyzed by lay users, AI systems can be better inte-
ation system when moderation was attributed to AI grated in day-to-day activities (Hagras, 2018).
compared to other users or an unknown source. However, Perception of the moderation system was examined via
instead of AI moderation being seen as less trustworthy participants’ assessment of the system’s accountability
and accountable across the board, the difference between and objectivity. Participants did not judge the system
AI and the other sources of moderation only emerged for objectivity differently based on the moderation source,
content moderation of ambiguously worded harassment, but there were significant differences in perceptions of
and not for clear harassment cases. Below we explain the system accountability depending on the moderator (i.e. an
meaning of these results from the point of view of users’ automated moderation system vs. other users or unknown
engagement with problematic content moderated by differ- source) when the comment was ambiguously harassing
ent sources, as well as the implications for making moder- another user. While system objectivity refers to general fair-
ation sources visible on social media platforms. ness of the moderation system, accountability goes beyond
With regards to the perceptions of moderation decision, fairness and captures whether the system is perceived to be
we found that users were less trusting when exposed to an acting in the best interest of all users (Rader et al., 2018). It
automated system moderator (vs. other users or unknown is possible that in the more ambiguous situation, users are
source) when the harassment comment was ambiguously concerned about AI’s moderation because of the above-
worded. Interestingly, no differences in source of moderation mentioned inherent bias toward machines (Lee, 2018)
foregrounded by ambiguous content. In other words, people experiment, the social media site we used (EatSnap.Love)
may be willing to yield to machine judgments in clear-cut was new to our participants. Being in a new social media
and unequivocal cases, but any kind of contextual ambigu- environment may have made our participants act differently
ity resurfaces underlying questions about the AI moderation than in their commonly used social media sites. Yet, even in
system because of a possibility of errors and questions this new environment and not knowing the other users of
about whose interests and values it reflects. In contrast, the site, participants still trusted more the moderation deci-
when participants see “other users” or even an unidentified sion with ambiguous content when other users flagged the
source flagging an ambiguous comment, they may be less comment than when an automated system did so. Future
likely to question their actions and intentions because the research could extend the test of moderation source visibil-
trust window is larger with other users than with AI ity to real world platforms, for example, by using field
moderation. experiments, similar to the study done by Matias (2019).
Finally, we looked at whether visibility of moderators Second, the results do not elucidate participants’ interpreta-
impacted users’ agreement with the moderation decision tions of their experiences with the moderation system,
and likelihood to look at the site content policy. which could be probed with open-ended questions and
Moderation source did not impact users’ agreement (vs. dis- qualitative data. Finally, while we examined three different
agreement) with the moderation decision but the exposure moderators, subsequent research should also look at the
to different type of comments led participants to take differ- effect of commercial moderators on perceptions of moder-
ent actions on the site: in the clear (vs. ambiguous) harass- ation decisions and systems. Adding this fourth category
ment comment condition, participants were more likely to would give a broader understanding of how people
answer the agreement question than to skip it, presumably engage with different types of content moderation.
because they were more confident in judging the
comment as harassment, and particularly to answer “yes”
(vs. “no”). This kind of prompt may have important impli- Conclusion
cations: the more harassment comments are reported by The findings from our study reveal the complexity of how
users, the higher the likelihood that the platform will users interact with different types of content moderation
remove the stated comment and the more the engagement in social media platforms. Specifically, users are more
of users on the platform (Jiménez Durán, 2022). likely to question AI moderators, especially how much
Overall, these results highlight how ambiguity in haras- they can trust their moderation decision and the moderation
sing comments strongly impacts perceptions of moderation system’s accountability, but only when moderation content
decisions and moderation systems, especially when the is inherently ambiguous. In other words, rather than invari-
moderator is an AI system. The results show that when ably trusting AI’s moderation less and consistently seeing
users feel uncertain about whether the comment is haras- AI as less accountable compared to other users’ moderation,
sing, they are more likely to question the moderation deci- the difference only emerges with ambiguous content. Those
sion and the moderation system, similar to how they findings are increasingly relevant to the question of whether
question the certainty of their own understanding of the AI should be used solely to moderate content (Gillespie,
system (Blackwell et al., 2018). Furthermore, the exposure 2020), as they highlight the differences in how people
to the same ambiguous content led to more questioning of engage with AI content moderation system compared to
AI moderation than the other moderation sources (i.e. human moderators.
other users and unknown sources). This finding adds a While media companies continue to use artificial intelli-
more nuanced understanding of folk theories about the gence to improve users’ experience (Moss, 2021) and to
role of a priori beliefs and direct experience with content maintain engagement and profitability (Jiménez Durán,
moderation (DeVito et al., 2018; Eslami et al., 2016; 2022), a promising approach could be identifying ways
French, 2017). It suggests that users assign different for AI and humans to work together as “moderating team-
weights to direct experiences with different types of mod- mates,” in line with the “machines as teammates” paradigm
erators (AI vs. other users or unknown source), likely (see Seeber et al., 2020). Even if AI could effectively mod-
because of the differences in a priori beliefs about them, erate content, there is a necessity of human moderators as
and future research should examine an interplay of a rules in community are constantly changing, and cultural
prior beliefs and direct experiences in how humans perceive context differ, so there is never a shared value system that
and interact with AI content moderation systems. everyone agrees on (Gillespie, 2020). Furthermore, AI
moderators can help with content moderation scalability
and reduce the burden on professional moderators who
Limitations deal daily with harassment comments, violence, or conspir-
As in any research, this study comes with some limitations. acies (Irwin, 2022). As our findings suggest, when a
Despite our ability to reproduce a simulated social media comment is clearly harassing another user, using AI as
environment to increase the ecological validity of the moderator does not seem to affect users’ perception of the
Ozanne et al. 11
moderation system. However, for an ambiguously haras- Bhandari A, Ozanne M, Bazarova NN, et al. (2021) Do you care who
sing comment, platforms could rely on users rather than flagged this post? Effects of moderator visibility on bystander
AI, or look for ways to engage automated moderation in a behavior. Journal of Computer-Mediated Communication
way that supports “partnership between human and auto- 26(5): 284–300.
Blackwell L, Chen T, Schoenebeck S, et al. (2018) When online
mated moderation” (Gillespie, 2020: 4), for example, by
harassment is perceived as justified. In: Twelfth International
AI tools providing users with contextual information
AAAI Conference on Web and Social Media.
when users are not sure about ambiguously worded harass- Brown KE and Pearson E (2018) Social media, the online environ-
ment. Similar to how users expand their understanding of ment and terrorism. In: Routledge Handbook of Terrorism and
the breadth and impact of a harassment problem through Counterterrorism. Oxon, UK: Routledge, 149–164.
the act of labeling negative online experiences as “online Brunk J, Mattern J and Riehle DM (2019) Effect of transparency
harassment” (Blackwell et al., 2018), users may increase and trust on acceptance of automatic online comment moder-
their understanding of AI moderation systems through ation systems. In: 2019 IEEE 21st Conference on Business
direct experience. Informatics (CBI), 2019, pp. 429–435. IEEE.
Crawford K and Gillespie T (2016) What is a flag for? Social
media reporting tools and the vocabulary of complaint. New
Acknowledgments Media & Society 18(3): 410–428. DOI: 10.1177/
We want to thank the reviewers for their comments. We also thank 1461444814543163.
Anna Spring for her help with developing the simulation, DeVito MA, Birnholtz J, Hancock JT, et al. (2018) How People
Katherine Miller with providing support with participant compen- Form Folk Theories of Social Media Feeds and What it
sation, and research assistants Annika Pinch, Suzanne Lee and Means for How We Study Self-Presentation. In: Proceedings
Hyun Seo (Lucy) Lee for their help with testing the simulation of the 2018 CHI Conference on Human Factors in
and managing data collection. Computing Systems, Montreal QC Canada, 19 April 2018,
pp. 1–12. ACM. DOI: 10.1145/3173574.3173694.
Declaration of conflicting interests Diakopoulos N (2015) Algorithmic accountability: journalistic
investigation of computational power structures. Digital
The author(s) declared no potential conflicts of interest with Journalism 3(3): 398–415.
respect to the research, authorship, and/or publication of this Dietvorst BJ, Simmons JP and Massey C (2015) Algorithm aver-
article. sion: people erroneously avoid algorithms after seeing them
err. Journal of Experimental Psychology: General 144(1): 114.
Funding DiFranzo D, Taylor SH, Kazerooni F, et al. (2018) Upstanding by
Design: Bystander Intervention in Cyberbullying. In:
The author(s) disclosed receipt of the following financial support
Proceedings of the 2018 CHI Conference on Human Factors
for the research, authorship, and/or publication of this article:
in Computing Systems - CHI ‘18, Montreal QC, Canada,
This work was supported by the USDA NIFA HATCH (grant
2018, pp. 1–12. ACM Press. DOI: 10.1145/3173574.3173785.
number 2020-21-276).
Duarte N, Llanso E and Loup AC (2018) Mixed messages? The
limits of automated social Media content analysis. In: FAT,
ORCID iDs 2018, p. 106.
Marie Ozanne https://orcid.org/0000-0001-5355-0397 Duggan M (2017) Witnessing online harassment. In: Pew
Natalya N Bazarova https://orcid.org/0000-0001-5375-6598 Research Center: Internet, Science & Tech. Available at:
https://www.pewresearch.org/internet/2017/07/11/witnessing-
online-harassment/ (accessed 15 April 2022).
References Eslami M, Karahalios K, Sandvig C, et al. (2016) First I ‘like’ it,
Ananny M and Crawford K (2018) Seeing without knowing: lim- then I hide it: Folk Theories of Social Feeds. In: Proceedings of
itations of the transparency ideal and its application to algorith- the 2016 CHI Conference on Human Factors in Computing
mic accountability. New Media & Society 20(3): 973–989. Systems, San Jose California USA, 7 May 2016, pp. 2371–
DOI: 10.1177/1461444816676645. 2382. ACM. DOI: 10.1145/2858036.2858494.
Banchik AV (2020) Disappearing acts: content moderation and Eslami M, Rickman A, Vaccaro K, et al. (2015) ‘I always assumed
emergent practices to preserve at-risk human rights–related that I wasn’t really that close to [her]’ Reasoning about
content. New Media & Society 23(6): 1527–1544. Invisible Algorithms in News Feeds. In: Proceedings of the
Bargh JA, Chaiken S, Govender R, et al. (1992) The generality of 33rd annual ACM conference on human factors in computing
the automatic attitude activation effect. Journal of Personality systems, 2015, pp. 153–162.
and Social Psychology 62(6): 893. Fox J (2007) The uncertain relationship between transparency and
Barocas S, Hood S and Ziewitz M (2013) Governing algorithms: accountability. Development in Practice 17(4–5): 663–671.
A provocation piece. Available at SSRN 2245322. French MR (2017) Algorithmic Mirrors: an Examination of How
Bates D, Maechler M, Bolker BM, et al. (2015) Fitting linear Personalized Recommendations Can Shape Self-perceptions
mixed-effects models using lme4. Journal of Statistical and Reinforce Gender Stereotypes. PhD, Stanford University,
Software 67: 1–48. United States, California. Available at: https://www.
Berkelaar BL and Harrison MA (2017) Information visibility. In: proquestcom/docview/2436884514/abstract/
Oxford Research Encyclopedia of Communication. D2717BFCBCD441FPQ/1 (accessed 9 August 2021).
Gillespie T (2014) The relevance of algorithms. Media Matias JN (2019) Preventing harassment and increasing group
Technologies: Essays on Communication, Materiality, and participation through social norms in 2,190 online science dis-
Society 167(2014): 167. cussions. Proceedings of the National Academy of Sciences
Gillespie T (2018) Custodians of the Internet: Platforms, Content 116(20): 9785–9789. DOI: 10.1073/pnas.1813486116.
Moderation, and the Hidden Decisions That Shape Social Matias JN, Simko T and Reddan M (2020) Study results: reducing
Media. New Haven, CT: Yale University Press. the silencing role of harassment in online feminism
Gillespie T (2020) Content moderation, AI, and the question of discussions.
scale. Big Data & Society 7(2): 2053951720943234. Moss S (2021) Facebook Plans huge $29–34 billion capex spend-
Gollatz K, Beer F and Katzenbach C (2018) The Turn to Artificial ing spree in 2022, will invest in AI, servers, and data centers.
Intelligence in Governing Communication Online. Berlin. Available at: https://www.datacenterdynamics.com/en/news/
https://www.ssoar.info/ssoar/handle/document/59528. facebook-plans-huge-29-34-billion-capex-spending-spree-in-
Gorwa R (2018) Towards fairness, accountability, and transparency 2022-will-invest-in-ai-servers-and-data-centers/ (accessed
in platform governance. AoIR Selected Papers of Internet 14 April 2022).
Research. Munger K (2017) Tweetment effects on the tweeted: experimen-
Gorwa R, Binns R and Katzenbach C (2020) Algorithmic content tally reducing racist harassment. Political Behavior 39(3):
moderation: technical and political challenges in the automa- 629–649. DOI: 10.1007/s11109-016-9373-5.
tion of platform governance. Big Data & Society 7(1): Myers West S (2018) Censored, suspended, shadowbanned: user
2053951719897945. interpretations of content moderation on social media plat-
Grant MJ, Button CM and Noseworthy J (1994) Predicting attitude forms. New Media & Society 20(11): 4366–4383. DOI: 10.
stability. Canadian Journal of Behavioural Science/Revue 1177/1461444818773059.
Canadienne des Sciences du Comportement 26(1): 68. Nadali S, Murad MAA, Sharef NM, et al. (2013) A review of
Grimmelmann J (2015) The virtues of moderation. Yale JL & cyberbullying detection: An overview. In: 2013 13th
Tech. 17: 42. International Conference on Intellient Systems Design and
Gross SR, Holtz R and Miller N (1995) Attitude certainty. Attitude Applications, 2013, pp. 325–330. IEEE.
Strength: Antecedents and Consequences 4: 215–245. Noble SU (2018) Algorithms of Oppression. New York: NYU
Hagras H (2018) Toward human-understandable, explainable AI. Press.
Computer 51(9): 28–36. DOI: 10.1109/MC.2018.3620965. OECD (2017) Trust and public policy: how better governance can
Irwin V (2022) Two content moderators file class-action lawsuit help rebuild public trust Available at: https://www.oecd-
against TikTok. Available at: https://www.protocol.com/bulletins/ ilibrary.org/governance/trust-and-public-policy_97892642689
tiktok-content-moderation-lawsuit (accessed 14 April 2022). 20-en (accessed 9 August 2021).
Jhaver S, Appling DS, Gilbert E, et al. (2019a) ‘Did you suspect Pater JA, Kim MK, Mynatt ED, et al. (2016) Characterizations of
the post would be removed?’: understanding user reactions to online harassment: Comparing policies across social media
content removals on reddit. Proceedings of the ACM on platforms. In: Proceedings of the 19th International
Human-Computer Interaction 3(CSCW): 1–33. DOI: 10. Conference on Supporting Group Work, 2016, pp. 369–374.
1145/3359294. Rader E, Cotter K and Cho J (2018) Explanations as mechanisms
Jhaver S, Birman I, Gilbert E, et al. (2019b) Human-Machine col- for supporting algorithmic transparency. In: Proceedings of the
laboration for content regulation: the case of reddit automo- 2018 CHI conference on human factors in computing systems,
derator. ACM Transactions on Computer-Human Interaction 2018, pp. 1–13.
26(5): 1–35. DOI: 10.1145/3338243. Roberts ST (2016) Commercial content moderation: Digital
Jiménez Durán R (2022) The economics of content moderation: laborers’ dirty work.
theory and experimental evidence from hate speech on Santa Clara Principles (2018) Santa Clara principles on transpar-
twitter. 4044098, SSRN Scholarly Paper, 25 February. ency and accountability in content moderation. Available at:
Rochester, NY: Social Science Research Network. DOI: 10. https://santaclaraprinciples.org/images/scp-og.png (accessed 9
2139/ssrn.4044098. August 2021).
Kempton W (1986) Two theories of home heat control. Cognitive Seeber I, Bittner E, Briggs RO, et al. (2020) Machines as team-
Science 10(1): 75–90. mates: A research agenda on AI in team collaboration.
Kroll JA, Huey J, Barocas S, et al. (2016) Accountable algo- Information & Management 57(2): 103174. DOI: 10.1016/j.
rithms’(2017). University of Pennsylvania Law Review 165: im.2019.103174.
633. Seering J, Wang T, Yoon J, et al. (2019) Moderator engagement
Langos C (2012) Cyberbullying: the challenge to define. and community development in the age of algorithms. New
Cyberpsychology, Behavior, and Social Networking 15(6): Media & Society 21(7): 1417–1443.
285–289. Shin D and Park YJ (2019) Role of fairness, accountability, and
Lee MK (2018) Understanding perception of algorithmic deci- transparency in algorithmic affordance. Computers in Human
sions: fairness, trust, and emotion in response to algorithmic Behavior 98: 277–284.
management. Big Data & Society 5(1): 2053951718756684. Sundar SS and Nass C (2001) Conceptualizing sources in online
Llanso E (2019) Platforms want centralized censorship. That news. Journal of Communication 51(1): 52–72.
should scare you. Wired, 18 April. Suzor NP, West SM, Quodling A, et al. (2019) What do we mean
Llansó EJ (2020) No amount of “AI” in content moderation will when we talk about transparency? Toward meaningful trans-
solve filtering’s prior-restraint problem. Big Data & Society parency in commercial content moderation. International
7(1): 2053951720920686. Journal of Communication 13: 18.
Ozanne et al. 13
Taylor SH, DiFranzo D, Choi YH, et al. (2019) Accountability and Vogels E (2021) The state of online harassment. In: Pew Research
empathy by design: encouraging bystander intervention to Center: Internet, Science & Tech. Available at: https://www.
cyberbullying on social Media. Proceedings of the ACM on pewresearch.org/internet/2021/01/13/the-state-of-online-
Human-Computer Interaction 3(CSCW): 1–26. harassment/ (accessed 9 August 2021).
ter Hoeven CL, Stohl C, Leonardi P, et al. (2021) Assessing organ- Waddell TF (2018) What does the crowd think? How online com-
izational information visibility: development and validation of ments and popularity metrics affect news credibility and issue
the information visibility scale. Communication Research importance. New Media & Society 20(8): 3068–3083. DOI: 10.
48(6): 895–927. 1177/1461444817742905.
Tyler TR (1996) The relationship of the outcome and procedural Yang K and Stoyanovich J (2017) Measuring fairness in ranked
fairness: how does knowing the outcome influence judgments outputs. In: Proceedings of the 29th international conference
about the procedure? Social Justice Research 9(4): 311–325. on scientific and statistical database management, 2017,
DOI: 10.1007/BF02196988. pp. 1–6.

Ozanne Et Al 2022 Shall Ai Moderators Be Made Visible Perception of Accountability and Trust in Moderation Systems On

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Ozanne Et Al 2022 Shall Ai Moderators Be Made Visible Perception of Accountability and Trust in Moderation Systems On

Uploaded by

Copyright:

Available Formats

Original Research Article

Big Data & Society

Marie Ozanne1 , Aparajita Bhandari2, Natalya N Bazarova2

Figure 1. Summary of predictions.

and share their own social media posts through a newsfeed,

Table 2. ANCOVA results.

Source Dependent variables D F Cohen’s d

Source of moderation (A) Perceived trust 2 3.25** 0.13

Table 3. Binary logistic regression results.

Dependent variables Predictors ba Z value S.E.

You might also like