Discord Icwsm-2

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 12

Characterizing the Depths of the Discord Platform in Brazil

Arthur Buzelin1 * , Yan Aquino1 * , Victoria Estanislau1 * , Pedro Bento1 * , Lucas Dayrell1 *
Caio Santana1 * , Pedro Robles1 * , Ana Paula Couto1 , Virgilio Almeida1
Fabricio Benevenuto1 , Wagner Meira Jr1
1
Universidade Federal de Minas Gerais, Brazil
{arthurbuzelin, yanaquino, victoria.estanislau, pedro.bento, lucasdayrell, caiosantana, ana.coutosilva, virgilio, fabricio, meira}
@dcc.ufmg.br, pedroroblesduten@ufmg.br

Abstract In recent years the platform experienced a particular


Discord, a prominent social platform specialized in gaming
growth, reaching more than half a billion users world wide
and voice communication, has recently attracted significant [Statista 2023]. This popularity increase put Discord in a
attention in Brazilian media highlighting many instances of spotlight it has never been before, however, it also shed light
toxic behavior. In order to understand this, we collected data on worrying incidents involving the network in Brazil, lead-
from 1,232 groups over an one-year period, encompassing ing to investigations and reports from major news vehicles in
700 million messages. From that, we determined the main the country. Those reports highlighted toxic behavior and il-
characteristics of the network and, using a fine-tuned BERT legal activities committed through the platform and attained
model, stratified the hateful messages into different types of national attention and commotion. The crimes ranged from
hate speech, in order to understand the efficiency of Discord’s psychological abuse, encouragement of self-harm and pe-
current moderation, the groups most affected by the hateful dophilia, mostly occurring against young women.
environments, and the evolution of users engaged in the plat-
form. Our findings revealed that Discord fails to do accurate Other social networks had already been studied in terms
moderation, by showing that the prevalence of hate speech of hate speech, such as Reddit [Rieger et al. 2021], Twit-
remains statistically the same between users who had their ter [Watanabe, Bouazizi, and Ohtsuki 2018] and WhatsApp
accounts deleted or deleted them and those who remain ac- [Saha et al. 2021], but found little to no similar work when
tive. Furthermore, sexism emerges as the primary form of it came to Discord in Brazil. There were a few studies with
hate speech in the network as a whole, with concerning age some kind of characterization of the platform but none tar-
demographics in groups with sensitive categories. Finally, we geting hate speech detection, which was surprising and con-
observe that Discord’s toxic environments tend to escalate cerning, given how popular it is today and all the problems
users toxicity, highlighting the need for improved moderation lying within it, as previously stated.
techniques.
Aiming to fill this gap we will address and investigate the
following research questions:
Introduction RQ1: How effectively does Discord enforce user moder-
Originally designed as a user-friendly platform for online ation in accordance with its guidelines?
gaming with real-time voice chatting capabilities, Discord RQ2: What groups are targeted by hate speech and ex-
evolved into a prominent social platform serving a diverse posed to toxic environments on Discord?
array of communities. Comparable to services like What- RQ3: How do the profiles of Discord users in Brazil
sApp and Telegram, it is particularly known for being pop- evolve over time in relation to the environment they are en-
ular among young adolescents who utilize it as a space for gaged in on the platform?
social interaction and virtual engagement. We shed light on those questions by collecting data from
The platform is structured into user-created and managed 1,232 public groups, adding up to 700 million messages
groups1 , and can be public or private. Each group hosts mul- from 2.2 million distinct users. We fine-tuned a BERT [De-
tiple coexisting voice and text channels, supporting a variety vlin et al. 2019] model designed for tweets in Portuguese
of communication forms including audio, images, videos, [Pérez, Giudici, and Luque 2021], using a extensive and
and file sharing. This study will focus specifically on text manually labeled collection of messages, in order to clas-
messages within public groups. sify messages into 5 different categories of hate speech:
* These authors contributed equally.
’Racism’, ’Homophobia’, ’Sexism’, ’Body shaming’ and
Copyright © 2024, Association for the Advancement of Artificial ’Ideology’.
Intelligence (www.aaai.org). All rights reserved. By comparing the proportion of hateful messages origi-
1
The groups are called ”servers” by Discord and its users, but nating from users who were either banned or deleted their
to avoid terminology confusion in this paper and maintain compat- accounts and those who remain active, we discovered that
ibility to related work they’ll be referenced as ”groups” they exhibit statistically identical rates through statistical
tests, revealing flaws in Discord moderation system. We also Moderation on Social Media
typified the types of hate speech, managed to find the self-
As a key set of analysis of our work explores moderation in
declared age demographics and categorized the groups, al-
Discord, next we briefly review efforts related to moderation
lowing us to uncover: (i) The age distribution within each
in different digital platforms.
group category; (ii) The prevalence of each type of hate
Social networks, ranging from mainstream platforms like
speech across categories, revealing a general high level of
Facebook and the former Twitter, now known as X, to al-
sexism and notable spikes in certain categories, such as ide-
ternative forums such as Gab and 4chan, have been the sub-
ology related hatred in politics groups and; (iii) The con-
ject of study in academic circles due to their recognized im-
cerning presence of minors in sensitive groups, such as
pact on people’s daily lives. These platforms are analyzed
those focused on pornography and online dating. Ultimately,
not only to understand their overall network behavior but
through analyzing toxicity trends over time in each environ-
also to examine the effects of digital integration. The ease
ment, we concluded that users tend to increase toxicity in
with which content can be disseminated on these networks
environments prone to hate speech.
[Bakshy et al. 2012] often leads to the spread of hateful con-
tent and extremist beliefs and opinions. The study [Chan-
Background and Related Work drasekharan et al. 2017] discusses the challenges in moder-
In this section, we begin by reviewing previous work that ex- ation and the consequences that its absence causes for users.
plored message platforms, specially Discord. Then, we ex- A practical example is the case of the study [Wang et al.
amine prior research concerning social network moderation 2023], which identifies classes of user behaviors that pro-
effectiveness and hate speech classification. mote Qanon theories and evaluates the impacts of Twitter
moderation on this content.
Messaging Platforms
There are different approaches to moderating a social net-
On the last few years, many mainstream platforms started to work, which vary depending on the platform’s operation.
intensify moderation in their platforms, removing misinfor- Moderation can be applied to individual posts, users, or even
mation and conspiracy theories that notably have promoted entire groups or communities, and can be characterized as
violence in the real world [BBC 2020, 2021, Zadrozny and either soft or hard interventions. Soft intervention, as ex-
Collins 2018, 2020]. plored in [Zannettou 2021], is characterized by warnings to
Not surprisingly, after these content moderation interven- the user, hiding or limiting the reach of content, as well as
tions, this kind of content migrated to digital spaces with limiting the interactions the user can make on the platform.
less moderation, like Parler [Aliapoulios et al. 2021] and As for hard intervention, exemplified in [Cheng, Danescu-
Gab [Lima et al. 2018], or messaging platforms like What- Niculescu-Mizil, and Leskovec 2016], it involves the re-
sApp and Telegram [Hoseini et al. 2023, La Morgia et al. moval of content, users, or communities deemed problem-
2021]. atic from the platform [Chandrasekharan et al. 2017, Jiang
Consequently, there are multiple research efforts that et al. 2023]. There are a few efforts that address moderation
report how massive misinformation campaigns explore in Reddit, in wich subreddit moderatos are volunteers who,
WhatsApp and Telegram, particularly during elections in in addition to manually conducting interventions, can make
Brazil [Tardaguila, Benevenuto, and Ortellado 2018, Re- use of automation tools such as AutoModerator - a customiz-
sende et al. 2019, Melo et al. 2019, Júnior et al. 2022]. In able bot that assists in comment monitoring, or ModMail, a
India, the spread of rumors through WhatsApp caused a se- tool that facilitates user communication with the moderator
ries of violent lynchings around the country [Arun 2019]. [Dosono and Semaan 2019, Gilbert 2020, Thach et al. 0].
Another issue regarding these platforms, in particular Tele-
gram, is that they are also commonly used for the practice of
Hate Speech Detection
diverse forms of digital scams [La Morgia et al. 2021].
A few previous studies have explored various uses of Dis- As a relevant part of work consists of measuring toxicity in
cord. Research [Wiles and Simmons 2022, Lacher and Biehl Discord messages, next, we briefly discuss efforts on mea-
2018] investigates its application in the educational context, suring hate speech.
while other studies focus on the platform’s use for drug traf- There are different methods for identifying hate speech
ficking [van der Sanden et al. 2022]. Additionally, a compar- on social media networks, as explored in previous studies
ative analysis between Discord, WhatsApp, and Telegram [Alkomah and Ma 2022, Saha et al. 2023, Davidson et al.
highlighted group characterization, post assessment, and as- 2017]. For instance, the study [Efstratiou et al. 2023] uti-
pects of privacy and security on each network [Hoseini et al. lized the Perspective API [Lees et al. 2022], to analyze in-
2020]. In a different context, an investigation conducted teractions in toxic subreddits, grouping users based on the
through one of the forums that provide links to public Dis- similarity of their behaviors. Other investigations on Reddit
cord groups [Disboard 2023] revealed the presence of a va- focused on identifying hate speech using specialized lexi-
riety of extremist groups [Heslep and Berge 2024]. Another cons [Olteanu et al. 2018, Schmitz, Muric, and Burghardt
contribution was the development of a dataset of juvenile 2022], with the former assessing user behavior during pe-
messages classified as hateful, which can help improve hate riods of extremist attacks and the latter exploring subred-
speech identification [Fillies, Peikert, and Paschke 2024], dits already known for their hate speech. Additionally, the
while another discussed specific challenges of moderation study [Lima et al. 2020] compared hate speech on Twitter,
in voice channels [Jiang et al. 2019]. a moderated platform, and Gab, an unmoderated platform,
identifying toxic expressions through a combination of pre- Source # of Links
selected terms and the Perspective API.
Discovery 1,086
Other efforts have analyzed which groups are most tar-
Disboard 1,200
geted by hate speech on social media [Chiril et al. 2021].
Discadia 181
Also, [Silva et al. 2016] made a systematic large-scale mea-
Top.gg 137
surement study on Twitter and Whisper, providing informa-
Discords 22
tion on types of hate occurring on each social media and re-
lating it to structural proprieties of the network. Another re- Total Unique 2,097
search [Obermaier and Schmuck 2022] investigates the rea-
sons why many teenagers and young adults frequently be- Table 1: Distribution of collected group links across all gath-
come targets of online hate speech. ered public sources.

Research Gap
While platforms such as WhatsApp and Telegram have have returned approximately 3 billion text messages from 2,020
been widely studied recently, Discord remains relatively un- unique groups.
derexplored. This oversight is especially significant in the Although our crawling process covers a time window of
Brazilian context, where recent incidents have caused na- more than 8 years, we restricted our analysis to a period
tional uproar and highlighted the platform’s role in facili- in which users’ activity, in terms of exchanged messages,
tating harmful behaviors. Motivated by these developments, reached the highest volume. The period with largest engage-
our study seeks to address these crucial gaps by providing ment occurred between October 1st, 2022 and September
a comprehensive analysis of user interactions and the effec- 30th, 2023. Moreover, we filtered out messages from groups
tiveness of moderation within Brazilian Discord communi- with less than 100 exchanged messages, since those com-
ties. By focusing on this specific geographical and cultural prised of abandoned groups. Then, approximately 2.3 billion
setting, we aim to uncover nuanced insights that previous messages were filtered out from the analyses.
studies may have overlooked, thus contributing significantly We then proceed with our filtering process turning our
to the broader understanding of how digital spaces can influ- attention to the presence of bots and deleted users in our
ence user behavior. dataset. Bots in the platform are automated accounts that
perform functions like greeting new members, moderating
Dataset Collection content, playing multimedia files, or even hosting games
among the users. Deleted users, instead, are those in which
We collected messages from public Brazilian Discord either an account is deleted by the user itself or by Discord
groups, following the platform policies2 , in which those due to a violation of terms of service. We identified bots
groups are split into two distinct types: (i) those featured and the ”Deleted Users” using Discord API. It was found
in Discovery 3 , an official, in app feature to browse for 30,331,773 and 55,878,395 messages delivered by bots and
new groups to join, which must adhere to stringent guide- ”Deleted Users”, respectively. The messages sent from bots
lines, including maintaining a safe moderated environment were filtered, reducing our dataset to 677,494,339 messages
and avoiding sensitive or controversial topics; (ii) and those sent by 2,239,337 unique users, while the ”Deleted Users”
with publicly available invite links on external websites and messages were marked for future analysis. The entire pro-
online forums. cess of data filtering, including these criteria and the subse-
We then developed a web scraper to collect invitation quent steps taken, is detailed in Figure 1.
links. This data was extracted from Discord’s discovery fea-
ture and the four most prominent Discord link-sharing web- Age retrieval
sites. The data collection occurred in October 2023, table
1 summarizes the number of groups identified from each In light of reports from Brazil, where a significant number
publicly available source. In this effort, we ensured that our of incidents involved teenagers both as perpetrators and vic-
dataset included all publicly accessible groups in Brazil. tims, understanding the age demographics within our dataset
The collection was done using a custom crawler based becomes crucial. This demographic insight is essential not
on the official Discord API 4 . Once configured, our crawler only for evaluating the effectiveness of Discord’s modera-
entered public Discord groups, those in which the link invi- tion strategies but also for gaining a deeper understanding of
tation was published on the cited public sources, and saved user interactions on the platform.
any and all text messages shared on their channels from the Given that Discord’s API doesn’t provide user’s age, we
creation of Discord, May 13th 2015, up to October 1st 2023. explored a unique feature of Brazilian Discord groups: many
Moreover, for each message, we collected their correspond- have an ”introduce yourself” text channel in which users typ-
ing timestamp, author’s username, author’s unique ID and ically share personal details, including their age. This study,
content. Due to the deactivation of 77 groups, our crawler therefore, concentrates on leveraging this self-reported age
information, only retrieved in public channels from public
2 groups.
https://discord.com/safety/360043709612-our-policies
3
https://discord.com/guild-discovery Our method involved deploying a script that utilizes reg-
4
https://discord.com/developers/applications ular expressions to extract age data from these channels. To
Figure 1: A diagram displaying the three stages of the dataset collection and filtering process, alongside with the total number
of groups, messages and users, after filtering.

avoid repetition, we implemented a system where only the Category # of Groups


most recent age reported by a user was retained, thus limit-
ing our dataset to one age entry per unique user ID. Chatting 406
Games 350
We considered age reports between 6 and 99 years as
Fan Club 204
plausible, based on the thought that children under six are
Anime 153
unlikely to use social media and centenarians represent a
Online Dating 111
very small proportion of the Discord user base. This age
Roleplay 106
range was selected to create a realistic and representative
Education 96
sample for our analysis.
Pornography 93
To assess its accuracy, we manually reviewed 1,000 ran-
Bussiness 80
dom messages. Our analysis revealed that 96% of these mes-
Sport 13
sages in fact represented age declarations, indicating effec-
Politics 5
tiveness in our approach.

Categories Classification Table 2: Overview of group distribution by category

To enhance our understanding of user characteristics within


different kinds of groups on Discord, and to analyze the spe- intrinsic characteristics of Discord interactions and style of
cific ”paths” these users follow, we have categorized each writing, which may impact the accuracy of the task we are
group into distinct categories. With this process, we were interested in. Then, before building the detection model, we
able to infer with more certainty what are the communities first created an annotated corpus to be further applied to fine-
and the behavior expressed in Discord, which will be better tuning our classification model.
discussed later.
We initiated our analysis by classifying each group into Annotated Corpus
multiple predefined categories found on the forums and offi-
cial Discord pages. As the analysis progressed, we continu- To conduct our labeling process, we randomly selected a set
ally assessed the relevance and scope of these categories. For of 16,000 Discord messages, which were divided into two
each group, we manually assigned it to a broader, more fit- distinct subsets. The first, containing 10,000 messages, was
ting category based on the groups’s description and required chosen through a weighted random selection process, which
tags to ensure the optimal classification. prioritized messages from larger groups to ensure a repre-
Through this iterative process, unnecessary categories sentative sample across different user populations. The sec-
were identified and subsequently merged into more relevant ond subset, consisting of 6,000 messages, was built using
ones. This methodical refinement continued until a total of a term search algorithm designed to identify messages con-
11 distinct categories. The final structure of these categories, taining specific terms frequently associated with toxic con-
along with the total number of groups within each, is de- tent. This methodological approach allowed us to balance
tailed in Table 2. randomness with targeted selection to analyze message dy-
namics within the platform. This targeted selection aimed to
compile approximately 1,200 messages for each category,
Hate Speech Classification Model deliberately including both hate speech and neutral mes-
To investigate the toxicity and presence of hate messages sages to avoid term-associated bias in our dataset.
in Discord groups, we required a reliable machine learning This sample of messages were divided into 8 batches
model to detect this kind of content. In our context, we faced of 2,000 examples each. We recruited 8 undergraduate and
two main challenges: (i) our content is written in Brazilian graduate students from Computer Science as annotators. We
Portuguese, for which several NLP machine learning models then gave 2 batches for each, so that every message is clas-
have been proven to not have high accuracy [] and; (ii) the sified twice in order to ensure the concordance among the
Category Kappa Value
Racism 0.97
Homophobia 0.95
Sexism 0.96
Body-Shaming 0.95
Ideology 0.92

Table 3: Inter-Rater Agreement (Cohen’s Kappa)

Figure 2: Our hate speech detection model.


members. The students were instructed to label each mes-
sage with one of five available categories: Racism, Homo-
phobia, Sexism, Body-Shaming, Ideology. A message was Category Acc. Prec. Rec. F1 Score
considered toxic if it fits into at least one of these categories.
For annotation purposes, we defined each category and how Racism 0.989 0.889 0.981 0.933
to classify it in a document for all the students to follow. Homophobia 0.985 0.924 0.930 0.927
We applied the Cohen’s Kappa statistic [Cohen 1960], to Sexism 0.987 0.940 0.886 0.913
measure inter-rater agreement. The kappa values are shown Body-Shaming 0.985 0.887 0.993 0.937
in Table 3. These values demonstrate a substantial level of Ideology 0.993 0.948 0.958 0.953
agreement among the raters, reinforcing the validity of our
classification terms and the effectiveness of our guidelines. Table 4: BERT metrics for prediction of each class validated
on our test set
Classification Model
To perform a robust analysis of toxicity and hate messages
on the Discord network on a large scale, we developed a fine- teristics inherent in our data.
tuned version of a pre-trained BERT model [Devlin et al. For training our model, we randomly divided the dataset
2019]. BERT, an encoder-only transformer model for NLP, into three non-overlapping sets: training, validation, and test
typically undergoes extensive self-supervised pre-training, sets, comprising 81%, 9%, and 10% of the labeled dataset,
engaging in tasks such as masked word prediction across respectively.The fine-tuning was executed using the PyTorch
large text corpora. This pre-training process lays a solid library [Paszke et al. 2019], with the AdamW optimizer
foundation of linguistic knowledge, which can be refined for [Loshchilov and Hutter 2017] on the training set, with a de-
specific tasks through the fine-tuning process. fined learning rate at 5 × 10−6 . The models was trained
Our decision to train a custom model was driven by ini- on one GPU NVIDIA RTX 4090. An early stopping ap-
tial tests that demonstrated off-the-shelf solutions like Per- proach was employed based on the validation set. Further-
spective API and PySentimiento [Pérez, Giudici, and Luque more, classification thresholds were established based on the
2021] underperforming on our dataset. With a finely la- output probabilities of the model, defined as the thresholds
beled dataset and additional fine-tuning, we aimed to de- for each class that yielded the best F1-score on our valida-
velop a model that is specifically tailored to the unique con- tion set.
text of Discord and the specific types of messages that are Our classifier, designed for detecting Portuguese hate
prevalent on this platform, since known solutions, such as speech, demonstrated highly reliable prediction metrics on
base BERT, only generalize across similar datasets [Fortuna, the test set, as detailed in Table 4. These results showcase
Soler-Company, and Wanner 2021]. the model’s capability to accurately identify hate speech
The initial training of the hate speech model from Py- within Brazilian Discord communities and discern subtle
Sentimiento was conducted starting with the BERTaba- variations in language. This performance ensures that our
poru weights [da Costa et al. 2023], derived from a self- findings are robust and can significantly contribute to more
supervised pre-training on 238 million tweets in Portuguese. effective moderation strategies, potential policy implemen-
The model was fine-tuned by the PySentimiento authors for tations, and the broader discourse on maintaining commu-
the multi-class hate speech classification task using 6000 nity health on social platforms.
tweets in Portuguese. For the construction of our clas- Our model serves as a foundational step towards under-
sifier, we leveraged the weights provided by the PySen- standing the scale of hate speech on the platform and paves
timiento framework , which includes a BERT model specifi- the way for future studies to explore the implications of
cally trained for hate speech detection across five categories: these findings in broader social, psychological, and regula-
racism, homophobia, body shaming, ideology, and sexism. tory contexts.
In our research, we conducted further fine-tuning of
the PySentimento hate speech model following the same
classes, but trained on our discord messages labeled dataset. Results
This procedure refined the model to better suit our specific
classification needs, allowing us to capture nuanced charac- This section presents our results and their findings.
Figure 3: Confidence interval plot of hate speech between
Deleted and non Deleted users.

Figure 4: Age distribution histogram across all groups.


User Moderation
Discord’s Terms of Service claim the use of multiple mod-
AI moderation. This is particularly concerning given that
eration tools, including machine learning technologies for
2.5% of the messages in our dataset contain hate speech.
identifying malicious content, complemented by human-led
This rate is alarming, especially in Brazil, where racism and
investigations. Despite these measures, our study shows that
homophobia are considered criminal offenses punishable by
there are concerns about the sufficiency of this moderation.
imprisonment, according to Brazilian Constitution [Brasil
The platform’s terms of service states that they have the
1989].
right, but not the obligation, to review such reports and block
or remove content[Discord 2023b], placing heavy reliance These findings are further corroborated by a survey that
on the active and responsible participation of the administra- was conducted [Discord 2024], which reveals that during the
tors. Ineffective group management can lead to unmoderated fourth quarter of 2023, only 3,313 accounts globally were
spaces, resulting in toxic environments. banned for hateful conduct out of more than 200,000 to-
tal bans. This discrepancy highlights a significant oversight
This section explores the correlation between hate speech
in Discord’s moderation practices, particularly considering
incidents among active users and those who have been
that within our dataset alone—focusing on Brazilian Dis-
deleted. The purpose of this analysis is to determine if there
cord—approximately 16 million messages were identified
are notable differences in the reasons for account deletions.
as hateful. This vast number of unchecked hateful messages
We hypothesize that the absence of statistically significant
could potentially contribute significantly to the types of se-
disparities would indicate one of two scenarios: (i) Most
vere incidents reported.
users delete their own account, rather than being banned by
Discord’s administrative actions; (ii) If the platform indeed
bans the users, it is not for hate speech related reasons. Age Demographics and Implications
Figure 3 shows confidence intervals for the percentage Understanding the age demographics of its users is pivotal,
of hate speech in all messages across the groups, as well particularly in light of Brazilian media reports frequently fo-
as those originating specifically from users whose accounts cusing on issues affecting teenagers and children in this plat-
were subsequently deleted. Our analysis substantiates the form. However, Discord does not directly provide users’ age
initial hypothesis: the observed results are remarkably con- data, presenting a challenge for accurate demographic anal-
sistent with expected trends. Despite some variances, none ysis.
are considerable enough to indicate effective moderation by The age distribution of Discord users is illustrated in Fig-
the platform, since all the intervals intersect. This assertion ures 2 and 3. In Figure 2, ages are categorized into six-year
is supported by the confidence intervals outlined previously; intervals. To provide a clearer view of the younger demo-
the percentages of hate speech from deleted users fall within graphics, Figure 3 specifically focuses on users up to the age
the 90% expected range. of 23, highlighting the predominant age group on this social
This analysis employed Bootstrap tests, utilizing 100,000 media platform.
random subsets, which is about 100 times the amount of It is evident that a majority of Discord’s user base is un-
groups, to estimate the group metrics based on a weighted derage. Notably, thousands of users are below the age of
mean that reflects the size of each group. This approach aims thirteen. This observation is concerning not only because it
to mimic real-world conditions and mitigate any potential bi- suggests a large presence of young individuals on the plat-
ases, thereby providing a reliable estimate of expected out- form, but also because it implies that many users falsify their
comes. age upon registration, contravening platform’s Terms of Use,
Discord claims to have an automated system for user dele- which prohibits account creation for individuals under thir-
tion based on their reports [Discord 2023a]. However, our teen years of age.
findings indicate no substantial difference between the cat- The graphs also shed light on a possible reason why Dis-
egories, calling into question the effectiveness of the said cord has been overlooked in research and remains relatively
Hate Speech Category Amount of Messages
Sexism 0.81%
Homophobia 0.44%
Ideology 0.58%
Body 0.38%
Racism 0.30%

Table 5: Percentage of total messages belonging to hate


speech categories

Figure 5: Age distribution histogram across all groups be-


tween 7 to 23 years.

obscure to many. The number of users declines sharply after


the age of 20, indicating that adults are generally less aware
of the activities and dynamics within the platform. This sit-
uation allows younger users to engage more freely, often
without adequate moderation. This raises further concerns,
especially considering that Discord largely relies on user-on-
user moderation, which might mean that young users could Figure 6: Violin plot of user’s age in group categories.
be moderating their peers, an arrangement that poses several
ethical questions.
To illustrate the dimension of our hypotheses we con- Category Analysis
ducted a experiment prior to the analysis to assess how eas-
ily underage users could engage with sensitive content. It As highlighted in the recent news reports in Brazil, it was ex-
aimed to provide indicative evidence rather than definitive plicitly pointed out the prevalent sexism within the Brazilian
proof. This segment of our study did not involve direct in- community, treating it as a disturbingly normalized issue. To
teraction with groups or users, but rather focused on obser- quantitatively assess this scenario, we utilized our classifier
vational data that was already collected on normal circum- to analyze all messages in our dataset. The results, depicted
stances via API. in Table 5, illustrates the most common forms of toxicity
identified in our analysis.
Drawing upon our prior understanding of groups related Building on the alarming findings of the report, the quan-
to adult content such as those categorized as online dating titative data illustrates the gravity of sexism as the most
or pornography, we randomly selected 100 groups from this prevalent form of toxicity on Discord within the Brazilian
pool. Then, we created accounts with the minimum age al- community. With instances, sexism significantly surpasses
lowed by the platform, 13 years old. other types of toxic behaviors, as seen in Table 5.
Then proceeded to examine three critical aspects: The high incidence of other forms of toxicity such as ho-
mophobia, racism, attacks based on body image and ideol-
1. Whether we were able to enter these groups. ogy indicates a broader issue of aggressive and exclusion-
2. Access to and visibility of +18 text channels upon enter- ary behavior in online interactions. This issue is particularly
ing a group. alarming considering that 8 out of the 12 identified clusters
primarily consist of users under the age of 18, notably in-
3. Access to and availability of +18 voice channels upon cluding those involved in pornography and online dating, as
entering a group. seen in Figure 6. Furthermore, the four clusters with an av-
erage member age over 18 comprise only 4.59% of the total
Out of the 100 randomly selected groups, 71 allowed en- user base.
try. Among these accessible groups, 53 permitted access to Categorizing groups reveals distinct patterns of hate
+18 text channels, while 68 allowed access to +18 voice speech prevalence, with ’Pornography’ category exhibiting
channels. the highest occurrences of sexism, while ’Politics’ is intu-
This observation raises significant concerns, as it indi- itively characterized by high ideology related toxicity, as
cates that underage users can potentially engage in voice seen in Figure 7.
calls with older individuals within groups that were designed It is noteworthy that the ’games’ category, which repre-
for adults. Discord appears to lack adequate censorship mea- sents Discord’s original purpose, exhibits a relatively low
sures to address this issue leading to serious child exposure. incidence of hate speech. This observation emphasizes the
Figure 7: Heatmap of toxicity in group categories.

fact that many of the platform’s most significant issues man- Figure 8: Trends in users toxicity.
ifest in less visible or non-gaming contexts. Therefore, an
analysis that focuses on Discord as merely a gaming net-
work might fail to capture these broader, more pervasive
problems.
We also could correlate users age and which types of hate fluenced by external factors, or that Discord experiences a
they’re most into, either reproducing or being exposed to highly dynamic discourse environment where users oscillate
it. As shown in Figure 6 and Figure 7, ’Pornography’, the between toxic and non-toxic interactions over time. Despite
category that had the most pronounced hate speech occur- that, there is a relatively high number of users exhibiting an
rences, has a concerning high incidence of younger audi- increase in toxic behavior. The disparity between this group
ences. This analysis underscores the critical importance of and those that exhibit decreasing toxicity behaviors high-
addressing safety concerns on platforms where children and lights a potential escalation in negative discourse within the
teenagers are engaging with content and communities meant platform.
for adults. It also highlights a potential gap in parental over-
sight and the platform’s responsibility to protect younger By analyzing Figures 7 and 9, we observe a dis-
users from exposure to harmful content. cernible pattern of user migration among the ”pornography,”
”anime,” and ” online dating” clusters. This trend is partic-
User Path ularly troubling as it suggests a trajectory within these com-
munities that may influence user behavior. The interconnect-
Users may enter in various groups and interact with many edness shown in the network graph, along with the heatmap
other users. To effectively monitor user behavior, it is es- data, highlights significant movements between these clus-
sential to first comprehend the different environments that ters, indicating that user behavior is not confined to isolated
they could encounter. While it is challenging to pinpoint the groups but spans across multiple, potentially influencing and
exact triggers of toxic messages, analyzing patterns can pro- reinforcing harmful ideologies and behaviors.
vide valuable insights.
We conducted a detailed examination of user behavior in To explore the impact of toxic messages on user behavior
discrete intervals. Messages from each user were divided within Brazilian Discord communities, an analysis was con-
into five equal segments, with each segment representing ducted using a novel graphical representation. This graph,
one-fifth of their total messages. This segmentation allows illustrated in Figure 10, tracks the time interval between an
us to track the evolution of discourse across different peri- initial toxic message and subsequent toxic responses. The
ods for users who have posted at least one toxic message. graph demonstrates a striking pattern: about 60% of follow-
After evaluating all segments for a user, the patterns of be- up toxic messages occur within one minute of the initial
havior were analyzed to determine the trend: (i) If the toxic- message. This visual insight highlights the rapid propaga-
ity rate of messages increased in each segment, the user was tion of toxicity within the digital communication channels,
categorized under ’Increasing Toxicity’; (ii) Conversely, if suggesting a potent, immediate influence exerted by the ini-
the trend showed a decrease in toxicity over time, the user tial toxic message on subsequent interactions.
fell into the ’Decreasing Toxicity’ category; (iii) If no clear
trend was observed and the user’s behavior varied between Studies of similar interactions on platforms like Twit-
the segments, they were placed in the ’Inconsistent Toxicity’ ter and Facebook have shown that toxic messages tend to
category. provoke immediate follow-up responses, indicating a reac-
Based on the results shown in Figure 8, the majority of tive and contagious nature of toxicity [Saveski, Roy, and
users fall under the ’inconsistent’ category, suggesting a spo- Roy 2021]. This phenomenon, also evident in the Discord
radic presence of toxic behavior that does not follow a clear groups, suggests that once a toxic message is introduced into
increasing or decreasing trend. This irregularity could in- a conversation, it significantly increases the likelihood of an
dicate that users’ behavior may be context-dependent, in- immediate toxic response.
Figure 10: Time interval of next hate speech message, after
one occurs

adopt more sophisticated, real-time automated tools to de-


Figure 9: Weighted directed graph of user migration between tect and prevent hate speech and other harmful content more
group clusters. The arrow thickness and direction represent effectively, specially for underage users. Moreover, there is a
the amount of flow going from one cluster to another. critical need for rigorous enforcement of community guide-
lines and a reevaluation of the reliance on community mod-
erators as the front line defense against ascending toxicity.
Conclusion
This study has critically evaluated the transformation of Dis- Future research Directions Further research should ex-
cord from a platform originally designed for gamers into plore the development of context-aware moderation tech-
a network that fails to accurately prevent hate speech and nologies that can dynamically adapt to the complexities of
toxic behaviors in Brazilian online communities. Through language, culture, and interaction patterns unique to Brazil-
the analysis of over 700 million messages from 1,232 public ian Discord communities. Additionally, longitudinal studies
groups, we addressed several pivotal research questions: could investigate the long-term impacts of exposure to toxic
RQ1: Effectiveness of Discord’s Moderation. environments on user behavior, contributing to more effec-
The findings reveal that Discord’s current moderation tive digital citizenship and online community management
tools and community-led governance structures are insuffi- strategies.
cient for Brazilian groups. Despite the platform’s attempts Our study contributes to the broader discourse on the re-
to deploy advanced moderation technologies and commu- sponsibilities of social media platforms in regulating con-
nity guidelines, the enforcement remains inconsistent, often tent and behavior, highlighting specific challenges and op-
being unable to contain the proliferation of harmful behav- portunities within the Brazilian context. This research not
iors. only deepens our understanding of digital communication
RQ2: Prevalence and Targets of Hate Speech. dynamics on the platform but also provides a foundation for
Our analysis pointed sexism as the main form of hate future interventions to foster healthier online communities.
speech and indicated that certain groups, particularly those
related to politics, and adult content, are hotspots for toxic Acknowledgments
interactions. The persistence of harmful content in these
groups emphasize the need for targeted moderation strate- The research leading to these results has been funded by
gies that are sensitive to the category and the age demo- CNPQ, CAPES and FAPEMIG.
graphics of each group.
RQ3: User Profile Evolution in Relation to Toxic Envi- Ethical Statement
ronments.
Analysis of users’ behavior exhibited how they move The data in this paper is derived from Discord API. It con-
across different categories of groups, revealing worrying mi- tains data from public groups and channels. No user/group
gration patterns. We also could infer how toxic messages can identifier is published within this paper.
instigate more toxicity, specially in an unmoderated environ-
ment, pointed out by the trends and interval between hateful References
messages.
Aliapoulios, M.; Bevensee, E.; Blackburn, J.; De Cristofaro,
Implications and Recommendations The urgent need for E.; Stringhini, G.; and Zannettou, S. 2021. An Early Look
enhanced moderation strategies is evident. Discord must at the Parler Online Social Network. In ICWSM.
Alkomah, F.; and Ma, X. 2022. A Literature Review of Tex- the 2019 CHI Conference on Human Factors in Computing
tual Hate Speech Detection Methods and Datasets. Informa- Systems, CHI ’19, 1–13. New York, NY, USA: Association
tion, 13(6). for Computing Machinery. ISBN 9781450359702.
Arun, C. 2019. On WhatsApp, Rumours, and Lynchings. Efstratiou, A.; Blackburn, J.; Caulfield, T.; Stringhini, G.;
Economic & Political Weekly, 54(6): 30–35. Zannettou, S.; and De Cristofaro, E. 2023. Non-polar Oppo-
Bakshy, E.; Rosenn, I.; Marlow, C.; and Adamic, L. 2012. sites: Analyzing the Relationship between Echo Chambers
The Role of Social Networks in Information Diffusion. and Hostile Intergroup Interactions on Reddit. Proceedings
arXiv:1201.4145. of the International AAAI Conference on Web and Social
Media, 17(1): 197–208.
BBC. 2020. Facebook bans QAnon conspiracy theory ac-
counts across all platforms. https://bbc.in/3hqARxH. Fillies, J.; Peikert, S.; and Paschke, A. 2024. Hateful Mes-
sages: A Conversational Data Set of Hate Speech Produced
BBC. 2021. Twitter suspends 70,000 accounts linked to by Adolescents on Discord. In Haber, P.; Lampoltsham-
QAnon. https://bbc.in/2RlDyGn. mer, T. J.; and Mayr, M., eds., Data Science—Analytics and
Brasil. 1989. Lei Nº 7.716, de 5 de Janeiro de 1989. Diário Applications, 37–44. Cham: Springer Nature Switzerland.
Oficial da República Federativa do Brasil. ISBN 978-3-031-42171-6.
Chandrasekharan, E.; Samory, M.; Srinivasan, A.; and Fortuna, P.; Soler-Company, J.; and Wanner, L. 2021. How
Gilbert, E. 2017. The Bag of Communities: Identifying well do hate speech, toxicity, abusive and offensive language
Abusive Behavior Online with Preexisting Internet Data. In classification models generalize across datasets? Informa-
Proceedings of the 2017 CHI Conference on Human Fac- tion Processing Management, 58(3): 102524.
tors in Computing Systems, CHI ’17, 3175–3187. New York, Gilbert, S. A. 2020. ”I run the world’s largest historical out-
NY, USA: Association for Computing Machinery. ISBN reach project and it’s on a cesspool of a website.” Moder-
9781450346559. ating a Public Scholarship Site on Reddit: A Case Study
Cheng, J.; Danescu-Niculescu-Mizil, C.; and Leskovec, J. of r/AskHistorians. Proc. ACM Hum.-Comput. Interact.,
2016. Antisocial Behavior in Online Discussion Communi- 4(CSCW1).
ties. arXiv:1504.00680. Heslep, D. G.; and Berge, P. 2024. Mapping Discord’s dark-
Chiril, P.; Pamungkas, E. W.; Benamara, F.; Moriceau, V.; side: Distributed hate networks on Disboard. New Media &
and Patti, V. 2021. Emotionally Informed Hate Speech De- Society, 26(1): 534–555.
tection: A Multi-target Perspective. Cognit Comput. Hoseini, M.; Melo, P.; Benevenuto, F.; Feldmann, A.; and
Cohen, J. 1960. A coefficient of agreement for nominal Zannettou, S. 2023. On the Globalization of the QAnon
scales. Educational and psychological measurement, 20(1): Conspiracy Theory Through Telegram. In Proceedings of
37–46. the 15th ACM Web Science Conference 2023, WebSci ’23,
da Costa, P.; Pavan, M.; dos Santos, W.; da Silva, S.; 75–85. New York, NY, USA: Association for Computing
and Paraboni, I. 2023. BERTabaporu: Assessing a Genre- Machinery. ISBN 9798400700897.
specific Language Model for Portuguese NLP. In Proceed- Hoseini, M.; Melo, P.; Júnior, M.; Benevenuto, F.; Chan-
ings of the 14th International Conference on Recent Ad- drasekaran, B.; Feldmann, A.; and Zannettou, S. 2020. De-
vances in Natural Language Processing, 217–223. Varna, mystifying the Messaging Platforms’ Ecosystem Through
Bulgaria: INCOMA Ltd. the Lens of Twitter. In Proceedings of the ACM Internet
Davidson, T.; Warmsley, D.; Macy, M.; and Weber, I. 2017. Measurement Conference, IMC ’20, 345–359. New York,
Automated Hate Speech Detection and the Problem of Of- NY, USA: Association for Computing Machinery. ISBN
fensive Language. arXiv:1703.04009. 9781450381383.
Devlin, J.; Chang, M.-W.; Lee, K.; and Toutanova, K. 2019. Jiang, J. A.; Kiene, C.; Middler, S.; Brubaker, J. R.; and
BERT: Pre-training of Deep Bidirectional Transformers for Fiesler, C. 2019. Moderation Challenges in Voice-Based
Language Understanding. arXiv:1810.04805. Online Communities on Discord. Proc. ACM Hum.-Comput.
Interact., 3(CSCW).
Disboard. 2023. Disboard - Public Discord Server List.
https://disboard.org/. Accessed: 2023-08-01. Jiang, J. A.; Nie, P.; Brubaker, J. R.; and Fiesler, C. 2023.
A Trade-off-centered Framework of Content Moderation.
Discord. 2023a. Discord Privacy Policy. https://discord. ACM Trans. Comput.-Hum. Interact., 30(1).
com/privacy. Accessed: 2023-08-01.
Júnior, M.; Melo, P.; Kansaon, D.; Mafra, V.; Sa, K.; and
Discord. 2023b. Discord Terms of Service. https://discord. Benevenuto, F. 2022. Telegram Monitor: Monitoring Brazil-
com/terms. Accessed: 2023-08-01. ian Political Groups and Channels on Telegram. In Proceed-
Discord. 2024. Number of Discord accounts removed ings of the 33rd ACM Conference on Hypertext and Social
from the platform as of the 4th quarter 2023, by viola- Media, HT ’22, 228–231. New York, NY, USA: Association
tion. https://www.statista.com/statistics/1286859/discord- for Computing Machinery. ISBN 9781450392334.
banned-accounts-by-violation/. Accessed: 2024-09-04. La Morgia, M.; Mei, A.; Mongardini, A. M.; and Wu,
Dosono, B.; and Semaan, B. 2019. Moderation Practices J. 2021. Uncovering the Dark Side of Telegram: Fakes,
as Emotional Labor in Sustaining Online Communities: The Clones, Scams, and Conspiracy Movements. arXiv preprint
Case of AAPI Identity Work on Reddit. In Proceedings of arXiv:2111.13530.
Lacher, L.; and Biehl, C. 2018. Using Discord to Under- Saha, P.; Das, M.; Mathew, B.; and Mukherjee, A. 2023.
stand and Moderate Collaboration and Teamwork: (Abstract Hate Speech: Detection, Mitigation and Beyond. In Pro-
Only). In Proceedings of the 49th ACM Technical Sympo- ceedings of the Sixteenth ACM International Conference
sium on Computer Science Education, SIGCSE ’18, 1107. on Web Search and Data Mining, WSDM ’23, 1232–1235.
New York, NY, USA: Association for Computing Machin- New York, NY, USA: Association for Computing Machin-
ery. ISBN 9781450351034. ery. ISBN 9781450394079.
Lees, A.; Tran, V. Q.; Tay, Y.; Sorensen, J.; Gupta, J.; Met- Saha, P.; Mathew, B.; Garimella, K.; and Mukherjee, A.
zler, D.; and Vasserman, L. 2022. A new generation of 2021. “Short is the Road that Leads from Fear to Hate”: Fear
perspective api: Efficient multilingual character-level trans- Speech in Indian WhatsApp Groups. In Proceedings of the
formers. In Proceedings of the 28th ACM SIGKDD Con- Web Conference 2021, WWW ’21, 1110–1121. New York,
ference on Knowledge Discovery and Data Mining, 3197– NY, USA: Association for Computing Machinery. ISBN
3207. 9781450383127.
Saveski, M.; Roy, B.; and Roy, D. 2021. The structure of
Lima, L.; Reis, J. C.; Melo, P.; Murai, F.; Araujo, L.;
toxic conversations on Twitter. In Proceedings of the Web
Vikatos, P.; and Benevenuto, F. 2018. Inside the right-
Conference 2021, 1086–1097.
leaning echo chambers: Characterizing gab, an unmoderated
social system. In ASONAM. Schmitz, M.; Muric, G.; and Burghardt, K. 2022. Quanti-
fying How Hateful Communities Radicalize Online Users.
Lima, L.; Reis, J. C. S.; Melo, P.; Murai, F.; and Benevenuto, In 2022 IEEE/ACM International Conference on Advances
F. 2020. Characterizing (Un)moderated Textual Data in So- in Social Networks Analysis and Mining (ASONAM), 139–
cial Systems. In 2020 IEEE/ACM International Confer- 146. Los Alamitos, CA, USA: IEEE Computer Society.
ence on Advances in Social Networks Analysis and Mining Silva, L.; Mondal, M.; Correa, D.; Benevenuto, F.; and We-
(ASONAM), 430–434. ber, I. 2016. Analyzing the targets of hate in online social
Loshchilov, I.; and Hutter, F. 2017. Decoupled weight decay media. In Proceedings of the International AAAI Conference
regularization. arXiv preprint arXiv:1711.05101. on Web and Social Media, volume 10, 687–690.
Melo, P.; Messias, J.; Resende, G.; Garimella, K.; Almeida, Statista. 2023. Number of Discord user base per
J.; and Benevenuto, F. 2019. Whatsapp monitor: A fact- year. https://www.statista.com/statistics/1367922/discord-
checking system for whatsapp. In Proceedings of the Inter- registered-users-worldwide/. Accessed: 2023-08-01.
national AAAI Conference on Web and Social Media, vol- Tardaguila, C.; Benevenuto, F.; and Ortellado, P. 2018. Fake
ume 13, 676–677. News Is Poisoning Brazilian Politics. WhatsApp Can Stop
Obermaier, M.; and Schmuck, D. 2022. Youths as targets: It.
factors of online hate speech victimization among adoles- Thach, H.; Mayworm, S.; Delmonaco, D.; and Haimson,
cents and young adults. Journal of Computer-Mediated O. 0. (In)visible moderation: A digital ethnography of
Communication, 27(4): zmac012. marginalized users and content moderation on Twitch and
Reddit. New Media & Society, 0(0): 14614448221109804.
Olteanu, A.; Castillo, C.; Boy, J.; and Varshney, K. 2018. van der Sanden, R.; Wilkins, C.; Rychert, M.; and Barratt,
The Effect of Extremist Violence on Hateful Speech Online. M. J. 2022. The Use of Discord Servers to Buy and Sell
Proceedings of the International AAAI Conference on Web Drugs. Contemporary Drug Problems, 49(4): 453–477.
and Social Media, 12(1).
Wang, E. L.; Luceri, L.; Pierri, F.; and Ferrara, E.
Paszke, A.; Gross, S.; Massa, F.; Lerer, A.; Bradbury, J.; 2023. Identifying and Characterizing Behavioral Classes
Chanan, G.; Killeen, T.; Lin, Z.; Gimelshein, N.; Antiga, L.; of Radicalization within the QAnon Conspiracy on Twitter.
et al. 2019. Pytorch: An imperative style, high-performance arXiv:2209.09339.
deep learning library. Advances in neural information pro- Watanabe, H.; Bouazizi, M.; and Ohtsuki, T. 2018. Hate
cessing systems, 32. Speech on Twitter: A Pragmatic Approach to Collect Hate-
Pérez, J. M.; Giudici, J. C.; and Luque, F. 2021. pysen- ful and Offensive Expressions and Perform Hate Speech De-
timiento: A Python Toolkit for Sentiment Analysis and So- tection. IEEE Access, 6: 13825–13835.
cialNLP tasks. arXiv:2106.09462. Wiles, A. M.; and Simmons, S. L. 2022. Establishment of
Resende, G.; Melo, P.; Sousa, H.; Messias, J.; Vasconcelos, an Engaged and Active Learning Community in the Biology
M.; Almeida, J.; and Benevenuto, F. 2019. (Mis)Information Classroom and Lab with Discord. Journal of Microbiology
Dissemination in WhatsApp: Gathering, Analyzing and & Biology Education, 23(1): e00334–21.
Countermeasures. In The World Wide Web Conference, Zadrozny, B.; and Collins, B. 2018. Reddit bans Qanon sub-
WWW ’19, 818–828. New York, NY, USA: Association for reddits after months of violent threats. https://nbcnews.to/
Computing Machinery. ISBN 9781450366748. 3wgGOBR.
Rieger, D.; Kümpel, A. S.; Wich, M.; Kiening, T.; and Groh, Zadrozny, B.; and Collins, B. 2020. YouTube bans QAnon.
G. 2021. Assessing the Extent and Types of Hate Speech in https://nbcnews.to/3ybJC4H.
Fringe Communities: A Case Study of Alt-Right Communi- Zannettou, S. 2021. ”I Won the Election!”: An Empiri-
ties on 8chan, 4chan, and Reddit. Social Media + Society, cal Analysis of Soft Moderation Interventions on Twitter.
7(4): 20563051211052906. arXiv:2101.07183.
Paper Checklist Discord API since it could contain data that harms the
1. For most authors... moral integrity of various groups and society in gen-
eral.
(a) Would answering this research question advance sci-
ence without violating social contracts, such as violat- (b) Did you specify all the training details (e.g., data splits,
ing privacy norms, perpetuating unfair profiling, exac- hyperparameters, how they were chosen)? Yes, see
erbating the socio-economic divide, or implying disre- ”Hate Speech Classification Model” section
spect to societies or cultures? Yes (c) Did you report error bars (e.g., with respect to the ran-
(b) Do your main claims in the abstract and introduction dom seed after running experiments multiple times)?
accurately reflect the paper’s contributions and scope? NA
Yes (d) Did you include the total amount of compute and the
(c) Do you clarify how the proposed methodological ap- type of resources used (e.g., type of GPUs, internal
proach is appropriate for the claims made? Yes cluster, or cloud provider)? Yes
(d) Do you clarify what are possible artifacts in the data (e) Do you justify how the proposed evaluation is suffi-
used, given population-specific distributions? NA cient and appropriate to the claims made? Yes
(e) Did you describe the limitations of your work? Yes (f) Do you discuss what is “the cost“ of misclassification
and fault (in)tolerance? Yes
(f) Did you discuss any potential negative societal im-
pacts of your work? NA 5. Additionally, if you are using existing assets (e.g., code,
(g) Did you discuss any potential misuse of your work? data, models) or curating/releasing new assets, without
NA compromising anonymity...
(h) Did you describe steps taken to prevent or mitigate po- (a) If your work uses existing assets, did you cite the cre-
tential negative outcomes of the research, such as data ators? Yes, all the methods used were correctly refer-
and model documentation, data anonymization, re- enced in the paper
sponsible release, access control, and the reproducibil- (b) Did you mention the license of the assets? NA
ity of findings? Yes (c) Did you include any new assets in the supplemental
(i) Have you read the ethics review guidelines and en- material or as a URL? NA
sured that your paper conforms to them? Yes (d) Did you discuss whether and how consent was ob-
2. Additionally, if your study involves hypotheses testing... tained from people whose data you’re using/curating?
NA
(a) Did you clearly state the assumptions underlying all
theoretical results? Yes (e) Did you discuss whether the data you are using/cu-
(b) Have you provided justifications for all theoretical re- rating contains personally identifiable information or
sults? Yes offensive content? Yes, see Ethical Statement
(c) Did you discuss competing hypotheses or theories that (f) If you are curating or releasing new datasets, did you
might challenge or complement your theoretical re- discuss how you intend to make your datasets FAIR?
sults? Yes NA
(d) Have you considered alternative mechanisms or expla- (g) If you are curating or releasing new datasets, did you
nations that might account for the same outcomes ob- create a Datasheet for the Dataset? NA
served in your study? Yes 6. Additionally, if you used crowdsourcing or conducted
(e) Did you address potential biases or limitations in your research with human subjects, without compromising
theoretical framework? Yes anonymity...
(f) Have you related your theoretical results to the existing (a) Did you include the full text of instructions given to
literature in social science? Yes participants and screenshots? NA
(g) Did you discuss the implications of your theoretical (b) Did you describe any potential participant risks, with
results for policy, practice, or further research in the mentions of Institutional Review Board (IRB) ap-
social science domain? Yes provals? NA
3. Additionally, if you are including theoretical proofs... (c) Did you include the estimated hourly wage paid to
participants and the total amount spent on participant
(a) Did you state the full set of assumptions of all theoret-
compensation? NA
ical results? NA
(d) Did you discuss how data is stored, shared, and dei-
(b) Did you include complete proofs of all theoretical re-
dentified? NA
sults? NA
4. Additionally, if you ran machine learning experiments...
(a) Did you include the code, data, and instructions
needed to reproduce the main experimental results (ei-
ther in the supplemental material or as a URL)? No,
because we are unsure about releasing data from the

You might also like