Professional Documents
Culture Documents
Automatic Hate Speech Detection On Social Media: A Brief Survey
Automatic Hate Speech Detection On Social Media: A Brief Survey
A Brief Survey
Ahlam Alrehili
Computer Science Department
Taibah University
El-Madina El-Monawara, Kingdom of Saudi Arabia
amrehili@gmail.com
Abstract— Due to the advancement in technology and the There is a huge difference between free speech and
explosion of the information age, people communicate with hate speech, but we can consider the hate speech as a type of
each other indirectly via using the online social networks free speech. Free speech supports community or individuals
(OSNs), such as Facebook Snapchat, Instagram, and Twitter. to express their ideas and opinions without fear of sanction,
Users of OSNs can post anything without any control or
censorship, and retaliation [4]. Freedom of speech is required
constraint of the content, which leads to increase in spreading
of hateful and offensive speech among users, thus resulting in for democratic rights, and it ensures the autonomous
an increase in crimes, murder, and terrorism. Hence, this enjoyment of the individual. Hence, hate speech is contrary
paper provides a survey and state of the art natural language to free speech and violates the fundamental rights of
processing (NLP) technique that is used in automatic detection community and individuals as well as reduces the
of the hate speech on OSNs, such as dictionaries, bag-of-words, permissible limits in free speech.
N-gram etc. In the recent years, the automatic detection of hate
speech has become a hot and popular topic in the research
Keywords—hate speech, OSN, online social network,
field owing to several reasons. European Union Commission,
automatic detection, offensive, hateful dictionaries, bag-of-
words, N-gram.
in recent years, has contributed to reducing hate speech by
incorporating different initiatives, such as designing
programs aimed at fighting the hate speech. Moreover, it
I. INTRODUCTION imposed Microsoft, Twitter, YouTube, and Facebook to sign
In the past, people used to communicate with each a European Union hate speech code that necessitates to
other either by meeting at home or in a public place or by review the user post and delete any post containing hate
using the telephone or mobile phones etc. But, nowadays, speech in less than 24 hours [5]. On the other hand, there is
with the advancement of the technology and the explosion of no specialized tool for automatic detection of the hate
the information age, the communication between people has speech, and there is lack of data collection, documentation
become indirect, and the world has become a global village. and systematic monitoring of hate and violence.
So, people can easily and quickly communicate with each
other by using social networking sites such as Twitter, A hate speech is widely spread on online social
Facebook, and Snap Chat. However, each person may have network sites (OSN), such as Twitter, Facebook, and
one or more accounts on each social networking site and can Instagram, because it provides an open space for users to
post anything without any control or constraint. express their opinions, beliefs, and ideas without any
limitations. Moreover, it gathers users from different
Content posted by people may differ from a good countries and nationalities; each of them has a different
content that carries high value for others to abusive speech. background, customs, and traditions. However, OSN users
The reason behind this difference is the freedom and lack of tend to use hate speech for a variety of reasons, either the
restrictions that prevent abusive speech. Many online social person is proud of his/her nationality or religion or he/she
networking sites are trying to mitigate the effects of non- has the feeling of physical safety.
restrictive content by implementing algorithms solution to
detect and strictly prohibit any abusive speech. Despite all The aim of this research is to conduct a survey and
attempts for detecting abusive speech automatically, it is still state of the art technique regarding the detection of hate
a difficult task owing to multifacetedness of the content. speech in online social networking sites. This research aims
at studying the most common and famous natural language
Hate speech is one of the most abusive speeches processing (NLP) techniques, which have contributed to the
that has been recently spread on online social network sites. automatic disclosure of the hate speech.
Hate speech refers to the language that attacks a person or The rest of the paper is organized as follows:
group depending on a set of attributes, such as ethnic section II provides a hate speech technique that is mostly
origin, religion, sex, race, national origin, sexual orientation, used on NLP. Section III provides an overview of hate
gender identity, and disability[1]. Hate speech was defined speech techniques. Section IV presents challenges and
by Merriam Webster as a “speech expressing hatred of a opportunities, and this paper is we conclude in section V.
particular group of people''[2]. Moreover, hate speech is II. AUTOMATIC HATE SPEECH TECHNIQUES
defined as a “speech that is intended to insult, offend, or
intimidate a person because of some attribute (national There are many researchers who have conducted
origin, sexual orientation, disability, religion, or race)[3].'' research on the automatic detection of the hate speech in the
previous literature. These researches are distinguished by a