Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

Lok Nayak Jai Prakash Institute of Technology Chapra-841302

Project Synopsis
On
“Hate Speech Detection”

Team members: -
S.no Name Collage Roll no. Registration no.
01 Ankit Raj 2k19-CSE-58 19105117055
02 Shashi Shekhar Sharma 2k19-CSE-53 19105117002
03 Rohit Gupta 2k19-CSE-20 19105117030
04 Vishal Kumar 2k19-CSE-45 19105117006
 Introduction

Hate speech is one of the serious issues we see on social media platforms like Twitter and
Facebook daily. There is no legal definition of hate speech because people’s opinions
cannot easily be classified as hateful or offensive. Nevertheless, the United Nations defines
hate speech as any type of verbal, written or behavioural communication that can attack or
use discriminatory language regarding a person or a group of people based on their identity
based on religion, ethnicity, nationality, race, color, ancestry, gender or any other identity
factor.

In Social media platforms, there are uncontrollable number of comments and posts issued
every second which make it impossible to trace or control the content of such platform.
Therefore, social platforms are facing a problem in limiting these posts while balancing the
freedom of speech.

The problem of hate speech in social networks is technically considered as unstructured text
Problem. Therefore, extracting insights and pattern from such text can be a bit challenging,
owing to the context-development interpretation of natural language. Text mining
technologies have the capabilities to handle the ambiguity and variability of unstructured
data.

Social media platforms need to detect hate speech and prevent it from going viral or ban it
at the right time. So in this project, I will walk you through the task of hate speech detection
with machine learning using the Python programming language.
 Aims & Objective

This aims to classify textual content into non-hate or hate speech, in which case the method
may also identify the targeting characteristics (i.e., types of hate, such as race, and religion)
in the hate speech.The proposed solutions employed the different feature engineering
techniques and ML algorithms to classify content as hate speech.

 The main objective of this work is to develop an automated deep learning based
approach for detecting hate speech and offensive language.

 Automated detection corresponds to automated learning such as machine learning:


supervised and unsupervised learning. We use a supervised learning method to
detect hate and offensive language.

 Classify tweets into two categories based on tweet sentiment and other features that
a tweet demonstrate.
 Hardware and Software required

Hardware: -
Processor- I5 or higher
Hard Disk- 10GB or higher
RAM-4GB or higher

Software: -
Anaconda navigator
Jupyter or Goggle collab
Python 3.0 or higher
 Flow of Project

1. Data collection: We can download the required set of data from Kaggle which
allows users to find and publish data sets, explore and build models in a web-based
data-science environment

2. Data Cleaning: As we have formulated our motive and also we did collect our data,
the next step to do is cleaning. Data cleaning is all about the removal of missing,
redundant, unnecessary and duplicate data from your collection.

3. Data Analysis and Exploration: It’s one of the prime things in data science to do
and time to get inner Holmes out. It’s about analyzing the structure of data, finding
hidden patterns in them, studying behaviors, visualizing the effects of one variable
over others and then concluding. We can explore the data with the help of various
graphs formed with the help of libraries using any programming language.

4. Data Modelling: Once we are done with our study that we have formed from data
visualization, we must start building a hypothesis model such that it may yield us a
good prediction in future. Here, we must choose a good algorithm that best fit to our
model. we train our model with the train data and then test it with test data.

5. Optimization and Deployment: we followed each and every step and hence build a
model that we feel is the best fit. But how can we decide how well our model is
performing? This where optimization comes. we test our data and find how well it is
performing by checking its accuracy. In short, we check the efficiency of the data
model and thus try to optimize it for better accurate prediction. Deployment deals with
the launch of our model and let the people outside there to benefit from that.
 Block Diagram
 Conclusion

The propagation of hate speech on social media has been increasing significantly in recent
years and it is recognized that effective counter measure rely on automated data mining
techniques. Our work made several contributions to this problem. We introduced a method
for automatically classifying hate speech.

 Future Work

We will explore future work in numerous ways such as improve the accuracy, use the
metadata along from Facebook, Instagram and other social media platform. We will also
explore ways to detect hate speech in form of audio or video. There are a lot of new things
which we will improve and add much more features to it.

You might also like