Detection of Fake News

You might also like

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 17

DETECTION OF FAKE NEWS

PROJECT BY: GUIDE BY:

Kiran Varathan - 7172110525


Tamizh Mani - 7172110552
Dr.SUSITHAM
Manigandan -
Dillibabu -
Yokesh -

Second Review Department of Computer Science and


Engineering
CONTENTS
 Problem Statement and Motivation
 Objectives
 Abstract
 System Architecture
 List of Modules and
 Reference

Second Review Department of Computer Science and


Engineering
Problem Statement and Motivation
 In democratic societies, informed citizens are essential for
effective governance. Fake news can manipulate public opinion,
influence elections, and undermine the democratic process by
spreading misinformation and falsehoods.
 Especially in times of crisis such as pandemics, fake news about
health-related issues can have serious consequences.
Misinformation about treatments, vaccines, or preventive
measures can lead to harmful behaviors and jeopardize public
health efforts.
Objectives
 The objective of this project is to develop a user-friendly web
application for detecting fake news using machine learning
 The research intends to evaluate and compare the performance
of these machine learning algorithms. This comparison will
provide valuable insights into which algorithms are more
effective at detecting Fake News.
 Implement a user-friendly Flask framework that integrates the
selected machine learning for fake news detection.
Abstract 1
 In today's digital age, the proliferation of fake news has become a significant
challenge, posing threats to democracy, public health, and societal stability.
 To address this issue, this research proposes the development of a user-
friendly web application for real-time fake news detection using machine
learning techniques, with a focus on the Naive Bayes (NB) algorithm. The
objective is to evaluate and compare the performance of the NB classifier
with other machine learning.
 The study involves collecting and preprocessing a dataset of news articles
labeled as real or fake. The NB algorithm is trained using this dataset,
leveraging its simplicity and efficiency in handling text classification tasks.
 The user-friendly web application is built using the Flask framework,
providing an intuitive interface for users to input news articles.
Abstract 2
 The NB classifier is integrated into the application backend to enable real-
time fake news detection. Upon submission, the application processes the
input text, applies the NB algorithm, and presents the prediction results to
the user, indicating whether the news is classified as real or fake.
 The research evaluates the performance of the NB algorithm in terms of
accuracy, precision, recall, and F1-score, comparing it with alternative
machine learning models.
 The findings contribute valuable insights into the effectiveness of the NB
classifier for fake news detection, highlighting its potential as a reliable tool in
combating misinformation

Second Review Department of Computer Science and


Engineering
System Architecture

TRAINING
DATA MODEL
DATASET NLTK PRE- TRAINING
NEWS PROCESSI
ML –NB
DATASET NG
TESTING
DATA

Evaluation/
Save Model &
Weights

FAKE NEWS USER INPUT Prediction


DETECTION MESSAGE webpage
List of Modules
 Data Collection
 Data Preprocessing
 Data Training
 Data testing
 Flask Framework
DATA COLLECTION

 id: unique id for a news article


 title: the title of a news article
 author: author of the news article
 text: the text of the article; could be incomplete
 label: a label that marks the article as potentially
unreliable
 1: unreliable
 0: reliable
DATA PREPROCESSING
 Text Cleaning:
 Remove HTML tags: If your NEWS data contains HTML tags, strip them to keep only the text content.
 Remove special characters and symbols: Eliminate non-alphanumeric characters, punctuation, and
symbols that do not convey meaningful information.
 Convert to lowercase: Ensure that all text is in lowercase to avoid case sensitivity.
 Tokenization:
 Split text into individual words or tokens. This is essential for further analysis as it breaks down text into
its basic units.
 Stopword Removal:
 Remove common stopwords such as "the," "is," "and," "in," which do not carry much information for
classification. You can use libraries like NLTK or spaCy to help with this.
 Stemming or Lemmatization:
 Apply stemming or lemmatization to reduce words to their root form. This helps in grouping related words
together, reducing the feature space.
 Handling Abbreviations and Acronyms:
 Expand common abbreviations and acronyms to their full forms. This can improve the interpretability of
the text data.

Second Review Department of Computer Science and Engineeri


ng
text preprocessing

tokenizer = RegexpTokenizer(r'\w+')

message = TextProcessor.replace_emojis(message)

message = TextProcessor.remove_mention_and_url(message)

message = TextProcessor.remove_punctuation(message)

message = TextProcessor.tokenizer.tokenize(message.lower())

message = TextProcessor.remove_stopwords(message)

message = TextProcessor.word_stemmer(message)

message = TextProcessor.word_lemmatizer(message)
 Tokenization

 Stopwords eg i’, ‘me’, ‘my’, ‘myself’, ‘we’, ‘our’, ‘ours’, ‘ourselves’, ‘you’, “you’re”, “you’ve”, “you’ll”, “you’d”, ‘your’,
.

‘yours’, ‘yourself’, ‘yourselves’, ‘he’, ‘him’, ‘his’, ‘himself’, ‘she’, “she’s”, ‘her’, ‘hers’, ‘herself’, ‘it’, “it’s”, ‘its’, ‘itself’, ‘they’, ‘them’,
‘their’, ‘theirs’, ‘themselves’, ‘what’, ‘which’, ‘who’, ‘whom’, ‘this’, ‘that’, “that’ll”, ‘these’, and so.on
DATA FLOW

1:Input 2:steps
Input dataset Pre- process X,Y Train split

Dataset

Pre-process

Split X,Y train


Algorithm used - NAÏVE BAYES
 Naive Bayes (NB) is a probabilistic machine
learning algorithm based on Bayes' theorem.
 NB works well with text data. In the case of FAKE
NEWS detection, messages are typically
represented as a bag-of-words (BoW) or term
frequency-inverse document frequency (TF-IDF)
vectors. These representations capture the
frequency of words in the text.

Second Review Department of Computer Science and


Engineering
References
 Hadeer Ahmed, Issa Traore and Sherif Saad, "Detection of online fake news using n-gram
analysis and machine learning techniques", International Conference on Intelligent Secure and
Dependable Systems in Distributed and Cloud Environments, pp. 127-138, 2017.
 Chih-Chung Chang and Chih-Jen Lin, LIBSVM - A Library for Support Vector Machines, July 2018.
 Niall J Conroy, Victoria L Rubin and Yimin Chen, "Automatic deception detection: Methods for
finding fake news", Proceedings of the Association for Information Science and Technology, vol.
52, no. 1, pp. 1-4, 2015.
 Chris Faloutsos, "Access methods for text", ACM Computing Surveys (CSUR), vol. 17, no. 1, pp.
49-74, 1985.
 Mykhailo Granik and Volodymyr Mesyura, "Fake news detection using naive bayes
classifier", 2017 IEEE First Ukraine Conference on Electrical and Computer Engineering
(UKRCON), pp. 900-903, 2017. Junaed Younus Khan, Md Khondaker, Tawkat Islam, Anindya Iqbal
and Sadia Afroz, A benchmark study on machine learning methods for fake news detection,
2019, [online] Available: .
 Cédric Maigrot, Ewa Kijak and Vincent Claveau, "Fusion par apprentissage pour la détection de
fausses informations dans les réseaux sociaux", Document numerique, vol. 21, no. 3, pp. 55-80,
2018.
Second Review
Department of Computer Science and Engineeri
ng
Thank You

You might also like