Final Synopsis-Major Abhilasha, Ananya

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

Major Project Synopsis

On
FAKE NEWS DETECTION USING MACHINE LEARNING

SUBMITTED BY:
Abhilasha Goel Ananya Bose
19102236 19102064

UNDER THE SUPERVISION OF: DR. SAMRITI KALIA

Submitted in partial fulfillment for the award of the degree of


BACHELOR OF TECHNOLOGY
in
ELECTRONICS & COMMUNICATIONS ENGINEERING

JAYPEE INSTITUTE OF INFORMATION TECHNOLOGY


SEC 62, NOIDA
Introduction

Brief overview of the project


Fake news detection refers to the process of identifying and verifying the accuracy of news stories and other
forms of information. It involves fact-checking, source evaluation, and content analysis. This project focuses
on Content analysis and is based on a Comprehensive Literature survey of existing fake news detection
models, which are then implemented, and compared for best results. Further, to increase accessibility and
convenience, the best models have been deployed on both a Telegram Bot and a Locally hosted webpage,
ensuring Users can easily access and benefit from our findings.

Objectives and goals of the project


Our journey began with a simple goal: to learn about ML and develop a model that could accurately detect
fake news. The impact of false information can be devastating, leading to confusion, chaos, and even harm.
We felt compelled to take action and develop a solution that could curb the spread of fake news and prevent
its negative consequences.
As we progressed, we discovered that the task of detecting fake news is far more complex than we had
originally thought. It is not just about identifying certain words or phrases, but it also requires analyzing the
sentiment and coherence of the content. We integrated the seemingly unrelated concept of sentiment analysis
with detection and also combat the bias in our datasets.
Our goal has been to create a powerful model that can live-learn and classify news with utmost accuracy. We
wish to monitor the intent of the news we consume and help people make informed decisions.
Our project represents a small step forward in the field of fake news We hope that our findings prove
valuable to academics, researchers, and policymakers alike, and we look forward to sharing them with the
broader community.

Overall methodology used


● Data collection: Among numerous available large and diverse datasets of news articles that included
both real and fake news, we shortlisted the most workable datasets and merged them suitably to
eliminate bias.
● Data preprocessing: The selected dataset was then preprocessed to clean and prepare the data for
analysis. This involved techniques such as removing stop words, stemming, and lemmatization.
● Feature extraction: Next, the dataset is processed to extract relevant features that can be used to
differentiate between real and fake news. This included features such as the frequency of certain
words or phrases, sentiment analysis, and the credibility of the news source.
● Model selection: Once the features have been extracted, different machine learning algorithms were
selected and evaluated to determine which model performs the best for the task of fake news
detection.
● Model training: The selected model was then trained on the preprocessed dataset using a supervised
learning approach.
● Model evaluation: Finally, the trained model was evaluated using a testing dataset to measure its
performance in terms of accuracy, precision, recall, and F1 score.
Phases of the project
● Literature Survey: In this phase, we conducted an extensive review of the existing literature on fake
news detection using machine learning and deep learning techniques. We reviewed and analyzed the
latest research papers, books, and online resources to identify the most relevant techniques and
algorithms that could be used in the project.
● Model Implementation and Comparative Analysis: In this phase, we implemented various
machine learning and deep learning models for fake news detection. We then compared the
performance of these models based on various evaluation metrics, such as accuracy, precision, recall,
and F1 score, to identify the best-performing model for the task. We also performed a comparative
analysis of the models to determine the strengths and weaknesses of each model.
● Evaluation and Improvements: In this phase, we evaluated the performance of the best-performing
models identified in the previous phase. We identified the areas where the models could be
improved, such as feature selection, data preprocessing, and hyperparameter tuning. We then made
the necessary improvements to the models to enhance their performance.
● Deployment: In this final phase, we deployed the best-performing models as a fake news detection
app on the web. We tested the app to ensure that it performed well in real-world scenarios and made
the necessary improvements based on user feedback. We also documented the app's architecture and
deployment process for future reference.

Figure : Timeline Of The Project


Overall, the project was divided into four distinct phases that involved a combination of literature review,
model implementation, evaluation, and deployment. The phases were repeated separately for both
machine learning and deep learning models, allowing for a thorough exploration of the various
techniques and algorithms available for fake news detection.
Contributions

1. ML & DL Model Development


Flowchart:

Libraries Used:

Pandas: For importing the dataset.


Seaborn/Matplotlib: For data visualization.
NLTK : to remove all the stopwords, punctuations and any irrelevant spaces from the text.
sklearn : for feature extraction, model training, and model evaluation
Keras : for DL model training and hyperparameter tuning

Results:
2. Telegram Bot
Flowchart:

Results:

Figure: Successful Deployment of Telegram Bot


Fake News Detection App
Web deployment involves steps like selecting a web server and hosting provider, configuring the server and
web application, testing the application, and publishing it on the internet.

The programming languages used in the web deployment include: HTML: used to structure the content of a
web page. CSS: used to define the visual style and layout of a web page. JavaScript: used to add interactive
behaviour and functionality to a web page.

Flask Framework

Flask is a Python-based micro web framework that is widely used for developing web applications.
It is a lightweight framework that is easy to use and has a simple and easy-to-understand syntax. Flask
allows developers to build web applications quickly and easily by providing a wide range of features and
tools. Flask is based on the WSGI toolkit and supports the use of extensions that add functionality to the
frameworks.
Key Features of Framework:
1. Routing
2. Template Engine
3. Debugging
4. Built-in development server
5. ORM support
6. Flask-RESTful

Resources
● Local Host Provider
● A development environment, Jupyter Notebook IDE & Source Code Editor
● Flask Framework
● Any Operating System(Mac, Windows, Linux)

Timeline
25 February ‘23 – Web-Dev knowledge
5 March ‘23 – Learning Flask basics
18 March ‘23 – Localhost Development
15 April ‘23 – Prototype and Testing
1 May ‘23 – Documentation
How the app works
Results from the app
This web app employs Logistic regression ML model for detecting fake news. The user can input the news
article titles for finding out credibility of the news articles.

The Web App is not hosted on a domain name server for public use as it is a prototype and telegram bot is
available for public use.

Figure: Welcome page of Credibl

Figure: Prediction page of Credibl

The figures below show user interaction with the bot and sample predictions by Credibl.
Conclusion
End-to-end fake news detection offers an exciting opportunity to improve model performance by adding
data and computation. In fact, our results show that our model can close the performance gap with human
operators by using more data and larger samples compared to previous incarnations. The project is built with
open-source software modules mostly coded in python. The modular design of the project makes it more
flexible and easier to add new jobs without disrupting existing ones. The model we trained showed accuracy
between 95% and 98%. After optimization for best time, accuracy, and negative classification as negative,
we conclude that the logistic regression model using a number vectorizer is the best for distribution. This
result is particularly noteworthy considering that this was an unsupervised learning problem and highlights
the effectiveness of ML techniques in solving complex problems like this.

Potential applications of the project


● Fake news impact measurement: (red alert creation)
● Reliability of the source predictor
● Rumour Classification
● Truth Discovery
● Clickbait Detection
● News classifier: categories like politics, sports etc
● Authentication of facts, quality of news
● Spammer and Bot Detection
● Hate speech detector

Limitations
1. Fact-checking
2. Source verification
3. Language style and grammar
4. Context and nuances
5. Image and video verification
6. Neural Fake News

Conclusion

We successfully completed the project, creating a deployable model file using HTML, Flask framework..
Deployment was achieved through a user-friendly web interface, ensuring accessibility for a wide user base.
Thorough testing with real and unseen datasets, consisting of articles from various sources, demonstrated the
model's high accuracy in correctly identifying the veracity of news articles. This showcases the potential of
machine learning in addressing this issue. While there is room for further improvements, particularly for
large businesses, this model deployment serves as a valuable tool for journalists, researchers, and the general
public in verifying news article accuracy. In the face of increasing misinformation, initiatives like these are
essential to uphold news source credibility and foster an informed society
References

1. Lectures on “Deep Learning for Multimedia” by Dr. Juhi Gupta


2. https://www.w3schools.com/whatis/
3. https://medium.com/swlh/localhost-to-com-deploying-a-web-app-for-beginners-ea05b0213eb7
4. https://tools.jboss.org/documentation/howto/servers_deploytolocalserver.html
5. https://www.tutorialspoint.com/flask/index.htm
6. https://flask.palletsprojects.com/en/2.3.x/
7.https://careerfoundry.com/en/blog/web-development/what-is-flask/#:~:text=Flask%20is%20what's%20kno
wn%20as,as%20the%20Jinja2%20template%20engine.

You might also like