Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 15

A

Mini Project or Internship Assessment Report


On

Title of Mini Project


Department of Information Technology

2022-23
Name of Supervisor
Name of Guide
Assistant Professor
(Mention your supervisor’s name with designation)
Name Team Member(s)
Student Name (Roll No)

Department pf Information Technology


G. L. Bajaj Institute of Technology and Management
Plot No 2, Knowledge Park-III, Greater Noida-201306
Dec 2023

Department ofInformation Technology


Declaration

I/We herewith declare that the Mini project work conferred during this report entitled
“……………………………………………” in Information Technology, submitted to
A.P.J. Abdul Kalam Pradesh Technical University, Uttar Pradesh, is an authentic record
of my/our own work distributed in Department of Information Technology & Engineer-
ing, G.L. Bajaj Institute of Technology & Management, Greater Noida. It contains no
material antecedently printed or written by another person except wherever due acknowl-
edgement has been created within the text. The Mini project work reported during this
report has not been submitted by me/us for award of the other degree or certification.

Signature:

Name:

RollNo:

Date:

Place:GreaterNoida
Department ofInformation Technology

Certificate

CERTIFICATE

This is to certify that the project titled “____________________________________________”


is the bonafide work carried out by ________________,a student of B Tech(IT) of GL Bajaj
Institute of Technology and Management affiliated to Dr. A.P.J. Abdul Kalam Technical
University, Lucknow(UP) India, during the academic year 2022-23, in partial fulfillment of the
requirements for the award of the degree of Bachelor of Technology(IT) and that the Mini
project has not formed the basis for the award previously of any other degree, diploma,
fellowship or any other similar title.

Date:

Name of Supervisor Dr.PCVashist

(Designation) HeadofDepartment
Department ofInformation Technology

TABLE OF CONTENTS (Sample)

ABSTRACT......................................................................................................................................8

CHAPTER-1:INTRODUCTIONTO 1IMAGEPROCESSING.........................................................9

1.1 INTRODUCTION.................................................................................................................9

1.2 HISTORY...........................................................................................................................10

1.3 OPENCV.............................................................................................................................11

1.4 METHODS OF IMAGE PROCESSING............................................................................12

1.5 STAGESOFPREPROCESSING.........................................................................................13

1.5.1ACQUISITIONOFIMAGE...........................................................................................13

1.5.2PREPROCESSING.......................................................................................................13

1.5.3SEGMENTATION........................................................................................................14

1.6APPLLICATIONOFIMAGEPROCESSING.......................................................................14

1.7 ADVANTAGES.................................................................................................................15

CHAPTER-2:EXISTING TECHNOLOGIES.................................................................................18

2.1 VEO....................................................................................................................................18
2.2 SOLOSHOT........................................................................................................................21

CHAPTER-3:IMAGERECOGNITIONINREALTIME...................................................................23

3.1 INTRODUCTION...............................................................................................................23

3.2 REALTIMEFOOTBALLDETECTION..............................................................................23

3.3 WHAT IS YOLO OBJECT DETECTION..........................................................................24

3.4 APPLICATIONS OF YOLO..............................................................................................25

3.5 PREREQUISITESOFYOLOALGORITHM.......................................................................25

3.6 CONVOLUTIONALNETWORKINYOLO........................................................................27

3.7 LOSSFUNCTIONS&REDUCTION...................................................................................30

3.8 YOLOALGORITHMPROCESS.........................................................................................33

3.8.1 WORKING..................................................................................................................33

3.9 ABNORMALBOUNDARYCASES...................................................................................35

3.9.1 INTERSECTIONOVERUNION..................................................................................35

3.9.2 NON-MAX SUPPRESSION........................................................................................37


3.9.3 ANCHORBOXES........................................................................................................39
3.10 CHALLENGES..................................................................................................................41

CHAPTER-4:INTERNETOFTHINGS(IOT)...................................................................................43

4.1 WHATISIOT?.....................................................................................................................43

4.2 HISTORY OF IOT..............................................................................................................43

4.3 HOW IOT WORKS............................................................................................................44

4.4 USEOFIOT.........................................................................................................................45

4.5 RASBERRYPI....................................................................................................................47

4.6 SERVOMOTOR.................................................................................................................50
4.6.1 INSIDEOFASERVOMOTOR......................................................................................51

4.6.2 SERVO MOTOR WORKING MECHANISM.............................................................52

4.6.3 SERVO MOTOR WORKING PRINCIPLE.................................................................52

4.6.4 HOW DO SERVO MOTOR WORK............................................................................53

4.6.5 CONTROLLINGASERVOMOTOR............................................................................53

4.7 MOVING CAMERA MODULE.........................................................................................58

CHAPTER-5:INTEGRATINGTHECONCEPT..............................................................................60

5.1 AGENDA............................................................................................................................60

5.2 THEIDEA AND EXECUTION..........................................................................................60

5.3 DATASETANDTRAINING...............................................................................................63

5.4 REQUIRENMENTS...........................................................................................................63

5.5 SETUP................................................................................................................................64

5.6 DETECTION CODEWALKTHROUGH............................................................................65

5.7 TRAINING CODE WALKTHROUGH..............................................................................70

5.8 TRAINING PROCESS IN DEPTH.....................................................................................72

5.9 GOOGLE COLAB..............................................................................................................72

5.10 LABELLING TOOL.........................................................................................................73

5.11 STEPS(YOLO) .................................................................................................................73

5.12 SAMPLES OF LABELLING DATASET USING LABELIMG.......................................74

5.13 SAMPLE OUTPUT OF COLLAB....................................................................................78

CHAPTER-6:CONCLUSION1ANDFUTURESCOPE...................................................................79

6.1 FUTURESCOPE.....................................................................................................................79
6.2 CONCLUSION.......................................................................................................................79

CHAPTER- 7 ENVIRONMENT AND


SUSTAINABILITY……………………………………………………………………….80

REFERENCES..............................................................................................................................82

Remark:
Instructions for Formatting the Mini Project Report:
1. All the fonts of the text should be in Times New Roman.
2. The Heading should be of Font Size=14 with Bold.
3. The text should be of Font Size=12.
3. There should be 1.5 line spacing between the Texts.
4. Figure & its caption should be center justified with font size 10.
5. Table and its caption should be center justified with font size 10.
6. All the text should be Justified (Select text -> Ctrl +J) in the
Project.
7. Project report should be plag free (less than 10% similarity).
8. Project Report should not less than 60 pages and printed on bond
paper.
9. Hard Binding should be blue with golden print ( contact to
supervisor).
CHAPTER-1
NTRODUCTION
Introduction
News has been the provider of information since centuries. In traditional times, there were news
agencies which were the source of news and hence, reliability and confidentiality remained with
the official organizations itself. In recent times, internet grew rapidly from rural to urban areas.
With the growth of internet, more users from all over the world got access to internet and to
spread the information in their way [1].

According to Economic Times report of 2019, there are 627 million internet users in India which
means India is home to world’s second largest internet user base [2]. However, with the
increasing popularity of social media, the internet becomes ideal breeding ground for fake news.
A research by BBC shows that nearly 72% Indians struggled to distinguish between fake and
real news [3]. Websites like The Onion[4], News Thump[5], The Poke News[6], and The Mash
News[7] are among the top rankers of ‘Fake’ or ‘misleading’ news propagator [8]. Hence, many
online fact checking resources like Snopes[9], FactCheck.org[10], Factmata.com[11],
PolitiFact.com[12] and many more grew rapidly. Social networking sites such as Facebook,
Whatsapp, and Google addressed this particular concern but the efforts hardly contributed in
solving the issue.

Approaches to detect Fake News:

1.1 Detection Approaches Based on Machine Learning: Support Vector Machines (SVMs),
Random forests, logistic regression models, Conditional Random Field (CRF) classifiers, Hid-
den Markov Models (HMMs) [13].
1.2 Detection approaches based on deep learning: The two most widely implemented para-
digms in modern artificial neural networks are Recurrent Neural Networks (RNN) and Convolu-
tional Neural Networks (CNN) [13].

This model will detect fake news by checking the credibility of the news provider, comment sentiment
analysis and content of the provided news. We will be using Natural Language Processing for pre-
processing the dataset and machine learning approach to fight fake news.

Figure 1: Fact Checker [14]

BACKGROUND
There are many models for fact checking and detecting fake news. PolitiFact[12] - A fact-
checking website operated by Poynter Institute in St. Petersburg, Florida which uses Truth-O-
Meter to determine truthfulness of a statement/article/event/Image/video. But the fact checking
is limited to political news and hence fails to cover broad spectrum of news. According to a
survey paper, Facebook fake news sources can be encountered using BS Detector[15]. Another
fact checking website, Factmata[11] provides platform to get better understanding of the content
by providing scores content on nine signals, including Hate speech and Political bias, to give us
a deep understanding of credibility and safety of any content on web. Messenger for businesses
Flock has launched Fake news detector that aims to stop false and misleading information from
being introduced in their environment [16].

In India, fact check has recently been launched by India Today, Times of India, and AFP India
but these resources do not provide platform for users to check whether the news article they are
viewing is fake or real. AltNews [17] has been successful in India to provide platform for user to
clear their doubt, though it is yet to get more efficient and reliable.
Models like Fact Finder, only check whether the news is fake or real. On the other hand,
AltNews website or app works on fake news and publish viral fake news articles. Our model,
performs both work simultaneously.

PROPOSED WORK
In this paper a model is build based on pre-processing data with the use of NLTK library,
removing all the stopwords such as “the”, “is”, and “are” and only using those words which are
unique and provide us with relevant information. We also removed punctuations, numbers and
converted our dataset into lowercase letters. Also we have used Count Vectorizer or TF-IDF
matrix which tallies to how often the word in used in a given article in our dataset, Figure 2
depicts the process from collecting News Articles Dataset to using News Classification
Algorithm. Since the problem concerns with text classification and information extraction, we
have used Naïve Bayes classifier for text-based classification. For training and testing, we have
used Multinomial NB and Passive Aggressive Classifier with 33% training dataset. We will also
remove rare words occurring in our corpus with the help of Count Vectorizer [18-20].

The goal of the project is to make a website and app for user so that whenever he/she selects a
text, the app reflects with floating window and provides user with the percentage of fake and real
news of the selected text. The advantage with the app or website is that without opening or
uploading any content in the app, the app will detect fake news.

Fi
gure 2: Process Flow Diagram

METHODOLOGY
In this section, methodology of proposed model has been described. Figure 3 represents work
flow of methods involved in creating the model. The major steps involved in building the model
are:
1. Corpus of Text Document

2. Text wrangling and pre-processing

3. Parsing and Basic Exploratory Data Analysis

4. Text representation using relevant feature engineering techniques

5. Modeling

6. Evaluation and Deployment


Figure 3: Methodology

Scraping News Articles for Data Retrieval

Currently, the model has been trained using dataset from Kaggle [21] with 6335 rows and
4columns. News articles will be scraped from, inshorts [22], with the help of python libraries
along with NLTK and spacy. A typical news article is also in the HTML section as depicted in
the following image:

Figure 4: The landing page for technology news articles and its corresponding HTML structure [23]

The specific HTML tags can also be used which contain the textual content [24]. Hence, with
the help of libraries such as BeautifulSoup and requests, useful content will be scraped.

Collected dataset contains 6335 rows and 4 columns; the head of the dataset has been depicted
in the following Figure 5:

Figure 5: Dataset of real and fake news articles

Text Wrangling, Cleaning and Pre-processing

Here, the nltk and spacy packages both have been leveraged to process the data. Stopwords can
be used to process data and remove the most common words used in our dataset such as
“and”,”the” and “is”. Along with stop words, HTML tags, accented text, expand contractions,
punctuations, numbers, and special characters are also needed to be removed since they do not
provide relevant information. Lemmatizing and stemming text are done with the help of
functions such as lemmatize_text() and simple_stemmer() respectively.

With the help of TF-IDF vectorizer, word importance in a given article in the entire corpus is
determined. [25]

Data Visualization and Feature Extraction

For better understanding of the dataset, we use matplotlib and seaborn libraries for visualization
and plotting graphs.Using stripplot() method, present in seaborn library statistical plot as
depicted in Figure 6 was formed which shows 0~5000, datasets are REAL while from
5000~10000, datasets are FAKE. CountVectoriser library to remove the rare words was
imported.

Figure 6: Dataset Visualization of Fake news and Real news using Seaborn
X-axis represents label(fake or real), y-axis represents Index
Modeling and Grid Search

With the help of Multinomial NB and Passive Aggressive Classifier, 33% of the dataset was
trained and testing rest 67%. Using confusion matrix, highest accuracy model will be achieved.
[26]

Experimental and Result Analysis

Let’s consider the result as positive, when the classifier classifies news articles as fake:

 The number of True Positives is the number of news articles correctly classified as Fake
News;
 The number of False Positives is the number of news articles incorrectly classified as
Fake News;
 The number of True Negatives is the number of news articles correctly classified as True
News;
 The number of True Positives is the number of news articles incorrectly classified as
True News;
The precision of a classifier is calculated as follows:

Precision = tp / (tp + fp)

where:
tp – number of true positive examples;
fp – number of false positive examples.
The recall of a classifier is calculated as follows:

Recall = tp / (tp + fn), (27)

where fn is a number of false negative examples.

As depicted in figure 7, confusion matrix helps in evaluating the quality of the output of a
classifier, in this case being, Multinomial NB and Passive Aggressive Classifier, on the fake or
real news dataset. Diagonal elements of the matrix represents number of points where predicted
label is equal to true label while off-diagonal matrix of the matrix represents number of points
where prediction of the model fails.

The figure shows the matrix without normalization. Here the results of the matrix changes as the
classification models or vectorizers are changed.

In Matrix 1, combination of Multinomial NB and Tf-IdfVectoriser

In Matrix 2, combination of Multinomial NB and Count Vectoriser

In Matrix 3, combination of Passive Aggressive Classifier and Tf-IdfVectoriser

In Matrix 4, combination of Passive Aggressive Classifier and Hashing Vectoriser


Figure 7: Confusion Matrix, without normalization

The precision for the given classifying model is 0.902; recall on the other hand is 0.486.

The precision of the model represents the relevant instances among the retrieved instances,
while recall is the fraction of total amount of relevant instances that were actually retrieved.

CONCLUSION AND FUTURE SCOPE


In this project, the proposed model is Fake News Detection which differentiates the text by text
classification algorithms to tell whether the news is ‘fake’ or ‘real’. For training, 33% dataset
has been used, and 67% data has been used for testing the FND model. The model predicted
fake and real news successfully with 90.2% accuracy.

In future, VADER for sentiment analysis can be used which is more efficient algorithm and a text
classification model that provides us with highest accuracy. Also, existing Fake News Detection
models have worked for news and politics only, scope in Stock Markets, where shares rise and
fall very frequently, still persists.
REFERENCES
1. Kuriakose, Ammu, et al. "ALIKAH-A Clickbait and Fake News Detection System using Natural
Language Processing." 2019 3rd International Conference on Trends in Electronics and Informatics
(ICOEI). IEEE, 2019.
2. “India has second highest number of Internet users after China” - economictimes.com, 2019[Online].
Available : https://economictimes.indiatimes.com
3. “Ordinary Indians are fueling the country’s fake-news crisis” – qz.com, 2018[Online]. Available:
https://qz.com/india
4. “The Onion” – theonion.com [Online]. Available: https://www.theonion.com/
5. “News Thump” – newsthump.com [Online]. Available: https://newsthump.com/
6. “Poke News” – pokenews.com [Online]. Available:
https://thepoke.co.uk/category/news/
7. “Mash News” – mashnews.com [Online].
Available: https://www.thedailymash.co.uk/news
8. “Top 50 Fake News Websites And Blogs on the Web in 2019” – blog.feedspot.com, 2019[Online].
Available: https://blog.feedspot.com/fake_news_blogs/
9. “Snopes” – snopes.com [Online]. Available: https://www.snopes.com/
10. “FACTCHECK.ORG” – factcheck.org [Online]. Available: https://www.factcheck.org/
11. “FACTMATA” – factmata.com [Online]. Available: https://factmata.com/
12. “Fact Checking U.S. Politics | PolitiFact ” – politifact.com [Online].
Available: https://politifact.com/
13. Bondielli, Alessandro, and Francesco Marcelloni. "A survey on fake news and rumour detection
techniques." Information Sciences 497 (2019): 38-55.
14. “Protecting the EU Elections From Misinformation and Expanding Our Fact-Checking Program to New
Languages” – aboutfb.com[Online]. Available: https://about.fb.com/news

15. "B.S. Detector - Browser extension to identify fake news sites", Bsdetector.tech, 2018. [Online].
Available: http://bsdetector.tech/.
16. “Messenger platform Flock launches feature to identify fake news”, economictimes.com, 2019 [Online].
Available: https://m.economictimes.com/small-biz
17. “Alt News”, altnews.com [Online]. Available: https://www.altnews.in/
18.  N. J. Conroy, V. L. Rubin, and Y. Chen, “Automatic deception detection: Methods for finding fake
news,” Proceedings of the Association for Information Science and Technology, vol. 52, no. 1, pp. 1–4,
2015.
19. S. Feng, R. Banerjee, and Y. Choi, “Syntactic stylometry for deception detection,” in Proceedings of the
50th Annual Meeting of the Association for Computational Linguistics: Short Papers-Volume 2,
Association for Computational Linguistics, 2012, pp. 171–175.
20. ShlokGilda,Department of Computer Engineering, Evaluating Machine Learning Algorithms for Fake
News Detection,2017 IEEE 15th Student Conference on Research and Development (SCOReD)
21. “Kaggle”, kaggle.com [Online]. Available: https://kaggle.com
22. “inshorts - stay informed”, inshorts.com [Online]. Available: https://inshorts.com
23. “A Practitioner's Guide to Natural Language Processing (Part I) — Processing & Understanding Text”,
towardsdatascience.com, 2019 [Online]. Available: https://towardsdatascience.com
24. M. Pagliardini, P. Gupta, and M. Jaggi, “Unsupervised learning of sentence embeddings using
compositional n-gram features,” arXiv preprintarXiv:1703.02507, 2017.
25. H. Rashkin, E. Choi, J. Y. Jang, S. Volkova, Y. Choi, and P. G. Allen, “Truth of Varying Shades:
Analyzing Language in Fake News and Political Fact-Checking,” in Proceedings of the 2017
Conference on Empirical Methods in Natural Language Processing, 2017, pp. 2931–2937.
26. M. Balmas, “When Fake News Becomes Real: Combined Exposure to Multiple News Sources and
Political Attitudes of Inefficacy, Alienation,and Cynicism,” Communic. Res., vol. 41, no. 3, pp. 430–
454, 2014.
27. Naive Bayes classifier. (n.d.) Wikipedia. [Online].
Available:https://en.wikipedia.org/wiki/Naive_Bayes_classifier. Accessed Feb. 6, 2017.

You might also like