Project Report Format

PROJECT REPORT
On
FalseDream
Submitted for Partial Fulfillment of Award of
BATCHELOR OF TECHNOLOGY
In
COMPUTER SCIENCE AND ENGINEERING

(2022-23)
Submitted by
DEV DUTT PANDEY - 1712210041
DIVYANSH PANDEY - 1712210043
B.TECH
Under the guidance of
Mr. Brijesh Kr. Verma
BABU BANARASI DAS ENGINEERING COLLEGE,

LUCKNOW
Affiliated to
Dr. APJ ABDUL KALAM TECHNICAL

UNIVERSITY,LUCKNOW
CSE DEPARTMENT
BBDEC
CERTIFICATE
Certified that the project entitled “FALSEDREAM ” submitted by Dev Dutt Pandey
[ROLL NO] and Divyansh Pandey [ROLL NO] in the partial fulfillment of the
requirements for the award of the degree of Bachelor of Technology (Computer
Science And Engineering) of Dr. APJ Abdul Kalam Technical University, is a record
of student’s own work carried under our supervision and guidance. The project report
embodies results of original work and studies carried out by students and the contents
do not forms the basis for the award of any other degree to the candidate or to
anybody else.
Mr. Brijesh Kr. Verma Dr. Avinash Gupta

(Project Guide) (Head of Department)
CSE DEPARTMENT, BBDEC, Lucknow Page vii

CSE DEPARTMENT
BBDEC
DECLARATION
We hereby declare that the project entitled “FALSEDREAM ” submitted by Dev
Dutt Pandey[ROLL NO] and Divyansh Pandey [ROLL NO] in the partial
fulfilment of the requirements for the award of the degree of Bachelor of Technology
(Computer Science And Engineering) of Dr. APJ Abdul Kalam Technical University,
is record of our own work carried under the supervision and guidance of
Mr. Brijesh Kr. Verma.
To the best of our knowledge this project has been submitted to Dr. APJ Abdul Kalam
Technical University or Institute for the award of any degree.
Name: Dev Dutt Pandey Name: Divyansh Pandey

Roll No: Roll No:
B.Tech-CSE B.Tech-CSE
Batch:2019-2023 Batch: 2019-2023
CSE DEPARTMENT, BBD, Lucknow Page vii

ACKNOWLEDGEMENT
We take this opportunity to express our profound gratitude and deep regards to our
guide Mr. Brijesh Kr. Verma and our coordinator Dr. Avinash Gupta for their
exemplary guidance, monitoring and constant encouragement. The blessing, help and
guidance given by them time to time shall carry us a long way in the journey of life on
which we are about to embark.
We also take this opportunity to express a deep sense of gratitude to Computer

Science and Engineering Department, BBDEC, Lucknow for their cordial support,
valuable information and guidance, which helped us in this task through various
stages.
We are obliged to staff members of Computer Science and Engineering
Department, BBDEC, for the valuable information provided by them in their
respective fields. We are grateful for their cooperation. We would like to express my
special gratitude and thanks to them for giving us such attention and time.
Last but definitely not least, we would like to thank our mother, father, family
member and friends for the constant encouragement and constant support they
showed us throughout our entire period as a College student which helped me to keep
going and never give up.
From the bottom of my heart THANK you all very much.
Name: Dev Dutt Pandey Name: Divyansh Pandey

Roll No: Roll No:
B.Tech-CSE B.Tech-CSE
Batch:2019-2023 Batch: 2019-2023

PREFACE
Email phishing is the most commonly used type of cyberattack. It uses email messages to trick
you into doing something dangerous that benefits the attacker.
Phishing uses impersonation and other kinds of deceptions to make you believe it is from
somebody you trust, and that the action you are taking will somehow benefit you. Phishing can
take many different forms, including simple attempts at deception that most people can spot. It
can also occur in much more complex situations that include a sequence of messages. The
process of deceiving people into taking some action is called social engineering. So phishing is
really a form of social engineering, like traditional scams and fraud schemes. However, they are
launched using email messages. These are typically against employees in businesses, hoping
that staff have not had sufficient cyber security awareness training to spot these attacks and
avoid them.
The project report has been divided into multiple chapters.
The topics covered under each are as follows:
INTRODUCTION: This chapter gives our problem definition along with the aims
and objectives. This part also includes a section on objectives, project analysis which
gives information about our project.
LITERATURE REVIEW: This chapter explains the takes the form of making
important summaries from these sources that are of relevance from the entire work.
PROPOSED METHODOLOGY: This chapter describes the way in which phishing

email detection analysis on mailsworks, about the modules in our projects and the
model used. It also defines the software/hardware requirements and specifications
along with programming codes used in project.
RESULTS: This section gives the tasks and features that were accomplished module-
wise in phishing email detection analysis system which is developed.
DISCUSSION: This section is used to describe the significance of the phishing email
detection system and provides new insights about overall system.
CONCLUSION: This section covers the various inferences that were drawn after the
completion of the entire project.
FUTURE SCOPE: This section gives the future enhancements that can be made in
the project idea and its implementation.
REFERENCES: This section lists all the sources we have used in our project so that
readers can easily find what we have cited.

ABSTRACT
Phishing is a form of cybercrime where an attacker imitates a real person / institution by
promoting them as an official person or entity through e-mail or other communication mediums.
In this type of cyber attack, the attacker sends malicious links or attachments through phishing
e-mails that can perform various functions, including capturing the login credentials or account
information of the victim. These e-mails harm victims because of money loss and identity theft.
In this study, a software called “Anti Phishing Simulator” was developed, giving information
about the detection problem of phishing and how to detect phishing emails. With this software,
phishing and spam mails are detected by examining mail contents. Classification of spam words
added to the database by Bayesian algorithm is provided.

TABLE OF CONTENTS
CERTIFICATE… ii
DECLARATION iii
ACKNOWLEDGEMENT… iv
PREFACE… v-vi
ABSTRACT… vii
1. INTRODUCTION 1
Research background 1-3
Problem statement 3
Objective of the study 4
Scope of study 4
2. LITERATUREREVIEW… 6-8
3. PROPOSEDMETHODOLOGY 9
Working 9
Block Diagram 10
Project Modules 11
Login/Signup Module 12-13
Client User dashboard 14-24
Admin dashboard 25-31
Implementation of ML 32
4. RESULT ANALYSISANDDISCUSSION 33
Choosing right ML algorithm 33
Result Analysis of sentimental analysis 37
Machine Learning 37-46
Data Visualization 46-47
5. CONCLUSION 72
6. FUTURE SCOPE OFTHEPROJECT… 73-75
APPENDIX-A : LISTOFFIGURES xi
APPENDIX-B :CODE IMPLEMENTATION xiv
REFERENCES: l
CSE DEPARTMENT, BBD, Lucknow Page 17

CHAPTER 1
INTRODUCTON
Research background:
As we all know, we are going through a phase of immense technology growth and it affects
everyone in our society and the whole world. As technology is growing it is also creating an
overwhelming issue for security. This new security issue is creating a lot of chaos in everyone
life because it's not only affecting the peoples' computer and so it’s also affecting their personal
life because due to this their personal data get leaked every time confidential information that
very important like their bank details and credit card details and their personal identity
information and many other things. In all these issues there is one issue which is very important
to pay attention to know as phishing it’s one of the ways for criminals to steal your data from
your electronic devices it uses social engineering and technology to steal a victim’s identity data
and account information. Email is one of the main ways of communication between users as in
the current day its showing great traffic increase over the internet. It is one of the fastest ways of
communication so now almost every one of using email to share information between users.
The past year's data has shown a great increase in the rate of phishing activity because many
victims have lost their data money and other information. It is the practice of luring the people
towards their fraud website and try from users to get their password email, account details, and
all of the credentials without let them suspect them. In the mails its is send as a faked message
from a faked message disguised seems like a message which is sent from a reputable company
which related to the financial department. According to a report from the Anti-Phishing
Working Group (APWG), the number of phishing detections in the first quarter of 2020 reached
the number of maximum in march 1,65,772, and in the second quarter of 2020, the total number
of phishing sites was 146,994 that’s was downed by 11% from Q1 of 2020. The numbers are
generally comparable to previous quarters: 139,685 in 1Q2020,132,553 in 4Q2019, 122,359 in
3Q 2019, and 112,163 in 2Q 2019. As we can see there is a significant increase in phishing
attack in comparison to previous years so it is one of the main concern of companies and people
nowadays it is affecting people mentally due to their loss and in this pandemic situation of
COVID -18 the money that they saved for their use that all got stolen because of phishing it's a
great issue that needs a solution the striking data, it is clear that phishing has shown an apparent
upward trend in recent years. Similarly, the harm caused by phishing can be imagined as well.

Problem statement:
In this project, we are going to use the sender address and link which is send by the address to
match the link to any blacklist sites or any blacklist person in the data which is in our system
that helps us use to pinpoint the email which is used for phishing and they will not able to do
anything to the users.
Objective of the study:
The system ensures that every link of the user info from the data of the blacklist persons or the
blacklist sites they all help us finding the phishing email after that we are going to categories the
email in two categories in two different types of legitimate email and phishing email. The basic
steps are:
• To detect the phishing email by checking the URL.
• To store the data in two form phishing and legitimate email.
Scope of study:
With the emergence of email, the convenience of communication has led to the problem of
spam, and any other type of e-attacks especially phishing attacks through email. Various anti-
phishing technologies have been proposed to solve the problem of phishing attacks. studied the
effectiveness of phishing blacklists. Blacklists mainly include sender blacklists and link
blacklists. This detection method which is used first extracts the sender’s address and after that
for more precaution link address in the message and checks whether the sender or the URL is
blacklisted or not, to distinguish whether the email is a phishing email or legitimate mail. The
update of a blacklist mail address or link is usually reported by users, and whether it is a
phishing website or not is manually identified. At present, the two well-known phishing
websites are PhishTank and OpenPhish. To some extent, the perfection of the blacklist
determines the effectiveness of this method based on the blacklist mechanism for phishing
email detection. The current situation is that new threats may not only cause severe damage to
customers’ computers but also aim to steal their money and identity.

CHAPTER2
LITERATURE SURVEY
History According to APWG, the term phishing was coined in 1996 due to social engineering
attacks against America On-line (AOL) accounts by online scammers. The term phishing comes
from fishing in a sense that fishers (i.e. attackers) use a bait (i.e. socially-engineered messages)
to fish (e.g. steal personal information of victims). However, it should be noted that the theft of
personal information is mentioned here as an example, and that attackers are not restricted by
that as previously defined in Section II. The origins of the ph replacement of the character f in
fishing is due to the fact that one of the earliest forms of hacking was against telephone
networks, which was named Phone Phreaking. As a result, ph became a common hacking
character replacement of f. According to APWG, stolen accounts via phishing attacks were also
used as a currency between hackers by 1997 to trade hacking software in exchange of the stolen
accounts. Phishing attacks were historically started by stealing AOL accounts, and over the
years moved into attacking more profitable targets, such as on-line banking and e-commerce
services. Currently, phishing attacks do not only target system endusers, but also technical
Phishing Motives According to Weider D. et. al. [6], the primary motives behind phishing
attacks, from an attacker’s perspective, are: • Financial gain: phishers can use stolen banking
credentials to their financial benefits. • Identity hiding: instead of using stolen identities directly,
phishers might sell the identities to others whom might be criminals seeking ways to hide their
identities and activities (e.g. purchase of goods). • Fame and notoriety: phishers might attack
victims for the sake of peer recognition.
Brand watch Sentiment Analysis
Brand watch is also a sentiment analysis tool developed by a team of PhD qualifiers in the
United Kingdom; this is also commercially available currently. Through this tool they are trying
to access whether a sentiment is positive, negative or neutral .
Importance.
According to APWG, phishing attacks were in a raise till August, 2009 when the all-time high
of 40,621 unique3 phishing reports were submitted to APWG. The total number of submitted
unique phishing websites that were associated with the 40,621 submitted reports in August,

2009 was 56,362. As justified by APWG, the drop in phishing campaign reports in the years
2010 and 2011 compared to that of the year 2009 was due to the disappearance of the
Avalanche gang4 which, according to APWG’s 2nd half of 2010 report, was responsible for
66.6% of world-wide phishing attacks in the 2nd half of 2009 [7]. In the 1st half of the year
2011, the total number of submitted phishing reports to APWG .
was 26,402, which is 35% lower than that of the peak in the year 2009 [8]. However, according
to APWG, the drop in phishing attacks was due to the switch in the activities of the Avalanche
gang from traditional phishing campaigns into malware-based phishing campaigns. In other
words, the Avalanche gang did not stop phishing campaigns but rather switched their tactics
toward malware-based phishing attacks (which still requires electronic communication channels
and social engineering techniques to deliver malware). Among the various types of malware
that are used in phishing attacks, Trojan horses software seem to be in a raise, and are the most
popular type of malware deployed by phishing attacks. According to APWG, Trojans software
contributed 72% of the total malware detected in the 1st half of 2011, from the previous value
of 55% in the 2nd half of 2010. It is also important to note that although the number of phishing
attack reports dropped since the peak in 2009, the number of phishing attack reports are still
high ,compared to that of the 2nd half of 2008 which faced an average of 28,916 unique reports,
and ranged between 22,000 and 26,000 of unique reports each month in the 1st half of 2011. On
the other hand, the 2nd half of 2011 saw a raise in phishing reports and websites, which seems
to be correlated with holidays season [9] as depicted in Figures 1 and 2. Which is further
amplified when knowing that each phishing campaign can be sent to thousands or even millions
of users via electronic communication channels. The year 2011 saw a number of notable spear
phishing attacks against well known security firms such as RSA [10] and HB Gary [2], which
resulted in further hacks against their clients such as RSA’s client Lockheed Martin [3]. This
shows that the dangers of phishing attacks, or security vulnerabilities due to the human factor,
are not limited to the naivety of endusers since technical engineers can also be victims.
Minimizing the impact of phishing attacks is extremely important and adds great value to the
overall security of an organization.
Determining the Semantic Orientation of Terms through Gloss Classification

Sentiment classification is a recent sub discipline of text classification which is concerned not
with the topic a document is about, but with the opinion it expresses. In this approach of
sentiment classification, it uses a method that is based on the quantitative analysis of the glosses
of such terms. i.e. the definitions that these terms are given in on-line dictionaries, and on the
use of the resulting term representations for semi supervised term classification [6].
Challenges
Because the phishing problem takes advantage of human ignorance or naivety with regards to
their interaction with electronic communication channels (e.g. E-Mail, HTTP, etc. . . ), it is not
an easy problem to permanently solve. All of the proposed solutions attempt to minimize the
impact of phishing attacks. From a high-level perspective, there are generally two commonly
suggested solutions to mitigate phishing attacks: • User education; the human is educated in an
attempt to enhance his/her classification accuracy to correctly identify phishing messages, and
then apply proper actions on the correctly classified phishing messages, such as reporting
attacks to system administrators. • Software enhancement; the software is improved to better
classify phishing messages on behalf of the human, or provide information in a more obvious
way so that the human would have less chance to ignore it. The challenges with both of the
approaches are: • Non-technical people resist learning, and if they learn they do not retain their
knowledge permanently, and thus training should be made continuous. Although some
researchers agree that user education is helpful [1], [11], [12], a number of other researchers
disagree [13], [14]. Stefan Gorling [13] says that: “this is not only a question of knowledge, but
of utilizing this knowledge to regulate behavior. And that the regulation of behavior is
dependent on many more aspects other than simply the amount of education we have given to
the user” • Some software solutions, such as authentication and security warnings, are still
dependent on user behavior. If users ignore security warnings, the solution can be rendered
useless. • Phishing is a semantic attack that uses electronic communication channels to deliver
content with natural languages (e.g. Arabic, English, French, etc. . . ) to persuade victims to
perform certain actions. The challenge here is that computers have extreme difficulty in
accurately understanding the semantics of natural languages. A notable attempt is E-mail-Based
Intrusion Detection System (EBIDS) [15], which uses Natural Language Processing (NLP)
techniques to detect phishing attacks, however its performance evaluation showed a phishing

detection rate of only 75%. In our opinion, this justifies why most well-performing phishing
classifiers do not rely on NLP techniques.
MITIGATION OF PHISHING ATTACKS: AN OVERVIEW
Due to the broad nature of the phishing problem, we find important to visualize the life-cycle of
the phishing attacks, and based on that categorize anti-phishing solutions. Based on our review
of the literature, we depict a flowchart describing the life-cycle of phishing campaigns from the
perspective of anti-phishing techniques, which is intended to be the most comprehensive
phishing solutions flowchart. See Figure 3. When a phishing campaign is started (e.g. by
sending phishing emails to users), the first protection line is detecting the campaign. The
detection techniques are broad and could incorporate techniques used by service providers to
detect the attacks, end-user client software classification, and user awareness programs. More
details are in Section IV-A. The ability to detect phishing campaigns can be enhanced whenever
a phishing campaign is detected by learning from such experience. For example, by learning
from previous phishing campaigns, it is possible to enhance the detection of future phishing
campaigns. Such learning can be performed by a human observer, or software (i.e. via a
machine learning algorithm).

CHAPTER-3
PROPOSED METHOOLOGY
spam, and any other type of e-attacks especially phishing attacks through email. Various anti-
phishing technologies have been proposed to solve the problem of phishing attacks. studied the
effectiveness of phishing blacklists. Blacklists mainly include sender blacklists and link
blacklists. This detection method which is used first extracts the sender’s address and after that
for more precaution link address in the message and checks whether the sender or the URL is
blacklisted or not, to distinguish whether the email is a phishing email or legitimate mail.
Working:
Figure 3.1. Process of Analysis
The update of a blacklist mail address or link is usually reported by users, and whether it is a
phishing website or not is manually identified. At present, the two well-known phishing
websites are PhishTank and OpenPhish. To some extent, the perfection of the blacklist
determines the effectiveness of this method based on the blacklist mechanism for phishing
email detection. The current situation is that new threats may not only cause severe damage to
customers’ computers but also aim to steal their money and identity.

Among these threats, phishing is a noteworthy one and is a criminal activity that uses social
engineering and technology to steal a victim’s identity data and account information. According
to a report from the Anti-Phishing Working compared with the fourth quarter of According to
the striking data, it is clear that phishing has shown rapid growth in recent years which is one of
the concerns.
The block diagram given above represents our approach to the problem. The description of each
block of the diagram is given below:
1. User: In this step, they will log in and user ID will be taken to check about the user.
2. Compose Mail: After the mail composes the detection algorithm for RCNN will be
started to find the RCNN algorithm fit in the mail which is composed.
3. Detection System: This step aims to detect the mail contain the malicious URL or the
who compose the mail is backlisted or the URL is from some fishy site that can harm
your data.
4. Database: After the detection system detects the type of user an URL then the mail gets
stored in the database and then distributed into two following categories:
5. ➢ Phishing email- This category contains all those emails which are sent by the
blacklisted person or contain some URL that is harmful to the user.
6. ➢ Legitimate – this category contains all those email clear data no bad URL only
legitimate data contained in it.
7. Admin: Admin can prepare data to Analysis and Admin also can detect which email is
phishing mail with more accuracy.
8. Result/graph: In this, all those phishing email detections are simply transferring.

Existing System
Various techniques for detecting phishing emails are mentioned in the literature. In the entire
technology development process, there are mainly three types of technical methods including
blacklist mechanisms, classification algorithms based on machine learning and based on deep
learning. From previous work, the existing detection methods based on the blacklist mechanism
mainly rely on people’s identification and reporting of phishing links requiring a large amount
of manpower and time. However, applying artificial intelligence to the detection method based
on a machine learning classification algorithm requires feature engineering to manually find
representative features that are not conducive to the migration of application scenarios.
Moreover, the current detection method based on deep learning is limited to word embedding in
the content representation of the email. These methods directly transferred natural language
processing (NLP) and deep learning technology, ignoring the specificity of phishing email
detection so that the results were not ideal Given the methods mentioned above and the
corresponding problems, we set to study phishing email detection systematically
based on deep learning. Specifically, this paper makes the following contributions:
Disadvantages –
1. With respect to the particularity of the email text, we analyze the email structure, and
mine the text features from four more detailed parts: the email header, the email body,
the word-level, and the char-level.
2. The RCNN model is improved by using the Then, the email is modelled from multiple
levels using an improved RCNN model. Noise is introduced as little as possible, and the
context information of the email can be better captured.

Proposed System
massive spam, especially phishing attacks through email. Various anti phishing technologies
have been proposed to solve the problem of phishing attacks. studied the effectiveness of
phishing blacklists. Blacklists mainly include sender blacklists and link blacklists. This
detection method extracts the sender’s address and link address in the message and checks
whether it is in the blacklist to distinguish whether the email is a phishing email. The update of
a blacklist is usually reported by users, and whether it is a phishing website or not is manually
identified. At present, the two well-known phishing websites are PhishTank and OpenPhish. To
some extent, the perfection of the blacklist determines the effectiveness of this method based on
the blacklist mechanism for phishing email
detection. The current situation is that new threats may not only cause severe damage to
customers’ computers but also aim to steal their money and identity. Among these threats,
phishing is a noteworthy one and is a criminal activity that uses social engineering and
technology to steal a victim’s identity data and account
information. According to a report from the Anti-Phishing Working compared with the fourth
quarter of According to the striking data, it is clear that phishing has shown an apparent upward
trend in recent years. Similarly, the harm caused by phishing can be imagined as well.
Advantages –
1. Phishing email refers to an attacker using a fake email to trick the recipient into
returning information such as an account password to a designated recipient.
2. Additionally, it may be used to trick recipients into entering special web pages, which
are usually disguised as real web pages, such as a bank’s web page, to convince users to
enter sensitive information such as a credit card or bank card number and password.
Although the attack of phishing email seems simple, its harm is immense.

Algorithm
R-C NN Algorithms
Let’s quickly summarize the different algorithms in the R-CNN family (R-CNN, Fast R-CNN,
and Faster R-CNN) that we saw in the first article. This will help lay the ground for our
implementation part later when we will predict the bounding boxes present in previously unseen
images (new data). R-CNN extracts a bunch of regions from the given image using selective
search, and then checks if any of these boxes contains an object. We first extract these regions,
and for each region, CNN is used to extract specific features. Finally, these features are then
used to detect objects. Unfortunately, R-CNN becomes rather slow due to these multiple steps
involved in the process. Fast R-CNN, on the other hand, passes the entire image to ConvNet
which generates regions of interest (instead of passing the extracted regions from the image).
Also, instead of using three different models (as we saw in R-CNN), it uses a single model
which extracts features from the regions, classifies them into different classes, and returns the
bounding boxes. All these steps are done simultaneously, thus making it execute faster as
compared to R-CNN. Fast R-CNN is, however, not fast enough when applied on a large dataset
as it also uses selective search for extracting the regions.

Requirement Analysis
The project involved analyzing the design of few applications so as to make the application
more users friendly. To do so, it was really important to keep the navigations from one screen to
the other well-ordered and at the same time reducing the amount of typing the user needs to do.
In order to make the application more accessible, the browser version had to be chosen so that it
is compatible with most of the Browsers.
Functional Requirement
 Graphical User interface with the User.
Software Requirement
For developing the application, the following are the Software Requirements:
 Python
 Django
 MySQL
 MySQL client
 WampServer 2.4
Operating System supported
 Windows 7
 Windows XP
 Windows 8
 Windows 10

Technology and language used to develop
 Python
Debugger and Emulator
 Any Browser (Particularly Chrome)
Hardware Requirement
For developing the application, the following are the Hardware Requirements:
 Processor: Pentium IV or higher
 RAM: 256 MB
 Space on Hard Disk: minimum 512MB

Project Modules:
 Module1: Login / signup module.

 Module2: Client dashboard
 Module3: Admin dashboard
 Module4: Applying machine learning to the data.
MODULES OF
PROJECT
Login/ Client Applying Admin

Signup dashboard Machine dash-
Module Learning to board
the data
Figure 3.3. Project Module

Login/Signup Module:
This is the first step of our project, wherein the user can login/ signup to the system
through his/ her credentials. The login credentials are stored in the my sql database
through connection.
Registration/ Signup Module:
The user can register to the system by filling the details as shown in the figure 3.4.
The user can use these credentials to further login into the system and save his/her
details for further consideration. Also, user Client personal details along with his log
in details will be entered here. Which can we later be edited also from the Client
dashboard.
Figure 3.4. Signup Module

Login Module:
Once the user has registered to the system, he or she can login directly by filling the
email and password as shown in the figure 3.5. this is also going to be used by the
admin to perform system login to get access to admin dashboard.
Figure 3.5. Login Module

Client dashboard:
After logging in to the system, the user Client gets into the homepage where there are
different options to select from Client dashboards shown in figure 3.6.
Figure 3.6. Client Dashboard Page
In the Client dashboard the user is given a lot of options in the navigation bar like my
details, compose mail, Check Phishing, View Phishing Details ,Feedback and logout
options.
When the user clicks on the Compose mail button he or she is redirected to the about
page as shown in the figure 3.7.
Figure 3.7. My details Page

In the compose mail he or she can write a mail and send it to any other person. After
sending the mail he or she can check the inbox where they can see all the email thy
have received through mail all those mail will be phishing free like shown that in
figure 3.8
Figure 3.8. compose mail Page
Once the client checks all the mail he or she can see how many phishing mail have been sent to
him by whom through phishing email option which will redirect them to new page as we can
see in below figure 3.9
Figure 3.9. inbox mails

Once user have gone through his mails if he or she have to check that given links are phishing
links or not they can check it through give option check phishing details like shown in below
figure 3.10
Figure 3.10. Check Phishing details
Once user check the links he or she can see all the links history at once through view phishing
details option like shown in below figure 3.11
Figure 3.11. View Phishing details
After this Client user can click on the logout button and to come back to the login page.

Admin dashboard:
Admin will login into the system by clicking onto the admin block on the main home page as
you can see it on the figure 3.5.
From there the admin will enter his/her credential to get access to the admin dashboard as u
can see in the figure 3.9.
Figure 3.12. Admin login Page
Now lets the tabs of Admin dashboard one by one. The first one consist of the list of the active
registered those who have registered onto the portal for expressing their views here Admin can
details of each and every user as seen in figure 3.10.

Figure 3.13. Admin dashboard Page
Next tab on the dashboard is user details tab where all the user data will be shows with their
name and mail like figure 3.11.
Figure 3.14.User Details

The next tab in the dashboard is analysis graph tab where the main graphical analysis result of
the sentiment is shown on the basic of types of mail that have been send by the users.
Figure 3.15. Analysis Page
As you can see in the positive graph analysis which is basically a pie chart show the types of
mail like social and other that have been sent by the user.
Now the next option is phishing attack where the admin can see all phishing mails that have
been sent to user by other users from there only admin can take actions against those mails as
shown in figure3.16
Figure 3.17. Phishing Attack
Here the admin will be able to see the feedback given by the Client users. After this the last tab
on the dashboard is the logout tab from here the admin can logout of the dashboard.

Applying machine learning to the data:
In the Sentiment Analysis the following steps are major to identify the positive,
negative or neutral of the mails . They are:
 Data Set Description.
 Pre-Processing the Dataset.
 Feature Extraction.
 Data Visualization.
Data Set Description:
Data Preprocessing is based on word embedding, which encodes the URL string into a two-
dimensional tensor that can be received by the deep learning model. After data preprocessing,
each character is encoded to a fixed length vector consisting of 0 and 1. This is because the
neural network needs to ensure that the input data is a vector of numbers when performing
mathematical operations.
First, we process the length of the URL string. There is a limit on the length of the URL in the
HTTP standard protocol RFC2616 document: “Servers ought to be cautious about depending on
URL lengths above 255 bytes because some older client or proxy implementations might not
properly support these lengths.” So, we set the length of URL to 255 characters, which means
that if the length of the URL exceeds 255 characters, only the first 255 characters are
intercepted. If the length of the URL is shorter than 255, add 0 to the end of the URL string to a
length of 255 characters.
At the same time, we counted the frequency of occurrences of characters in all URLs in the
dataset and selected the first 59 characters with the highest frequency as valid characters. It
contains 26 English letters, 10 Arabic numerals, and 23 special characters including “@/: = #-.”
Other characters that are not in the list are all “special characters,” and each URL is treated as a
sequence of only 60 different characters. As shown in Figure 3, each character is encoded into a
60-bit 01 string where one in the interface value row and zero in the rest. Then, we use the
word2vec method in natural language processing to encode the previously processed 60-bit 01
string into a 64-bit word vector. Thus, each URL is processed into a two-dimensional matrix of
length , which then passes to the input of PDRCNN.

Importing the necessary packages
Figure 3.18. Importing packages
Reading the train.csv Pandas file

 In the first line we read the train.csv file using Pandas.
 In the second line as a safe backup we keep a copy of our original train.csv file. We
make a copy of train data so that even if we have to make any changes in this
dataset we would not lose the original dataset.
Data
Pre-
Processing:
Figure 3.19. steps of pre-processing

Let’s begin with the pre-processing of our dataset.
STEP — 1 :
Combine the train.csv and test.csv files.
Pandas dataframe.append() function is used to append rows of other dataframe to the end of
the given dataframe, returning a new dataframe object.
Overview of the combined train and test dataset.

Type combine.head() in the cell and you get the following result.
Again type combine.tail() in the cell and you get the following result.
Columns not in the original data frames are added as new columns and the new cells are
populated with NaN value.
STEP — 2
Removing Garbage Data
In our analysis we can clearly see that the garbage data do not contribute anything significant to
solve our problem. So, it’s better if we remove them in our dataset.
Given below is a user-defined function to remove unwanted text patterns from the mails. It
takes two arguments, one is the original string of text and the other is the pattern of text that we
want to remove from the string. The function returns the same input string but without the given
pattern. We will use this function to remove the pattern from all the mails in our data.

Here NumPy Vectorization ‘np.vectorize()’ is used because it is much more faster than the
conventional for loops when working on datasets of medium to large sizes.
STEP — 3
Removing Punctuation, Numbers, and Special Characters
Punctuation, numbers and special characters do not help much. It is better to remove them from
the text just as we removed the mails texts. Here we will replace everything except characters
and hashtags with spaces.
STEP — 4
Removing Short Words
We have to be a little careful here in selecting the length of the words which we want to
remove. So, I have decided to remove all the words having length 3 or less. These words are
also known as Stop Words.
For example, terms like “hmm”, “and”, “oh” are of very little use. It is better to get rid of them.
STEP — 5
Tokenization
Now we will tokenize all the cleaned mails in our dataset. Tokens are individual terms or words,
and tokenization is the process of splitting a string of text into tokens.
Here we tokenize our sentences because we will apply Stemming from the
“NLTK” package in the next step.
So finally, these are the basic steps to follow when we have to Pre-Process a dataset containing
textual data.
OK, so now we are done with our Data Pre-Processing stages.
Let’s move on to our next step that is Feature Extraction.
Feature Extraction:
we will discuss how we can extract features from our textual dataset by using Bag-of-Words.
Bag-of-Words Features

Bag of Words is a method to extract features from text documents. These features can be used
for training machine learning algorithms. It creates a vocabulary of all the unique words
occurring in all the documents in the training set.
Consider a corpus (a collection of texts) called C of D documents {d1,d2…..dD} and N unique
tokens extracted out of the corpus C. The N tokens (words) will form a list, and the size of the
bag-of-words matrix M will be given by D X N. Each row in the matrix M contains the
frequency of tokens in document D(i).

CHAPTER-4
RESULT ANALYSIS AND DISCUSSION
Choosing the right machine learning algorithm:

Machine learning is part art and part science. When you look at machine learning
algorithms, there is no one solution or one approach that fits all. There are several
factors that can affect your decision to choose a machine learning algorithm.
Some problems are very specific and require a unique approach. E.g. if you look at a
recommender system, it’s a very common type of machine learning algorithm and it
solves a very specific kind of problem. While some other problems are very open and
need a trial & error approach. Supervised learning, classification and regression etc.
are very open. They could be used in anomaly detection, or they could be used to
build more general sorts of predictive models.
Besides some of the decisions that we make when choosing a machine learning
algorithm have less to do with the optimization or the technical aspects of the
algorithm but more to do with business decisions. Below we look at some of the
factors that can help you narrow down the search for your machine learning
algorithm:
Data Science Process:
Before we start looking at different ML algorithms, we need to have a clear picture of

your data, your problem and your constraints.
a) Understand Your Data
The type and kind of data we have plays a key role in deciding which algorithm to
use. Some algorithms can work with smaller sample sets while others require tons and
tons of samples. Certain algorithms work with certain types of data. E.g. Naïve Bayes
works well with categorical input but is not at all sensitive to missing data.
Hence it is important that you:

 Know your data
Look at Summary statistics and visualizations

Percentiles can help identify the range for most of the data
Averages and medians can describe central tendency
Correlations can indicate strong relationships
 Visualize the data
Box plots can identify outliers

Density plots and histograms show the spread of data
Scatter plots can describe bivariate relationships
 Clean your data
Deal with missing value. Missing data affects some models more than others. Even
for models that handle missing data, they can be sensitive to it (missing data for
certain variables can result in poor predictions)
Choose what to do with outliers
Outliers can be very common in multidimensional data.
Some models are less sensitive to outliers than others. Usually tree models are less
sensitive to the presence of outliers. However regression models, or any model that
tries to use equations, could definitely be effected by outliers.
Outliers can be the result of bad data collection, or they can be legitimate extreme
values.
 Augment your data
Feature engineering is the process of going from raw data to data that is ready for
modeling. It can serve multiple purposes:
Make the models easier to interpret (e.g. binning)
Capture more complex relationships (e.g. NNs)
Reduce data redundancy and dimensionality (e.g. PCA)
Rescale variables (e.g. standardizing or normalizing)

Different models may have different feature engineering requirements. Some have
built in feature engineering.
b) Categorize the problem
The next step is to categorize the problem. This is a two-step process.
 Categorize by input:
If we have labelled data, it’s a supervised learning problem.

If we have unlabelled data and want to find structure, it’s an unsupervised learning
problem.
If we want to optimize an objective function by interacting with an environment, it’s a
reinforcement learning problem.
 Categorize by output:
If the output of your model is a number, it’s a regression problem.

If the output of your model is a class, it’s a classification problem.
If the output of your model is a set of input groups, it’s a clustering problem.
c) Understand your constraints
 What is the data storage capacity
Depending on the storage capacity of your system, we might not be able to store
gigabytes of classification/regression models or gigabytes of data to cauterize. This is
the case, for instance, for embedded systems.
 Does the prediction have to be fast?
In real time applications, it is obviously very important to have a prediction as fast as

possible. For instance, in autonomous driving, it’s important that the classification of
road signs be as fast as possible to avoid accidents.
CSE DEPARTMENT, BBD,Lucknow Page 38

 Does the learning have to be fast?
In some circumstances, training models quickly is necessary: sometimes, you need to

rapidly update, on the fly, your model with a different dataset. Find the available
algorithms.
Now that we have a clear understanding of where we stand, we can identify the
algorithms that are applicable and practical to implement using the tools at your
disposal. Some of the factors affecting the choice of a model are:
i. Whether the model meets the business goals

ii. How much pre processing the model needs
iii. How accurate the model is
iv. How explainable the model is
v. How fast the model is: How long does it take to build a model, and how long
does the model take to make predictions.
vi. How scalable the model is
An important criteria affecting choice of algorithm is model complexity. Generally

speaking, a model is more complex if:
i. It relies on more features to learn and predict (e.g. using two features vs ten
features to predict a target)
ii. It relies on more complex feature engineering (e.g. using polynomial terms,
interactions, or principal components)
iii. It has more computational overhead (e.g. a single decision tree vs. a random
forest of 100trees).
Besides this, the same machine learning algorithm can be made more complex based
on the number of parameters or the choice of some hyperparameters. For example,
A regression model can have more features, or polynomial terms and interaction
terms, a decision tree can have more or less depth. Making the same algorithm more
complex increases the chance of overfitting.

Machine Learning:
We generally use different models to see which best fits our dataset and then we use that model
for predicting results on the test data.
Here we will use 2 different models
 Logistic Regression
 RCNN
and then we will compare their performance and choose the best possible model with the best
possible feature extraction technique for predicting results on our test data.
Importing f1_score from sklearn
We will use F1 Score throughout to asses our model’s performance instead of accuracy. You
will get to know why at the end of this topic.
CODE :-
Now, let’s move on to applying different models on our dataset from the features extracted by
using Bag-of-Words.
1. Logistic Regression
The first model we are going to use is Logistic Regression.
Fitting the Logistic Regression Model.

Predicting the probabilities.
OUTPUT :-
Figure 3.20.
Predicting the probabilities for a Phishing mail into either Positive or
Negative class.
The output basically provides us with the probabilities of the mails falling into either of the
classes that is Negative or Positive.
Calculating the F1 score
Figure 3.21. F1 score of logistic regression.

2. RCNN (Region Based Convolutional Neural Networks)
The last model we use is RCNN.
Fitting the RCNN.
Predicting the probabilities.
Figure 3.22. Predicting the probability of a Phishing email into either

legitimate and non-legitimate.
Calculating the F1 Score
Figure 3.23. F1 score of RCNN.

Summary
We practiced a wide array of machine learning models for classification and
regression, what their advantages and disadvantages are, and how to control model
complexity for each of them.
Algorithm F1 score
RCNN 0.572134
Logistic Regression 0.524879
We saw that for many of the algorithms, setting the right parameters is important for good
performance. And at last, we took SVM as our model because of its accuracy.
Data Visualization:
So Data Visualization is one of the most important steps in Machine Learning projects because
it gives us an approximate idea about the dataset and what it is all about before proceeding to
apply different machine learning models.
Graph:
Graph analysis is the part where admin can know the statistics about the process of details. The
data are taken from the project flow and it shows until updated value. The data give a clear
solution to admin that part of improvement and user satisfaction and other factors.
Result:
Analysis of email structure. a circle represents a character, and a rectangle represents a word. A
rectangle is filled with an indefinite number of circles, indicating that the word consists of an
indefinite number of characters.

CHAPTER-
CONCLUSION
We use a new deep learning model named to detect phishing emails. The model employs
an improved RCNN to model the email header and the email body at both the character
level and the word level. Therefore, the noise is introduced into the model minimally. In
the model, we use the attention mechanism in the header and the body, making the
model pay more attention to the more valuable information between them. We use the
unbalanced dataset closer to the real-world situation to conduct experiments and
evaluate the model. The model obtains a promising result. Several experiments are
performed to demonstrate the benefits of the proposed model.

CHAPTER-
FUTURE SCOPE OF THE PROJECT
For future work, we will focus on how to improve our model for detecting phishing
emails with no email header and only an email body. The model employs an improved
RCNN to model the email header and the email body at both the character level and the
word level. Therefore, the noise is introduced into the model minimally. In the model,
we use the attention mechanism in the header Phishing Email Detection Using Improved
RCNN Model with Multilevel Vectors and Attention Mechanism the body, making the
model pay more attention to the more valuable information between them. We use the
unbalanced dataset closer to the real-world situation to conduct experiments and
evaluate the model. The THEMIS model obtains a promising result.

FALSEDREAM
APPENDIX-A
LIST OF FIGURES
Figure Page NO.

17
Figure 3.1. Process of Analysis
Figure 3.2.Block Diagram 19
Figure 3.3.Project Module 25
Figure 3.4.Singup Module 26
Figure 3.5.Login Page 27
Figure3.6. Client Dashboard Page 28
Figure 3.7. 28
Figure 3.8. Upload Tweet Page 29
Figure 3.9. Inbox mail 30
Figure 3.10. Check Phishing details 31
Figure 3.11. View Phishing details 31
Figure 3.12. Admin login Page 32
Figure 3.13. Admin dashboard Page 32
Figure 3.14. User Details 33
Figure 3.15. Analysis Page 34
Figure 3.16. steps of pre-processing 35
Figure 3.17. Phishing Attack 37
Figure 3.18. after removing special character 38
Figure 3.19. after removing stopwords 38
Figure 3.20. Predicting the probabilities for a Phishing 39

mail into either Positive or Negative class
39
Figure 3.21. stemming
Figure 3.22. Predicting the probability of a Phishing email 40

into either legitimate and non-legitimate.
41
Figure 3.23. Count Vectorizer
Figure 3.24. F1 score of logistic regression. 42
CSE DEPARTMENT, BBD, Lucknow Page xlix

FALSEDREAM
APPENDIX-B
CODE IMPLEMENTATION
UI DESIGN:
<!DOCTYPE html>
{% load staticfiles %}
<html lang="en">
<head>
<meta charset="UTF-8">
<title>Title</title>
<link href="https://fonts.googleapis.com/css?family=Russo+One&display=swap"
rel="stylesheet">
<style>
body{
background: url("{% static 'bg3.png' %}");
background-size: cover;
}
.menu table{
width:100%;
text-align:center;
font-family: 'Russo One', sans-serif;
}
h1{
}
.menu table td:hover{
background:;
FALSEDREAM
}
.menu table td{
background:rgb(243, 243, 243);
}
.menu table,.menu table th,.menu table td {
border: ;
border-collapse: collapse;
}
.menu table th,.menu table td {
padding: 15px;
.topic h1{
color:black;
padding:2px;
text-align:center;
border-style:none;
height:100px;
width:1330px;
float:left;
}
.mainholder{
position:relative;
top:50px;
left:50px;
z-index:999;

FALSEDREAM
float:left;
}
</style>
</head>
<body>
<div class="background-image">
<div class="topic"><h1 style="color:#ff0e00;margin-top:10px;margin-left:30px;border-
style:none;width:1300px;height:56px;border-color:black;background:;">Phishing Email
Detection Using Improved RCNN Model with Multilevel Vectors and Attention
Mechanism</h1></div>
<div class="menu">
<table>
<tr>
<td><a style="color:#010101;text-decoration: none;" href="{% url 'mydetails'
%}">MY DETAILS</a></td>
<td><a style="color:#010101;text-decoration: none;" href="{% url 'userpage'
%}">COMPOSE MAIL</a></td>
<td><a style="color:#010101;text-decoration: none;" href="{% url 'checking' %}">
CHECK PHISHING DETAILS</a></td>
<td><a style="color:#010101;text-decoration: none;" href="{% url 'checking_attack'
%}"> VIEW PHISHING DETAILS</a></td>
<td><a style="color:#010101;text-decoration: none;" href="{% url 'feedback' %}">
FEEDBACK</a></td>
<td><a style="color:#010101;text-decoration: none;" href="{% url 'userlogin'

%}">LOGOUT</a></td>
</tr>
</table>
</div>
</div>

FALSEDREAM
</div>
<div class="marqee">
</div>
<div class="mainholder">
{% block userblock %}
{% endblock %}
</div>
</body>
</html>
CSE DEPARTMENT, BBD, Lucknow Page liii

FALSEDREAM
USER LOGIN:
<!DOCTYPE html>
<html>
<head>
<title>Phishing Email Detection</title>
<link rel="stylesheet" href="https://stackpath.bootstrapcdn.com/bootstrap/4.1.3/css/bootstrap.min.css"
integrity="sha384-MCw98/SFnGE8fJT3GXwEOngsV7Zt27NXFoaoApmYm81iuXoPkFOJwJ8ERdknLPMO"
crossorigin="anonymous">
<script src="https://ajax.googleapis.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>
<link rel="stylesheet" href="https://use.fontawesome.com/releases/v5.6.1/css/all.css" integrity="sha384-
gfdkjb5BdAXd+lj+gudLWI+BXq4IuLW5IT+brZEZsLFm++aCMlF1V92rMkPaX4PP"
crossorigin="anonymous">
<style>
body,
html {
margin: 0;
padding: 0;
height: 100%;
background: #60a3bc !important;
}
.user_card {
height: 400px;
width: 350px;
margin-top: auto;
margin-bottom: auto;
background: #f39c12;
position: relative;
display: flex;
justify-content: center;
flex-direction: column;
padding: 10px;
box-shadow: 0 4px 8px 0 rgba(0, 0, 0, 0.2), 0 6px 20px 0 rgba(0, 0, 0, 0.19);
-webkit-box-shadow: 0 4px 8px 0 rgba(0, 0, 0, 0.2), 0 6px 20px 0 rgba(0, 0, 0, 0.19);
-moz-box-shadow: 0 4px 8px 0 rgba(0, 0, 0, 0.2), 0 6px 20px 0 rgba(0, 0, 0, 0.19);
border-radius: 5px;

FALSEDREAM
.brand_logo_container {
position: absolute;
height: 170px;
width: 170px;
top: -75px;
border-radius: 50%;
background: #60a3bc;
padding: 10px;
text-align: center;
}
.brand_logo {
height: 150px;
width: 150px;
border-radius: 50%;
border: 2px solid white;
}
.form_container {
margin-top: 100px;
}
.login_btn {
width: 100%;
background: #c0392b !important;
color: white !important;
}
.login_btn:focus {
box-shadow: none !important;
outline: 0px !important;
}
.login_container {
padding: 0 2rem;
}
.input-group-text {
background: #c0392b !important;
color: white !important;
border: 0 !important;
border-radius: 0.25rem 0 0 0.25rem !important;
}
.input_user,
.input_pass:focus {
box-shadow: none !important;
outline: 0px !important;

FALSEDREAM
}
.custom-checkbox .custom-control-input:checked~.custom-control-label::before {
background-color: #c0392b !important;
}
</style>
</head>
<body>
<div class="container h-100">
<div class="d-flex justify-content-center h-100">
<div class="user_card">
<div class="d-flex justify-content-center">
<div class="brand_logo_container">
<img src="https://cdn2.iconfinder.com/data/icons/security-safety-volume-2/1000/Phishing_Attack-
512.png" class="brand_logo" alt="Logo">
</div>
</div>
<div class="d-flex justify-content-center form_container">
<form method="POST">
{% csrf_token %}
<div class="input-group mb-3">
<div class="input-group-append">
<span class="input-group-text"><i class="fas fa-user"></i></span>
</div>
<input type="text" name="email" class="form-control input_user" value="" placeholder="Email">
</div>
<span class="input-group-text"><i class="fas fa-key"></i></span>
</div>
<input type="password" name="password" class="form-control input_pass" value=""
placeholder="password">
</div>
<div class="form-group">
<div class="custom-control custom-checkbox">
<input type="checkbox" class="custom-control-input" id="customControlInline">
<label class="custom-control-label" for="customControlInline">Remember me</label>
</div>
</div>
<div class="d-flex justify-content-center mt-3 login_container">
<input type="submit" name="button" value="login" class="btn login_btn">
</div>

FALSEDREAM
</form>
</div>
<div class="mt-4">
<div class="d-flex justify-content-center links">
Don't have an account? <a href="{% url 'userregister' %}" class="ml-2">Sign Up</a>
</div>
Admin - Site <a href="{% url 'login_page' %}" class="ml-2">Admin Only</a>
</div>
</div>
</div>
</div>
</div>
</div>
<h2 style="color:black;margin-top:-100px;text-align:center;">{{a}}</h2>
</body>
</html>
CSE DEPARTMENT, BBD, Lucknow Page liv

FALSEDREAM
FEEDBACK
{% extends 'users/design.html' %}
<link href="https://fonts.googleapis.com/css?family=Russo+One&display=swap" rel="stylesheet">
<style>
.feedback{
position: absolute;
top:120px;
left:130px;
padding:10px;
width:500px;
.feedback table{
width:30em;
text-align:center;
border-collapse:collapse;
border-spacing:1px;
background:;
}
.feedback table tr th{

color:;
}
.feedback table tr th{
background:
padding:10px;
}
.feedback table tr td{
background:#ff0e00;
padding:10px;
}
.updatedetails table tr:hover td{
background:r);
}
.fimage{
CSE DEPARTMENT, BBD, Lucknow Page lxv
FALSEDREAM
border-style:solid;
border-width:1px;
height:370px;
width:390px;
margin-top:40px;
margin-left:740px;
background: url("{% static 'feedback.jpg' %}");
background-size: 100%100%;
}
</style>
<div class="feedback">
<table>
<form method="post">
{% csrf_token %}
<tr>
<td style="color:black">FEEDBACK</td>
<td><textarea name="feedback" rows="4" cols="50"> </textarea></td>
</tr>
<tr>
<td style="text-align:center;" colspan="2"><input type="submit" name="submit" value="SUBMIT"
style="background:white;color:black;padding: 10px;
border-radius: 10px;"></td> </tr>
</form>
</table>
</div>
<div class="fimage"></div>
{% endblock %}

FALSEDREAM
CHECKING ATTACK
rel="stylesheet">
<style>
.viewfeedback{
position: absolute;
top: 50px;
left: -27px;
padding: 5px;
height: 300px;
width: 1275px;
float: left;
overflow: scroll;
}
.viewfeedback table{
width:40em;
text-align:center;
border:2;
border-spacing:1px;
background:;
}
.viewfeedback table tr th{

background:rgb(0,139,139);

FALSEDREAM
padding:5px;
}
.viewfeedback table tr td{
background:#ff0e00;
padding:5px;
}
.viewfeedback table tr:hover td{
background:rgba();
}
.any{
border-color:rgba(199,21,133,0.8);
border-width:1px;
height:380px;
width:360px;
margin-top:50px;
margin-left:830px;
background: url("{% static '20.jpg' %}");
float:left;
}
</style>
<div class="viewfeedback">
<form method="post">
{% csrf_token %}
<table border="2">
<tr>
<th style="color:">User Name</th>
<th style="color:">Website</th>
<th style="color:">Attack Details</th>
</tr>

FALSEDREAM
{% for a in obj %}
<tr>
<td style="color:white">{{a.usid.userid}}</td>
<td style="color:white">{{a.website}}</td>
<td style="color:white">{{a.atk}}</td>
{% endfor %}
<tr></tr>
</table>
</form>
</div>
<div class="any"></div>

FALSEDREAM
VIEW MAIL
rel="stylesheet">
<style>
body{
}
.index{
border-style:none;
height:50px;
width:300px;
background:blue;
margin-left:450px;
text-align:center;
margin-top:-30px;
}
.mailtable{
position: absolute;
margin-top:50px;
left:120px;
padding:5px;
height:350px;
width:750px;
overflow:scroll;
float:left;
}

FALSEDREAM
.mailtable table{
width:50em;
text-align:center;
border-spacing:1px;
background:;
}
.mailtable table tr th{

padding:5px;
}
.mailtable table tr td{
background:#ff0e00;
padding:5px;
}
.mailtable table tr:hover td{
background:rgba(0,00.5);
}
.viewimage{
border-style:solid;
border-width:1px;
height:350px;
width:400px;
margin-top:-400px;
margin-left:900px;
background: url("{% static 'gif.gif' %}");
}
.compose{
border-style:none;

FALSEDREAM
border-width:1px;
height:254px;
width:100px;
margin-top:-300px;
margin-left:0px;
background: rgb(243, 243, 243);
</style>
</head>
<body>
<div class="mailtable">
<table>
<tr>
<th>MAIL ID</th>
<th>CHAT</th>
<th>DELETE</th>
</tr>
{% for o in form %}
<tr>
<td>{{o.to}}</td>
<td>{{o.chat}}</td>
<td ><a href="{% url 'deleteobj' o.id %}" style="text-
decoration:none;color:black">Delete</a></td>
</tr>

FALSEDREAM
{% endfor %}
</table>
</div>
<div class="index"> <h3 style="color:white;padding-top:15px;">INBOX MESSAGE</h3></div>

<button type="button" style="background:RED;margin-top:400px;margin-left:600px;"><a href="{% url
'userpage' %}" style="color:white;text-decoration:none">BACK</a></button>
<div class="viewimage"></div>
<div class="compose">
<a href="{% url 'userpage' %}" style="text-decoration:none;color:black;"> <p
style="color:yellow;text-decoration:none;margin-left:20px;">COMPOSE </a><br>
<a href="{% url 'viewmailpage' %}" style="text-decoration:none;color:black;"><p
style="color:black;text-decoration:none;margin-top:50px;margin-left:20px;"> INBOX </p></a><br>
<a href="{% url 'spampage' %}" style="color:black;text-decoration:none;"><p
style="color:black;text-decoration:none;margin-top:50px;margin-left:20px;"> PHISHING MAIL
</a><br>
</div>
{% endblock %}

FALSEDREAM
ANALYSIS PAGE
{% extends 'admins/admin_design.html' %}
{% block adminblock %}
<link
href="https://fonts.googleapis.com/css?family=Russo+One&display=swap"
rel="stylesheet">
<style>
.category{
position: absolute;
margin-top:50px;
left:px;
padding:5px;
height:350px;
width:1150px;
overflow:scroll;
float:left;
}
.category table{
width:70em;
text-align:center;
border-spacing:1px;
background:;
}
.category table tr th{

padding:5px;
}
.category table tr td{
background:#ff0e00;
padding:5px;
}

FALSEDREAM
</style>
<div class="category">
<table>
<tr>
<th>SENDER MAIL</th>
<th>TO MAIL</th>
<th>SUBJECT</th>
<th>CHAT</th>
<th>CATEGORY</th>
<th>DELETE</th>
</tr>
{% for o in obj %}
<tr>
<td>{{o.sendermail}}</td>
<td>{{o.to}}</td>
<td>{{o.subject}}</td>
<td>{{o.chat}}</td>
<td>{{o.category}}</td>
<td ><a href="{% url 'analysisdelete' o.id %}" style="text-
decoration:none;color:black">Delete</a></td>
</tr>
{% endfor %}
</table>
</div>
<div class="sideimage"></div>
{%endblock%}

FALSEDREAM
USER REGISTER
<link
href="//netdna.bootstrapcdn.com/bootstrap/3.0.3/css/bootstrap.min.css"
rel="stylesheet" id="bootstrap-css">
<script
src="//netdna.bootstrapcdn.com/bootstrap/3.0.3/js/bootstrap.min.js"></
script>
<script
src="//cdnjs.cloudflare.com/ajax/libs/jquery/3.2.1/jquery.min.js"></sc
ript>

<body>
<div class="container h-100">
<div class="d-flex justify-content-center h-100">
<div class="user_card">
<div class="d-flex justify-content-center">
<div class="brand_logo_container">
<img src="https://cdn2.iconfinder.com/data/icons/security-safety-volume-
2/1000/Phishing_Attack-512.png" class="brand_logo" alt="Logo">
</div>
</div>
<div class="d-flex justify-content-center form_container">
{% csrf_token %}
<span class="input-group-text"><i class="fas fa-user"></i></span>
</div>
<input type="text" name="username" class="form-control input_user" value=""
placeholder="Admin Id">
</div>
<span class="input-group-text"><i class="fas fa-key"></i></span>
</div>
<input type="password" name="password" class="form-control input_pass" value=""
placeholder="password">
</div>
<div class="form-group">
<div class="custom-control custom-checkbox">
<input type="checkbox" class="custom-control-input" id="customControlInline">
<label class="custom-control-label" for="customControlInline">Remember me</label>
</div>
</div>
FALSEDREAM
<div class="d-flex justify-content-center mt-3 login_container">

<input type="submit" name="button" value="login" class="btn login_btn">
</div>
</form>
</div>
<div class="mt-4">
</div>
</div>
</div>
</div>
</div>
</body>
</html>

FALSEDREAM
REFRENCES
 Y. Fang, C. Zhang, C. Huang, L. Liu, and Y. Yang, "Phishing Email Detection

Using Improved RCNN Model With Multilevel Vectors and Attention
Mechanism," in IEEE Access, vol. 7, pp. 56329-56340, 2019, doi:
10.1109/ACCESS.2019.2913705.
 Ra, Vinayakumar, Barathi Ganesh HBa, Anand Kumar Ma, Soman KPa,
Prabaharan Poornachandran, and A. Verma. "DeepAnti-PhishNet: Applying deep
neural networks for phishing email detection." In Proc. 1st AntiPhishing Shared
Pilot 4th ACM Int. Workshop Secur. Privacy Anal.(IWSPA), pp. 1-11. Tempe,
AZ, USA, 2018.
 ALAUTHMAN, MOHAMMAD. "Botnet Spam E-Mail Detection Using Deep

Recurrent Neural Network." International Journal 8, no. 5 (2020).
 Almomani, Ammar, Brij B. Gupta, Samer Atawneh, Andrew Meulenberg, and

Eman Almomani. "A survey of phishing email filtering techniques." IEEE
communications surveys & tutorials 15, no. 4 (2013): 2070-2090.
 Gavves, E., Fernando, B., Snoek, C. G., Smeulders, A. W., and Tuytelaars, T.
(2015). Local alignments for fine-grained categorization. International Journal of
Computer Vision, 111(2):191–212.
 P. Somervuo, A. Harm¨ a and S. Fagerlund, “Parametric Representations ¨ of Bird

Sounds for Automatic Species Recognition”, IEEE Trans. Audio, Speech, Lang.
Process., Vol.14, No.6, pp.2252–2263, November 2006.

Project Report Format

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Project Report Format

Uploaded by

Copyright:

Available Formats

PROJECT REPORT

Submitted for Partial Fulfillment of Award of

COMPUTER SCIENCE AND ENGINEERING

DEV DUTT PANDEY - 1712210041

DIVYANSH PANDEY - 1712210043

BABU BANARASI DAS ENGINEERING COLLEGE,

Dr. APJ ABDUL KALAM TECHNICAL

Mr. Brijesh Kr. Verma Dr. Avinash Gupta

CSE DEPARTMENT, BBDEC, Lucknow Page vii

Name: Dev Dutt Pandey Name: Divyansh Pandey

CSE DEPARTMENT, BBD, Lucknow Page vii

We also take this opportunity to express a deep sense of gratitude to Computer

From the bottom of my heart THANK you all very much.

Name: Dev Dutt Pandey Name: Divyansh Pandey

CSE DEPARTMENT, BBD, Lucknow Page vii

The project report has been divided into multiple chapters.

The topics covered under each are as follows:

PROPOSED METHODOLOGY: This chapter describes the way in which phishing

CSE DEPARTMENT, BBD, Lucknow Page vii

CSE DEPARTMENT, BBD, Lucknow Page vii

CSE DEPARTMENT, BBD, Lucknow Page 17

CSE DEPARTMENT, BBD, Lucknow Page 17

Objective of the study:

• To store the data in two form phishing and legitimate email.

CSE DEPARTMENT, BBD, Lucknow Page 17

Brand watch Sentiment Analysis

CSE DEPARTMENT, BBD, Lucknow Page 17

Determining the Semantic Orientation of Terms through Gloss Classification

CSE DEPARTMENT, BBD, Lucknow Page 17

CSE DEPARTMENT, BBD, Lucknow Page 17

MITIGATION OF PHISHING ATTACKS: AN OVERVIEW

CSE DEPARTMENT, BBD, Lucknow Page 17

Figure 3.1. Process of Analysis

CSE DEPARTMENT, BBD, Lucknow Page 17

CSE DEPARTMENT, BBD, Lucknow Page 17

CSE DEPARTMENT, BBD, Lucknow Page 17

CSE DEPARTMENT, BBD, Lucknow Page 18

CSE DEPARTMENT, BBD, Lucknow Page 19

 Graphical User interface with the User.

Operating System supported

CSE DEPARTMENT, BBD, Lucknow Page 20

Debugger and Emulator

 Any Browser (Particularly Chrome)

CSE DEPARTMENT, BBD, Lucknow Page 21

 Module1: Login / signup module.

Login/ Client Applying Admin

Figure 3.3. Project Module

CSE DEPARTMENT, BBD, Lucknow Page 22

Registration/ Signup Module:

Figure 3.4. Signup Module

CSE DEPARTMENT, BBD, Lucknow Page 23

Figure 3.5. Login Module

CSE DEPARTMENT, BBD, Lucknow Page 24

Figure 3.6. Client Dashboard Page

Figure 3.7. My details Page

CSE DEPARTMENT, BBD, Lucknow Page 25

Figure 3.8. compose mail Page

Figure 3.9. inbox mails

CSE DEPARTMENT, BBD, Lucknow Page 26

Figure 3.10. Check Phishing details

Figure 3.11. View Phishing details

CSE DEPARTMENT, BBD, Lucknow Page 27

Figure 3.12. Admin login Page

CSE DEPARTMENT, BBD, Lucknow Page 28