Phishing Seminar

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 19

Name of topic - Phishing

NAME - Aditya Ashok Ghadge


Roll No - 19
Name of guide - Ms. preeti joshi
Index
1. Introduction
2. Background and related work
3. Problem statement
4. Motivation and need
5. Literature survey
6. Conclusion
7. Future scope
8. References
Introduction
What is phishing -
Phishing is a act of attempting to acquire information such as
username,password and credit card details as trustway entity in an
electronic communication.

Communications purporting to be from popular social web sites,auction


sites,online payment process or it administration are commonly used to
lure the unsuspecting public. Phishing email may contain links to websites
that are infected with malware
Background and related work
How phishing works -

Phishing attacks typically rely on social networking techniques applied to email or other
electronic communication methods. Some methods include direct messages sent over
social networks and SMS text messages.

Phishers can use public sources of information to gather background information about the
victim's personal and work history, interests and activities. Typically through social networks
like LinkedIn, Facebook and Twitter. These sources are normally used to uncover information
such as names, job titles and email addresses of potential victims. This information can then
be used to craft a believable email.
victim receives a message that appears to have been sent by a known contact or
organization
Some of examples used for phishing
Phishing Detection Using Machine Learning Algorithm

DATASET -
URLs of benign websites were collected from www.alexa.com and The URLs
of phishing websites were collected from www.phishtank.com. The data set
consists of total 36,711 URLs which include 17058 benign URLs and
19653 phishing URLs. Benign URLs are labelled as “0” and phishing URLs
are labelled as “1”
Feature Extraction

1. Presence of IP address in URL

2. Presence of @ symbol in URL

3. Number of dots in Hostname

4. HTTPS token in URL

5. Presence of Unicode in URL

6. Length of Host name


Machine Learning Algorithm

1. Decision tree algorithm - tree begins its work by choosing best splitter from the
available attributes for classification which is considered as a root of the tree. Algorithm
continues to build tree until it finds the leaf node.Decision tree creates training model which
is used to predict target value or class in tree representation each internal node of the tree
belongs to attribute and each leaf node of the tree belongs to class label. In decision tree
algorithm, gini index and information gain methods are used to calculate these Decision
nodes.
2. Random Forest Algorithm- Random forest algorithm creates the forest
with number of decision trees. High number of tree gives high detection
accuracy. Creation of trees are based on bootstrap method.

3.Support Vector Machine Algorithm - In support vector machine algorithm


each data item is plotted as a point in n-dimensional space and support vector
machine algorithm constructs separating line for classification of two classes, this
separating line is well known as hyperplane.
Implementation and result

Scikit-learn tool has been used to import Machine learning algorithms.


Dataset is divided into training set and testing set in 50:50, 70:30 and 90:10
ratios respectively. Each classifier is trained using training set and testing set
is used to evaluate performance of classifiers. Performance of classifiers has
been evaluated by calculating classifier accuracy score, false negative rate
and false positive rate.
Result -
Motivation and need
Motivations of Phishing Criminals-

Phishers, people who are phishing other people (i.e., victims), have reasons for
doing so. They are all criminals…cons…each pretending to be something they
are not in order to trick people into revealing sensitive information or into
running a Trojan Horse program. They are broken people with poor morals
looking to gain something they could not otherwise get with honesty, integrity,
or hard work. But they have their reasons and motivations.
Literature survey
Conclusion
This paper aims to enhance detection method to detect phishing websites using
machine learning technology. We achieved 97.14% detection accuracy using
random forest algorithm with lowest false positive rate. Also result shows that
classifiers give better performance when we used more data as training data. In
future hybrid technology will be implemented to detect phishing websites
more accurately, for which random forest algorithm of machine learning
technology and blacklist
method will be used.
References
1. A. K. Jain and B. B. Gupta, “A novel approach to protect against phishing attacks at client side
using auto-updated white-list,” EURASIP Journal on Information Security, vol. 2016, article 9,
11 pages, 2016. View at: Publisher Site | Google Scholar
2. G. A. Montazer and S. Yarmohammadi, “Detection of phishing attacks in Iranian e-banking
using a fuzzy-rough hybrid system,” Applied Soft Computing, vol. 35, pp. 482–492, 2015.
View at: Publisher Site | Google Scholar
3. B. Gu, V. S. Sheng, Z. Wang, D. Ho, S. Osman, and S. Li, “Incremental learning for ν-Support
Vector Regression,” Neural Networks, vol. 67, pp. 140–150, 2015.
4. Mohammad R., Thabtah F. McCluskey L., (2015) Phishing websites dataset. Available:
https://archive.ics.uci.edu/ml/datasets/Phishing+Websites Accessed January 2016
5.
Thank You!

You might also like