Professional Documents
Culture Documents
Phishing Seminar
Phishing Seminar
Phishing Seminar
Phishing attacks typically rely on social networking techniques applied to email or other
electronic communication methods. Some methods include direct messages sent over
social networks and SMS text messages.
Phishers can use public sources of information to gather background information about the
victim's personal and work history, interests and activities. Typically through social networks
like LinkedIn, Facebook and Twitter. These sources are normally used to uncover information
such as names, job titles and email addresses of potential victims. This information can then
be used to craft a believable email.
victim receives a message that appears to have been sent by a known contact or
organization
Some of examples used for phishing
Phishing Detection Using Machine Learning Algorithm
DATASET -
URLs of benign websites were collected from www.alexa.com and The URLs
of phishing websites were collected from www.phishtank.com. The data set
consists of total 36,711 URLs which include 17058 benign URLs and
19653 phishing URLs. Benign URLs are labelled as “0” and phishing URLs
are labelled as “1”
Feature Extraction
1. Decision tree algorithm - tree begins its work by choosing best splitter from the
available attributes for classification which is considered as a root of the tree. Algorithm
continues to build tree until it finds the leaf node.Decision tree creates training model which
is used to predict target value or class in tree representation each internal node of the tree
belongs to attribute and each leaf node of the tree belongs to class label. In decision tree
algorithm, gini index and information gain methods are used to calculate these Decision
nodes.
2. Random Forest Algorithm- Random forest algorithm creates the forest
with number of decision trees. High number of tree gives high detection
accuracy. Creation of trees are based on bootstrap method.
Phishers, people who are phishing other people (i.e., victims), have reasons for
doing so. They are all criminals…cons…each pretending to be something they
are not in order to trick people into revealing sensitive information or into
running a Trojan Horse program. They are broken people with poor morals
looking to gain something they could not otherwise get with honesty, integrity,
or hard work. But they have their reasons and motivations.
Literature survey
Conclusion
This paper aims to enhance detection method to detect phishing websites using
machine learning technology. We achieved 97.14% detection accuracy using
random forest algorithm with lowest false positive rate. Also result shows that
classifiers give better performance when we used more data as training data. In
future hybrid technology will be implemented to detect phishing websites
more accurately, for which random forest algorithm of machine learning
technology and blacklist
method will be used.
References
1. A. K. Jain and B. B. Gupta, “A novel approach to protect against phishing attacks at client side
using auto-updated white-list,” EURASIP Journal on Information Security, vol. 2016, article 9,
11 pages, 2016. View at: Publisher Site | Google Scholar
2. G. A. Montazer and S. Yarmohammadi, “Detection of phishing attacks in Iranian e-banking
using a fuzzy-rough hybrid system,” Applied Soft Computing, vol. 35, pp. 482–492, 2015.
View at: Publisher Site | Google Scholar
3. B. Gu, V. S. Sheng, Z. Wang, D. Ho, S. Osman, and S. Li, “Incremental learning for ν-Support
Vector Regression,” Neural Networks, vol. 67, pp. 140–150, 2015.
4. Mohammad R., Thabtah F. McCluskey L., (2015) Phishing websites dataset. Available:
https://archive.ics.uci.edu/ml/datasets/Phishing+Websites Accessed January 2016
5.
Thank You!