Phishing Attacks Surge During COVID-19, Targeting Individuals and Organizations Globally. Cybercriminals Use Social Engineering To Trick Users Into Sharing Sensitive Data, Emphasizing The Need Fo

Assessing Machine Learning Tools for Web Page Phishing
Detection: A Performance Evaluation

&
Modernizing Phishing Defense: A Groundbreaking Ensemble
Machine Learning Approach
Presented By: Bakkireddygari Sai Sravanthi

Student Id: 112121037
1
Table of Contents
Introduction Ensemble Model PhishNet Methodology Data Collection and Feature

Architecture Overview Overview Pre-Processing Extraction
Results Phish Net Conclusion and

Rule Extraction Future work
Implementation
2
Introduction
• Phishing attacks surge during COVID-19, targeting individuals and

organizations globally.
• Cybercriminals use social engineering to trick users into sharing
sensitive data, emphasizing the need for strong security measures.
• This study introduces an innovative ensemble ML approach integrated
into the browser extension, Phish Net, to combat phishing threats
effectively.
3
Ensemble Model Architecture
Random Forest Classifier (RFC) combined with:
￭ Artificial Neural Network (ANN)
￭ k-Nearest Neighbors (KNN)
￭ Decision Tree (C4.5)
The ensemble model connects the collective intelligence of diverse

classifiers to enhance phishing detection accuracy and robustness
4
PhishNet Overview
PhishNet is a browser extension for Google Chrome designed to
detect phishing websites effectively.
It analyzes webpage characteristics in real-time and alerts users if a

phishing attempt is detected, enhancing web security
5
Methodology Overview
Data Collection: Obtained 1000 phishing URLs from Phish Tank and 400
legitimate internet banking URLs.
Feature Extraction: Extracted 14 features including IP address, SSL security,

and URL characteristics.
Model Building and Training: Trained SVM, Random Forest, and k-NN
models using Python's Scikit-learn library.
6
Methodology Overview
Model Assessment: Evaluated model performance using metrics like accuracy,
true positive rate, and true negative rate.
Rule Extraction: Extracted decision rules from the best-performing model

(Random Forest).
Phish Net Implementation: Integrated extracted rules into a Google Chrome

extension using web technologies.
7
Data Collection and Pre-Processing
The dataset comprised 1000 phishing URLs and 400 legitimate Internet
banking URLs.
The dataset utilized for training comprises 11055 instances and 30

features sourced from the UCI machine learning repository
⚬ Phishing URLs: 55.69%
⚬ Legitimate URLs: 44.3057%
8
Feature Extraction
⚬ Presence of IP address in the ⚬ Domain registration length

URL ⚬ Redirects
⚬ SSL security availability ⚬ Website Popularity
⚬ Number of dots in the URL ⚬ Website age
⚬ Length of the URL ⚬ Unusual characters
⚬ Presence of "@" symbol in the
URL
⚬ Subdomains
9
Results
Ensemble Model Performance:
⚬ RFC + ANN achieved an impressive F1-score of 0.975 and an accuracy of
97.16%.
⚬ RFC + KNN demonstrated superior performance with an F1-score of 0.976
and an accuracy of 97.33%.
⚬ RFC + C4.5 exhibited notable results with an F1-score of 0.976 and an
accuracy of 96.36%.
Model Building and Training
Random Forest:
⚬ Achieved an outstanding accuracy of 98.35%
⚬ Demonstrated a perfect true positive rate of 100% and a true negative rate of
90.48%
10
Rule Extraction
Decision rules extracted from Random Forest highlight key features

signaling phishing behavior from the Trained Decision Tree model.
These rules support Phish Net's detection system, enabling instant

identification of potential phishing attempts
11
PhishNet Implementation
Screenshots or diagrams illustrating Phish Net's natural interface
and its continuous integration into the Google Chrome browser.
Highlight the user-friendly nature of Phish Net and its proactive

role in safeguarding users against phishing attacks.
12
Phish Net analyses a page PhishNet detects a phishing site
13
Conclusion and Future work
The study's findings highlight the efficacy of the proposed

ensemble model and its integration into the practical Phish Net
browser extension.
Future research avenues may include exploring performance on

diverse datasets, refining feature extraction techniques, and
enhancing Phish Net's capabilities through continuous innovation
and development.
14
15

Phishing Attacks Surge During COVID-19, Targeting Individuals and Organizations Globally. Cybercriminals Use Social Engineering To Trick Users Into Sharing Sensitive Data, Emphasizing The Need Fo

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Phishing Attacks Surge During COVID-19, Targeting Individuals and Organizations Globally. Cybercriminals Use Social Engineering To Trick Users Into Sharing Sensitive Data, Emphasizing The Need Fo

Uploaded by

Copyright:

Available Formats

Assessing Machine Learning Tools for Web Page Phishing

Detection: A Performance Evaluation

Presented By: Bakkireddygari Sai Sravanthi

Introduction Ensemble Model PhishNet Methodology Data Collection and Feature

Results Phish Net Conclusion and

• Phishing attacks surge during COVID-19, targeting individuals and

The ensemble model connects the collective intelligence of diverse

It analyzes webpage characteristics in real-time and alerts users if a

Feature Extraction: Extracted 14 features including IP address, SSL security,

Rule Extraction: Extracted decision rules from the best-performing model

Phish Net Implementation: Integrated extracted rules into a Google Chrome

The dataset utilized for training comprises 11055 instances and 30

⚬ Presence of IP address in the ⚬ Domain registration length

Decision rules extracted from Random Forest highlight key features

These rules support Phish Net's detection system, enabling instant

Highlight the user-friendly nature of Phish Net and its proactive

The study's findings highlight the efficacy of the proposed

Future research avenues may include exploring performance on

You might also like