RSI Project AI For Cybersecurity

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 2

Project: ML-based attacks detection

Objective
The aim is to understand and analyze network behavior under normal operating conditions and during specific
types of cyber attacks. This project involves capturing network traffic data during a normal usage scenario and
then during three distinct attack scenarios: a scanning attack, a Denial-of-Service (DoS) attack, and a Man-in-The-
Middle (MITM) attack.

1 Task 1: Simulation of Attacks


1.1 Task 1.1: Attacks execution
You will set up a network comprising two virtual machines (VMs) or physical machines, one serving as the attacker
and the other as the victim. Write Python scripts utilizing the Scapy library to execute the following three attacks:
Xmas Scan (Scanning Attack), UDP Flood (Denial-of-Service Attack), and DNS Cache Poisoning (Man-in-The-
Middle Attack).

• Xmas Scan (Scanning Attack): Collect network traffic using Wireshark or tcpdump on the victim machine
for the duration necessary to complete two or three scans. This method helps in understanding the traffic
pattern generated by scanning attacks.
• DNS Cache Poisoning (Man-in-The-Middle Attack): Similarly, the collection period should cover two
or three instances of the attack to capture a comprehensive dataset of the DNS manipulation efforts.
• UDP Flood (Denial-of-Service Attack): Given the high volume of traffic generated by this attack,
consider a shorter collection period of 10 to 15 minutes. This timeframe is sufficient to understand the
intensity and impact of the DoS attack without overwhelming storage with excessive data.
• Normal traffic: it is crucial to collect normal network traffic to serve as a baseline. To do this, launch
Wireshark while using your machine normally and collect traffic for a duration of 30 to 60 minutes.

For each attack, execute one at a time, capture the network traffic on the victim machine, and save it as a
PCAP file. This structured approach allows for a detailed analysis of each attack’s characteristics and effects on
the network, compared to normal traffic patterns. Figure 1 illustrates this process.

1.2 Task 1.2: Network flow and feature extraction


Use one of the two traffic exporters: CICFlowMeter or Tranalyzer, to convert PCAP files into CSV files, each
rich with a unique set of features. This stage is crucial for:

• Extracting network traffic flows from the collected data.


• Calculating features that uniquely characterize each network flow.
The outcomes of this phase are:
1. Three Python scripts, one for each specified attack.

2. Four CSV files capturing network traffic: One for the normal traffic, one for each of the three attacks.

1
Normal

Add Label
Normal
Normal

Target Dataset
Attack 1
Hacker
Traffic (PCAP)
Attack 1
Traffic Features Attack 1
exporter calculation

Attack N

Attack N
Traffic Features Attack N
exporter calculation

Training
Performance evaluation
Decision Tree

Figure 1: Project workflow

2 Task 2: ML-based Detection and Classification


Write a Python program that performs the following tasks:

1. For each CSV file, add a new column named ”Label.” Assign to this column the name of the attack for
the respective attack captures, and for the capture containing normal traffic, assign the value ”Normal.”
Subsequently, merge the four CSV files into a single CSV file, name it ”Dataset.csv”.
2. Use the dataset (CSV file) to train and create a decision tree classifier using the Scikit-Learn library. The
decision tree should be capable of distinguishing between normal traffic and the various types of attacks.

3. Evaluate the performance of the decision tree using the following two metrics:
• Detection rate (Recall)
• False positive rate

3 Deliverables
Please submit the following deliverables, adhering to the specified format requirements:

1. Three Python scripts for conducting the attacks.


2. The merged dataset file, named dataset.csv.
3. A Python program for training and evaluating the decision tree classifier.

4. A detailed report, not exceeding 5 pages, in PDF format. The report should cover the process, methodology,
results, and conclusions of your work.

Important: the Python scripts and the dataset dataset.csv should be packaged together in a single .zip
archive for submission. If the size of the file exceeds the email attachment limit, please share it via a private
Google Drive link and ensure access is granted to.

You might also like