Professional Documents
Culture Documents
DDOS_Attack_final[1][1][2]
DDOS_Attack_final[1][1][2]
DDOS_Attack_final[1][1][2]
System
E PROJECT ENGINEERING
PROJECT
WORK
Submitted by
Sangavi S – 722821106128
Sanjay K – 722821106129
Theja S – 722821106168
Varnika S – 722821106170
BATCH
2021 – 2025
Under the Guidance of
Mr. M.Vivek Kumar,
ME.,(Ph.D).,
Department of Electronics and Communication Engineering
BONAFIDE CERTIFICATE
Certified that this project report titled “DeepShield: Intrusion Detection System” done
for the bonafide work of
Sangavi S 722821106128
Sanjay K 722821106129
Theja S 722821106168
Varnika S 722821106170
………………………………… …………………………………
SIGNATURE SIGNATURE
Submitted for the End Semester practical examination – Project using Design
Thinking viva-voce held on _ _ _ _ _ _ _ _ _ _ _ _
……………………… ……………………..
(Internal Examiner) (External Examiner)
TABLE OF CONTENTS
ABSTRACT iii
LIST OF FIGURES iv
1 INTRODUCTION 1
2 LITERATURE SURVEY 5
2.1 EXISTING PRODUCT 5
2.2 PROBLEM STATEMENT 7
3 PROPOSED SOLUTION 9
3.1 OVERVIEW 9
3.2 BLOCK DIAGRAM 10
4 HARDWAE DESCRIPTION 14
4.1 OVERVIEW 14
5 SOFTWARE DESCRIPTION 17
5.1 SOFTWARE SPECIFICATION 17
5.2 CODING STRUCTURE 19
6 RESULT & IMPLEMENTATION 28
7 CONCLUSION & FUTURE SCOPE 33
7.1 CONCLUSION 33
iii
LIST OF FIGURES
FIGURE No. TITLE PAGE No.
Figure 1 Bidirectional LSTM
Architecture 10
Figure 2 Histogram of Output
28
Figure 3 Plot for Loss
29
Figure 4 Plot for Accuracy
30
Figure 5 User Interface
31
Figure 6 Result of Classified Output
32
iv
iv
CHAPTER 1
INTRODUCTION
A significant volume of sensitive user data is susceptible to a diverse array of
internal and external security breaches. With technological advancements,
cyber-attacks have evolved alongside increasingly sophisticated algorithms. The
primary targets of such attacks are systems that either process or store critical
data, as well as services reliant on these systems. Consequently, there arises a
necessity for a cutting-edge Intrusion Detection System (IDS) capable of
identifying malicious cyber-attacks posing security risks. IDS functions as an
intrusion detection tool, automatically identifying and categorizing attacks,
security policy violations, and intrusions across both network and host level
infrastructures.
The evolving landscape of cyber threats underscores the need for substantial
refinement and adaptation, often involving the integration of Machine Learning
(ML) to enhance IDS performance. ML, a subset of Artificial Intelligence (AI),
enables computers to learn without explicit programming. ML systems make
predictions by learning from existing data, aiming to develop efficient
algorithms that process input data and generate predictions through statistical
analysis. ML algorithms typically fall into two primary categories: Supervised
Learning and Unsupervised Learning.
1
Modern IDS systems must harness the capabilities of AI through ML to achieve
optimal performance in accurately predicting and classifying various types of
attacks. The effectiveness of these ML models is closely tied to the datasets
used for training. Overlooking or concealing biases within the data or
algorithms can lead to biased predictions, thereby undermining the performance
of AI applications.
In recent years, the global proliferation of the Internet of Things (IoT) has
witnessed exponential growth. Projections suggest that the number of
interconnected IoT devices could soar to 125 billion by 2030. The integration of
IoT devices with diverse technologies, services, and protocols has significantly
complicated IoT network management. Consequently, the internet is left
susceptible to severe cyber-attacks and threats, posing risks to consumers
utilizing such devices. Common attacks on IoT systems include Distributed
Denial of Service (DDoS), Denial of Service (DoS), ransomware, and botnet
attacks.
2
be used to carry out various malicious activities, such as distributed denial-of-
service (DDoS) attacks, spam campaigns, and data theft.
Cyber attackers deploy botnets to amplify the scale and impact of their attacks,
harnessing the collective computing power of the infected devices to overwhelm
target systems or networks. Botnets enable attackers to execute coordinated and
widespread assaults, often exploiting vulnerabilities in software or systems to
infiltrate and co-opt new devices into the network.
Neptune:
A Neptune DoS attack floods a target system with excessive traffic to render it
inaccessible. Orchestrated by a malicious actor, it exploits vulnerabilities in
network infrastructure, often leveraging a botnet. Its aim is to disrupt services,
causing downtime and potential financial or reputational damage. Defence
entails robust security measures like traffic filtering and intrusion detection to
mitigate its impact effectively.
Smurf:
Teardrop:
Satan:
4
CHAPTER 2
LITERATURE SURVEY
2.1 Existing Project
NSL-KDD Dataset:
This report discusses the utilization of the NSL-KDD dataset for training and
evaluating DOS network intrusion detection systems. Unlike its predecessor, the
KDD dataset, NSL-KDD addresses several issues, ensuring a more balanced
and reliable classifier training and testing environment. Notably, NSL-KDD
eliminates redundant records in the training set and removes duplicates from the
test set, preventing classifier bias. Additionally, the dataset's composition
adjusts the number of records from each difficulty level group, enhancing
representativeness. The training set comprises 21 different attacks categorized
into DOS, U2R, R2L, and Probe types, out of the 37 attacks present in the test
set. It provides an overview of the major attacks within each category. Overall,
the NSL-KDD dataset offers a refined framework for robust network intrusion
detection research and development.
5
NSL-KDD Dataset sampling:
This section discusses the selection process for training and validating a Denial
of Service (DOS) Intrusion Detection System (IDS) model using the NSL-KDD
dataset. Out of the available subsets, including the complete dataset and subsets
representing 20% and 10% of the complete dataset, the authors opted for using
20% of the complete NSL-KDD dataset for training purposes. For validation,
they utilized the entire NSL-KDD testing dataset. The distribution of samples
for both the training and testing sets is detailed in Table 3, providing insights
into how the data was allocated for the study's purposes. This approach ensures
a representative yet manageable dataset for training and evaluating the
performance of the DOS IDS model.
This section outlines the crucial steps of data pre-processing and normalization
for training classifiers. It emphasizes the conversion of symbolic (textual)
features into numeric representations, using the NSL-KDD dataset as an
example, where features like network protocols and service types are coded into
numerical values. Following this, min-max normalization is employed to
normalize the resultant feature vectors, ensuring that all features fall within the
range [0,1] to prevent biases during training caused by extremely large or small
values. This process is applied both to the training and test data, ensuring
consistency and reliability in model validation.
Feature Selection:
The KNN algorithm relies on the assumption that similar instances are close to
each other in the feature space. In the context of DDoS detection, this
assumption may not hold true due to the dynamic and evolving nature of
network traffic patterns, leading to low accuracy in classification. Additionally,
KNN suffers from computational inefficiency when dealing with large-scale
datasets, which exacerbates the processing overhead and reduces detection
speed.
7
Furthermore, the lack of feature selection and dimensionality reduction
techniques in KNN can further compound the curse of dimensionality,
diminishing its ability to discriminate between normal and attack traffic
effectively. These limitations underscore the challenges faced in achieving
accurate DDoS detection using traditional KNN methods.
8
CHAPTER 3
PROPOSED SOLUTION
3.1 Overview:
This setup enables the model to assimilate information from both past and
future time steps, thereby capturing the intricate temporal dependencies and
long-range contextual cues embedded in network traffic patterns.
During the training phase, the model becomes adept at distinguishing between
normal network behaviour and the anomalous activity indicative of DDOS
attacks.
What makes this model even more robust is the incorporation of attention
mechanisms. These mechanisms act like a spotlight, allowing the model to hone
in on the most pertinent features within the input sequence.
This heightened focus enables more precise detection of attack patterns amidst
the sea of network data. Through extensive experimentation and rigorous
evaluation, the bidirectional LSTM has consistently showcased its superiority
over conventional detection methods.
Its knack for accurately identifying and mitigating DDOS attacks positions it as
a stalwart solution for bolstering network security and fortifying resilience
against the ever-evolving landscape of cyber threats.
9
3.2 Block Diagram:
The process begins with the input KDD Dataset, providing raw data for network
intrusion detection. This data undergoes preprocessing and feature selection,
involving tasks such as cleaning, normalization, and identifying relevant
features. Subsequently, a Bidirectional LSTM model is trained on the processed
data, enabling sequence analysis and prediction of network intrusions with the
test dataset.
Data Collection:
10
instruments, and obtaining the required data through observation,
experimentation, surveys, interviews, or automated systems.
Data Preprocessing:
Feature selection:
11
decision-making. It separates all the dependent labels in an partially equal
amount and feed it to the machine.
Validating the Bi-directional LSTM model with test set data is a crucial step in
assessing its generalization performance and reliability in real-world scenarios.
By evaluating the model on unseen data, we can verify its ability to accurately
detect DDoS attacks and distinguish them from normal network traffic. During
validation, the model's predictions are compared against ground truth labels
from the test set, allowing for the calculation of performance metrics such as
accuracy, precision, recall, and F1-score. These metrics provide insights into the
model's ability to correctly classify instances of DDoS attacks while minimizing
false positives and false negatives. Additionally, validation helps identify
potential overfitting or underfitting issues, ensuring that the model maintains
robustness and generalizability across different datasets.
13
CHAPTER 4
HARDWARE DESCRIPTION
The 12th Gen Intel Core i7-1255U is a processor designed for laptops and other
mobile computing devices.
5. Cores and Threads: While the exact specifications may vary, typical
Intel Core i7 processors feature multiple cores and support hyper-threading,
allowing for efficient multitasking and parallel processing.
14
card in laptops and other compact devices. However, the specifics of the
integrated graphics depend on the particular model.
Overall, the Intel Core i7-1255U offers a blend of performance and power
efficiency suitable for a wide range of computing tasks, particularly in mobile
devices where space and energy constraints are significant considerations
OS Build: 22631.3447
5. Memory: The GPU likely comes with its dedicated video memory
(VRAM), which helps improve performance by reducing reliance on system
RAM. However, the amount of VRAM may vary depending on the specific
laptop configuration.
15
6. Power Consumption: As a mobile GPU, the R7 M265 is designed to
balance performance with power efficiency. It typically has a lower power
consumption compared to high-end desktop graphics cards, allowing for longer
battery life in laptops.
16
CHAPTER 5
SOFTWARE DESCRIPTION
The Python language had a humble beginning in the late 1980s when a
Dutchman Guido Von Rossum started working on a fun project, which would
be a successor to ABC language with better exception handling and capability
to interface with OS Amoeba at Centrum Wiskunde and Informatica. It first
appeared in 1991. Python 2.0 was released in the year 2000 and Python 3.0 was
released in the year 2008. The language was named Python after the famous
British television comedy show Monty Python's Flying Circus, which was one
of Guido's favourite television programmes. Here we will see why Python has
suddenly influenced our lives and the various applications that use Python and
its implementations.
5.1.2 Libraries:
Numpy:
NumPy is a Python library used for working with arrays. It also has functions
for working in domain of linear algebra, fourier transform, and matrices.
NumPy was created in 2005 by Travis Oliphant. It is an open source project and
you can use it freely. NumPy stands for Numerical Python.
Pandas:
Pandas is a Python library used for working with data sets. It has functions for
analyzing, cleaning, exploring, and manipulating data. The name "Pandas" has a
reference to both "Panel Data", and "Python Data Analysis" and was created by
Wes McKinney in 2008.
Scikit-Learn:
17
Scikit-Learn, also known as sklearn is a python library to implement machine
learning models and statistical modelling. Through scikit-learn, we can
implement various machine learning models for regression, classification,
clustering, and statistical tools for analyzing these models.
Matplotlib:
Matplotlib is a comprehensive library for creating static, animated, and
interactive visualizations in Python. Matplotlib makes easy things easy and hard
things possible. Create publication quality plots. Make interactive figures that
can zoom, pan, update. Customize visual style and layout.
Tkinter:
Tkinter is an open source, portable graphical user interface (GUI) library
designed for use in Python scripts. Tkinter relies on the Tk library, the GUI
library used by Tcl/Tk and Perl, which is in turn implemented in C. Therefore,
Tkinter can be said to be implemented using multiple layers.
TensorFlow:
TensorFlow is an open-source machine learning library developed by Google.
TensorFlow is used to build and train deep learning models as it facilitates the
creation of computational graphs and efficient execution on various hardware
platforms.
Click:
Click is a Python package for creating beautiful command line interfaces in a
composable way with as little code as necessary. It's the “Command Line
Interface Creation Kit”. It's highly configurable but comes with sensible
defaults out of the box.
Seaborn:
Seaborn is a Python data visualization library based on matplotlib. It provides a
high-level interface for drawing attractive and informative statistical graphics.
For a brief introduction to the ideas behind the library, you can read the
introductory notes or the paper.
Pickle:
18
Pickle in Python is primarily used in serializing and deserializing a Python
object structure. In other words, it's the process of converting a Python object
into a byte stream to store it in a file/database, maintain program state across
sessions, or transport data over the network.
import numpy as np
import tensorflow.keras as tf
import bilstm
import pandas as pd
import pandas as pd
df = pd.read_csv('KDD.txt', sep=",",header=None)
print("")
print(df)
19
print("")
df=df.dropna(how="any")
print(df)
print("")
print(df.info())
print("")
#Data Visualization
#histogram of output
plt.figure(figsize=(10,8))
plt.title("Histogram of Output")
plt.hist(df[41],rwidth=0.9)
plt.show()
df[41]=df[41].map({"teardrop.":5,"satan.":4,"portsweep.":3,"smurf.":2,"neptune
.":1,"normal.":0})
print(df)
print(df[41].value_counts())
X = df.iloc[:,4:41].values
y = df.iloc[:, 41].values
print("Xtrain value")
print(X_train)
20
print("ytrain value")
print(y_train)
y_train = to_categorical(y_train)
ytest = to_categorical(y_test)
y_train
model = Sequential()
net = model.add(Dense(64,input_dim=37,activation='relu'))
model.add(Dense(6,activation='softmax'))
model.compile(loss="categorical_crossentropy",optimizer='adam',metrics=['acc
uracy'])
history = model.fit(X_train,y_train,epochs=10)
model.save("bilstm_training.h5")
print(history.history.keys())
plt.figure(figsize=(20,10))
plt.ylabel('Loss', fontsize=16)
plt.legend(loc='upper right')
plt.show()
21
plt.figure(figsize=(20,10))
plt.ylabel('Accuracy', fontsize=16)
plt.legend(loc='upper right')
plt.show()
Testing Code:
import numpy as np
import pandas as pd
import pickle
import click
from tensorflow.keras.models import *
#RSA
from tkinter import *
from tkinter import ttk
from tkinter import Menu
from tkinter import messagebox as mbox
# import filedialog module
from tkinter import filedialog
flg=0;
import tkinter as tk
model = load_model('bilstm_training.h5')
# Function for opening the
# file explorer window
def browseFiles():
filename = filedialog.askopenfilename(initialdir = "/",
title = "Select a CSV File",
22
filetypes = (("text files",
"*.txt*"),
("all files", "*.*")))
# Change label contents
label_file_explorer.configure(text="File Opened: "+filename)
global f
f = filename
def start():
print("Process Started")
dataset = pd.read_csv(
f, sep=",",header=None)
#dataset=dataset.dropna(how="any")
print(dataset)
print(dataset.info())
X = dataset.iloc[:,4:41].values
# load the model from disk
ypred = model.predict(X)
import numpy as np
y_pred=np.argmax(ypred,axis=1)
print(y_pred)
app = tk.Tk()
if(y_pred[0]==0):
print("Normal")
label_file_explorer.configure(text="Result for the request: The Input
Request is Normal")
app.title("Attack Detection")
ttk.Label(app, text="Result for the request: The Input Request is
Normal").grid(column=0,row=0,padx=20,pady=30)
menuBar = Menu(app)
23
app.config(menu=menuBar)
elif(y_pred[0]==1):
print("neptune")
label_file_explorer.configure(text="Result for the request: The Input
Request is neptune")
app.title("Attack Detection")
ttk.Label(app, text="Result for the request: The Input Request is a
neptune").grid(column=0,row=0,padx=20,pady=30)
menuBar = Menu(app)
app.config(menu=menuBar)
elif(y_pred[0]==2):
print("smurf")
label_file_explorer.configure(text="Result for the request: The Input
Request is smurf")
app.title("Attack Detection")
ttk.Label(app, text="Result for the request: The Input Request is a
smurf").grid(column=0,row=0,padx=20,pady=30)
menuBar = Menu(app)
app.config(menu=menuBar)
elif(y_pred[0]==3):
print("portsweep")
label_file_explorer.configure(text="Result for the request: The Input
Request is portsweep")
app.title("Attack Detection")
ttk.Label(app, text="Result for the request: The Input Request is a
portsweep").grid(column=0,row=0,padx=20,pady=30)
menuBar = Menu(app)
app.config(menu=menuBar)
elif(y_pred[0]==4):
print("satan")
24
label_file_explorer.configure(text="Result for the request: The Input
Request is satan")
app.title("Attack Detection")
ttk.Label(app, text="Result for the request: The Input Request is a
satan").grid(column=0,row=0,padx=20,pady=30)
menuBar = Menu(app)
app.config(menu=menuBar)
elif(y_pred[0]==5):
print("teardrop")
label_file_explorer.configure(text="Result for the request: The Input
Request is teardrop")
app.title("Attack Detection")
ttk.Label(app, text="Result for the request: The Input Request is a
teardrop").grid(column=0,row=0,padx=20,pady=30)
menuBar = Menu(app)
app.config(menu=menuBar)
if __name__ == '__main__':
window = Tk()
# Set window title
window.title('Application')
# Set window size
window.geometry("700x400")
#Set window background color
window.config(background = "white")
import tkinter
from tkinter import *
from PIL import Image, ImageTk
# Create a photoimage object of the image in the path
image1 = Image.open("bg.jpg")
test = ImageTk.PhotoImage(image1)
25
label1 = tkinter.Label(image=test)
label1.image = test
# Position image
label1.place(x=0, y=0)
def on_enter(e):
button_explore.config(background='OrangeRed3', foreground= "white")
def on_leave(e):
button_explore.config(background= 'SystemButtonFace', foreground=
'black')
def on_enter1(e):
button_start.config(background='OrangeRed3', foreground= "white")
def on_leave1(e):
button_start.config(background= 'SystemButtonFace', foreground= 'black')
def on_enter2(e):
button_exit.config(background='OrangeRed3', foreground= "white")
def on_leave2(e):
button_exit.config(background= 'SystemButtonFace', foreground= 'black')
# Create a File Explorer label
label_file_explorer = Label(window,
text = "Please give Input Request",
width = 100, height = 4,
fg = "blue")
button_explore = Button(window,
text = "Browse Request Files",
command = browseFiles, height = 5)
button_exit = Button(window,
text = "exit",
command = exit, height = 5, width=10)
button_start = Button(window,
26
text = "Start Analyzing Request",
command = start, height = 5)
# Grid method is chosen for placing
# the widgets at respective positions
# in a table like structure by
# specifying rows and columns
label_file_explorer.grid(column = 1, row = 1, padx=1, pady=5)
button_explore.grid(column = 1, row = 2, padx=5, pady=5)
button_exit.grid(column = 1,row = 3, padx=5, pady=5)
button_start.grid(column = 1,row = 4, padx=5, pady=5)
# Let the window wait for any events
window.mainloop()
27
CHAPTER 6
RESULT AND IMPLEMENTATION
29
The graph below depicting accuracy in a bidirectional LSTM (Bi-LSTM) model
typically showcases the performance of the model over iterations or epochs
during the training process. As training progresses, the accuracy of the model on
the training data is evaluated and plotted against the number of iterations or
epochs.This phase is characterized by significant improvement as the model
adjusts its parameters to better fit the training data. However, as training
continues, the rate of improvement typically slows down, and the accuracy
curve may start to plateau. The validation accuracy helps assess the
generalization ability of the model and detect overfitting. Overall, the accuracy
graph provides valuable insights into the learning progress and performance of
the Bi-LSTM model, guiding decisions on model architecture, hyperparameters,
and training strategies to achieve the desired level of accuracy and
generalization.
30
other relevant files, depending on the system's purpose. On the other hand, the
"Start Analyzing Report" button triggers the commencement of the analysis
process once the necessary files or requests have been submitted.
31
Fig 6: Result of Classified Output
By visually presenting the detected DDoS types, the picture enables users to
quickly understand the nature and characteristics of the attacks detected by the
system. Additionally, the categorization of DDoS types facilitates informed
decision-making and targeted mitigation strategies, allowing network
administrators and security analysts to prioritize response efforts based on the
specific threat vectors identified. Overall, the picture serves as a valuable
reference tool for interpreting and responding to detected DDoS attacks
effectively.
32
CHAPTER 7
CONCLUSION AND FUTURE SCOPE
7.1 CONCLUSION
In conclusion, our study demonstrates that utilizing Bi-directional LSTM for
DDoS attack detection outperforms the traditional K-Nearest Neighbors (KNN)
approach in terms of accuracy. Through comprehensive experimentation and
evaluation, we have observed that the Bi-directional LSTM model exhibits
superior performance in accurately identifying and mitigating DDoS attacks. By
leveraging the temporal dependencies inherent in network traffic data, Bi-
directional LSTM effectively captures nuanced patterns and temporal
relationships, enhancing its ability to detect subtle deviations indicative of
attacks. In contrast, the KNN approach, while straightforward, may struggle to
adapt to the dynamic and evolving nature of DDoS attack patterns, leading to
lower accuracy in classification. Our findings underscore the effectiveness of
deep learning techniques, specifically Bi-directional LSTM, in enhancing the
accuracy and reliability of DDoS attack detection systems, paving the way for
more robust and resilient network security solutions.
Anomaly Detection and Zero-day Attacks: Extend the scope of the project to
encompass anomaly detection techniques for detecting zero-day DDoS attacks
or previously unseen attack patterns.
34
[11] J. Jabez and B. Muthukumar, “Intrusion detection system (ids): Anomaly
detection using outlier detection approach,” in Procedia Computer Science,
2015. [12] D. E. Denning, “An intrusion-detection model,” in Proceedings -
IEEE Symposium on Security and Privacy, 2012.
[18] E. Hodo et al., “Threat analysis of IoT networks using artificial neural
network intrusion detection system,” in 2016 International Symposium on
Networks, Computers and Communications, ISNCC 2016, 2016.
35