Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

Algerian Journal of Renewable Energy and Sustainable Development Volume: 2 | Number: 2 | 2020| December

Design of Machine learning Approach for Denial of Service


Attack Detection
ABID Dhiya Eddine, GHAZLI Abdelkader2
1&2
Department of Mathematics and Computer Sciences, Faculty of Exact Sciences, Tahri Mohamed University,
Bechar, Algeria
*
Corresponding author; Email: abid.dhiyaeddine@gmail.com.

Article Info ABSTRACT


Article history: This study investigates the application of machine learning techniques
for detecting Denial of Service (DoS) attacks in network traffic data.
Received , 2020
We evaluate the performance of multiple classification models,
Revised , 2020
including Logistic Regression, Support Vector Machine (SVM),
Accepted , 2020
Random Forest, Decision Tree, and k-Nearest Neighbors (KNN), in
accurately distinguishing between normal and malicious network
Keywords: activity. Our results demonstrate that ensemble methods like Random
Forest, along with robust classifiers such as SVM, Decision Tree, and
DoS attack KNN, achieve high precision, recall, F1-Score, and accuracy in
Machine learning identifying DoS attacks. Notably, Random Forest attains perfect scores
Random Forest across all metrics, highlighting its effectiveness in mitigating cyber
Cybersecurity threats. These findings emphasize the significance of leveraging
Threat detection advanced machine learning techniques for bolstering cybersecurity
defenses and safeguarding critical network infrastructures against
evolving cyber threats like DoS attacks.

I. Introduction
In today's digitally interconnected world, the internet serves as the backbone of countless services and systems,
facilitating communication, commerce, and collaboration on an unprecedented scale. However, this very
interconnectedness also exposes these systems to a myriad of threats, with Denial of Service (DoS) attacks standing
out as one of the most prevalent and disruptive forms of cyber assault.
Denial of Service attacks, characterized by their intent to overwhelm targeted systems or networks with a flood of
traffic, can incapacitate essential services, disrupt operations, and inflict severe financial losses. Traditional
methods of mitigating such attacks often fall short in the face of evolving tactics employed by attackers,
necessitating innovative approaches to swiftly detect and counter these threats.
Enter machine learning and deep learning techniques. Leveraging the power of artificial intelligence, these
methodologies offer a promising avenue for enhancing the detection and mitigation of DoS attacks. By harnessing
vast datasets and sophisticated algorithms, machine learning and deep learning models can discern patterns,
anomalies, and subtle indicators of malicious activity amidst the deluge of network traffic.
In this article, we delve into the realm of cybersecurity, exploring how machine learning techniques are
revolutionizing the detection of Denial of Service attacks.

II. Background
In the following section, key terms central to the detection of Denial of Service attacks using artificial
intelligence techniques will be succinctly defined.

1
Algerian Journal of Renewable Energy and Sustainable Development x(x) 2020: xxx-xxx, doi: 10.46657/ajresd.2020.x.x.x

II.1. Denial of service

Denial of service (DoS) attack is one of the most popular and easiest forms attacks on the Internet. This type of
attacks does not snoops to the systems or try to obtain some data but it is aim to stop some services in the systems
[1].

II.2. Machine learning

Machine Learning is a subset of artificial intelligence that enables systems to learn from data and improve
performance on a specific task without being explicitly programmed [2]. In the context of DoS attack detection,
machine learning algorithms can analyze network traffic patterns to identify anomalies indicative of an ongoing
attack.

II.3. Deep learning

Deep Learning is a subset of machine learning that employs artificial neural networks with multiple layers to
extract intricate patterns and features from large datasets [3]. Deep learning techniques, such as convolutional
neural networks (CNNs) and recurrent neural networks (RNNs), are increasingly utilized in cybersecurity for their
ability to detect complex patterns and anomalies in network traffic data.

II.4. Anomaly detection

Anomaly Detection refers to the identification of patterns or data points that deviate significantly from normal
behavior within a dataset [4]. In the context of DoS attack detection, anomaly detection techniques are employed
to identify unusual patterns or activities in network traffic that may indicate a potential attack.

III. Related works


In this section, we review relevant literature and prior research that explores the application of artificial
intelligence techniques in detecting Denial of Service attacks.
The paper [5] introduced a new algorithm to classify normal service requests from denial of service attacks by
analyzing the packets sent by the client to the server using machine learning, and they achieved 99% in the
percentage of correct classification of all selected cases.
The paper [6] aims to study the problem of detecting the distributed DoS attack in a cloud environment using
CICIDS 2017 dataset and multiple regression algorithm for making a machine learning model to predict DDoS
and Bot attacks.
The paper [7] implented several machine learning models and linked them with the DDoS detection system where
the goal is to improve the accuracy of detecting the DDoS attack using CICDDoS 2019 dataset. They arrived to a
high accuracy with the random forest algorithm estimated by 99.99%.
The paper [8] evaluates the power of rapid machine learning methods for identifying denial of service attacks.
They used WEKA tools and the dataset CICIDS2017 and several techniques which are the REP tree (REPT),
random tree (RT), random forest (RF), decision stump (DS), and J48. In the experiment, they got that the J48 takes
less testing time and gives more powerful results, an accuracy results of 99.51% and 99.96% achieved using 4 and
8 features.
The paper [9] aims to prevent systems and devices from MTM and DoS attacks using machine learning. They
implemented four ML algorithms which are random forest (RF), eXtreme gradient boosting (XGBoost), gradient
boosting (GB), and decision tree (DT). They evaluated their work using the following metrics precision, accuracy,
recall, and f1-score. In the experiment results, they got more than 99% in all evaluation metrics.
[10], in this paper, they tried to identifying the related attack parameters from the simple network management
protocol dataset. The chosen parameters passed with two models, the first is linear regression with accuracy of
99.7% and 3.3% of errors, and in the second they applied gradient descent algorithm in the linear regression and

2
Algerian Journal of Renewable Energy and Sustainable Development x(x) 2020: xxx-xxx, doi: 10.46657/ajresd.2020.x.x.x

the errors was reduced by 3%.


[11], proposed a mathematical model for distributed denial-of-service attacks using machine learning algorithms
such as Logistic Regression and Naive Bayes with CAIDA 2007 Dataset in wekka data mining platform.
In the paper [12], a machine learning-based framework was proposed to detect distributed DOS (DDoS)/DoS
attacks, they used a dataset containing a large number of network traffic, and they combined the principle
component analysis (PCA) and single value decomposition (SVD) to manage the high number of features in the
dataset. They evaluated their model using accuracy, recall, and F1 score and they got 100% accuracy using random
forest algorithm.
The article [13], created a novel technique to detect denial of service (DoS) attacks, this technique can reduce the
space of features, calculation time and model overfitting. This technique created using support vector machine
(SVM) as a training algorithm and CICDDoS2019 dataset. At the end their model arrived to an accuracy of
99.95%.
The [14], presented a research with quantative nature. They arrived to detecting DDoS attack using CIC-
DDoS2019 dataset and by applying Logistic Regression, Decision Tree, Random Forest, Ada Boost, Gradient
Boost, KNN, and Naive Bayes classification algorithms. Their final model gave the best results of classification
with AdaBoost and Gradient Boost algorithms, however it gave a good results with Logistic Regression, KNN,
and Naive Bayes but they didn't arrive to satisfactional results with Decision Tree and Random Forest.

IV. Methodology
In this section, we detail the methodology employed to prepare and train the dataset for detecting Denial of
Service (DoS) attacks using artificial intelligence techniques. A robust and well-curated dataset is crucial for
training accurate and reliable machine learning models capable of identifying malicious network activity amidst
legitimate traffic. We describe the steps involved in collecting, preprocessing, labeling, and augmenting the dataset
to enhance its effectiveness in training DoS detection algorithms. The next figure explain more our aim from this
research:

Figure 1: Graphical description for our work

The user will send an http request, this request contain several features we will extract 28 feature compatible with
our dataset, then we will pass it to our trained model in order to classify which type of request it belongs to.

IV.1. Dataset description

Our dataset called "DDoS SDN dataset", it contains 104345 rows and 23 columns. There is a one target variable
called label: contains only 1(malicious) and 0(benign). Our task is to classify whether the traffic is normal or not
using Machine Learning algorithms. The following figure show the percentage of Begin and Malicious Request in
our dataset.

3
Algerian Journal of Renewable Energy and Sustainable Development x(x) 2020: xxx-xxx, doi: 10.46657/ajresd.2020.x.x.x

Figure 2: The percentage of Begin and Malicious Request in dataset

Our dataset contains several features such as source and destination ip address, packets count, byte count, duration,
protocol…etc. The next figures will show some features and their number of requests and the number of malicious
request from each feature.

Figure 3: Number of requests from different IP addresses

Figure 4: Number of requests from different protocols

4
Algerian Journal of Renewable Energy and Sustainable Development x(x) 2020: xxx-xxx, doi: 10.46657/ajresd.2020.x.x.x

IV.2. Data cleaning and preprocessing

Data cleaning and preprocessing are essential steps in preparing the raw dataset for training machine learning
models for Denial of Service (DoS) attack detection. This phase involves several key tasks aimed at enhancing the
quality and usability of the data:[15] [16]

1) Handling Missing Values.


2) Removing Duplicates.
3) Standardization and Normalization.
4) Encoding Categorical Variables.
5) Handling Outliers.
6) Feature Scaling.
7) Feature Engineering.
8) Data Transformation.

In our case we have drop all the null values in our dataset especially the data exists in the columns rx_kbps and
tot_kbps, then we droped the column of source and destination address because they don’t have any impact, after
that we have categorized the column of request protocol.

By meticulously cleaning and preprocessing the dataset, we ensure that it is free from inconsistencies, biases, and
noise, laying the foundation for robust and accurate DoS attack detection models. The resulting clean and
standardized dataset is then ready for further analysis and model training.

IV.3. Model training

In this section, we describe the process of training machine learning models for Denial of Service (DoS) attack
detection using the preprocessed dataset. We split the dataset into training (70%) and testing sets (30%) and train
multiple classification models, including Logistic Regression, Support Vector Machine (SVM), k-Nearest
Neighbors (KNN), Decision Tree, and Random Forest. Each model's performance is evaluated using appropriate
metrics to assess its effectiveness in detecting DoS attacks.

IV.4. Performance Evaluation

After training each model, we evaluate its performance on the validation set using appropriate evaluation metrics
such as accuracy, precision, recall, and F1-score. These metrics provide insights into each model's ability to
correctly classify instances of normal and malicious network traffic. The evaluation results with all algorithms and
metrics are summarized in the following table:

Macro AVG

TP FP FN TN Precision Recall F1-score Accuracy

Logistic Regression 0.51 0.10 0.13 0.26 0.75 0.76 0.75 0.77

Support Vector Machine 0.59 0.02 0.01 0.38 0.97 0.96 0.97 0.97

Random Forest 0.61 0 0 0.39 1 1 1 1

Decision Tree 0.60 0.01 0 0.39 0.98 0.98 0.98 0.98

K Nearets Neighbor 0.60 0.01 0.01 0.38 0.98 0.98 0.98 0.98

5
Algerian Journal of Renewable Energy and Sustainable Development x(x) 2020: xxx-xxx, doi: 10.46657/ajresd.2020.x.x.x

IV.5. Discussion of results

In the results provided, we have the performance metrics for five different classification models: Logistic
Regression (LR), Support Vector Machine (SVM), Random Forest (RF), Decision Tree (DT), and k-Nearest
Neighbors (KNN). Each model's performance is evaluated using various metrics such as True Positive Rate (TP),
False Positive Rate (FP), False Negative Rate (FN), True Negative Rate (TN), Precision, Recall, F1-Score, and
Accuracy.

1) Logistic Regression (LR)

Logistic Regression achieves moderate performance with a precision of 0.75, indicating that 75% of the instances
predicted as positive are indeed positive. The recall score of 0.76 suggests that 76% of the actual positive instances
are correctly identified. The F1-Score, which is the harmonic mean of precision and recall, is also 0.75. The model's
accuracy is 0.77, indicating that it correctly classifies 77% of all instances.

2) Support Vector Machine (SVM)

SVM achieves high performance with precision, recall, and F1-Score of 0.97, indicating that it performs
exceptionally well in distinguishing between positive and negative instances. The high accuracy score of 0.97
demonstrates the model's overall effectiveness in correctly classifying instances.

3) Random Forest (RF)

Random Forest demonstrates outstanding performance, achieving perfect precision, recall, F1-Score, and accuracy
scores of 1. This indicates that the model makes no false positive or false negative predictions and correctly
classifies all instances in the dataset.

4) Decision Tree (DT)

Decision Tree performs similarly to SVM with high precision, recall, F1-Score, and accuracy scores of 0.97,
indicating its effectiveness in classifying instances.

5) k-Nearest Neighbors (KNN)

KNN achieves excellent performance with precision, recall, F1-Score, and accuracy scores of 0.98, indicating its
robustness in classifying instances accurately.

As an overall analysis, we can say that:

- SVM, Random Forest, Decision Tree, and KNN outperform Logistic Regression in terms of precision, recall,
F1-Score, and accuracy.
- Random Forest stands out as the best-performing model, achieving perfect scores across all metrics.
- SVM, Decision Tree, and KNN also demonstrate high performance with scores close to or equal to 1 for
precision, recall, F1-Score, and accuracy.
- These results suggest that ensemble methods like Random Forest and robust classifiers like SVM, Decision
Tree, and KNN are well-suited for DoS attack detection tasks, offering high accuracy and reliability in
identifying malicious network traffic.

III. Conclusion
In conclusion, our study explores the efficacy of various machine learning models in detecting Denial of Service

6
Algerian Journal of Renewable Energy and Sustainable Development x(x) 2020: xxx-xxx, doi: 10.46657/ajresd.2020.x.x.x

(DoS) attacks based on network traffic data. Through rigorous experimentation and analysis, we have
demonstrated that ensemble methods such as Random Forest, as well as robust classifiers like Support Vector
Machine (SVM), Decision Tree, and k-Nearest Neighbors (KNN), exhibit exceptional performance in accurately
classifying instances of normal and malicious network activity. These models achieve high precision, recall, F1-
Score, and accuracy, with Random Forest notably achieving perfect scores across all metrics. Our findings
underscore the importance of leveraging advanced machine learning techniques for enhancing the resilience of
network infrastructures against DoS attacks, thereby safeguarding critical digital assets and infrastructure from
disruptive cyber threats.

Acknowledgements
We would like to express our sincere gratitude to my teachers Dr. GHAZLI Abdelkader and Dr. Bouida Ahmed
for their invaluable support, guidance, and contributions to this research endeavor. Their expertise, feedback, and
assistance have been instrumental in shaping the direction and outcomes of this study. We also acknowledge the
collective efforts of our colleagues and peers who have offered insights, encouragement, and constructive criticism
throughout the course of this research. Finally, we extend our heartfelt appreciation to our families and loved ones
for their unwavering support and understanding during the completion of this work.

References
[1] C. Easttom, Computer security fundamentals, Fourth edition. Indianapolis, Indiana: Pearson Education,
Inc., 2020.
[2] A. Géron, Hands-on machine learning with Scikit-Learn, Keras, and TensorFlow: concepts, tools, and
techniques to build intelligent systems, Second edition. Beijing [China] ; Sebastopol, CA: O’Reilly Media,
Inc, 2019.
[3] I. Goodfellow, Y. Bengio, et A. Courville, Deep learning. in Adaptive computation and machine learning.
Cambridge, Massachusetts: The MIT Press, 2016.
[4] V. Chandola, A. Banerjee, et V. Kumar, « Anomaly detection: A survey », ACM Comput. Surv., vol. 41, no
3, p. 1‑58, juill. 2009, doi: 10.1145/1541880.1541882.
[5] M. M. Rasheed, A. K. Faieq, et A. A. Hashim, « Development of a new system to detect denial of service
attack using machine learning classification », Indones. J. Electr. Eng. Comput. Sci., vol. 23, no 2, p. 1068,
août 2021, doi: 10.11591/ijeecs.v23.i2.pp1068-1072.
[6] S. Sambangi et L. Gondi, « A Machine Learning Approach for DDoS (Distributed Denial of Service)
Attack Detection Using Multiple Linear Regression », in The 14th International Conference on
Interdisciplinarity in Engineering—INTER-ENG 2020, MDPI, déc. 2020, p. 51. doi:
10.3390/proceedings2020063051.
[7] E. S. Alghoson et O. Abbass, « Detecting Distributed Denial of Service Attacks using Machine Learning
Models », Int. J. Adv. Comput. Sci. Appl., vol. 12, no 12, 2021, doi: 10.14569/IJACSA.2021.0121277.
[8] M. I. Kareem et M. N. Jasim, « Fast and accurate classifying model for denial-of-service attacks by using
machine learning », Bull. Electr. Eng. Inform., vol. 11, no 3, p. 1742‑1751, juin 2022, doi:
10.11591/eei.v11i3.3688.
[9] S. A. M. Al-Juboori, F. Hazzaa, Z. S. Jabbar, S. Salih, et H. M. Gheni, « Man-in-the-middle and denial of
service attacks detection using machine learning algorithms », Bull. Electr. Eng. Inform., vol. 12, no 1, p.
418‑426, févr. 2023, doi: 10.11591/eei.v12i1.4555.
[10] G. Rajakumaran, N. Venkataraman, et R. R. Mukkamala, « Denial of Service Attack Prediction Using
Gradient Descent Algorithm », SN Comput. Sci., vol. 1, no 1, p. 45, janv. 2020, doi: 10.1007/s42979-019-
0043-7.
[11] K. Kumari et M. Mrunalini, « Detecting Denial of Service attacks using machine learning algorithms », J.
Big Data, vol. 9, no 1, p. 56, déc. 2022, doi: 10.1186/s40537-022-00616-0.
[12] F. Rustam, M. Mushtaq, A. Hamza, M. Farooq, A. Jurcut, et I. Ashraf, « Denial of Service Attack
Classification Using Machine Learning with Multi-Features », Electronics, vol. 11, no 22, p. 3817, nov.
2022, doi: 10.3390/electronics11223817.
[13] M. Aljanabi, R. Altaie, S. Talib, A. Hussien Ali, M. A. Mohammed, et T. Sutikno, « Distributed denial of
service attack defense system-based auto machine learning algorithm », Bull. Electr. Eng. Inform., vol. 12,
no 1, p. 544‑551, févr. 2023, doi: 10.11591/eei.v12i1.4537.

7
Algerian Journal of Renewable Energy and Sustainable Development x(x) 2020: xxx-xxx, doi: 10.46657/ajresd.2020.x.x.x

[14] Department of CSE, Acharya Nagarjuna University, Guntur, AP, India, K. B. Dasari, et N. Devarakonda,
« Detection of DDoS Attacks Using Machine Learning Classification Algorithms », Int. J. Comput. Netw.
Inf. Secur., vol. 14, no 6, p. 89‑97, déc. 2022, doi: 10.5815/ijcnis.2022.06.07.
[15] A. Zheng et A. Casari, Feature engineering for machine learning: principles and techniques for data
scientists. Beijing Boston Farnham Sebastopol Tokyo: O’Reilly, 2018.
[16] F. Provost et T. Fawcett, Data science for business: what you need to know about data mining and data-
analytic thinking, 1. ed., 2. release. Beijing Köln: O’Reilly, 2013.

You might also like