Ddos Attacks + Description Data Set

Multimedia Tools and Applications (2022) 81:4185–4211
https://doi.org/10.1007/s11042-021-11740-z
Predictive machine learning‑based integrated approach

for DDoS detection and prevention
Solomon Damena Kebede1 · Basant Tiwari2 · Vivek Tiwari3 · Kamlesh Chandravanshi4
Received: 16 June 2021 / Revised: 22 September 2021 / Accepted: 8 November 2021 /

Published online: 3 December 2021
© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2021
Abstract
Distributed Denial of Service attack has been a huge threat to the Internet and may
carry extreme losses to systems, companies, and national security. The invader can
disseminate Distributed denial of service (DDoS) attacks easily, and it ends up being
significantly harder to recognize and forestall DDoS attacks. In recent years, many IT-
based companies are attacked by DDoS attacks. In this view, the primary concern of
this work is to detect and prevent DDoS attacks. To fulfill the objective, various data
mining techniques such that Jrip, J48, and k-NN have been employed for DDoS attacks
detection. These algorithms are implemented and thoroughly evaluated individually to
validate their performance in this domain. The presented work has been evaluated using
the latest dataset CICIDS2017. The dataset characterizes different DDoS attacks viz.
brute force SSH, brute force FTP, Heartbleed, infiltration, botnet TCP, UDP, and HTTP
with port scan attack. Further, the prevention method takes place in progress to block
the malicious nodes participates in any of the said attacks. The proposed DDoS pre-
vention works in a proactive mode to defend all these attack types and gets evaluated
concerning various parameters such as Throughput, PDR, End-to-End Delay, and NRL.
This study claimed that the proposed technique outperforms with respect to the AODV
routing algorithm.
Keywords CICIDS2017 · DDoS Attack · Machine learning · Classification algorithm ·

DDoS Detection · DDoS Prevention
* Vivek Tiwari
viveknitbpl@gmail.com
1
Department of Information Technology, Hawassa University Institute of Technology, Hawassa,
Ethiopia
2
Department of Computer Science, Hawassa University Institute of Technology, Hawassa, Ethiopia
3
Department of CSE, Dr. S P Mukherjee IIIT-NR, Raipur, India
4
Department of Information Technology, LNCT, Bhopal, India
13
Vol.:(0123456789)
4186 Multimedia Tools and Applications (2022) 81:4185–4211
1 Introduction
Nowadays, securing information under any circumstance is a critical prerequisite for

private or government sectors. Various elements bargain the security of data; one of
these is an attacker. The inspiration for the attacker may rise from different intentions.
The most attractive ones would be system vulnerabilities and valuable information [9,
25, 29]. The number of attacks over a network and other mediums has enlarged radi-
cally in recent years. Hence, a well-organized intrusion detection technique as a secu-
rity layer to capture and remove these malicious, suspicious, and abnormal events is
mandatory. Thus, the Intrusion Detection System (IDS) has been acquainted as a safety
component to identify several attacks [24, 26]. Various online attacks cause most
wrecking impacts and hampering IT security, and one of them is Denial of Service
(DoS) [20]. It laid a striking burden over security specialists in bringing out substantial
safeguard measures. These attacks could be accomplished in different ways with an
assortment of tools and codes. Denial of Service (DoS) attacks are considered one of
the major and toughest threats in the network. A new propelled variation of DoS is the
Distributed Denial of Service (DDoS) attack, it is a more remarkable and overpower-
ing variation of DoS, due to its distributive nature [15]. None of the existing detection
and prevention mechanism of DDoS attacks is fulfilling the IT world’s needs. That’s
why these attacks are suffering some of the giant IT companies. Distributed Denial of
Service (DDoS) attacks are a somewhat direct and influential technique that puts away
all types of network resources. These attacks are initiated effortlessly, but prevent-
ing and tracking them is difficult. A huge number of such attacks can break/block the
healthy flow of networks and so, DDoS attacks become dangerous threats to the net-
worked system. The DDoS attack is classified into four categories, where the first three
are bandwidth consumption attack, asset reduction attack, and infrastructure attack.
The fourth one is a zero-day attack, and it is difficult to detect, which further may dam-
age the payload and other assets. Moreover, there is no known patch or fix available
at the time of a zero-day exploit [14] and so, it is needed to have an efficient security
framework that can reduce its impact.
An Intrusion Detection System (IDS) is defined as a system that screens network
traffic flow for distrustful movement and issues alarms when such action is found [18].
An IDS system utilizes two fundamental methodologies for detection: Signature/rule-
based detection and Anomaly-based detection. A signature-based detection procedure
looks at well-known data against previously caught profiles placed in the IDS database
to put away attacks but, ready to identify only recognized attacks [5]. Data Mining is
characterized as the strategy for mining or separating data/information from the mas-
sive volume of data. It is used to extract knowledge and relative patterns from the large
dataset for predictive analysis [11, 28, 33]. The tremendous volume of existing and
recently appearing network data influenced researchers to use data mining technologies
to investigate attacks [16, 17, 21, 23, 32]. Currently, various researchers are continu-
ing to develop an effective IDS system for DDoS using a machine learning algorithm
[1, 3, 6, 8, 12, 19, 22, 30]. This is since data mining methods are suitable for extensive
mining of collection of different patterns from a network traffic flow. This supports
differentiating the incoming packets to predict whether packets received are valid or
not effectively (i.e., an attack packet) [8]. Vaseer, G et al. presented new techniques
to defend a set of active attacks, like DoS, probe, vampire, and U2R in MANET. The
13
Multimedia Tools and Applications (2022) 81:4185–4211 4187
author offered behavior-based and distributed trust-based analysis under the AODV
routing protocol in NS-2 [31]. J. Batra and C. R. Krishna used the feed-forward back
propagation method as a classifier to handle MANET against DDoS attacks under
AODV protocol. Authors claimed improvement in PDR and throughput and a signifi-
cant reduction in delay [7]. N. Singh et al. presented a rate-limiting scheme to pro-
tect the network against DDoS attacks. It immediately eliminates that node from the
network as found the attacker node [27]. This paper proposes a method that not only
detects flooding types of DDoS but also prevents network against agents that are gen-
erating such attacks. In this study, a data mining-based methodology (classification)
has been proposed to detect the attack packets, and further, it prevents the network
by blocking the agent. The proposed system followed signature-based detection where
it has been evaluated using the publicly available CICIDS-2017 dataset. This dataset
contains both DDoS attacks and benign records. Using this dataset set, the proposed
detection system learns the pattern about the nature of incoming packets and store the
signature of attack packets. With the proper learning, the system can detect the vulner-
able node in real-time. Furthermore, it also maintains the profile of the attacker node
and keeps them in a grey list. The proposed prevention system takes the appropriate
decision by iteratively inspecting the patterns of greylisted nodes. Hence, the proposed
work behaves like a hybrid detection and prevention system, which used signature-
based detection as well as profile-based prevention.
The organization of this paper is as follows: a brief explanation about the DDoS Attack
is in Section 1. Section 2 includes materials and methods which is used under Section 3 to
explain the details about the proposed methodology. Sections 4 and 5 emphasize results
and discussion of an experiment conducted over proposed work, respectively. Finally, Sec-
tion 6 presents the conclusion.
2 Material and method
This part focuses on research methods, study design, and a strategy of data collection,
management, and analysis. The evaluation of the proposed work is done independently in
two phases: phase one for DDoS detection evaluation and phase two for DDoS prevention
evaluation.
2.1 Dataset
There are many genuine datasets are existing for DDoS, IDS, and other related studies,
which are widely accessible. Some of them are FIFA World Cup Dataset 1998, KDD99
and NSL-KDD, DARPA 2009, CAIDA, DEFCON, UNSW-NB15, CICIDS2017. Most of
the datasets are outdated [11]. This paper used the latest IDS dataset called CICIDS2017,
developed by the Canadian Institute for Cybersecurity (CIC). It is one of the distinctive
datasets that incorporate modern attacks [2].
2.1.1 Description of the dataset
The CICIDS2017 dataset [13] is recent and has been created to overcome the exist-
ing gaps in the earlier datasets. A common issue with the outdated dataset is that
13
they failed to capture the attack signature since intrusion attack types evolve con-
tinuously and become more sophisticated. Other problems found in some datasets
are the lack of features and metadata; some of them do not contain enough variety
of known attacks. Inappropriate training or outdated patterns can bring poor detec-
tion while benchmarking dataset is needed for complete and healthy learning. Such a
benchmark dataset must be recent, accurately labeled for preparing supervised data
mining techniques. A dataset must also be freely accessible and contains genuine
network traffic with a wide range of attacks to prepare an effective IDS/IPS system
[2]. In this view, the CICIDS2017 dataset fulfills all these criteria and hence con-
sider in this study.
CICIDS2017 contains the latest attack types such as DoS, DDoS, brute force SSH, brute
force FTP, Heartbleed, infiltration, and botnet [4, 13] with port scan attack. These attacks
are included in the dataset with a generalized name as a DDoS attack. So, the CICIDS
dataset contains 225,745 records and 79 attributes, including the label/class attribute. The
label attribute has two levels; BENIGN and DDoS. Benign has 97,718 records, and DDoS
has 128,027 records.
2.2 Dataset optimization
Most of the available datasets contain unwanted elements (missing, redundant, or infinity
values) that should be removed or transformed. The steps of preprocessing to obtain a suit-
able dataset are as follows:
2.2.1 Dataset cleaning
The researchers make sure about the neatness of the data, and it is directly related
to the output of the research. So, the first effort is to clean up the CICIDS data-
set. Initially, there were 79 attributes, and after removing the redundant attribute
‘Fwd_Header_Length’ that appeared twice in the list, the total number of attributes
become 78. Furthermore, redundant 2,633 records out of 225,745 records have also
been dropped, and 223,112 records remain in the dataset. Two attributes have missing
values:
• ‘Total Length of Bwd packet’ with four missing values, and.

• ‘Flow Bytes/s’ with one missing value.
The ‘ReplaceMissingValues’ algorithm of WEKA substitutes all lost data with the
modes for nominal and means for numeric attributes from the training data in a dataset. By
default, the class attribute is skipped.
2.2.2 Features selection
The proposed work offers a filtering approach with a ranking algorithm to reduce the
number attribute or to remove the irrelevant attribute. Actually, the performance of the
classifier highly depends on the discrimination power of attributes. The suitable fea-
tures enhance the performance and vice versa increase the computational and model
13
building cost. The filtering-based selection method has given zero ranks for ten fea-
tures. Hence these ten attributes have been considered irrelevant and been eliminated.
The remaining one attribute was the duplicate attribute, that’s why it is also removed
from the dataset. So, a total of 68 attributes have been used in the proposed work. The
filtering approach is used as a feature selection mechanism for this work. There are a lot
of algorithms for filtering [2]. In this study, Information gain and Gain ratios are used
for attribute selection.
• Information gain
Information gain (IG) quantifies the volume of information for the class. It is
related to features and their corresponding class distribution. Solidly, it gauges the
normal decrease in entropy. We have figure out two kinds of entropies using fre-
quency tables:
1. Entropy for a single attribute
i.e., calculate the entropy of the target/parent.

n
∑
E(A) = −px log2 px (1)
x=1
Where 𝑝x indicates the percentage of instances going to class x(x=1,…, n).
2. Entropy for two attributes
The entropy for every division against the feature is computed [22]. Finally, it is added
correspondingly to become overall entropy.
∑
E(A, X) = P(c)E(c) (2)
c𝜖X
Where A denotes parent class, X denotes parent attribute, P(c) attribute value, and E(c)
entropy of attribute value. The outcome entropy (2) is deducted from the entropy (1), and
the final result is “Information Gain”.
Gain(T, X) = Entropy(A) − Entropy(A, X)
Where Entropy(A) denotes parent entropy, i.e., the entropy of class attribute and
Entropy (A, X) denotes entropy of a single attribute (2).
• Gain ratio
Gain Ratio (GR) is an adjustment of the information gain that decreases its bias. It
takes the quantity and dimensions of subdivisions while picking an attribute. The infor-
mation gain is corrected by the gain ratio after considering the necessary information of
a split.
The split information value denotes the possible information generated by splitting the
training dataset ‘S’ into ‘p’ partitions using the attribute ‘A’.
13
Fig. 1 The intersection of attribute evaluation
p ��S �� ⎛ ��Sj �� ⎞
� � j�
SplitInfoA (S) = − Xlog2 ⎜ � � ⎟ (3)
j=1
�S� ⎜ S ⎟
⎝ ⎠
The gain ratio is defined as:

Gain(Attribute)
GainRatio(Attribute) = (4)
Intrinsic_info(Attribute)
The output of the dataset attributes is ranked after applying the feature selection method
with the ranking algorithm. After taking the intersection of the above two algorithms, the
irrelevant features are removed from the dataset. In both algorithms, ten attributes are
ranked 0 (zero), i.e (Fig. 1).
Dataset detail after and before preprocessing is shown in the following Table 1.
2.3 Model evaluation and selection for DDoS attack detection
This section describes the evaluation of the model and selection criteria.
Table 1 Dataset description Status No. of records No. of Size (MB) DDoS Benign
before and after preprocessing attributes
Before 225,745 79 73.5 128,027 97,718

After 223,112 68 64.1 128,016 95,096
13
2.3.1 Metrics for evaluating classifier performance
Four expressions need to understand that further used in figuring numerous assessment
measures, that are:
True positives (TP): Truly positive tuples classified by the algorithm. They are positive
tuples and correctly classified.
True negatives (TN): Truly negative tuples classified by the algorithm. They are nega-
tive tuples and correctly classified.
False positives (FP): The classifier classifies tuples wrongly as positive. But they are
actually a negative tuple and incorrectly classified.
False negatives (FN): The classifier classifies tuples wrongly as negative. But they are
actually positive tuples and incorrectly classified.
The best classifier is selected on the basis of the following metrics:
Accuracy: It is used to calculate the accuracy of the classifier as:

TP + TN
Accuracy = (5)
P+N
‘P’ is the total number of positive tuples in the dataset, ‘N’ is the total number
of Negative tuples in the dataset, and P+N is the total number of tuples in the
dataset.
Error rate: It is calculated as the number of incorrect predictions divided by the

total number of tuples in the dataset. The best error rate is 0.0, whereas the worst is
1.0. It is calculated as:
FP + FN
ErrorRate = (6)
P+N
Sensitivity: It means the extent of positive tuples that are accurately recognized. It
is also called a true positive rate and can be computed as:
TP
Sensitivity = (7)
P
Specificity: It means the extent of negative tuples that are effectively recognized. It
is also called a true negative rate and can be defined as:
TN
Specificity = (8)
N
Precision: It shows what level of tuples named as positive is really positive. It is
also called the proportion of correctness and can be computed as:
TP
Precision = (9)
TP + FP
13
2.4 Cross‑validation
K-fold cross-validation technique is used for validation. The whole dataset is divided into
unrelated k-parts, and each subset is near the same data size. There is a need to conduct
k-cycles where each part becomes a validation set, and reaming is a training set. Both
training and validation are going up to kth cycle. In this technique, the accuracy of the
classifier is calculated as:
the sum of accurate classification from k round
entire tuples of the original data dataset
This study uses 10-fold cross-validation for evaluating accuracy since it gives low
bias and variance.
2.5 Evaluation metrics for DDoS attack prevention phase
The evaluation metrics in the prevention of DDoS Attack is different from the
detection phase. There were considered numerous metrics such that Through-
put, Packet Delivery Ratio (PDR), End-to-End (E2E) delay, that are described as
follows:
Throughput: It is the measure of information transferred effectively, starting from
the sender to the receiver in a given timeframe and estimated in bits per second (bps). It
is a significant pointer to the performance and quality of a network connection. A high
proportion of ineffective message delivery will eventually prompt lower throughput and
degraded network performance.
Np ∗ PacketSize
Average Throughput = (10)
Seconds
Where, Np is the number of packets that reached to its destination successfully.
Packet Delivery Ratio (PDR): It is concerned about the proportion of the number
of packets effectively delivered to its receiver out of the number of packets sent by
the sender. This shows how effectively a protocol transfers packets from sender to
receiver.
Rp
PDR = ∗ 100 (11)
Sp
Where, Rp is the total number of packets received successfully by destination, and Sp

is the total number of packets sent by the sender.
End-to-End (E2E) Delay; It is concerned about the time interval between the sender
and the receiver when they exchange the data.
E2ED = TR − TS
Where TR is packet/data receiving time at destination, and TS is data sending time
at the sender. This section describes the evaluation of the model and selection criteria is
described.
13
3 Proposed algorithm
The proposed algorithm works in two phases: DDoS Attack Detection phase and DDoS
Attack Prevention phase that are described in subsequent subsections as follows:
3.1 DDoS attack detection phase methodology
First, the underlying dataset is needed to be preprocessed and then further feed to clas-
sification algorithms, namely, J48, k-NN, and JRip.
These classification algorithms individually develop their working models. The data-
set, the experimental environment (like minimum and maximum memory allocation),
and the testing option are the same for all algorithms. Individual algorithm is evaluated
for their performance based on performance metrics discussed in Section 3. The com-
parison is made based on the outcome of the evaluation. Finally, the best classifier is
selected for the classification of DDoS attacks and Benign packets. The proposed meth-
odology has the following steps:
3.2 DDoS attack prevention methodology
It is used to handle a DDoS attack in the network is being detected. This phase uses two
parameters, called Waiting Time (WT) & Trial. Additionally, it maintains two lists called
13
Blacklist and Greylist to store the status of sender nodes. The blacklist stores untrustworthy
nodes that generate the DDoS attack and must be excluded from the network. The Grey-
list stores list of suspected nodes. It is implemented through a vector with two elements
namely node_id and node_number of trials. Similarly, the black list has been implemented
with a single element (node_id) vector. If the sender is a new node and tries to induct
the DDoS attack, i.e., it is untrustworthy and so mark it into the Greylist with minimum
threshold WT. On the contrary, if the sender node is already in Greylist and tries for DDoS
attack, the algorithm checks the number of trials (i.e., how many times the node generated
the attack). Such node gets marked on blacklist as the trial exceeds a threshold. Further-
more, the waiting time is dynamically increased, and the node remains in Greylist if the
trial is less than or equal to the threshold. Waiting Time (WT) is when a node is blocked
or stopped to transfer the data. The proposed algorithm sets the initial waiting time for
10 s for the first time. In this regard, a node will get enabled for retransmission once given
waiting time over. Moreover, if that specific node tries to send a packet more than ten times
before the waiting time expires, that node is shifted to the blacklist. The waiting time is
incremented by multiplying the initial waiting with several trials. For example, if a sender
is in a grey list and tries to send another packet for 7th times (i.e., Trial =7), the waiting
time is calculated as follows:
Initial waiting time WT = 10 s.

1st trial WT = WT * no. of trial i.e. 10 * 1 = 10 s.
2nd trial WT = WT * no. of trial i.e. 10 * 2 = 20 s.
3rd trial WT = WT * no. of trial i.e. 20 * 3 = 60 s.
.
.
7th trial WT = WT * no. of trial i.e. 7200 * 7 = 50,400 s.
Concerning the above example, once the node is listed in the grey list, the node has
to wait 50,400 s after the 7th trial (14 h). Assume the attacker waits some seconds (till
30,000 s) and again attempts for the 8th time, the waiting time will be increased 30,000 *
8 = 240,000 s. This waiting time is a kind of punishment to the underlying node and pro-
tects the network from attacks. Once a node gets registered in a blacklist, this node is not
allowed to generate any traffic, and as a result, its packets get dropped rather than forward-
ing to the network.
The flowchart, as depicted in Fig. 2, presents the working of the proposed detection and
prevention algorithm. Initially, the sender node’s id is checked into the blacklist to resolve
the overwhelming of the machine. If it is blacklisted, the incoming packet is ignored, else
it goes to the classification module. The classification module examines whether it is legiti-
mate or not by giving a class label. If it is legitimate, the incoming packet is accepted. Oth-
erwise, the prevention technique takes care of the incoming packet by looking at the sender
node’s status in the grey list.
It’s worth noting that DDoS attacks naturally send hundreds of thousands of packets in
a second (i.e., it is called a DDoS flooding attack. In this view, the proposed WT method
works fine to catch such a situation. The proposed prevention mechanism double checks for
the security as either node get rejected within a second (if in a blacklist) or examined later
by the classifier to check its signature as normal or abnormal.
13
13
13
Fig. 2 Proposed algorithm

flowchart
4 Results
This section presents the results and further discussion of the detection and the preven-
tion methods CICIDS2017 dataset.
13
4.1 Experimental results on DDoS attack detection
The detection phase starts from dataset preprocessing up to the final evaluation perfor-
mance of individual algorithm through experiment.
4.1.1 Attribute selection
Feature selection or attribute evaluation comes under dataset preprocessing. The irrelevant
attribute is removed, and the dataset becomes more precise and lightweight. InfoGainAt-
tributeEval and GainRatioAttributeEval with ranker algorithms are used to select the best
features. The output of both algorithms is shown in Figs. 3 and 4.
Ten attributes are selected as irrelevant or ranked ‘0’ in both cases (i.e., attribute num-
bers 32, 33, 34, 50, 56, 57, 58, 59, 60, and 61). Actually, ranked ‘0’ attributes don’t have
any influence on the performance of the attack detection.
4.1.2 The classification method evaluation
This section discusses the experimental results of the individual classification algorithm
over the preprocessed dataset. The algorithm performance is measured by using the same
test option discussed in Section 3, i.e., k-fold cross-validation, where k = 10.
1. JRip performance evaluation (Table 2)
2. J48 performance evaluation (Table 3)
3. k-NN performance evaluation (Table 4)
Fig. 3 The information gain evaluation with ranker algorithm
13
Fig. 4 The gain ratio evaluation with ranker algorithm
Table 2 Confusion Matrix of JRip on preprocessed data
13
Table 3 Confusion matrix of J48 on preprocessed data
4.1.3 Performance comparison
The summary of the evaluation of individual algorithms over unprocessed and preproc-
essed datasets is shown in Figs. 5, 6 and Table 5.
Detail discussion about these results is included in Section 5.1.
4.2 Experimental results on DDoS attack prevention
NS-2 simulator has been used to create an environment to mimic the attack and
implement the DDoS attack prevention method. Two scenarios, namely the DDoS
Attack scenario and the proposed scenario has been implemented under the AODV
(Ad-hoc On-Demand Distance Vector) routing protocol. It is worth noting that the
existing DDoS attack scenario under NS-2 has been updated as per the proposed
prevention methodology. Further, both scenarios are compared to each other with
NRL, Throughput, PDR, Avg. E-E delay performance metrics to show network per-
formance. The following subsections describe the simulation environment and result
in analysis.
13
Table 4 Confusion matrix of k-NN on preprocessed data
Fig. 5 Algorithm’s performance on the unprocessed dataset
4.2.1 Simulation environment
The proposed prevention algorithm is simulated in NS-2 with Ubuntu 16.04 LTS Oper-
ating system. The simulation environment is shown in the following Table 6.
13
Fig. 6 Algorithm performance on preprocessed dataset
Table 5 Time based evaluation Algorithm Model Building Time Model

individual classification (Second) Evaluation
algorithms time(Minute)
JRip 148.75 15.55

J48 43.12 6.71
k-NN 0.16 101.764
Table 6 Simulation environment Simulation Parameter Values
Channel Type Wireless Channel

Propagation Model Two Ray Ground
Traffic Type CBR
CBR Packet Size 1000
MAC Type 802.11
Interface Queue Type DropTail/ PriQueue
Link Layer Type LL
Antenna Model OmniAntenna
Max Packet in Queue 50
Number of Mobile Nodes 50
Routing Protocol AODV
Window Size 800 × 800
Simulation Time (Sec) 100
Some parameters in the simulation of wireless communication are always used as man-
datory i.e., Wireless Channel: 802.11. Whereas, the number of nodes can be varied and
proposed work taken 50 nodes, which is more feasible to develop a small Ad hoc network.
13
Furthermore, the routing protocol: AODV, because it offers the shortest route-based com-
munication. Queue: drop tail, is a queue management technique in NS2 to store incoming
packets based on their incoming time. The propagation type: two ray ground technique
because mobile device communicates based on a ground base technique that does not
require sky or satellite-based communication.
4.2.2 Result analysis
With reference to Fig. 7, DDoS attacks are initiated at 82nd seconds (as shown in the simu-
lation time at x-axis) and continue to grow up to 60 % (on the y-axis) as simulation time
reached 100 s. This concludes that the performance of the network starts degrading heavily
as the number of attacks increase.
Figures 8, 9, and 10 depict the performance against other parameters such as Normal
Routing Load, Throughput, and PDR with respect to DDoS attacks and proposed sce-
narios. The line (in violet color) represents performance under the DDoS attack scenario,
Fig. 7 Network performance

with DDoS attack and time
Fig. 8 Performance against nor-

mal routing load analysis
13
while the line in green color depicts the performance under the proposed scenario. The
X-Axis in all graphs shows simulation time up to 100 s, while Y-Axis shows results of
respective performance metrics in that simulation time. Detail discussion about these fig-
ures is included in Section 5.2.
Figure 8 shows the performance results of Normal Routing Load (NRL) in both sce-
narios. As the figure illustrates, NRL in the proposed scenario is less than the attack sce-
nario. It is also to be noted that attacks begin at 82nd simulation time and further, NRL
is increased drastically for DDoS scenarios, but at the same time the performance of the
proposed scenarios is intact.
Figure 9 illustrates the throughput analysis in both scenarios. It is clearly showing that
the proposed scenario performs much better than the DDoS attack scenario and achieved
higher throughput. In Fig. 10, it can realize that the Packet Delivery Ratio of the proposed
work is more remarkable as compared to the normal AODV routing protocol.
Figure 11 illustrates the average end-to-end delay between the sender node and the
receiver node to transfer data in both scenarios.
Table 7 illustrates sample attack analysis for every single node that initiates an attack
packet and is listed in the grey list.
5 Discussion
This section discusses the experimental results of DDoS detection and prevention phases,
including various challenges.
5.1 Result discussion on DDoS attack detection
DDoS detection method employed the CICIDS2017 dataset for the experiment as it con-
tains the most recent network attacks. Detection Accuracy, Error Rate, Sensitivity, Speci-
ficity, Precision, Model Building Time and Model Evaluation Time are the performance
metrics for three classifiers: JRip, J48, and k-NN which achieved 99.99 %, 99.986 %, and
99.97 % accuracy, respectively. The number of misclassifications in JRip, J48, and k-NN
Fig. 9 Performance against

throughput analysis
13
Fig. 10 Performance against

packet delivery ratio analysis
classifiers were 22, 31, and 66 instances, respectively. The model building (training) time
has also been considered as a comparison measure in this study where JRip, J48, and
k-NN classifiers took 148.75, 43.12, and 0.16 s respectively. With reference to the preced-
ing experimental results, JRip offers the highest accuracy (99.99 %), but it took 148.75 s
of training time. Similarly, K-NN and J48 methods offer 99.97 %, 99.98 % accuracy, and
0.16 s, 43.12 s model building time, respectively.
In a predictive machine learning-based attack detection system, detection accuracy
and model building time is an important aspect because it may affect the data packet dur-
ing transmission in the network. In this view, the proposed system does not only focus
on higher detection accuracy but also see minimal model building time. The experi-
ment reveals that Jrip algorithm achieved 99.99 % accuracy with a model building time
of 148.75 s. Similarly, J48 algorithm achieved 99.86 % detection accuracy with a model
building time of 43.12 s only. This shows a significant reduction of about 1/3 of Jrip model
building time. Thus, this work claimed J48 as outperform and the best attack classifier with
a significant reduction in model building time without negotiating much more detection
accuracy. These statistics encouraged us to choose J48 as a suitable classifier for DDoS
Attack detection. Because J48 offers better detection accuracy and less evaluation time than
k-NN. Additionally, J48 achieved better model building as well as evaluation time than Jrip
also. One of the challenges faced during this work was the state-of-the-art work since there
Average End to End Delay

0.54
0.52
(ms)
0.5 DDoS Attack

Proposed Prevention
0.48
0.46
Average End-to-End Delay
Fig. 11 Performance against average end to end packet delivery delay
13
Table 7 Sample Grey listed and Black listed nodes with waiting timestamp
Grey, Blacklist and Block Analysis
Node in Grey List No. of trials
node 3 in Grey List 1
node 3 in Black List
Time = 13.000000 Block The Node = 3
Grey, Blacklist and Block Analysis
Node in Grey List No. of trials
13
Table 8 Comparative analysis Algorithms Accuracy (%) Accuracy

of the proposed DDoS detection In CICIDS2017 dataset (%)
with state of art work In NSL-
KDD
dataset
[1] [22] [4] Proposed [10]
Jrip NA NA NA 99.99 95.24

J48 (Decision Tree) NA 95.94 NA 99.986 94.81
k-NN 95 93.01 90.6 99.97 96.08
NA: - Not Available
were very few works with the CICIDS2017 dataset. It happened since this dataset is the lat-
est. Finally, it is also noticed that the k-NN algorithm was intact with both versions of the
dataset (unprocessed and preprocessed). On the other hand, it was not the case with JRip
and J48 methods. It is also worth noting that J48 gives better accuracy with preprocessed
datasets. Table 8 shows the comparative analysis of the proposed DDoS detection with
other existing work:
5.2 Result discussion on DDoS attack prevention
Concerning the experimental results in Fig. 8, the proposed prevention technique is advan-
tageous over DDoS attack scenarios against the Normal Routing Load measure. The pro-
posed prevention scenario work continues without extra routing load during the attack trial
(at 82nd second) because it is handled against attack smartly. Simultaneously, the attack
scenario drastically increased the routing load and overhead up to 14 %. This shows the
proposed prevention technique worked better than attack scenarios. Figure 9 shows the
throughput results in both scenarios, and it is reported that the proposed prevention sce-
nario performs better than the attack scenario. The proposed technique scored 1533.88
Kbps average throughput as compared to 870.64 Kbps average throughput in an attack sce-
nario. The figure also concluded that throughput went down progressively in the presence
of an attack. Figure 10 shown comparative results of PDR in both scenarios where the pro-
posed prevention scenario is more trustworthy to deliver a packet from source to destina-
tion. This means a number of the dropped packet is reported less in the proposed scenario
compared to attack scenarios. PDR is the proposed technique is recorded at 90.77 % as
compared to an attack scenario where PDR is recorded at 83.29 %. During the experiment,
the proposed scenario sent 7918 packets to the destination, and out of them, only 731 pack-
ets get dropped. The reason for the dropping of these packets may be congestion, collision,
or another technical fault in the network, but not because of an attack. Concurrently, attack
scenario experiments sent 5990 packets toward a destination, and out of them, 1001 pack-
ets had been dropped because of a DDoS attack.
Figure 11 illustrated the comparative results of the average end-to-end delay between
both scenarios. The proposed technique scenario recorded 0.48 ms average end-to-end
delay to reach the destination, while in an attack scenario, it was 0.52 ms and increased by
0.40 ms compared to the proposed prevention scenario. A comprehensive summary of both
scenarios is shown in the following Table 9.
13
Table 9 Comprehensive Overall summary DDoS attack Proposed

summary analysis scenario prevention
scenario
Number of Packets sent 5990 7918

Number of Packets Received 4989 7187
Packet Delivery Ratio (%) 83.29 90.77
Average throughput (Kbps) 870.61 1533.88
Normal Routing Load (%) 24.03 0.99
Average e-e delay(ms) 0.52 0.48
No. of dropped packets 1001 731
Table 10 Comparative analysis of proposed DDoS Prevention Technique work with state of art works
Parameter [31] [7] [27] DDoS attack scenario Proposed
prevention
scenario
Packet Delivery Ratio 89.76 86.173 48.75 83.29 90.77

Throughput (Kbps) 1400 834.45 705.28 870.61 1533.88
Average End-to-End Delay 0.51 (ms) 0.61(ms) NA 0.52 (ms) 0.48 (ms)
Normal Routing Load 1.1 NA NA 0.24 0.99
NA: - Not Available
Table 9 shows the results of all performance metrics under both scenarios, where PDR,
Throughput, NRL, E2E delay are recorded better in the proposed prevention scenario. It is
concluded that the proposed DDoS prevention technique mitigates the attack and prolongs the
active route in the network and provides better performance of the network for reliable data
delivery
Table 10 shows the comparative analysis of the proposed DDoS prevention technique with
other similar existing work.
5.3 Space and time complexity analysis for proposed work
The proposed detection and prevention technique’s time and space complexity has been
analyzed for route discovery and security technique separately. The proposed solution
shows O(n2) time complexity for route discovery. It uses the AODV routing protocol
which uses the broadcasting technique to create the route. Similarly, space complexity is
O(n) for storing node information. Furthermore, time and space complexity for security
techniques is O(n). Since security technique uses distributed based approach among the
node. There is ‘n’ number of nodes in the network and out of those, one node is desig-
nated as the source, one node - destination, and one node –attacker. In this view, (n-3)
nodes participate in security checking. So, O(n) time is consumed for the security algo-
rithm and O(n) is for space.
13
6 Conclusions
Currently, DDoS attacks have been significantly critical for the web and can carry
extraordinary misfortune to organizations and governments. Improvement in rising
technologies like distributed computing, IoT, and AI giving chances to attackers to
dispatch DDoS attacks with minimal effort, and it turns out to be difficult to distin-
guish and anticipate DDoS attacks. Presently, DDoS Attacks have repeatedly launched
in many giant IT-based companies. The latest dataset in the area of IDS called the
CICIDS2017 dataset has been used to model and detect DDoS attacks. It is prepared
to overcome some issues of existing intrusion datasets such as anonymity and outdated
attack types. The primary concern of this paper is the detection and prevention of
DDoS attacks. DDoS detection started with the cleaning of the dataset. Further data-
set dimensionality is reduced because it contained a diverse number of features (79
features). This is achieved by accurately selecting the features by applying the Infor-
mation Gain attribute evaluation and Gain Ratio attribute evaluation method with the
ranking algorithm. The number of features had significantly been reduced. After pre-
processing the dataset, the cleaned dataset is given to the popular classification algo-
rithms in IDS, namely, JRip, J48, and k-NN classifiers. Those algorithms are evaluated
independently, and the results are recorded. Depending on the evaluation performance,
the appropriate classifier J48 is selected because it gave 99.986 % accuracy to detect
DDoS attacks.
The proposed DDoS prevention algorithm provides a proactive technique to secure
the network against attack. Black and grey’s list is maintained to store the status of the
node that may be blacklist and suspicious. A packet generated from the blacklisted node
is not allowed to forward in the network, and greylisted node’s packets are blocked to
forward for a significant time. After the calculated waiting time, either node becomes
blacklisted or genuine based on its behavior. That means the proposed algorithm gave a
chance to the suspicious node to change their behavior as a genuine node. The proposed
prevention algorithm is implemented in NS-2 under the AODV routing protocol. The
performance evaluation of the proposed algorithm is compared with the attack scenario.
It is found that PDR, Throughput, End-to-End Delay, and NRL are better than attack
scenarios.
Authors’ contributions Not applicable.
Data availability Public datasets have been used.
Code availability No.
Declarations
Conflicts of interest/Competing interests There is not any conflict of interest among authors.
Ethics approval Not applicable.
Consent to participate Not applicable.
Consent for publication Not applicable.
13
References
1. Aamir M, Mustafa S, Zaidi A (2019) Clustering based semi-supervised machine learning for DDoS
attack classification. J King Saud Univ - Comput Inf Sci 33(4):436–446
2. Abdulhammed R, Musafer H, Alessa A, Faezipour M, Abuzneid A (2019) Features dimensionality
reduction approaches for machine learning based network. Electronics 8(3):322
3. Ahmed N, Hussain I, Yousaf Z (2019) Analysis and detection of DDoS attacks targetting virtual-
ized servers. International Journal of Computer Science and Network Security 19(1):128–133
4. Akram B, Gaviro JC (2019) CICIDS2017 dataset: Performance improvements and validation as a
robust intrusion detection system testbed. no. April, pp 0–13
5. Alzahrani S, Hong L (2018) Generation of DDoS attack dataset for effective IDS development and
evaluation. J Inf Secur 09(04):225–241
6. Ammar H, Yilmaz Y (2018) Real-time detection and mitigation of DDoS attacks in intelligent
transportation systems. IEEE, pp 157–163
7. Batra J, Krishna CR (2019) Ddos attack detection and prevention using Aodv routing mechanism
and Ffbp neural network in a manet. Int J Recent Technol Eng (IJRTE) ISSN: 2277-3878, vol 8
Issue 2
8. Bista S, Chitrakar R (2017) DDoS attack detection using heuristics clustering algorithm and naïve
bayes classification. J Inf Secur 9:33–44
9. Dejene D, Tiwari B, Tiwari V (2020) TD²SecIoT: Temporal, data-driven and dynamic network
layer based security architecture for industrial IoT. International Journal of Interactive Multimedia
& Artificial Intelligence 6(4)
10. Garg T, Khurana SS (2014) Comparison of classification techniques for intrusion detection dataset
using WEKA. IEEE Int Conf Recent Adv Innov Eng
11. Gupta PK, Tyagi V, Singh SK (2017) Introduction to predictive computing. Predictive computing
and information security. Springer, Singapore. https://doi.org/10.1007/978-981-10-5107-4_1
12. Hui Wang Z, Cao, Hong B (2020) A network intrusion detection system based on convolutional
neural network. J Intell Fuzzy Syst 38(6):7623–7637
13. Intrusion Detection Evaluation Dataset (CIC-IDS) (2017) https://www.unb.ca/cic/datasets/ids-
2017.html. Accessed 31 June 2020
14. Kanimozhi V, Jacob TP, Kanimozhi V, Jacob TP (2019) Artificial intelligence based network intru-
sion detection with hyper-parameter optimization tuning on the realistic cyber dataset CSE-CIC-
IDS2018 using cloud computing. ICT Express
15. Liu Z et al (2018) The efficiency comparison between DDoS and DoS attack. 2018 IEEE 9th Int
Conf Inf Technol Med Educ, pp 1050–1054
16. Maccari L, Passerini A (2019) Security and privacy 2:1 A Big Data and machine learning approach
for network monitoring and security. Security and Privacy 2(1):e53
17. Mohammed SS et al (2018) A new machine learning-based collaborative DDoS mitigation mecha-
nism in software-defined network. Int Conf Wirel Mob Comput Netw Commun 2018-Oct, pp 1–8
18. Nema A, Tiwari B, Tiwari V (2016) Improving accuracy for intrusion detection through layered
approach using support vector machine with feature reduction. In Proceedings of the ACM Sympo-
sium on Women in Research, pp 26-31
19. Patil NV, Krishna R, Kumar CK (2020) Apache spark based real-time DDoS detection system. J
Intell Fuzzy Syst, IOS Press 38(5):6527–6535
20. Roempluk Tanaphon OS (2019) A machine learning approach for detecting distributed denial of
service attacks (2019 Jt). Int. Conf. Digit. Arts, Media Technol. with ECTI North. Section Conf.
Electr. Electron. Comput. Telecommun. Eng. (ECTI DAMT-NCON), pp 146–149
21. Shah S (2019) A comprehensive survey of machine learning-based network intrusion detection.
Smart Intell Comput Appl. Springer, Singapore, pp 345–356
22. Sallam AA, Kabir MN, Alginahi YM, Jamal A, Thamer KE (2020) IDS for improving DDoS attack
recognition based on attack profiles and network traffic features,.16th IEEE Int Colloq Signal Pro-
cess its Appl, pp 255–260
23. Salloum SKSA, Muhammad A, Ashraf E (2020) Machine learning and deep learning techniques for
cybersecurity: A review. Jt Eur Work Appl Invariance Comput Vis, pp 50–57
24. Sharafaldin I, Lashkari AH, Ghorbani AA (2018) Toward generating a new intrusion detection
dataset and intrusion traffic characterization. Proc of the 4th Int Conf Inf Syst Secur Priv (ICISSP
no. Cic, pp 108–116
25. Sharma K, Gupta BB (2018) Taxonomy of Distributed Denial of Service (DDoS) attacks and
defense mechanisms in present era of smartphone devices. Int J E-Services Mob Appl 10(2):58–74
13
26. Shrivastava A, Sondhi J, Khan S (2017) An implementation of intrusion detection system using
machine learning classification technique. Int Res J Eng Appl Sci 5(2):14–17
27. Singh N, Dumka A, Sharma R (2018) A novel technique to defend DDOS attack in manet. J Com-
put Eng Inf Technol 7:5. https://doi.org/10.4172/2324-9307.1000214
28. Singh M, Kant U, Gupta PK, Srivastava VM (2019) Cloud-based predictive intelligence and its secu-
rity model. Predictive intelligence using big data and the Internet of things. IGI Global, pp 128–143
29. Tandon R, Gupta P (2021) A novel pseudonym assignment and encryption scheme for preserving
the privacy of military vehicles. Def Sci J 71(2):192–199. https://doi.org/10.14429/dsj.71.15534
30. Tian GY, Monika R, Jonathon C (2020) An intrusion detection system against DDoS attacks in loT
Networks. IEEE, pp 562–567
31. Vaseer G, Ghai G, Patheja PS (2017) A novel intrusion detection algorithm: An AODV routing pro-
tocol case study. In 2017 IEEE International Symposium on Nanoelectronic and Information Systems
(iNIS). IEEE, pp 111-116
32. Xie YLJ, Richard F, Tao H, Xie R, Liu J, Wang C (2018) A survey of machine learning techniques
applied to software defined networking (SDN): Research issues and challenges. IEEE Commun Surv
Tutor 1:393–430
33. Yadav S, Tiwari V, Tiwari B (2016) Privacy preserving data mining with abridge time using vertical
partition decision tree. In Proceedings of the ACM Symposium on Women in Research, pp 158-164
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and
institutional affiliations.
13

Ddos Attacks + Description Data Set

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Ddos Attacks + Description Data Set

Uploaded by

Copyright:

Available Formats

Multimedia Tools and Applications (2022) 81:4185–4211

Predictive machine learning‑based integrated approach

Solomon Damena Kebede1 · Basant Tiwari2 · Vivek Tiwari3 · Kamlesh Chandravanshi4

Received: 16 June 2021 / Revised: 22 September 2021 / Accepted: 8 November 2021 /

Keywords CICIDS2017 · DDoS Attack · Machine learning · Classification algorithm ·

Nowadays, securing information under any circumstance is a critical prerequisite for

2 Material and method

2.1.1 Description of the dataset

• ‘Total Length of Bwd packet’ with four missing values, and.

1. Entropy for a single attribute

i.e., calculate the entropy of the target/parent.

Where 𝑝x indicates the percentage of instances going to class x(x=1,…, n).

2. Entropy for two attributes

Fig. 1 The intersection of attribute evaluation

The gain ratio is defined as:

2.3 Model evaluation and selection for DDoS attack detection

Before 225,745 79 73.5 128,027 97,718

2.3.1 Metrics for evaluating classifier performance

The best classifier is selected on the basis of the following metrics:

Accuracy: It is used to calculate the accuracy of the classifier as:

Error rate: It is calculated as the number of incorrect predictions divided by the

2.5 Evaluation metrics for DDoS attack prevention phase

Where, Rp is the total number of packets received successfully by destination, and Sp

3.1 DDoS attack detection phase methodology

3.2 DDoS attack prevention methodology

Initial waiting time WT = 10 s.

Fig. 2 Proposed algorithm

4.1 Experimental results on DDoS attack detection

4.1.2 The classification method evaluation

Fig. 3 The information gain evaluation with ranker algorithm

Fig. 4 The gain ratio evaluation with ranker algorithm

Table 2 Confusion Matrix of JRip on preprocessed data

Table 3 Confusion matrix of J48 on preprocessed data

4.2 Experimental results on DDoS attack prevention

Table 4 Confusion matrix of k-NN on preprocessed data

Fig. 5 Algorithm’s performance on the unprocessed dataset

Fig. 6 Algorithm performance on preprocessed dataset

Table 5 Time based evaluation Algorithm Model Building Time Model

JRip 148.75 15.55

Table 6 Simulation environment Simulation Parameter Values

Channel Type Wireless Channel

Fig. 7 Network performance

Fig. 8 Performance against nor-

5.1 Result discussion on DDoS attack detection

Fig. 9 Performance against

Fig. 10 Performance against

Average End to End Delay

0.5 DDoS Attack

Fig. 11 Performance against average end to end packet delivery delay

Table 8 Comparative analysis Algorithms Accuracy (%) Accuracy

Jrip NA NA NA 99.99 95.24

NA: - Not Available

5.2 Result discussion on DDoS attack prevention

Table 9 Comprehensive Overall summary DDoS attack Proposed

Number of Packets sent 5990 7918

Packet Delivery Ratio 89.76 86.173 48.75 83.29 90.77

NA: - Not Available

5.3 Space and time complexity analysis for proposed work

Authors’ contributions Not applicable.

Data availability Public datasets have been used.

Code availability No.

Ethics approval Not applicable.

2 Material and method

2.1.1 Description of the dataset

2.3 Model evaluation and selection for DDoS attack detection

2.3.1 Metrics for evaluating classifier performance

2.5 Evaluation metrics for DDoS attack prevention phase

3.1 DDoS attack detection phase methodology

3.2 DDoS attack prevention methodology

4.1 Experimental results on DDoS attack detection

4.1.2 The classification method evaluation

4.2 Experimental results on DDoS attack prevention

5.1 Result discussion on DDoS attack detection

5.2 Result discussion on DDoS attack prevention

5.3 Space and time complexity analysis for proposed work