Download as pdf or txt
Download as pdf or txt
You are on page 1of 15

International Journal of Computational Intelligence Systems (2023) 16:177

https://doi.org/10.1007/s44196-023-00355-x

RESEARCH ARTICLE

A Comparative Study of Using Boosting-Based Machine Learning


Algorithms for IoT Network Intrusion Detection
Mohamed Saied1 · Shawkat Guirguis1 · Magda Madbouly1

Received: 13 July 2023 / Accepted: 22 October 2023


© The Author(s) 2023

Abstract
The Internet-of-Things (IoT) environment has revolutionized the quality of living standards by enabling seamless connec-
tivity and automation. However, the widespread adoption of IoT has also brought forth significant security challenges for
manufacturers and consumers alike. Detecting network intrusions in IoT networks using machine learning techniques shows
promising potential. However, selecting an appropriate machine learning algorithm for intrusion detection poses a consid-
erable challenge. Improper algorithm selection can lead to reduced detection accuracy, increased risk of network infection,
and compromised network security. This article provides a comparative evaluation to six state-of-the-art boosting-based algo-
rithms for detecting intrusions in IoT. The methodology overview involves benchmarking the performance of the selected
boosting-based algorithms in multi-class classification. The evaluation includes a comprehensive classification performance
analysis includes accuracy, precision, detection rate, F1 score, as well as a temporal performance analysis includes training
and testing times.

Keywords Internet-of-Things · Machine learning · Cyber security · Intrusion detection · Extreme boosting · Light boosting ·
Categorical boosting · Supervised learning

1 Introduction attack surface for attackers and suffering from a variety of


security threats.
The rapid proliferation of Internet-of-Things (IoT) devices Threat landscape changes when moving from conven-
has revolutionized various industries, enabling seamless tional networks to IoT-based networks. This shift introduces
connectivity, data exchange, and automation. However, the unique challenges and expands the attack surface, making
widespread adoption of IoT technology has also brought forth IoT networks more susceptible to various security threats.
new security challenges, particularly in the realm of network With growing number of machines and smart devices con-
intrusion detection. IoT includes extensive number of limited nected to the network, the vulnerabilities of IoT security are
and heterogeneous devices [1]. Such condition makes each gradually increased. IoT networks typically consist of a vast
layer of the three-tier IoT environment representing potential number of interconnected devices with diverse hardware,
operating systems, and communication protocols. This scale
and heterogeneity make it challenging to implement consis-
Shawkat Guirguis and Magda Madbouly are contributed equally to this tent security measures across all devices, leading to potential
work. vulnerabilities [42].
Limited resources of IoT devices compromise the capa-
B Mohamed Saied
bility of installing protective security solutions. IoT devices
igsr.msaied@alexu.edu.eg
often have limited computational power, memory, and
Shawkat Guirguis
shawkat_g@alexu.edu.eg energy resources. This limitation makes it difficult to deploy
resource-intensive security solutions, such as robust encryp-
Magda Madbouly
mmadbouly@alexu.edu.eg tion or intrusion detection systems, on all IoT devices.
Attackers can exploit these resource constraints to launch
1 Department of Information Technology, Institute of Graduate attacks and compromise the devices. Moreover, many IoT
Studies and Research, Alexandria University, Alexandria devices lack proper mechanisms for software updates and
21526, Egypt

0123456789().: V,-vol 123


177 Page 2 of 15 International Journal of Computational Intelligence Systems (2023) 16:177

patches that adds a difficulty to an already challenging envi- algorithm for IoT network intrusion detection, this research
ronment [2]. This situation makes it difficult to address can guide the development of robust and efficient intrusion
known vulnerabilities and apply security fixes promptly, detection systems tailored to the unique characteristics and
leaving devices exposed to known attacks. constraints of IoT environments. Furthermore, the insights
IoT standards and protocols are still evolving [3], resulting gained from this study can inform the design of proactive
in a lack of uniform security practices across different IoT security measures to mitigate the risks associated with IoT
devices and ecosystems. Inconsistent security implementa- network intrusions.
tions can create vulnerabilities, as attackers can exploit weak The main contributions of this paper are four folds:
links in the network. Especially, with considering that IoT
devices are often deployed in physically exposed and uncon- 1. Examining the literature of using boosting-based ML
trolled environments, such as industrial settings or public algorithms in IoT network intrusion detection.
infrastructure [4]. This physical exposure increases the risk 2. Conducting an exploratory data analysis (EDA) to N-
of physical tampering, unauthorized access, and device com- BaIoT data set [8] to analyze and summarize their main
promise. characteristics and features.
Securing IoT is the only solution for supporting its spread- 3. Investigating the potential of boosting-based methods
ing or decaying is the alternative. Detecting and mitigating for detecting IoT botnet attacks through an experimental
intrusions in IoT networks is of paramount importance to performance evaluation of six boosting-based ML algo-
safeguard sensitive data, ensure privacy, and maintain the rithms representing boosting technique-based algorithms
integrity of IoT systems. Network intrusion detection plays ADB, GDB, XGB, CAB, HGB, and LGB.
a crucial role in providing real-time protection for IoT envi- 4. Benchmarking the six models through a computational
ronment. It is used to monitor network traffic and distinguish analysis to gain more insight into how light they are to
between normal and abnormal network behaviors. Tradi- an IoT environment.
tional network intrusion detection systems (NIDS) may not
be well-suited to address the unique characteristics and The remaining sections of this paper are structured as fol-
challenges presented by IoT networks [5]. The scale, hetero- lows. Section 2 surveys the related work. Section 3 presents
geneity, and resource constraints of IoT devices necessitate a background for the boosting-based ML algorithms. Sec-
innovative approaches to effectively detect and respond to tion 4 demonstrates the evaluation scheme. It describes the
network intrusions. used data set, shows the data set preprocessing and the evalua-
Incorporating Machine Learning (ML) into the defense tion metrics. Section 5 introduces the experimental results for
architecture has shown promise in this domain. It contributes model performance evaluation. Section 6 provides the con-
in achieving higher detecting accuracy rates in addition to the clusion of this work and provides possible future research
capability of detecting zero day infections [6]. Boosting is an directions.
ensemble modeling technique that refers to improving ML
algorithms predictive accuracy through combining weak or
base learning models into strong predictive model [7]. Its core 2 Related Work
idea is to iteratively train the base models, and then combine
their predictions for the sake of improving the accuracy of A number of studies have been proposed for the sake
the overall ensemble model. of detecting network intrusions in IoT environment. This
section investigates papers applying boosting-based ML
The objective of this paper is to conduct a compara- algorithms for detecting intrusions in IoT environments.
tive study on the effectiveness of boosting-based machine A quantitative systematic review approach is followed
learning algorithms for IoT network intrusion detection. It to select relevant studies. An extensive search was con-
conducts a comprehensive comparative study of multiple ducted using scientific electronic search engines on scientific
boosting-based models, i.e., Adaptive Boosting (ADB), Gra- databases including IEEE Xplore, Science Direct, Scopus,
dient Descent Boosting (GDB), Extreme Gradient Boosting and Research Gate. The search is limited to publications
(XGB), Categorical Boosting (CAB), Hist Gradient Boosting written in English and published in scientific journals, con-
(HGB), and Light Gradient Boosting (LGB). This study aims ferences, or theses. All combinations of “machine learning”,
to evaluate their efficacy within the context of IoT network “boosting”, “intrusion detection” and “IoT” were used in the
security and identify the most suitable algorithm for accurate title, abstract, and keywords over the period from 2017 to
and efficient intrusion detection. The findings of this compar- 2023. The focus was only on published work during that
ative study will contribute to the existing body of knowledge period due to the fact that the trigger for this research field was
on IoT network security and intrusion detection. By iden- the reported botnet malware (Mirai) in 2016. US Computer
tifying the most effective boosting-based machine learning Emergency Readiness Team (US-CERT) reported a botnet

123
International Journal of Computational Intelligence Systems (2023) 16:177 Page 3 of 15 177

malware that had disrupted the services of a major US Inter- updated global weights after averaging. They compared using
net provider. It caused a disruption of multiple major websites of ANN and XGB models through BoT–IoT data set. The
via a series of massive distributed denial of service (DDoS) results show that ANN has better performance of 99.99 rather
attacks. It spread quickly and infected thousands of malicious than XGB of 98.96.
endpoints. ML umbrella covers several learning techniques. Khan et al. [18] proposed a proactive interpretable predic-
Boosting algorithms have been around for years and yet it’s tion model to detect different types of security attacks using
only recently when they have become mainstream in the ML the log data generated by heating, ventilation, and air condi-
community. This section surveys and discusses the litera- tioning (HVAC) attacks. Several ML algorithms were used,
ture of network intrusion detection in IoT environment for such as DT, RF, GDB, ADB, LGB, XGB, and CAB. They
boosting-based related work. reported that the XGB classifier has produced the best result
Kumar et al. [9] used a two-step process for identifying with 99.98% accuracy. Their study was performed using the
peer to peer P2P bots which are detection step and analyz- Elnour et al. [19] HVAC systems data set.
ing step. For the classification step, tenfold cross-validation Alissa et al. [20] proposed a DT, an XGB model, and a
is used on Random Forest (RF), Decision Tree (DT) and logistic regression (LR) model. They used UNSW-NB15 data
XGB. Their approach achieved detection rate of 99.88%. set with applying features correlation technique resulting in
They trained the model for P2P botnet detection using traffic discarding nine features. They reported that the DT outper-
from three botnets namely Waledac, Vinchuca, and Zeus. formed with 94% test accuracy with slight higher accuracy
Liu et al. [10] studied eleven ML algorithms for detect- that XGB while LR achieved the worst accuracy.
ing intrusions in Contiki-NG-Based IoT Networks. They Al-Haija et al. [21] proposed an ensemble learning model
reported that XGB achieved the best performance with 97% for botnet attack detection in IoT. Their approach is to
accuracy using the NSL–KDD data set. applying the voting-based probability to ensemble three
Alqahtani et al. [11] proposed Fisher-score for reducing ML classifiers, i.e., ADB, Random under sampling boosting
the number of features for an IoT botnet attack detection. model (RUS), and bagged model. The individual perfor-
Their approach used a genetic-based extreme gradient boost- mance for the selected classifiers was 97.30, 97.70, and
ing (GXGB) model. Fishers score allows them to select only 96.20, respectively. While the performance of the proposed
three out of 115 data features of the N-BaIoT data set [8]. ensemble model was 99.60%.
Their approach achieved an accuracy of 99.96%. Dash et al. Garg et al. [22] compared the performance of boosting
[12] proposed a multi-class Adaptive Boost model (ADB) for techniques with the non boosting ensemble-based tech-
predicting the anomaly type. They used IoT security data set niques. They identified two types of attacks: IoT attacks and
from DS2OS [13] for the model evaluation. This data set cov- DDoS attacks as binary class and multiclass output, respec-
ers eight types of anomalies those are data probing, denial of tively. Three data sets were used to for evaluation BoT–IoT,
service (DoS), malicious control, malicious operation, scan, IoT-23 and CIC–DDoS-2019. Two boosting methods were
spying and wrong setup. They reported an anomaly detection used, i.e., XGB and LGB. LGB achieved the best perfor-
accuracy of 95%. mance with an accuracy of 94.79%.
Krishna et al. [14] proposed hybrid approach based on Bhoi et al. [23] proposed an LGB-based model for anoma-
ML and feature selection. The NSL–KDD [8] and NBaIoT lies detection in IoT environment. They used Gravitational
data sets [14] are used with applying feature extraction using Search-based optimization (GSO) for optimizing LGB hyper
Recursive feature elimination (RFE). They reported an accu- parameters and compared with the Particle swarm optimiza-
racy of 99.98%. They compared it with GDB classifier which tion (PSO). They used a simulated IoT sensors data set called
achieved an accuracy of 99.30%. Hazman et al. [15] proposed IoT data set that is cited in [24]. They reported an optimal
an approach for intrusion detection in IoT-based smart envi- accuracy of 100%.
ronments with Ensemble Learning called IDS–SIoEL. Their Awotunde et al. [25] proposed a boosting-based model for
approach uses ADB and combining different feature selec- intrusion detection in industrial Internet-of-Things networks.
tion techniques Boruta, mutual information and correlation They investigated the detection of various ensemble classi-
furthermore. They evaluated their approach on IoT-23, BoT– fiers, such as XGB, Bagging, extra trees (ET), RF, and ADB.
IoT [16], and Edge-IIoT data sets. They reported a detection They utilized the Telemetry data of the TON_IoT data sets.
accuracy of 99.90%. The results indicated that XGB showed the highest accu-
Ashraf et al. [17] proposed a federated intrusion detection racy in detecting and classifying IIoT attacks. Rani et al.
system for blockchain enabled IoT healthcare applications. [26] compared several algorithms for intrusion detection in
Their approach is based on using lightweight artificial neu- IoT environments, i.e., LR, RF, XGB, and LGB. They uti-
ral networks in a federated learning way. In addition, it lized DS2OS data set [27] and reported that XGB and LGBM
uses blockchain technology to provide a distributed ledger achieved almost equal accuracy of 99.92%.
for aggregating the local weights and then broadcasting the

123
177 Page 4 of 15 International Journal of Computational Intelligence Systems (2023) 16:177

Table 1 presents a comparative analysis for the previous mented in scikit class sklearn.ensemble.AdaBoostClassifier
related work in tabular form. It lists the surveyed papers in [30].
adopting boosting-based approaches in detection IoT net- The mathematical architecture of the ADB model training
work intrusions and their characteristics. The papers are can be concluded as follows [31]:
ordered chronologically, and their characteristics in terms of
the objective, the employed boosting algorithm(s), the evalu- 1. Initialize the weights of the training samples:
ation data set, the number of classes, the number of features,
and the reported accuracy. wi = 1/N , (1)

where N is the number of training samples.


2. For each boosting iteration t = 1 to T:
3 ML Boosting-Based Algorithms
• Train a weak learner on the training data with weights
In the context of IoT intrusion detection, ML techniques play wi .
a crucial role in enhancing security measures. One such tech- • Compute the weak learner’s error rate:
nique is boosting, which leverages the concept of ensemble
supervised learning to strengthen the detection capabilities. t = i wi ∗ δ(yi = h t (xi )), (2)
By combining several learners into a strong model, boost-
ing effectively reduces bias and variance in the prediction where h t (xi ) is the weak learner’s prediction for sam-
process [28]. Those simple models are called weak learners ple xi and δ is the Kronecker delta.
or base estimators [7]. It aggregates all predictions from its • Compute the weak learner’s weight in the ensemble:
constituent learners in a sequential manner. In such way, each
learner eliminates the error of its previous one to update the αt = 0.5 ∗ ln((1 − t )/t ). (3)
residual error. This section introduces six different boosting-
based algorithms. These algorithms build upon the principles • Update the sample weights:
of ensemble learning, empowering ML models to achieve
higher prediction accuracy in detecting and mitigating intru- wi = wi ∗ ex p(αt ∗ δ(yi = h t (xi ))). (4)
sions in IoT systems.
The general architecture of the boosting techniques con- • Normalize the sample weights:
sists of the following steps:
wi = wi /i wi . (5)

1. Initialize the training data set and assign equal weights 3. Output the final boosted model: H (x) = sign(t αt ∗
to each training instance. h t (x)).
2. Train a base learner on the weighted data set.
3. Adjust the weights of misclassified instances to give them ADB is known for its ability to handle complex data sets and
higher importance. achieve high accuracy. It focuses on mis-classified samples,
4. Repeat steps 2 and 3 for a specified number of iterations giving them higher weights in subsequent iterations, lead-
(or until a stopping criterion is met). ing to improved performance. It is resistant to over-fitting
5. Combine the predictions of all weak learners using a and can work well with weak classifiers. However, It can
weighted voting or averaging scheme to obtain the final be sensitive to noisy data and outliers, which can negatively
prediction. impact its performance. It may struggle with data sets that
have imbalanced class distributions.
3.1 Adaptive Gradient Boosting
3.2 Gradient Descent Boosting
Adaptive Gradient Boosting algorithm (AdaBoost or ADB)
of Freund and Schapire was the first practical boosting algo- Gradient descent boosting (GDB) is an extension of boosting
rithm [29]. The algorithm begins by fitting a classifier on the technique where the process of additively generating weak
original data set, and then fits additional copies of the classi- models is formalized as a gradient descent algorithm. The
fier on the same data set. It assigns higher weights to the incor- final prediction is a weighted sum of all of the tree pre-
rectly classified classes and lower weights to the correctly dictions. All its weak learners are decision trees. The idea
classified classes to focus more on difficult cases. The exact behind is to take a weak hypothesis or weak learning algo-
process repeats until the best possible result is achieved and rithm and make a series of tweaks to it, that will improve the
the algorithm has used all the instances in the data. It is imple- strength of the hypothesis/learner. This type of Hypothesis

123
Table 1 Comparative analysis for the related work
Boosting Algorithm Author Ref Year Data set Objective No of classes No of features Accuracy (%)

XGB Kumar [9] 2019 Synthetic Peer-to-Peer Botnet 2 18 99.88


Detection
XGB Liu [10] 2020 NSL-KDD NIDS for 8 8 97.00
Contiki-NG-Based
IoT Networks
GXGB Alqahtani [11] 2020 N-BaIoT Botnet attack detection 3 3 99.96
in IoT
ADB Dash [12] 2020 DS2OS Anomaly detection in 8 13 95.00
IoT
ADB Krishna [14] 2021 NSL-KDD and Attack detection in IoT 4 41 99.30
International Journal of Computational Intelligence Systems

NBaIoT
EL ADB Hazman [15] 2022 IoT-23, BoT-IoT, NIDS for Smart cities 2 30 99.90
Edge-IIoT IoT
XGB Ashraf [17] 2022 CIC-IDS2018, NIDS for Blockchain 2 10 98.96
N-BaIoT, KDD enabled IoT
(2023) 16:177

Cup 99 Healthcare
Applications
XGB Khan [18] 2022 Elnour et al. HVAC Attack detection for 2 24 99.98
data set HVAC
XGB/DT Alissa [20] 2022 UNSW-NB15 Botnet attack detection 2 40 94.00
in IoT
ADB RUS ELBA Al-Haija [21] 2022 N-BaIoT Botnet attack detection 3 10 97.30 97.70 99.60
in IoT
XGB LGB Garg [22] 2022 BoT-IoT, IoT-23, Attacks Identification: 2 35 94.49 94.79
CIC-DDoS-19 IoT attacks and DDoS
attacks
LGB PSO-LGB Bhoi [23] 2022 IoT data set Identify cation of 2 13 99.99 100.0 100.0
GSA-LGB Malicious Access in
IoT Network
XGB Awotunde [25] 2023 ToN-IoT NIDS for IIoT 7 17 99.73
XGB LGB Rani [26] 2023 DS2OS NIDS for Smart home 8 13 99.92
Page 5 of 15

IoT
177

123
177 Page 6 of 15 International Journal of Computational Intelligence Systems (2023) 16:177

Boosting is based on the idea of Probability Approximately 1. Initialize the model’s predictions:
Correct Learning (PAC). Gradient boosting classifiers are
the Ada-Boosting method combined with weighted mini- F0 (x) = argmin c i L(yi , c), (10)
mization, after which the classifiers and weighted inputs are
recalculated. The objective of Gradient Boosting classifiers where L is the loss function.
is to minimize the loss. It is implemented in scikit class
sklearn.ensemble.GradientBoostingClasifier [30]. 2. For each boosting iteration t = 1 to T :
The mathematical architecture of the GDB model training
can be concluded as follows [32]:
• Compute the negative gradient of the loss function:

1. Initialize the model’s predictions: git = −[δL(yi , F(xi ))/δ F(xi )] F(xi )=Ft−1 (xi ) .
(11)
F0 (x) = argmin c i L(yi , c), (6)
• Compute the second derivative approximation of the
where L is the loss function, and c is the predicted value. loss function:
2. For each boosting iteration t = 1 to T :
• Compute the negative gradient of the loss function: h it = [δ 2 L(yi , F(xi ))/δ F(xi )2 ] F(xi )=Ft−1 (xi ) .
(12)
rit = −[δL(yi , F(xi ))/δ F(xi )] F(xi )=Ft−1 (xi ) .
(7) • Fit a weak learner to the negative gradient and second
derivative:
• Fit a weak learner to the negative gradient:
h t (x) = argmin h i [git ∗ h(xi )
h t (x) = argmin h i L(rit , h(xi )) (8) +0.5 ∗ h it ∗ h(xi )2 ] + (h), (13)

where (h) is the regularization term.


• Update the model’s predictions:
• Update the model’s predictions:

Ft (x) = Ft−1 (x) + η ∗ h t (x) (9) Ft (x) = Ft−1 (x) + η ∗ h t (x), (14)

where η is the learning rate. where η is the learning rate.


3. Output the final boosted model: H (x) = FT (x). 3. Output the final boosted model: H (x) = FT (x).

GDB builds models sequentially, minimizing the loss XGB excels in both speed and performance. It supports
function by gradient descent, resulting in improved perfor- parallel processing and has a comprehensive set of hyper-
mance. However, it can be computationally expensive and parameters for fine-tuning. However, it is sensitive to hyper-
may require careful tuning of hyper-parameters. It is more parameter settings. Selecting the optimal combination of
prone to over-fitting compared to other algorithms. hyper-parameters can be time-consuming and computation-
ally expensive. Additionally, the interpretability of XGB
3.3 Extreme Gradient Boosting models can be challenging due to their complexity.

Extreme Gradient Boosting (XGB) is simply an improved 3.4 Light Gradient Boosting
version of the GBM algorithm. It implements machine learn-
ing algorithms under the Gradient Boosting framework. Its Light Gradient Boosting algorithm (LGB) is an ensemble
working procedure is the same as GBM, except that XGB learning method. It is an implementation of Gradient Boosted
implements parallel pre-processing at the node level which Decision Trees (GBDT) similar to random forest [35]. It
makes it generally over ten times faster than GBM [33]. combines multiple decision trees to obtain a better predic-
XGB also includes a variety of regularization techniques that tion. LGB algorithm is an implementation of GBD [35]. It
reduce over fitting and improve overall performance. The uses boosting to eliminate the residual error. LGB is able to
mathematical architecture of the XGB model training can be handle huge amounts of data with ease. It does not perform
concluded as follows [34]: well with a small number of data points. The trees in LGB

123
International Journal of Computational Intelligence Systems (2023) 16:177 Page 7 of 15 177

have a leafwise growth, rather than a levelwise growth. After 3.5 Categorical Boosting
the first split, the next split is done only on the leaf node
that has a higher delta loss. To speed up the training pro- Categorical Boosting is an algorithm for gradient boosting
cess, LGB uses a histogram-based method for selecting the on decision trees (also know as CatBoost or CAB) [37].
best split. Observing the high training time requirement for It is a special version of GBDT. It solves problems with
gradient boosting decision trees (GBD), Ke et al. [28] pro- ordered features while also supporting categorical features.
posed two novel techniques to overcome the challenge based It shuffles the data randomly and mean is calculated for every
on Gradient-based One-Side Sampling (GOSS) and Exclu- object only on its historical data. It constructs combinations
sive Feature Bundling (EFB). This new implementation was in a greedy way. It incorporates an ordered boosting with
named LGB, and it improved training and inference time of a permutation driven alternative to the conventional gradient
GBD by 20%. boosting. Such permutations decrease the final model predic-
The mathematical architecture of the LGB model training tions’ variance compared to the general boosting algorithm
can be concluded as follows [36]: [38].
The mathematical architecture of the CAB model training
1. Initialize the model’s predictions: can be concluded as follows [39]:

1. Initialize the model’s predictions:


F0 (x) = argmin c i L(yi , c), (15)
F0 (x) = argmaxc i L(yi , c), (19)
where L is the loss function
2. For each boosting iteration t = 1 to T : where L is the loss function.
2. For each boosting iteration t = 1 to T :
• Compute the negative gradient of the loss function:
• Compute the pseudo-residuals:
git = −[δL(yi , F(xi ))/δ F(xi )] F(xi )=Ft−1 (xi ) .
rit = −[δL(yi , F(xi ))/δ F(xi )] F(xi )=Ft−1 (xi ) .
(16)
(20)

• Compute the second derivative approximation of the • Fit a weak learner to the pseudo-residuals and the
loss function: categorical features.
• Update the model’s predictions:
h it = [δ 2 L(yi , F(xi ))/δ F(xi )2 ] F(xi )=Ft−1 (xi ) .
(17) Ft (x) = Ft−1 (x) + η ∗ h t (x). (21)

3. Output the final boosted model: H (x) = FT (x).


• Grow a tree with a leafwise approach, selecting the
best split based on the gradients and second deriva-
CAB provides good accuracy and robustness to noisy data.
tives.
It also offers built-in handling of missing values. However,
• Update the model’s predictions:
it can be computationally expensive and slower compared to
other boosting algorithms, especially with large data sets. It
Ft (x) = Ft−1 (x) + η ∗ h t (x), (18) may require more computational resources during training.
Tuning its hyper-parameters can be challenging due to the
where η is the learning rate. increased complexity.

3. Output the final boosted model: H (x) = FT (x). 3.6 Hist Gradient Boosting

LGB utilizes a leafwise tree growth strategy and gradient- Histogram-based Gradient Boosting Classification Tree (HGB)
based optimization, resulting in faster training times and is much faster than Gradient Boosting Classifier for big
lower memory usage. However, it may not perform well data sets. Its implementation is inspired by LGB. Dur-
when dealing with smaller data sets. It is more sensitive to ing training based on the potential gain, the tree grower
over-fitting and may require careful regularization. The inter- learns at each split point whether samples with missing
pretability of LGB models can be challenging due to their values should go to the left or right child. When pre-
complex nature. dicting, samples with missing values are assigned to the

123
177 Page 8 of 15 International Journal of Computational Intelligence Systems (2023) 16:177

Features
left or right child consequently. If no missing values were
encountered for a given feature during training, then sam-

Tot. Features 115


15
15

35

15

35
ples with missing values are mapped to whichever child

Count Time frame


has the most samples. It is implemented in scikit class:
sklearn.ensemble.HistGradientBoostingClassifier [30]. The
mathematical architecture of the HGB model training can be

5
5

5
concluded as follows [40]:

Tot. Traffic characteristics 23


3
3

7
1. Initialize the model’s predictions:

Stream aggregation designation Stream aggrega- Weight Mean Variance/standard deviation Magnitude Radius Covariance Correlation Coefficient
F0 (x) = argmin c i L(yi , c), (22)

where L is the loss function.


2. For each boosting iteration t = 1 to T :
• Compute the negative gradient of the loss function:


X
X

X
git = −[δL(yi , F(xi ))/δ F(xi )] F(xi )=Ft−1 (xi ) .
(23)


X
X

X
• Compute the second derivative approximation of the
loss function:


X
X

X
h it = [δ 2 L(yi , F(xi ))/δ F(xi )2 ] F(xi )=Ft−1 (xi ) .
(24)


X
X

X
• Construct a histogram of the feature values and their
corresponding gradients and second derivatives.
• Find the best split points in the histogram using a
greedy algorithm.
• Compute the leaf values for the histogram bins.
• Update the model’s predictions:

Ft (x) = Ft−1 (x) + η ∗ h t (x), (25)


Var
Var

Var
Std

Std
where η is the learning rate.



3. Output the final boosted model: H (x) = FT (x).



Host source IP + 

Host to Host 

Host to Host 

Host port to Host 


(Source IP to

channel
tion description

Host source IP

destination IP)

channel jitter

4 Evaluation Scheme
(IP+Socket)
channel
Table 2 Data set attributes information

MAC

This section demonstrates the evaluation environment. It


port

describes the data set used, the applied data preprocessing


and the evaluation metrics to be used for performance eval-
uation.

4.1 Data Set Description

This section describes the data set used in the experimen-


tal framework. N-BaIoT [41] data set is selected for training
HH_Jit

HpHp

and evaluation purposes as it is a widely accepted as bench-


HH
MI

mark sequential data set. It contains realistic network traffic


H

123
International Journal of Computational Intelligence Systems (2023) 16:177 Page 9 of 15 177

Fig. 1 Feature extraction

and a variety of attack traffic. It was suggested by Meidan et set attributes information. The attacks executed by botnets
al. [8] through gathering traffic of nine commercially avail- include: Scan that can discover vulnerable devices; flooding
able IoT devices authentically infected by Mirai and Bashlite that makes use of SYN, ACK, UDP and TCP flooding; and
malware. The devices were two smart doorbells, one smart combo attacks to open connections and send junk data.
thermostat, one smart baby monitor, four security cameras Our study uses Median’s data set in Naveed organized
and one webcam. Traffic was captured when the devices formats. Figure 2 shows the data exploration for the data
were in normal execution and after infection with malware. set collected by three labeled types, i.e., benign, Mirai and
The traffic was captured through network sniffing utility into Gafgyt. Figure 3 shows the data set individual distribution of
raw network traffic packet capture format (PCAP). It can the 10 malware classes in addition to the benign traffic.
be achieved through using port mirroring. Five features are
extracted from the network traffic as abstracted in Table 2.
Three or more statistical measures are computed for each of 4.2 Data Set Preprocessing
these five features for data aggregation, resulting in a total
of 23 features. These 23 distinct features are computed over Data preprocessing is the process of preparing the data set for
five separate time-windows (100 milliseconds (ms); 500 ms; analysis. It is an essential step in ML as it helps to ensure that
1.5 seconds (s); 10 s; and 1 minute) as demonstrated in Fig. 1. the data is appropriate and correct for feeding into the model.
Using time windows makes this data set appropriate for state- As demonstrated during data set exploration in Sect. 4.1, the
ful IDS and resulting in total of 115 features. Naveed et al. data set is imbalanced and diversified into many files based
[41] organized this data set in an easier file structure and on the attack type, as shown in Fig. 3.
made it available at Kaggle. To integrate the data set files, the data set files are inte-
The data set contains instances of network traffic data grated together into three main categories, i.e., Benign, Mirai,
divided into three categories: normal traffic (Benign data), and Gafgyt. The Bengin category contains all normal traf-
Bashlite infected traffic, and Mirai infected traffic. Each fic records represented in Light Green color in Fig. 3. Mirai
data instance consists of 115 features represented by 23 category includes all Mirai related attacks, i.e., Mirai_Ack,
different traffic characteristics in 5 different time frames. Mirai_Scan, Mirai_Syn, Mirai_Udp, Mirai_Udpplain repre-
Table 2 presents an abstracted demonstration for the data sented in Blue color in Fig. 3. The data set file of this category
is called “All_Mirai”. The third category is Gafgyt and

123
177 Page 10 of 15 International Journal of Computational Intelligence Systems (2023) 16:177

imbalanced data set. The confusion matrix is used to visualize


the performance of a ML technique. It describes the perfor-
mance of a classification model on a set of test data and allows
easy identification of confusion between classes. The classi-
fication is evaluated through four indicators as follows: True
positives (TP): packets are predicted as malicious, and their
ground truth is malicious. False positives (FP): packets are
predicted as malicious, while their ground truth is benign.
True negatives (TN): packets are predicted as benign, and
their ground truth is benign. False negatives (FN): packets
are predicted as benign, while their ground truth is malicious.
A successful detection requires correct attacks identifi-
cation with minimizing the number of false alarms. Four
metrics are widely used for evaluating ML models, i.e., accu-
Fig. 2 Data set exploration racy, precision, recall, and F1 score. Those four measures are
defined as follows:
it includes all Gafgyt related attacks, i.e., Gafgyt_Combo,
Gafgyt_Junk, Gafgyt_Scan, Gafgyt_Tcp, Gafgyt_Udp rep- (T P + T N )
resented in Red Color in Fig. 3. The data set file of this Accuracy ≡ , (26)
(T P + T N + F P + F N )
category is called “All_Gafgyt”.
(T P)
To deal with the data set imbalance, two data sets are cre- Precision ≡ , (27)
ated for ternary classification. A data set contains the three (T P + F P)
categories is created, i.e., Benign, Mirai, Gafgyt. Each cate- (T P)
Recall (Detection Rate) ≡ , (28)
gory contains 555,932 rows. The Benign category contains (T P + F N )
555,932 rows. The Mirai and Gafgyt categories are created by (2 × Precision × Recall)
F1 Score ≡ . (29)
shuffling the “All_Mirai” and “All_Gafgyt” respectively and (Precision + Recall)
selecting only 555,932 rows of each of them. The ternary data
set overall size is 1,667,796 records. As the implementation
is by using Python, every step in the data set preprocessing is Those measures range from 0 to 1. The goal is to max-
creating an index column. To avoid the bias of such columns, imize all the pre-mentioned measures. The higher values
those columns are cleaned from the data set. correspond to better classification performance. For fair com-
parative evaluation, two additional performance measures are
4.3 Evaluation Metrics considered. The first measure is the model training time that
is defined by the consumed time during the model training
Performing a comprehensive performance evaluation requires phase. The second measure is the testing time that is defined
addressing several metrics. Accuracy only is not sufficient for by the consumed time during the testing phase.

Fig. 3 Distribution of packets


for each class

123
International Journal of Computational Intelligence Systems (2023) 16:177 Page 11 of 15 177

Table 3 Evaluation results for ternary classification


Evaluation metric Class ADB GDB XGB CAB HGB LGB

Accuracy 0.952566 0.999889 0.999400 0.999994 0.999994 0.999988


Precision Benign 0.999028 0.999730 0.999289 0.999991 0.999991 0.999991
Mirai 0.998830 0.999946 0.999136 0.999991 0.999991 0.999973
Gafgyt 0.998929 0.999838 0.999213 1.000000 1.000000 1.000000
Detection Rate Benign 0.876440 0.999937 0.999085 1.000000 1.000000 0.999973
Mirai 0.999309 0.999739 0.999157 0.999991 0.999991 0.999991
Gafgyt 0.933850 0.999838 0.999121 0.999990 0.999991 0.999991
F1 Score Benign 0.999979 1.000000 0.999828 0.999995 0.999995 0.999986
Mirai 0.859125 0.999981 0.999909 0.999991 0.999991 0.999982
Gafgyt 0.924216 0.999990 0.999869 0.999995 0.999995 0.999995
Training Time (Seconds) 1645.29 10125.00 2157.26 1804.02 166.53 170.94
Testing Time (Seconds) 9.54 3.47 2.16 1.37 6.27 6.27

matrix of ternary classification using HGB and CAB algo-


rithms. Adaptive Boosting algorithm is originally developed
for binary classification. Its tree is just made of a decision
stump which is a node and two leaves. That’s why it can
be seen as a forest of stumps. This explains its relative low
performance in ternary classification of 95.2566.
Some boosting algorithms might be computationally
intensive and resource-demanding, which could hinder their
practicality in resource-constrained IoT environments. Fig-
ure 5 illustrates the temporal performance for the six algo-
rithms. The results show that the GDB model training took
around five times the consumed time for training the XGB
model. It is because GDB does not support multi threading.
Unlike the XGB algorithm that is an implementation of GDB
Fig. 4 Confusion matrix for HGB and CAB ternary classifiers supporting multi-threading.
Enhancing the intrusion detection rate of the model can
lead to an improvement in the real-time detection perfor-
5 Experimental Results mance of the entire IoT intrusion detection system. Table 3
shows the experimental results of ternary classification. How-
The experiments are conducted on Colab notebook interac-
ever, HGB and CAB achieved the highest detection accuracy,
tive environment. For the sake of providing evidence-based
HGB consumed lower time for training. CAB consumed
evaluation, the project along with data sets are uploaded and
around eleven times the consumed time of HGB. It is an
shared on Github. The evaluation is conducted for evaluating
empirical result for the two techniques used in LGB, i.e.,
six boosting-based ML algorithms, i.e., ADB, GDB, XGB,
GOSS and EFB. As HGB is inspired by the LGB design, it
CAB, HGB, and LGB. The six boosting-based algorithms
achieves the same relative small training time compared with
are evaluated for the objective of ternary classification. The
the other algorithms. Beside achieving the highest detection
ternary data set demonstrated in Sect. 4.2 is used for models
accuracy, CAB consumed the least testing time of 1.37 s, that
evaluation. The six algorithms are fitted with the formed data
reflects its strength in IoT intrusions detection and real-time
set. The performance evaluation metrics identified in Sect.
feasibility.
4.3 are calculated and documented for ternary classification
To ensure the robustness and reliability of our findings,
in Table 3.
we conducted cross-validation as an essential step in our
The empirical evaluation results showed significant poten-
research methodology. Cross-validation is a widely accepted
tial for the boosting-based ML algorithms in detection
technique used to assess the generalization performance of a
network intrusion in IoT environments. For ternary classi-
predictive model. In our study, we employed k-fold cross-
fication, both CAB and HGB algorithms outperform with
validation, where the data set was divided into k equally
highest accuracy of 99.9994%. Figure 4 show the confusion

123
177 Page 12 of 15 International Journal of Computational Intelligence Systems (2023) 16:177

Fig. 5 Temporal performance


for boosting-based algorithms in
ternary classification

Table 4 Evaluation results for


Evaluation metric Class ADB GDB XGB CAB HGB LGB
fivefold cross-validation
Accuracy Fold 1 0.951208 0.997661 0.999442 0.999988 0.999997 0.999997
Fold 2 0.951654 0.997739 0.999559 0.999988 0.999997 0.999991
Fold 3 0.951666 0.997634 0.999523 1.000000 0.999997 1.000000
Fold 4 0.951741 0.997781 0.999556 0.999985 0.999982 0.999979
Fold 5 0.952479 0.997682 0.999559 0.999988 0.999988 0.999985
Mean 0.951750 0.997699 0.999528 0.999989 0.999992 0.999990
Precision Fold 1 0.957753 0.997662 0.999442 0.999988 0.957353 0.999997
Fold 2 0.957680 0.997740 0.999559 0.999988 0.999997 0.999991
Fold 3 0.957694 0.997635 0.999523 1.000000 0.999997 1.000000
Fold 4 0.957735 0.997782 0.999556 0.999985 0.999982 0.999979
Fold 5 0.958304 0.997683 0.999559 0.999988 0.999988 0.9999850
Mean 0.957753 0.997700 0.999528 0.999989 0.999992 0.999990
Detection Rate Fold 1 0.951207 0.997661 0.999442 0.999988 0.957353 0.999997
Fold 2 0.951654 0.997739 0.999559 0.999988 0.999997 0.999991
Fold 3 0.951667 0.997634 0.999523 1.000000 0.999997 1.000000
Fold 4 0.951741 0.997781 0.999556 0.999985 0.999982 0.999979
Fold 5 0.952479 0.997682 0.999559 0.999988 0.999988 0.999985
Mean 0.951750 0.997699 0.999528 0.999989 0.999992 0.999990
F1 Score Fold 1 0.950979 0.997661 0.999442 0.999988 0.957353 0.999997
Rate Fold 2 0.951427 0.997739 0.999559 0.999988 0.999997 0.999991
Fold 3 0.951450 0.997634 0.999523 1.000000 0.999997 1.000000
Fold 4 0.951515 0.997781 0.999556 0.999985 0.999982 0.999979
Fold 5 0.952258 0.997682 0.999559 0.999988 0.999988 0.999985
Mean 0.951526 0.997699 0.999528 0.999989 0.999992 0.999990

123
International Journal of Computational Intelligence Systems (2023) 16:177 Page 13 of 15 177

6 Conclusion and Future Work

This paper presented an experimental evaluation for adopting


ML boosting-based algorithms in detection network intru-
sion in IoT. Six boosting-based algorithms are implemented
and tested using a well known standard data set N-BaIoT
for bench marking. The results demonstrated the signifi-
cant potential of the boosting-based ML algorithms. The
best performance was achieved using HGB algorithm in
ternary classification. Fivefolds cross-validation is conducted
to ensure the experimental results that showed an outperform
of HGB with a detection accuracy of 0.999992.
Boosting -based algorithms can sometimes lack inter-
pretability, making it difficult to understand the concept
behind how the intrusion detection was made. A future
Fig. 6 Learning curve for HGB
research potential is employing the explainable artificial
intelligence (XAI) with boosting-based algorithms in the
context of intrusion detection in IoT to provide transparency
and interpretability of the intrusions detection and classifica-
sized folds. During each iteration, one fold was held out as tion.
a validation set, while the model was trained on the remain- This study presented an empirical evaluation for employ-
ing k-1 folds. This process was repeated k times, with each ing boosting-based algorithms with N-BaIoT data set. Fur-
fold serving as the validation set once. By averaging the ther research is required for evaluating the performance of
performance metrics across all iterations, we obtained a com- the boosting-based algorithms with other IoT data sets.
prehensive evaluation of the model’s effectiveness and its
Acknowledgements Authors would like to acknowledge the help and
ability to generalize to unseen data. Cross-validation allowed support provided by the department of Information Technology in the
us to mitigate the risk of overfitting, as it provided a more Institute of Graduate Studies and Research, Alexandria University.
objective assessment of our model’s performance. The uti-
lization of this rigorous technique enhances the credibility of Funding Open access funding provided by The Science, Technology &
Innovation Funding Authority (STDF) in cooperation with The Egyp-
our results and strengthens the validity of our conclusions. tian Knowledge Bank (EKB). Funds or other support was received.
Table 4 shows the results of conducting a fivefolds cross-
validation. The results show that HGB outperforms the other Availability of Data and Materials The data set of Median et al. that
algorithms with an average detection accuracy of 0.999992. support the findings of this study is available in Kaggle repository with
the identifier [https://doi.org/10.1109/MPRV.2018.03367731]. Further
Figure 6 generated by the code shows the learning curve data and associated software underlying this article are available in Kag-
for the HGB model’s performance. The x-axis represents the gle, at https://www.kaggle.com/MohamedSaiedEssa/BoostingBasedIo
number of training examples used, and the y-axis represents TNIDS and in Github at: https://github.com/MohamedSaiedEssa/
the performance of the model. BoostingBasedIoTNIDS.
The learning curve is composed of two lines: the training
error and the cross-validation error. The training error repre- Declarations
sents how well the model performs on the training data as the
Conflict of Interest The authors declare no competing interests.
number of training examples increases. The cross-validation
error, on the other hand, represents the model’s performance Open Access This article is licensed under a Creative Commons
on the validation data during cross-validation. The curves Attribution 4.0 International License, which permits use, sharing, adap-
are plotted with the mean errors, however variability during tation, distribution and reproduction in any medium or format, as
long as you give appropriate credit to the original author(s) and the
cross-validation is shown with the shaded areas that repre- source, provide a link to the Creative Commons licence, and indi-
sent a standard deviation above and below the mean for all cate if changes were made. The images or other third party material
cross-validations. in this article are included in the article’s Creative Commons licence,
As the number of training examples increases, both the unless indicated otherwise in a credit line to the material. If material
is not included in the article’s Creative Commons licence and your
training error and the cross-validation error should improve. intended use is not permitted by statutory regulation or exceeds the
The gap between the two lines indicates the model’s general- permitted use, you will need to obtain permission directly from the copy-
ization ability. The gap is not large that means that the model right holder. To view a copy of this licence, visit http://creativecomm
is not overfitting the training data and is able to generalize ons.org/licenses/by/4.0/.
well to unseen data.

123
177 Page 14 of 15 International Journal of Computational Intelligence Systems (2023) 16:177

References enabled IoT healthcare applications. Healthcare 10, 279–295


(2022). https://doi.org/10.3390/healthcare10061110
1. Imteaj, A., Thakker, U., Wang, S., Li, J., Amini, M.H.: A survey 18. Khan, I.U., Aslam, N., Alshedayed, R., Alfrayan, D., Alessa,
on federated learning for resource-constrained IoT devices. IEEE N.A.R.A., Safwan, A.A.: A proactive attack detection for heating,
Internet Things J. 9(1), 1–24 (2021) ventilation, and air conditioning (hvac) system using explainable
2. Almiani, M., Abughazleh, A., Al-rahayfeh, A., Atiewi, S., extreme gradient boosting model (xgboost). Sensors 22(23), 9235
Razaque, A.: Deep recurrent neural network for IoT intrusion (2022)
detection system. Simul. Model. Pract. Theory (2019). https://doi. 19. Elnour, M., Meskin, N., Khan, K., Jain, R.: Application of data-
org/10.1016/j.simpat.2019.102031 driven attack detection framework for secure operation in smart
3. Guillemin, P., Berens, F., Carugi, M., Arndt, M., Ladid, L., Per- buildings. Sustain. Cities Soc. 69, 102816 (2021). https://doi.org/
civall, G., De Lathouwer, B., Liang, S., Bröring, A., Thubert, P.: 10.1016/j.scs.2021.102816
Internet of things standardisation-status, requirements, initiatives 20. Alissa, K., Alyas, T., Zafar, K., Abbas, Q., Tabassum, N., Sakib,
and organisations. In: Internet of Things, pp. 259–276. River Pub- S.: Botnet attack detection in IoT using machine learning. Comput.
lishers (2022) Intell. Neurosci. 2022, 4515642–4515642 (2022)
4. Sathyadevan, S., Achuthan, K., Doss, R., Pan, L.: Protean authen- 21. Al-haija, Q.A., Al-Dala’ien, M.: Elba-iot: an ensemble learning
tication scheme—a time-bound dynamic keygen authentication model for botnet attack detection in IoT networks. Sens. Actuat.
technique for IoT edge nodes in outdoor deployments. IEEE Access Netw. (2022). https://doi.org/10.3390/jsan11010018
7, 92419–92435 (2019). https://doi.org/10.1109/ACCESS.2019. 22. Garg, S., Kumar, V., Payyavula, S.R.: Identification of internet
2927818 of things (IoT) attacks using gradient boosting: a cross dataset
5. Radoglou-grammatikis, P.I., Sarigiannidis, P.G.: An anomaly based approach. Telematique 21(1), 6982–7012 (2022)
intrusion detection system for the smart grid based on cart decision 23. B, G.B., Naik, B., Oram, E., Vimal, S.: Gravitational search
tree. In: 2018 Global Information Infrastructure and Networking optimized light gradient boosting machine for identification of
Symposium (GIIS), 1–5 (2018) malicious access in IoT network. Int. Conf. Comput. Intell. Pattern
6. Abri, F., Siami-Namini, S., Khanghah, M.A., Soltani, F.M., Namin, Recogn. 1, 570–579 (2022). https://doi.org/10.3390/jsan11010018
A.S.: Can machine/deep learning classifiers detect zero-day mal- 24. Aubet, F.-X.: Machine learning-based adaptive anomaly detection
ware with high accuracy? In: 2019 IEEE International Conference in smart spaces. (Doctoral dissertation, PhD thesis). (2018)
on Big Data (Big Data), pp. 3252–3259 (2019). IEEE 25. Awotunde, J.B., Folorunso, S.O., Imoize, A.L., Odunuga, J.O., Lee,
7. Giraud-Carrier, C.: Combining Base-Learners Into Ensembles, pp. C.-C., Li, C.-T., Do, D.-T.: An ensemble tree-based model for intru-
169–188. Springer, Cham (2022). https://doi.org/10.1007/978-3- sion detection in industrial internet of things networks. Appl. Sci.
030-67024-5_9 13(4), 2479 (2023)
8. Meidan, Y., Bohadana, M., Mathov, Y.M., Shabtai, Y., Breiten- 26. Rani, D., Gill, N.S., Gulia, P., Arena, F., Pau, G.: Design of an intru-
bacher, A., Elovici, D.: Yuval: N-baiot-network-based detection sion detection model for IoT-enabled smart home. IEEE Access
of IoT botnet attacks using deep autoencoders. IEEE Perva- (2023)
sive Comput. 17(3), 12–22 (2018). https://doi.org/10.1109/MPRV. 27. Pahl, M. O., Aubet, F. X.: All eyes on you: Distributed Multi-
2018.03367731 Dimensional IoT microservice anomaly detection. In: 2018 14th
9. Kumar, A., Kumar, N., B, A.H., Shukla, S.K.: Peerclear: Peer-to- International Conference on Network and Service Management
peer bot-net detection. International Symposium on Cyber Security (CNSM). (pp. 72–80). IEEE
Cryptography and Machine Learning, 279–295 (2019) https://doi. 28. Bentéjac, C., Csörgő, A., Martínez, G.: A comparative analysis of
org/10.1007/978-3-030-20951-3_24 gradient boosting algorithms. Springer, Netherlands 54(3), 1937–
10. Liu, J., Kantarci, B., Adams, C.: Machine learning-driven intrusion 1967 (2021)
detection for contiking-based IoT networks exposed to nsl-kdd 29. Freund, Y., Schapire, R.E., Avenue, P.: A short introduction to
dataset. In: Proceedings of the 2nd ACM workshop on wireless boosting. J. Jpn. Soc. Artif. Intell. 14(5), 771–780 (1999)
security and machine learning, 25–30 (2020) 30. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion,
11. Alqahtani, M., Mathkour, H., Ismail, M.M.: IoT botnet attack detec- B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V.,
tion based on optimized extreme gradient boosting and feature Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M.,
selection. Sensors (2020) Duchesnay, E.: Scikit-learn: machine learning in python. J. Mach.
12. Dash, P.B., Rao, K.S.: Anomaly detection in IoT network by using Learn. Res. 12, 2825–2830 (2011)
multi-class adaptive boosting classifier. Int. J. Inf. Secur. 9(3), 164– 31. Freund, Y., Schapire, R.E.: A decision-theoretic generalization of
171 (2020) on-line learning and an application to boosting. J. Comput. Syst.
13. Pahl, M. O., & Aubet, F. X.: Ds2os traffic traces, IoT traffic traces Sci. 55(1), 119–139 (1997)
gathered in a the ds2os IoT environment. Int J Info Sec (IJIS) (2018) 32. Friedman, J.H.: Greedy function approximation: a gradient boost-
14. Krishna, E.S.P., Thangavelu, A.: Attack detection in IoT devices ing machine. Ann. Stat. 29, 1189–1232 (2001)
using hybrid metaheuristic lion optimization algorithm and firefly 33. Chen, T., He, T.: xgboost: extreme gradient boosting. R Packag.
optimization algorithm. Int. J. Syst. Assur. Eng. Manag. 9(3), 164– 0.4-2 1(4), 0–3 (2017)
171 (2021). https://doi.org/10.1007/s13198-021-01150-7 34. Chen, T., Guestrin, C.: Xgboost: A scalable tree boosting system.
15. Hazman, C., Guezzaz, A., Benkirane, S., Azrour, M.: lids-sioel: In: Proceedings of the 22nd Acm Sigkdd International Conference
intrusion detection framework for IoT-based smart environments on Knowledge Discovery and Data Mining, pp. 785–794 (2016)
security using ensemble learning. Cluster Comput. (2022). https:// 35. Pythongeeks: Xgboost introduction. https://pythongeeks.org/
doi.org/10.1007/s10586-022-03810-0 xgboost-introduction/ (2022)
16. Koroniotis, N., Moustafa, N., Benjamin, T.: Towards the develop- 36. Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q.,
ment of realistic botnet dataset in the internet of things for network Liu, T.-Y.: Lightgbm: A highly efficient gradient boosting decision
forensic analytics: Bot-IoT dataset. Future Gener. Comput. Syst. tree. Adv. Neural Inf. Process. Syst. 30, 3149–3157 (2017)
100, 779–796 (2019) 37. Ibrahim, A.A., Ridwan, R.L., Muhammed, M.M., Abdulaziz, R.O.,
17. Ashraf, E., Areed, N.F.F., Salem, H., Abdelhay, E.H., Farouk, Saheed, G.A.: Comparison of the catboost classifier with other
A.: Fidchain: federated intrusion detection system for blockchain- machine learning methods. Int. J. Adv. Comput. Sci. Appl. 11(11),
738–748 (2020)

123
International Journal of Computational Intelligence Systems (2023) 16:177 Page 15 of 15 177

38. Prokhorenkova, L., Gusev, G., Vorobev, A., Dorogush, A.V., Gulin, 41. Naveed, K., Wu, H., Abusaq, A.: Dytokinesis: A cytokinesis-
A.: Catboost: unbiased boosting with categorical features. Adv. inspired anomaly detection technique for IoT devices. In: 2020
Neural. Inf. Process. Syst. 4, 1–11 (2018) IEEE 45th Conference on Local Computer Networks (LCN), pp.
39. Guo, C., Berkhahn, F.: Entity embeddings of categorical variables. 373–376 (2020)
arXiv preprint arXiv:1604.06737 (2016) 42. Saied, M., Guirguis, S., Madbouly, M.: Review of artificial intel-
40. Guryanov, A.: Histogram-based algorithm for building gradient ligence for enhancing intrusion detection in the internet of things.
boosting ensembles of piecewise linear decision trees. In: Aalst, Engineering Applications of Artificial Intelligence 127, 107231
W.M.P., Batagelj, V., Ignatov, D.I., Khachay, M., Kuskova, V., (2024)
Kutuzov, A., Kuznetsov, S.O., Lomazova, I.A., Loukachevitch,
N., Napoli, A., Pardalos, P.M., Pelillo, M., Savchenko, A.V.,
Tutubalina, E. (eds.) Analysis of Images, Social Networks and
Texts, pp. 39–50. Springer, Cham (2019)

123

You might also like