Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

Computer Communications 176 (2021) 146–154

Contents lists available at ScienceDirect

Computer Communications
journal homepage: www.elsevier.com/locate/comcom

Internet of Things attack detection using hybrid Deep Learning Model


Amiya Kumar Sahu a ,∗, Suraj Sharma a , M. Tanveer b , Rohit Raja c
a
International Institute of Information Technology Bhubaneswar, India
b
Indian Institute of Technology Indore, India
c
Guru Ghasidas Vishwavidyalaya (Central University), Chattishgarh, India

ARTICLE INFO ABSTRACT


Keywords: The Internet of Things (IoT) has become a very popular area of research due to its large-scale implementation
Internet of Things and challenges. However, security is the key concern while witnessing the rapid growth in its size and
Deep Learning applications. It is a tedious task to individually put security mechanisms in each IoT device and update it as per
Attack detection
newer threats. Moreover, machine learning models can best utilize the colossal amount of data generated by
CNN
IoT devices. Therefore, many Deep Learning (DL) based mechanisms have been proposed to detect attacks in
LSTM
IoT. However, the existing security mechanisms addressed limited attacks, and they used limited and outdated
datasets for evaluations. This paper presents a novel security framework and an attack detection mechanism
using a Deep Learning model to fill in the gap, which will efficiently detect malicious devices. The proposed
mechanism uses a Convolution Neural Network (CNN) to extract the accurate feature representation of data
and further classifies those by Long Short-Term Memory (LSTM) Model. The dataset used in the experimental
evaluation is from twenty Raspberry Pi infected IoT devices. The accuracy of the empirical study for attack
detection is 96 percent. In addition, it is observed that the proposed model outperformed various recently
proposed DL-based attack detection mechanisms.

1. Introduction communication, Requirement of Ultra-Reliable and Low Latency Com-


munication, and Dynamic changes in the network, which makes the
Internet of Things (IoT) devices are becoming an essential entity security provision most formidable task in IoT. It has also increased
in recent days, and they will have a profound economic and social the landscape of threats for the attackers [5].
impact on people’s lives. Nonetheless, security has become the primary The number of devices connected to the Internet, including cameras,
concern in it. Kaspersky [1] registered a notable surge in the number embedded machines, sensors, and many more that make up the IoT,
of malware for IoT devices from 3219 to 121588 samples in the year continues to grow at a rapid pace. A forecast from International Data
2018. McAfee also recorded a large number of data breaches, and Corporation (IDC) projected that there would be 41.6 billion connected
cyberattacks in 2018 [1]. Cybercriminals targeted a significant number
IoT objects generating 79.4 zettabytes (ZB) of data in 2025 [6]. The
of attacks on IoT devices due to the massive number of vulnerabilities
research community in IoT is now viewing the scope in the colossal
present in IoT devices. Additionally, the nodes involved in IoT gen-
amount of real-time data generated by IoT devices. And, they proposed
erally possess limited resources that provide fertile ground for cyber
various Machine Learning(ML) and Deep Learning(DL) mechanisms for
attackers. Besides that, rapidly growing size IoT networks, comprising
heterogeneous devices with dynamic behavior, have further extended IoT security by harnessing knowledge from the data [7,8]. Moreover,
the security challenges to the next level [2]. DL-based security mechanisms can learn heterogeneous features from
There have been many extensive security mechanisms proposed unstructured data by themselves, and thus these are heterogeneity
addressing IoT security, primarily through traditional cryptographic tolerant. These can also be used to detect the novel mutated attacks
constructs [3]. However, the existing cryptographic solutions on in- from their earlier forms; therefore, the security mechanism does not
dividual IoT devices are insufficient to satisfy the whole spectrum require a patch on IoT devices from time to time [9]. Nonetheless, the
of IoT security because of the dynamic nature of attacks and IoT existing DL-based security mechanisms considered a limited number
networks as well [4]. There are various characteristics of devices of attacks and outdated datasets for their evaluations. Many proposed
in an IoT network, such as device–device close proximity Commu- schemes are specific to a particular attack, such as Mirai botnet de-
nication, Inter-connectivity, Massive deployment, Heterogeneity, Self- tection [10] or ransomware detection [11]. Besides that, the existing
organization and self-healing characteristics, Low-power and low-cost security mechanism used emulated datasets and mostly self-generated

∗ Corresponding author.
E-mail addresses: c117002@iiit-bh.ac.in (A.K. Sahu), suraj@iiit-bh.ac.in (S. Sharma), mtanveer@iiti.ac.in (M. Tanveer), rohit.raja@ggu.ac.in (R. Raja).

https://doi.org/10.1016/j.comcom.2021.05.024
Received 3 December 2020; Received in revised form 19 April 2021; Accepted 25 May 2021
Available online 1 June 2021
0140-3664/© 2021 Elsevier B.V. All rights reserved.
A.K. Sahu, S. Sharma, M. Tanveer et al. Computer Communications 176 (2021) 146–154

them in their control environment. The number of samples used to train Command and Control (C&C): [14] It implies that the malicious
their model is also limited hence suffering from underfitting. Therefore, IoT objects were connected to a Command and Control Server. The
this paper proposed a novel security framework and an IoT attack suspicious server may download some malicious binaries onto the
detection mechanism using a hybrid Deep Learning model to fill in devices or sniff the information passing through the devices.
the research gap and efficiently detect malicious devices affected by Distributed Denial of Server (DDoS): [15] It refers to the par-
nine various attacks. The proposed model was evaluated on a standard ticipation of IoT devices and other malicious bots to overload service
dataset from the Stratosphere lab published in 2020. The dataset was requests to a computing system so that the computing system is unable
obtained from twenty infected Raspberry Pi and Three benign IoT to process. The attacks are usually targeted at a particular IP address.
devices. The proposed mechanism uses a Convolution Neural Network File Download: [14] It shows that a suspicious file is downloaded
(CNN) [12] to extract the high-level feature representation of data and for malicious activity. Its usual size is in the range of between 3KB
further classifies those by Long Short Term Memory (LSTM) Model to 5KB. This suspicious activity occurs with a high correlation with a
[13]. Command & Control (C&C) server and a dubious port or IP address.
The data packets in IoT network traffic are contiguous and obtained Heart Beat: [16] It implies that packets sent, usually of 1 byte, by
from continuous-time intervals. The series of data packets contains the C&C server to keep track of the infected IoT devices. The tracking
the information of the sequence. This chain-like nature reveals that activity can be sensed by identifying connections with a response of
recurrent neural networks(RNN) are intimately related to sequences less than 1 byte. A remote C&C server usually does the tracking activity.
and lists. They are the natural architecture of the neural network to The server sends the heartbeat packets through earlier known malicious
use for such data series. Specifically, LSTM, a variant of RNN, is an ports and destination IP address with similar periodic connections.
apt model to obtain the essential high-level sequential characteristics Part Of A Horizontal Port Scan: [17] It is a horizontal port scan to
of network data and classify various IoT attacks. Before feeding data gather information and perform further extensive attacks. The scanned
to the LSTM classifier, a CNN model is used for harnessing the true connection usually shares a single port for transmitting inquiry bytes
representative of IoT network data; these are good for extracting the for multiple destinations IP addresses.
high-level features with fewer learning-parameters requirements. The Mirai: [18] It is the famous attack that was used to disrupt the Dyn
CNN is used in the proposed model to achieve self-feature learning server. It identifies vulnerabilities in IoT devices attempting more than
capability. sixty common default usernames and passwords at the outset. Then, it
The following are the primary contributions of this paper: logs into those devices to further infect them with the malicious Mirai
code. It turns networked IoT devices into remotely controlled bots that
• A novel framework for IoT attack detection is proposed can be used as part of large-scale botnet DDoS attacks.
• The proposed model is structured as a hybrid Deep Learning Torii: [19] It possesses a rich set of features for exfiltration of
Model Architecture using Convolution Neural Network and Long sensitive information and modular architecture capable of retrieving
Short-Term Memory models. information. It can execute commands and executables through multi-
• Dynamic Analysis of the IoT attacks from more than 30 million ple layers of encrypted communications. It can also infect many types
traffic flows taken from dataset published in 2020. of devices with various architectures, such as x86, x64, MIPS, PowerPC,
• Comparative analysis of our model with other similar contempo- ARM, and many more.
rary research works. Okiru: [20] It is a malware that is similar to Mirai. Its configuration
is encrypted in two parts, and the attack via Telnet is much more severe
The rest of the paper is organized as follows: Section 2 enumerates
as it uses a list of over 100 credentials.
the attacks considered in the proposed security mechanism. Then,
Section 3 delineates the related works and identifies the research gap.
3. Literature review
Followed by Section 4 presents the proposed security framework and
DL-based attack detection mechanism. Subsequent Section 5 evaluates
Deep Learning(DL) is evolving as an alternative security solution for
the proposed scheme empirically and provides security analysis consid-
IoT-enabled applications [21]. The recent developments in Deep Learn-
ering other contemporary similar work. Lastly, Section 6 provides the
ing technology have attracted many researchers to work on IoT security
conclusion followed by references.
using DL models. The following are the recent DL-based research
endeavors undertaken for IoT security.
2. Attacks considered in proposed scheme Resource constraint IoT devices are much easily compromised in
comparison to a Desktop computer, which has sufficient resources. Y.
Resource constraint IoT devices are an easy target for cyber attack- Meidan et al. [10] presented network-based anomaly detection in the
ers, and many attackers usually influence these. Furthermore, malicious Internet of Things(N-BaIoT). They collected data from nine botnet in-
IoT objects may participate in other extensive attacks. This section enu- fected devices. They used an autoencoder (AE), a deep learning model,
merates the attacks considered for the proposed classification model. to detect malicious network traffic from malware-infected IoT devices.
The proposed study is done on the dynamic analysis of the attacks, Nevertheless, the model has addressed only two botnets, particularly
i.e., malicious binary files are executed on devices and monitored from the BASHLITE and Mirai botnet. In addition to that, the model took
the network traffic. (174 +212=) 386 ms to detect the malicious botnet. Besides that, the
The experimented network traffic dataset is obtained from twenty- model compared three traditional machine learning models. The com-
three IoT devices. Twenty of those are Raspberry Pi devices infected parison of the model with other DL models should estimate reasonable
with various malicious codes and three from real IoT devices network accuracy in comparison to others.
traffic. The benign network samples were acquired by capturing the DL with various capabilities, for example, high-level feature extrac-
network traffic of three distinct IoT devices: a Somfy smart door lock, tion, self-learning, feature compression capability, and ideal hidden
a Philips HUE smart LED lamp, and an Amazon Echo home-based pattern discovery, provides an edge over other traditional Machine
intelligent personal assistant. These devices are generic benign IoT Learning (ML) models. The feature extraction mechanism in DL could
devices. On the contrary, malicious network traffic is collected from help discriminate the attacks obtained from a slight mutation of earlier
twenty Raspberry Pi, each running with a specific malware that uses known attacks. A. A. Diro et al. [22] proposed a DL model using
various protocols and performs distinct operations. Stochastic Gradient Descent(SGD) as the backpropagation optimiza-
The following are the eight categories of significant attacks consid- tion algorithm, that is, a stochastic approximation of the gradient
ered in the proposed IoT attack detection: descent optimization. The proposed model detects attacks in social IoT

147
A.K. Sahu, S. Sharma, M. Tanveer et al. Computer Communications 176 (2021) 146–154

Table 1
Deep learning mechanisms in IoT security.
Ref. Objective Model Attack considered Limitations
DoS CC DDoS FD HB M O T POHP R
Y. Meidan et al. [10] IoT botnet detection Deep autoEncoder  Only evaluated on Mirai and
BashLITE botnet
-compared with Three ML
Model
A. A. Diro et al. [22] Attack detection SGD     Limited attacks
-Limited data samples

B. Roy et al. [23] Intrusion detection LSTM and BRNN      single dataset
-No model comparison

Y. Zhou et al. [24] Intrusion detection DFNN     Limited Data samples


-Limited attacks
-significant increase in time

A. Dawoud et al. [25] Intrusion detection RBM     Old dataset


-Limited attacks

H. HaddadPajouh et al. [26] Malware detection LSTM and BNN  Emulated dataset
-Limited data samples

S. Homayoun et al. [11] Ransomware detection LSTM and CNN  Ransomware detection only
-Emulated dataset

A. Azmoodeh et al. [27] Malware detection CNN  Self-manufactured dataset


-Limited dataset samples

O. Brun et al. [28] DDoS detection DRNN  Limited network attacks


-No comparison is provide
for similar model

Notes: DoD: Denial of Service; CC: Command & Control; DDoS: Distributed Denial of Service; FD: File Download; HB: Heart Beat; M: Mirai; O: Okiru; T:Torii; POHP: Part Of A
Horizontal Port Scan; R: Ransomware; : Supported by security mechanism.

networks. However, the model does not provide a comparison from samples with a single dataset. Besides that, it does not offer any
other traditional ML models. Additionally, the model has considered a comparison for other contemporary models.
limited dataset with limited attacks such as port scanning, downloading The authors Y. Zhou et al. [24] proposed architecture called Deep
malicious files onto a device, buffer overflow, and DoS. Feature Embedding Learning (DFEL) to reduce the data dimensions by
Internet technology and the new generation of mobile communica- taking the edge of deep learning and transfer learning. The model used
tion technology and automotive intelligence revolutionized travel and NSL-KDD and UNSW-NB15 datasets. The two datasets are randomly
daily commutes. Simultaneously, smart vehicles are also becoming a split, and 80% of the data was used to fit DFEL to obtain the pre-
target due to various technical loopholes resulting in a range of security trained model. The remaining 20% of the data were randomly split into
issues. Authors Fei Li et al. [29] have proposed a DL-based intrusion 70% and 30% as training and testing data for classifiers, respectively.
detection using autoencoder and recurrent neural network. They used Then, the rest 20% of the data was transferred to latent attributes
autoencoder to learn high-dimensional complex data structure and using the DFEL. Furthermore, the obtained embedding features were
extract the required features. Then, they used RNN to identify the split into 70% and 30% for embedding training and embedding test,
influence of fake vehicle RPM data on the speed of a smart vehicle. respectively. The experimental evaluation of the model’s accuracy on a
recent dataset is comparatively less.
Authors Y. Tan et al. [30] proposed another interesting Deep Fea-
A. Dawoud et al. [25] proposed a framework for IoT by integrating
ture Embedding Learning framework to detect an Internet intrusion
IoT with Software Defined Network and presented a DL-based intrusion
in an IoT environment. They experimented with UNSW-NB15 and
detection mechanism in it. The security model used Restricted Boltz-
NSL-KDD datasets. They proposed a three-layer system to generate em-
mann Machines (RBM). The model worked better than the traditional
bedding features for small datasets. Various feature generation models
ML model, such as Kernel Density Estimation and Support Vector
followed by classification-based ensemble methods have been pro-
Machine, using the KDD99 dataset. However, the dataset considered
posed. Deep learning models with multilayer processing architecture in the evaluation is acquired in 1999, and it is outdated.
are showing better performance as compared to the shallow or tra- H. HaddadPajouh et al. [26] further explored the idea of using
ditional classification models [31]. R. Katuwal et al. [32] proposed a Recurrent Neural Network(RNN) in IoT malware detection. They col-
deep Random Vector Functional Link(RVFL) network. They integrate lected malware samples from 32-bit ARM-based processor. They build
the deep learning networks with sparse pre-trained Random Vector their dataset by extracting the OpCodes and tested for three different
Functional Link. It used a sparse-autoencoder to learn the hidden layer versions of the LSTM model [13,33]. They trained their model using
parameters of RVFL. It can be obtained by training a deep network only 281 malware and 270 benign programs; and received an accuracy
once rather than training several models independently as in traditional of 98%. It can be observed that the dataset is small and emulated.
ensembles. Therefore, the model should be tested under the recently available large
The authors B. Roy et al. [23] proposed IoT attack detection mech- datasets.
anism using a Bi-directional Long Short-Term Memory, a variant of S. Homayoun et al. [11] present a DL-based Deep Ransomware
the Recurrent Neural Network (RNN) model. The model provides an Threat Hunting and Intelligence System (DRTHIS) to discriminate ran-
accuracy of 95%. However, the model is trained with only 5451 test somware from benign software and identify their families. The authors

148
A.K. Sahu, S. Sharma, M. Tanveer et al. Computer Communications 176 (2021) 146–154

Fig. 1. Proposed network model.

used two DL techniques, such as Long Short-Term Memory (LSTM) [13] 4. Proposed scheme
and Convolutional Neural Network (CNN), for classification. They ex-
perimented with over 220 Locky, 220 Cerber, and 220 TeslaCrypt
This section presents the proposed security mechanism to detect
ransomware samples, and 219 benign samples for training. They ob-
attacks in a typical IoT network and safeguards. At the outset, the
tained the True Positive Rate(TPR) of 97%. However, the accuracy
threat model with considered attacks in a standard IoT network is
measure is not provided for the evaluation. Moreover, the study is only
limited to ransomware and its variants. enumerated. Subsequently, a proposed network model and a novel
IoT also gained much attention from military usage. Various IoT security framework are presented as a security mechanism to counter
devices are helping in military cooperation and defensive applications. IoT attacks. Subsequent sub-sections explain the sub-components of the
The devices and data involved in their communication are very crucial framework, such as CNN and LSTM, for the detection of the considered
from a security point of view. Authors A. Azmoodeh et al. [27] ad- attacks.
dressed malware detection on the Internet of Battlefield Things (IoBT). Threat Model: Fig. 1 shows a typical IoT network where IoT-
The model is based on a class-wise selection of Op-Codes sequence as enabled devices are networked and later connected to the Internet
a feature for the classification task. Further, they created a graph of for accessing various analytical services. However, the devices are
selected features for each sample and used a deep Eigenspace learning exposed to many potential attackers who could sniff through the inter-
approach for malware classification. They obtained an accuracy of 98% net connection and gather crucial information. Later, they could also
to detect junk code insertion attacks. Nonetheless, the dataset is having
take leverage of resource constraint IoT devices and launch various
limited samples and self-manufactured.
attacks. The attacks could be of multiple types, for example, supply
Authors O. Brun et al. [28] proposed a security mechanism for the
command & control the devices, distributed denial of services, secretly
online detection of network attacks in IoT networks. They captured the
downloading malicious files, Mirai, Okiru, Torii, and horizontal port
online traffic transmitting through the network gateway and extracted
the data from the PCAP file. They trained the data samples using a stan- scanning. The explanation of these considered attacks is depicted in
dard RNN composed of statistically identical cells where the number of Section 2. Present-day intruders are enabled with various advanced
cells is enormous, and each cell receives inhibitory spike trains from network information gathering and analysis tools. Therefore, there is
external cells. However, the study considered a self-generated dataset a high probability that a potential intruder can identify old or isolated
and worked with a limited number of attacks such as sleep deprivation IoT devices and target them for further intrusion. Once the attacker gets
attacks, UDP flood, broadcast attack, and TCP SYN. Besides that, the control over an IoT device, it may launch further extended attacks to
proposed method is not compared with any other security mechanisms. control other devices in the network.
Table 1 shows the security objective, mechanisms, attacks, and Network Model: Fig. 1 shows the proposed security framework
limitations of the DL-based security works. It can be observed that for a typical IoT network. The IoT devices are isolated from the fully
the delineated works are addressing a limited number of attacks such connected computer network to distinguish the network characteristics
as ransomware detection or Mirai botnet detection. Also, the security better. It is proposed to secure IoT devices in a separate subnet for bet-
mechanism used emulated datasets and mostly self-generated them in
ter detection of malicious IoTs. Consider Fig. 1, there will be additional
their control environment. The number of samples used to train their
devices equipped with CNN trained model at every access point; it taps
model is also limited and outdated datasets. In this regard, there is
the network traffic, learns features by processing it, and then sends the
a need for a security mechanism addressing the more significant set
of attacks with a recently available updated dataset. Therefore, our features to the upper-layer network device. The upper layer, typically
works presented a security framework and proposed an embedded Deep a gateway, will be equipped with the LSTM trained classifier. The
learning mechanism using Convolution Neural Network and Long Short extracted features from the CNN model will be provided to the LSTM
Term Memory to detects the IoT attacks in a typical network. The trained model for classification. If the feature is classified as malicious,
dataset used in the proposed model is a standard dataset from the then the corresponding identity and address of the IoT device will be
Stratosphere lab published in 2020. It is obtained from real infected notified through the system monitor so that further preventive action
IoT devices. can be taken to quarantine the malicious activities and devices.

149
A.K. Sahu, S. Sharma, M. Tanveer et al. Computer Communications 176 (2021) 146–154

The softmax function inputs a vector z of K real numbers and


normalizes it into a probability distribution consisting of K probabil-
ities. The loss function adopted for minimization is Cross Entropy,
i.e., 𝐶𝑟𝑜𝑠𝑠𝐸𝑛𝑡𝑟𝑜𝑝𝑦(𝐿, 𝑆) = −𝛴𝑖 𝐿𝑖 𝑙𝑜𝑔𝑆𝑖 , where 𝐿𝑖 is the one-hot rep-
resentation of target label and 𝑆𝑖 is the output state vector from the
output layer using softmax function.

4.1.2. LSTM model


LSTM is a variant of the Recursive Neural Network(RNN) model,
and Fig. 4 shows the single cell of the LSTM model used for the
classifier.
The unit cell consists of three gates, such as input gate(𝑖𝑡 ), forget
gate(𝑖𝑓 ), and output gate(𝑖𝑜 ), to control the features information learn-
ing. 𝑊𝑖 , 𝑊𝑓 and 𝑊𝑜 are the recurrent weights, and 𝑈𝑖 , 𝑈𝑓 and 𝑈𝑜 are the
input weight parameters to be learned. The following are the equations
to be evaluated for the mentioned gates:
𝑜′𝑡 = 𝑊𝑜 ℎ𝑡−1 + 𝑈𝑜 𝑥𝑡 + 𝑏𝑜
𝑜𝑡 = 𝜎(𝑜′𝑡 )
𝑖′𝑡 = 𝑊𝑖 ℎ𝑡−1 + 𝑈𝑖 𝑥𝑡 + 𝑏𝑖
𝑖𝑡 = 𝜎(𝑖′𝑡 )
𝑓𝑡′ = 𝑊𝑓 ℎ𝑡−1 + 𝑈𝑓 𝑥𝑡 + 𝑏𝑓
𝑓𝑡 = 𝜎(𝑓𝑡′ )
1
𝜎(𝑥) = 1+𝑒 𝑥
Fig. 2. Proposed CNN–LSTM framework for IoT attack detection.
The features obtained from the CNN model at time t, 𝑥𝑡 , is input
vector to LSTM cell. Then, a temporary state, 𝑠̃𝑡 , is calculated using the
input, 𝑥𝑡 , and the previous selectively written cell state ℎ𝑡−1 , it is given
4.1. Proposed security mechanism by, 𝑠̃𝑡 = tanh(𝑊 ℎ𝑡−1 + 𝑈 𝑥𝑡 + 𝑏). Next, the cell output at time t, {𝑠𝑡 , ℎ𝑡 }
is obtained using following equations:
Fig. 2 depicts the proposed security framework. It comprises two 𝑠𝑡 = 𝑓𝑡 ⊙ 𝑠𝑡−1 + 𝑖𝑡 ⊙ 𝑠̃𝑡
stages. First, a Deep Learning based Convolutional Neural Network ℎ𝑡 = 𝑜𝑡 ⊙ tanh(𝑠𝑡 )
2𝑥
(CNN) model learns features from the IoT network traffic and inputs tanh(𝑥) = 𝑒𝑒2𝑥 −1 +1
them into another DL-based Long Short Term Memory(LSTM) model.
Second, the LSTM classifies the learned features and identifies the 5. Experimental evaluation
malicious IoT devices, if any. The detailed proposed architecture and
functionalities of the CNN and LSTM frameworks are explained in the The proposed scheme is implemented using the TensorFlow in the
following Sections 4.1.1 and 4.1.2, respectively. Google Colab environment, an end-to-end open-source platform for
Machine Learning [34]. The following Sections Section 5.1 explains the
4.1.1. CNN model datasets and their pre-processing. Subsequently, Sections 5.2 and 5.3
The CNN model used at the initial stage of classification acquires the
provides the performance measures and result analysis, respectively.
essential features from the network traffic. Fig. 3 shows the proposed
architecture of the CNN model.
5.1. Datasets and pre-processing
The model is a Deep Neural network with four convolutions and a
fully connected dense layer. A feature of 6 × 6, thirty-six dimensions
Dataset: The dataset used in this experimental evaluation is IoT-23.
form the input layer. The subsequent convolution layer is formed,
It contains device traffic captured from three benign IoT devices and
applying eighty 2 × 2 kernels with a stride equal to one. Therefore, the
twenty malware-infected Raspberry Pi. The IoT network traffic was ob-
number of learning parameters, including bias, will be 400. The second
tained from the Stratosphere Laboratory in Czech technical university,
convolution layer applies forty 2 × 2 kernels with stride equal to one,
Czech Republic [35]. The benign scenarios were obtained by capturing
and the obtained number of learning parameters is 200. Similarly, the
the network traffic of three distinct IoT devices: a Somfy smart door
third layer with twenty kernels of 3 × 3 results in 200 parameters to be
lock, an Amazon Echo home-based intelligent personal assistant, and
learned. The fourth convolution layer requires 150 learning parameters.
a Philips HUE smart LED lamp. It is essential to mention that these
The last fully connected dense layer, before output, is the aimed feature
three IoT devices are real hardware and not simulated. Besides that,
of dimension 100, and it would require 27100 learning parameters for
the dataset contains more than 325 million traffic flows data published
the model. The total sum parameters needed for the model are equal
in 2020.
to 28050.
The captured .pcap file from the network node is first analyzed man-
The model used adoptive moment estimation optimization algo-
ually. Then, the suspicious flows are identified, and proper labels are
rithm with parameters, such as 𝛽1 = 0.9, 𝛽2 = 0.999, 𝜖 = 1𝑒 − 8 and
assigned. Next, a labeled file in comma-separated value(.csv) format is
𝜂 = 0.001, in the following equitations:
produced by the expert. A python script processes the manually labeled
𝑚𝑡 = 𝛽1 ∗ 𝑚𝑡−1 + (1 − 𝛽1 ) ∗ ∇𝑤𝑡
.csv files to add labels to each flow. Another python script processed
𝑣𝑡 = 𝛽2 ∗ 𝑚𝑡−1 + (1 − 𝛽2 ) ∗ (∇𝑤𝑡 )2
𝑚 𝑣 other log files and compared them with the pattern of manually labeled
𝑚̂ 𝑡 = (1−𝛽𝑡 )𝑡 , 𝑣̂𝑡 = (1−𝛽𝑡 )𝑡
1
𝜂 2 .csv files; furthermore, It assigned a label to the appropriate flow of the
𝑤𝑡+1 = 𝑤𝑡 − √ ∗ 𝑚̂ 𝑡
𝑣̂𝑡 +𝜖 processed files.
The model used Leaky Rectified Linear Unit, i.e. 𝐿𝑒𝑎𝑘𝑦𝑅𝑒𝐿𝑈 (𝑥) = Data Pre-processing: Initially, the obtained labeled files from the
𝑚𝑎𝑥(𝛼𝑥, 𝑥), as the activation function for the hidden layers, where 𝛼 = Stratosphere Laboratory are converted to Comma Separated
0.01. But, the output activation function used in the model is softmax Value(.CSV) format. The features, for example, IP address, expanded
and it is defined as: to its sub-components as integer features. Subsequently, The categorical
𝑧𝑖
𝑆𝑜𝑓 𝑡𝑚𝑎𝑥 = 𝑆(⃗𝑧)𝑖 = 𝐾𝑒 𝑧𝑖 features are encoded to numerically ordered features, and numeric data
𝛴𝑗=1 𝑒

150
A.K. Sahu, S. Sharma, M. Tanveer et al. Computer Communications 176 (2021) 146–154

Fig. 3. Proposed CNN Model for feature learning..

False Positive Rate(FPR) [36]: The ratio of the number of misclassified


negative samples to the total number of negative samples. It is given
𝐹𝑃
by: 𝐹 𝑃 𝑅 =
𝐹𝑃 + 𝑇𝑁
False Negative Rate(FNR) [36]: it is the ratio of the number of misclas-
sified positive samples to the number of positive samples. It is evaluated
𝐹𝑁
as: 𝐹 𝑁𝑅 =
𝑇𝑃 + 𝐹𝑁
Accuracy: the overall effectiveness of the classification model, i.e., the
ratio of the number of correctly classified samples to the total number
of samples present in a given test dataset. It is evaluated as: 𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 =
𝑇𝑃 + 𝑇𝑁
Fig. 4. LSTM model for classification. 𝑇𝑃 + 𝐹𝑃 + 𝑇𝑁 + 𝐹𝑁
F-Measure: the harmonic mean of the precision and recall, i.e., 𝐹 -
𝑃 𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 × 𝑅𝑒𝑐𝑎𝑙𝑙
Table 2 𝑀𝑒𝑎𝑠𝑢𝑟𝑒 = 2 ×
𝑃 𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 + 𝑅𝑒𝑐𝑎𝑙𝑙
Confusion matrix.
Predicted positive class Predicted negative class 5.3. Result and security analysis
Actual positive class True Positive (TP) False Negative (FN)
Actual negative class False Positive (FP) True Negative (TN)
The proposed DL-based security model is trained and tested with
network samples taken from a large set of attacked IoT devices. Fig. 1
depicts that the proposed network module. It can be clearly observed
are normalized to range between 0 to 1. Lastly, the target labeled that each subnet is added with a CNN model to harness the true
features used in our supervised learning is converted to One-Hot encod- representative of network features. Whenever we scale up the network
ing. The target encoding scheme will have nine Sub-Classes for eight by adding additional IoT devices, then we also need to accompany
malicious attack Labels and one benign: C&C, DDoS, FileDownload, an additional CNN module with the connected network switch. The
HeartBeat, Mirai, Okiru, Torii, PartOfAHorizontalPortScan, and Benign. increase in the IoT traffic in the network will be handled with its
corresponding CNN module. Furthermore, the extracted features from
5.2. Performance measures CNN are directed to the LSTM classifier, and there, it would require
little computational overhead to process attack detection at the LSTM
The proposed security scheme is evaluated using the following per- classifier model.
formance characteristics. True Positive(TP) is positively labeled sam- Table 3 shows the classification accuracy of only the CNN model
ples correctly classified as positive, whereas True Negative(TN) is using the iot-23 dataset. The initial model does not perform well in clas-
negative samples correctly predicted as Negative by the model. False sifying attacks such as CC, File Download, HeartBeat, and PartofHor-
Positive(FP) is defined as negatively labeled samples is misclassified izontalPortScan. However, it works substantially well in identifying
as Positive class. And, False Negative(FN) is positively labeled samples the Benign features with an accuracy of 94% and FNR equal to 4%.
misclassified as Negative class by the model. Table 2 provides the tab- The resultant FNR of attacks, such as C&C, Heartbeat, and PartofHor-
ular representations of Confusion Matrix used for further performance izontalPortScan, are high. It signifies that the ratio of the number of
measure calculation. misclassified positive samples to the number of positive samples is
The following performance metrics, obtained from the confusion high. It is highly insecure and dreadful to classify actual malicious IoT
matrix, are used for evaluation of the proposed model: devices as benign.
Recall: it refers to the ratio of predicted positives and total true posi- The classification inaccuracy FNR for various attacks in the CNN
𝑇𝑃 model motivates us to two-stage model using both CNN and LSTM. The
tives. It is given by: 𝑅𝑒𝑐𝑎𝑙𝑙 =
𝑇𝑃 + 𝐹𝑁 embedded features obtained from the fully connected CNN-layer are
Precision: it infers the ratio of positively predicted instances out of total
𝑇𝑃 stored and provided to LSTM for further classifications. Table 4 presents
positively predicted. It is calculated as: 𝑃 𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = the performance measures for the hybrid model. It can be observed that
𝐹𝑃 + 𝑇𝑃
Specificity: it refers to the proportion of real negative instances that the model performs pretty well in identifying malicious devices with an
have been predicted negative. It is evaluated as: 𝑆𝑝𝑒𝑐𝑖𝑓 𝑖𝑐𝑖𝑡𝑦 = accuracy of 96% and F-measures equal to 96% as well. The FNR rate,
𝑇𝑁
2%, signifies its miss-classifications for malicious devices. The security
𝐹𝑃 + 𝑇𝑁

151
A.K. Sahu, S. Sharma, M. Tanveer et al. Computer Communications 176 (2021) 146–154

Table 3
Performance measures in CNN.
Attack Recall Precision Specificity FPR FNR F-Measure Accuracy
CC 0.8369 0.6525 0.6371 0.3628 0.1630 0.7333 0.7268
FileDownload 0.7386 0.6989 0.7227 0.2772 0.2613 0.7182 0.7301
HeartBeat 0.6458 0.7560 0.7727 0.2272 0.3541 0.6966 0.7065
PartofHorizontalPortScan 0.7422 0.6728 0.6902 0.3097 0.2577 0.7058 0.7142
Torii 0.9042 0.8585 0.8426 0.1573 0.0957 0.8808 0.8743
Okiru 0.8723 0.8817 0.8888 0.1111 0.1276 0.8770 0.8808
Mirai 0.9200 0.9019 0.8969 0.1030 0.0800 0.9108 0.9086
DDos 0.9270 0.9081 0.8941 0.1058 0.0729 0.9175 0.9116
Benign 0.9587 0.9300 0.9270 0.0729 0.0412 0.9441 0.9430

Table 4
Performance measures in CNN–LSTM.
Classes Recall Precision Specificity FPR FNR F-Measure Accuracy
Malicious 0.9706 0.9594 0.9526 0.0473 0.0293 0.9650 0.9623
Benign 0.9578 0.9469 0.9446 0.0553 0.0421 0.9523 0.9513

Table 5
Comparative study of the proposed model.
Attack considered
Ref. Security mechanism Accuracy
DoS CC DDoS FD HB M O T POHP
B. Roy et al. [23] LSTM and BRNN      72%
H. HaddadPajouh et al. [26] LSTM and BNN  84%
A. Azmoodeh et al. [27] CNN  88%
Proposed work CNN and LSTM          96%

Notes: DoD: Denial of Service; CC: Command & Control; DDoS: Distributed Denial of Service; FD: File Download; HB: Heart Beat; M: Mirai; O: Okiru; T:Torii; POHP: Part Of A
Horizontal Port Scan; : Supported by security mechanism.

The model also considers a comparatively larger attack-set, i.e., nine


various attacks, for malicious IoT object detection. It can be observed
from Table 5 that the proposed model by H. HaddaPajouh et al. [26],
and A. Azmoodeh et al. [27] achieved an accuracy of 84% and 88%,
respectively. Besides that, they only consider attacks from a Command
and Control Server. Another work by B. Roy et al. [23] considered five
various attack categories; but, it only achieves an accuracy of 72%,
whereas our model identifies the attack vectors with the accuracy of
96%.

6. Conclusion

IoT security has become one of the primary concerns in recent


day technology. Many cryptographic constructs and security protocols
are proposed to secure IoT devices and its network; but, implement-
ing and maintaining security for thousands of devices is tedious and
challenging. Also, adversaries may likely come up with similar but
newer threats for which security patch is essential. Instead of provid-
Fig. 5. Loss trend in the proposed model. ing individual cryptographic security, an underlying threat discovery
mechanism over the whole network is cheaper and efficient. Therefore,
many research endeavors have been made using DL to detect attacks
model, Recall value equal to 97%, illustrates its satisfactory result is and secure IoT networks. Nevertheless, it is observed that the existing
identifying the malicious devices as exactly as malicious. works are trained and evaluated with a small number of attack sam-
Our proposed model identifies malicious network traffic with an ples; additionally, the datasets are emulated and outdated. Therefore,
F-Measure of 96%. Once the traffic is identified, it can be further inves- this paper proposed an IoT attack detection mechanism using a Deep
tigated and identified with the CNN-based sub-classification network. Learning-based classifier. It is modeled using the recently published
Figs. 5 and 6 depicts the graphical representation of loss trend standard dataset having more than thirty-five million samples. The
and accuracy of the proposed model, respectively. Fig. 5 shows that classifier’s initial stage, CNN, efficiently learns the IoT features with
the training and validation graph is on the same trend, and therefore 28050 learning parameters; then, it classifies using an LSTM based
the model does not suffer from an over-fitting problem. The proposed classifier. The proposed model is empirically evaluated and found that
model also does not suffer from an under-fitting problem. Fig. 6 ex- the model performs quite well in identifying malicious devices with
plains that the accuracy is achieved in the 110th epoch, and the model an accuracy of 96%. The proposed model is efficiently classifying the
is pretty efficient in achieving 96% accuracy. malicious devices outperforming other related models. The security
Table 5 depicts the comparison between the proposed security model is also easily scalable in adding a CNN module whenever an
mechanism and other three related works. The proposed model outper- additional IoT sub-network is added. However, it would require little
formed the other models’ in attack detection accuracy. The evaluated computational overhead in the process of attack detection by the
accuracy is achieved when tested with the standard iot-23 dataset. LSTM classifier. In addition to that, the model does not suffer from

152
A.K. Sahu, S. Sharma, M. Tanveer et al. Computer Communications 176 (2021) 146–154

Fig. 6. Accuracy in the proposed model.

underfitting or overfitting problems. Therefore, it would be useful in [9] A. Thakkar, R. Lohiya, A review on machine learning and deep learning
securing the typical IoT network from the considered eight categories perspectives of IDS for IoT: recent updates, security issues, and challenges, Arch.
Comput. Methods Eng. (2020) https://doi.org/10.1007/s11831-020-09496-0.
of attacks and its novel mutated versions.
[10] Y. Meidan, M. Bohadana, Y. Mathov, Y. Mirsky, A. Shabtai, D. Breitenbacher,
Y. Elovici, N-baIoT—Network-based detection of IoT botnet attacks using deep
CRediT authorship contribution statement autoencoders, IEEE Pervasive Comput. 17 (3) (2018) 12–22, http://dx.doi.org/
10.1109/MPRV.2018.03367731.
[11] S. Homayoun, A. Dehghantanha, M. Ahmadzadeh, S. Hashemi, R. Khayami,
Amiya Kumar Sahu: Conceptualization, Methodology, Software, K.-K.R. Choo, D.E. Newton, DRTHIS: Deep ransomware threat hunting and
Data curation, Writing - original draft. Suraj Sharma: Visualization, In- intelligence system at the fog layer, Future Gener. Comput. Syst. 90 (2019)
vestigation, Supervision, Resources. M. Tanveer: Software, Validation, 94–104, https://doi.org/10.1016/j.future.2018.07.045.
Resources. Rohit Raja: Writing - review & editing. [12] K. Muhammad, T. Hussain, M. Tanveer, G. Sannino, V.H.C. de Albuquerque, Cost-
effective video summarization using deep CNN with hierarchical weighted fusion
for IoT surveillance networks, IEEE Internet Things J. 7 (5) (2020) 4455–4463,
Declaration of competing interest http://dx.doi.org/10.1109/JIOT.2019.2950469.
[13] K. Greff, R.K. Srivastava, J. Koutník, B.R. Steunebrink, J. Schmidhuber, LSTM:
A search space odyssey, IEEE Trans. Neural Netw. Learn. Syst. 28 (10) (2017)
The authors declare that they have no known competing finan-
2222–2232, http://dx.doi.org/10.1109/TNNLS.2016.2582924.
cial interests or personal relationships that could have appeared to [14] Cyber espionage through botnets, Secur. J. 33 (2020) 43–62, https://doi.org/10.
influence the work reported in this paper. 1057/s41284-019-00194-6.
[15] M.M. Salim, S. Rathore, J.H. Park, Distributed denial of service attacks and its
defenses in IoT: a survey, J. Supercomput. (2020) 5320–5363, https://doi.org/
References 10.1007/s11227-019-02945-z.
[16] S. Kyatam, A. Alhayajneh, T. Hayajneh, Heartbleed attacks implementation and
[1] M.A. Amanullah, R.A.A. Habeeb, F.H. Nasaruddin, A. Gani, E. Ahmed, A.S.M. vulnerability, in: 2017 IEEE Long Island Systems, Applications and Technol-
Nainar, N.M. Akim, M. Imran, Deep learning and big data technologies for ogy Conference, LISAT, 2017, pp. 1–6, http://dx.doi.org/10.1109/LISAT.2017.
IoT security, Comput. Commun. 151 (2020) 495–517, https://doi.org/10.1016/ 8001980.
j.comcom.2020.01.016. [17] M. Bhuyan, D.K. Bhattacharyya, J.K. Kalita, Surveying port scans and their
[2] A. Sahu, S. Sharma, D. Puthal, A. Pandey, R. Shit, Secure authentication detection methodologies, Comput. J. 54 (10) (2011) 1565–1581, http://dx.doi.
protocol for IoT architecture, in: 2017 International Conference on Information org/10.1093/comjnl/bxr035.
Technology, ICIT, 2017, pp. 220–224, http://dx.doi.org/10.1109/ICIT.2017.21. [18] X. Zhang, O. Upton, N.L. Beebe, K.-K.R. Choo, IoT botnet forensics: A compre-
[3] A. Sahu, S. Sharma, S.S. Tripathi, K.N. Singh, A study of authentication protocols hensive digital forensic case study on mirai botnet servers, Forensic Sci. Int.:
in internet of things, in: 2019 International Conference on Information Tech- Digit. Investig. 32 (2020) 300926, https://doi.org/10.1016/j.fsidi.2020.300926.
nology, ICIT, 2019, pp. 217–221, http://dx.doi.org/10.1109/ICIT48102.2019. [19] J. Kroustek, V. Iliushin, A. Shirokova, J. Neduchal, M. Hron, Torii botnet -
00045. not another mirai variant, 2020, https://blog.avast.com/new-torii-botnet-threat-
[4] J. Yoon, Deep-learning approach to attack handling of IoT devices using IoT- research. (Accessed 23 March 2020).
enabled network services, Internet of Things 11 (2020) 100241, https://doi.org/ [20] R. Joven, D. Maciejak, IoT Botnet: more targets in okiru’s cross-hairs, 2020,
10.1016/j.iot.2020.100241. https://www.fortinet.com/blog/threat-research/iot-botnet-more-targets-in-
[5] F. Hussain, R. Hussain, S.A. Hassan, E. Hossain, Machine learning in IoT security: okirus-cross-hairs. (Accessed 20 April 2020).
Current solutions and future challenges, IEEE Commun. Surv. Tutor. 22 (3) [21] M.M. Khapra, Introduction to Deep Learning, NPTEL, IIT Madras, India, https:
(2020) 1686–1721, http://dx.doi.org/10.1109/COMST.2020.2986444. //nptel.ac.in/courses/106/106/106106184/. (Accessed 15 February 2020).
[6] The Growth in Connected IoT Devices Is Expected to Generate 79.4ZB of [22] A.A. Diro, N. Chilamkurti, Distributed attack detection scheme using deep
Data in 2025, According to a New IDC Forecast, International Data Corpo- learning approach for internet of things, Future Gener. Comput. Syst. 82 (2018)
ration, https://nptel.ac.in/courses/106/106/106106184/. released: 18.06.2019, 761–768, https://doi.org/10.1016/j.future.2017.08.043.
accessed: 15.02.2020. [23] B. Roy, H. Cheung, A deep learning approach for intrusion detection in
[7] M.A. Al-Garadi, A. Mohamed, A.K. Al-Ali, X. Du, I. Ali, M. Guizani, A survey of internet of things using bi-directional long short-term memory recurrent neural
machine and deep learning methods for internet of things (IoT) security, IEEE network, in: 2018 28th International Telecommunication Networks and Applica-
Commun. Surv. Tutor. 22 (3) (2020) 1646–1685, http://dx.doi.org/10.1109/ tions Conference, ITNAC, 2018, pp. 1–6, https://doi.org/10.1109/ATNAC.2018.
COMST.2020.2988293. 8615294.
[8] F. Jauro, H. Chiroma, A.Y. Gital, M. Almutairi, S.M. Abdulhamid, J.H. Abawajy, [24] Y. Zhou, M. Han, L. Liu, J.S. He, Y. Wang, Deep learning approach for
Deep learning architectures in emerging cloud computing architectures: Recent cyberattack detection, in: IEEE INFOCOM 2018 - IEEE Conference on Computer
development, challenges and next research trend, Appl. Soft Comput. 96 (2020) Communications Workshops, INFOCOM WKSHPS, 2018, pp. 262–267, https:
106582, https://doi.org/10.1016/j.asoc.2020.106582. //doi.org/10.1109/INFCOMW.2018.8407032.

153
A.K. Sahu, S. Sharma, M. Tanveer et al. Computer Communications 176 (2021) 146–154

[25] A. Dawoud, S. Shahristani, C. Raun, Deep learning and software-defined net- [29] F. Li, J. Zhang, E. Szczerbicki, J. Song, R. Li, R. Diao, Deep learning-based
works: Towards secure IoT architecture, Internet of Things 3–4 (2018) 82–89, intrusion system for vehicular ad hoc networks, Comput. Mater. Continua 65 (1)
http://dx.doi.org/10.1016/j.iot.2018.09.003. (2020) 653–681, http://dx.doi.org/10.32604/cmc.2020.011264.
[26] H. HaddadPajouh, A. Dehghantanha, R. Khayami, K.-K.R. Choo, A deep recurrent [30] Y. Tan, L. Tan, X. Xiang, H. Tang, J. Qin, W. Pan, Automatic detection of aortic
neural network based approach for internet of things malware threat hunting, dissection based on morphology and deep learning, Comput. Mater. Continua 62
Future Gener. Comput. Syst. 85 (2018) 88–96, http://dx.doi.org/10.1016/j. (3) (2020) 1201–1215, http://dx.doi.org/10.32604/cmc.2020.07127.
future.2018.03.007. [31] M.A. Ganaie, M. Hu, M. Tanveer, P.N. Suganthan, Ensemble deep learning: A
[27] A. Azmoodeh, A. Dehghantanha, K.R. Choo, Robust malware detection for review, 2021, arXiv:2104.02395.
internet of (battlefield) things devices using deep eigenspace learning, IEEE [32] R. Katuwal, P.N. Suganthan, M. Tanveer, Random vector functional link neural
Trans. Sustain. Comput. 4 (1) (2019) 88–95, http://dx.doi.org/10.1109/TSUSC. network based ensemble deep learning, 2019, arXiv:1907.00350.
2018.2809665. [33] S. Hochreiter, J. Schmidhuber, Long short-term memory, 9 (8) (1997) https:
[28] O. Brun, Y. Yin, E. Gelenbe, Deep learning with dense random neural network //doi.org/10.1162/neco.1997.9.8.1735.
for detecting attacks against IoT-connected home environments, in: The 15th [34] G. Aurélien, Hands-on Machine Learning with Scikit-Learn, Keras, and
International Conference on Mobile Systems and Pervasive Computing (MobiSPC TensorFlow, second ed., O’Reilly Media, 2019, pp. 1–819.
2018) / The 13th International Conference on Future Networks and Communi- [35] A. Parmisano, S. Garcia, M.J. Erquiaga, A labeled dataset with malicious and
cations (FNC-2018) / Affiliated Workshops, Procedia Comput. Sci. 134 (2018) benign IoT network traffic, URL https://www.stratosphereips.org/datasets-iot23.
458–463, https://doi.org/10.1016/j.procs.2018.07.183. [36] Y. Xin, L. Kong, Z. Liu, Y. Chen, Y. Li, H. Zhu, M. Gao, H. Hou, C. Wang,
Machine learning and deep learning methods for cybersecurity, IEEE Access 6
(2018) 35365–35381, http://dx.doi.org/10.1109/ACCESS.2018.2836950.

154

You might also like