Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/333925714

Intrusion Detection Systems with Deep Learning: A Systematic Mapping


Study

Conference Paper · April 2019


DOI: 10.1109/EBBT.2019.8742081

CITATION READS

1 65

4 authors, including:

Gozde Karatas
T.C. Istanbul Kultur University
9 PUBLICATIONS 20 CITATIONS

SEE PROFILE

All content following this page was uploaded by Gozde Karatas on 03 October 2019.

The user has requested enhancement of the downloaded file.


Intrusion Detection Systems with Deep Learning: A
Systematic Mapping Study
Sinem Osken, Ecem Nur Yildirim Gozde Karatas, Levent Cuhaci
Department of Mathematics and Computer Sciences Department of Mathematics and Computer Sciences
İstanbul Kültür University İstanbul Kültür University
İstanbul, TURKEY İstanbul, TURKEY
{g.karatas, l.cuhaci}@iku.edu.tr

Abstract— In this study, a systematic mapping study was


conducted to systematically evaluate publications on Intrusion II. FUNDAMENTAL INFORMATION
Detection Systems with Deep Learning. 6088 papers have been In this section provides information about Intrusion Detection
examined by using systematic mapping method to evaluate the
Systems, Deep Learning and Systematic Mapping Study.
publications related to this paper, which have been used
increasingly in the Intrusion Detection Systems. The goal of our A. Intrusion Detection Systems
study is to determine which deep learning algorithms were used
All actions aimed to prevent the integrity, confidentiality,
mostly in the algortihms, which criteria were taken into account
for selecting the preferred deep learning algorithm, and the most
accessibility or reliability of an electronic data resource or
searched topics of intrusion detection with deep learning network defines the Intrusion Detection Systems. The earliest
algorithm model. Scientific studies published in the last 10 years preliminary IDS concept was delineated in 1980 by James
have been studied in the IEEE Explorer, ACM Digital Library, Anderson in article named "Computer Security Threat
Science Direct, Scopus and Wiley databases. Monitoring and Surveillance". Later in 1986 Dorothy E.
Denning and Peter G. Neumann published a model of an IDS
Keywords—deep learning, intrusion detection, systematic that formed the basis for many systems today. This model used
mapping study statistics for anomaly detection, named the Intrusion Detection
Expert System (IDES), which followed with a new version in
I. INTRODUCTION 1993 and called Next-Generation IDES(NIDES). Intrusion
detection systems are software designed to detect attacks that
Nowadays, the rapid development of network and internet cause damage to systems in the network, coming from the
technologies leads to the formation of new network Internet or the local network, after violating and threatening, by
applications, so compatibility of existing applications and entering computer systems and capturing information against
systems with network technologies are thought the institutions all precautions [1]-[3]. Intrusion detection systems are software
about their security. The personal or institutional aspect designed to, after an attack that happened despite all measures
important data by others will put the institutions in material and and precautions taken, detect attacks that cause damage to
non-pecuniary damage. Therefore, technological measures systems on the network from the global Internet or the local
have been taken to provide information security. network. Intrusion detection systems are one of the safety
The attacks on the internet have enabled the development components aimed at responding to attacks and can be
of defense systems. Systems such as Firewall and Anti-Virus considered as an alarm method [1], [2]. The methods and
have been developed to protect the local network. However, approaches used in the intrusion detection system have been
the weaknesses caused by the architectures of Firewall and developed for the classification of the data or the creation of
Anti-viruses are not updated and new attacks can be created. the rule tables. A platform for the intrusion detection system is
For this reason, Intrusion Detection Systems (IDS) and created with the datas which are generated by the methods. In
Intrusion Prevention System (IPS) have been developed. intrusion detection systems, many approaches have been
applied, such as; artificial neural network, data mining,
Intrusion detection systems are systems designed to detect artificial immune system, rule based, threshold value, state
attacks that may come from the internet or local network and transition diagrams, support vector machines, genetic
damage systems in the network and which may be composed algorithms and fuzzy logic.
of various packets and data to ensure data security. Their main
purpose is to detect attacks and to prevent this attack if Intrusion Detection Systems can be classified according to
necessary. Intrusion Detection Systems are able to obtain the environment where they detect the attacks and their
detailed information about the most common and type of attack approaches of detection. On the Basis of Intrusion Detection
and attackers. In the following part of this study, information Approach [1]: Signature Based IDS, Anomaly Detection
was given about Intrusion Detection Systems, Deep Learning, Based IDS. Also on the Basis of Environment Intrusion
Systematic Mapping Study, Research Methods, Data Detection [1]: Host Based IDS, Network Based IDS.
Extraction and results.

978-1-7281-1013-4/19/$31.00 ©2019 IEEE


B. Deep Learning training the data. The aim of an Auto encoder is typically to
Being a prominent trend in the recent years, deep learning learn a code in order to reduce dimensionless for a data set.
is a machine learning approach. Deep learning comprises of
multiple-level layers many layers of which is capable of C. Systematic Mapping Study
running processes simultaneously and high-level features are Systematic mapping studies or scope determination studies
produced from low-level ones. Therefore, it takes its action by are designed to provide a general view over research subject.
forming its own features without needing human power. Deep They encompass literature review to know which subjects are
Learning had its first big impact in science World in 2012 [4]- dealt within the literature. Systematic maps are mainly about
[11]. forming a research plan. By systematic mapping study, the
Artificial Neural Networks (ANN), which is one of the articles that will be investigated within literature review are
architects of deep learning, is the modelling of how human determined. Step by step approach is crucial in this study.
brain neurons work in computer environment. Its aim is to • Statement of mapping questions
create intelligent systems that work independently from
humans. An Artificial Neural Networks consist of three layers: • Searching relevant publications in databases and
Input Layer, Hidden Layer and Output Layer. determining relevant publications
Convolution Neural Network (CNN), another one of deep • Selecting publications in line with predetermined
learning methods, is a special type of ANNs. At the same time, criteria
it is the most frequently used deep learning architecture. CNN
• Following the review of contents of publications,
is a significant model regarding problems with visuals [4]-[11].
detecting the dates and types of publications to be used.
CNN algorithms can classify visuals better than humans can.
However, it operated successfully not only in visual processing Through a good literature review, it is possible to have
but also advice systems and other areas as well. In 2015, it knowledge about all studies. The more complete the list that
became the first deep neural network to be used in AtomNet forms after publication scan, the better results might be yielded
medicine design development. In a study to determine sleep with the research conducted. A systematic literature review
quality, CNN algorithm yielded the best result. contributes to finding further different research topics that
might be conducted in the future, distilling prior unsuccessful
Recurrent Neural Network (RNN), which is another deep methods, obtaining knowledge about methods that can be used.
learning method, is an artificial class of neural network which
has a memory and is a loop between the units [4]-[11]. The
main idea in this algorithm is classifying the information in III. RESEARCH METHOD
certain order correctly. In the studies on voice recognition Before all actions aimed to prevent the integrity,
RNN algorithms were used. You et al. used a RNN model confidentiality, accessibility or reliability of an electronic data
which classified text messages in prisons as safe and unsafe. In resource or network defines the Intrusion Detection Systems.
order to solve Vanishing Gradient problem (long term The earliest preliminary IDS concept was delineated in 1980
dependencies), in 1997 computer scientists Sepp Hochreiter by James Anderson in article named "Computer Security
and Juergen Schmidhuber developed Long Short-Term Threat Monitoring and Surveillance,". Later in 1986 Dorothy
Memory (LSTM) which is a specific type of RNN [4]-[11]. In E. Denning and Peter G. Neumann published a model of an
LSTM architecture there are four layers bound in a special way IDS that formed the basis for many systems today. This model
instead of single neural layer. used statistics for anomaly detection, named the Intrusion
Restricted Boltzmann Machines (RBM) are a quick Detection Expert System (IDES), which followed with a new
learning algorithm developed by Geoffrey Hinton. RBMs are version in 1993 and called Next-Generation IDES(NIDES).
two-layered neural networks. RBM is a productive random Intrusion detection systems are software designed to detect
artificial neural network which can learn probability attacks that cause damage to systems in the network, coming
distribution on a set of entry. RBM algorithms is a very from the Internet or the local network, after violating and
successful algorithm in terms of dimension reduction, threatening, by entering computer systems and capturing
classification and attribute learning. information against all precautions.

Deep Belief Networks (DBN) which was put forth by Stage of the investigation methods are defined as Planning,
Geoffrey Hinton again, can be considered to be a component of Propulsion, Reporting and Data Extraction respectively. These
RBMs. It is a productive deep neural network where every steps are demonstrated in the following section.
layer is linked to one another and it is a multi-layered graphics
model. In general, DBNs has many uses in areas like A. Planning
electroencephalography and medicine discovery. In this section, investigation questions, searching strategies,
inclusiveness and externalizations criteria, data retrieval and
Auto encoders (AE), which are also named as Diabolo
synthesizing ways are revealed. Research questions which is
network, is a specific neural network which copies the values at
below are determined to state primary studies in Intrusion
input layer to output layer; and is used for unsupervised
Detection Systems:
learning in which the labels are not pronounced distinctively
while training the data set. AE produces its own labels while RQ1: What are the commonly used research techniques?
(theory, survey, test, experiment, examination etc.)
RQ2: Which one of the electronic databases include more according to inclusion-exclusion criteria. First, the key words
publishing relevant to intrusion detection systems with deep were eliminated, then the whole text was read out and the
learning subject? number of the final papers was found. Also, only one of
duplicated study is selected and the others were eliminated.
RQ3: Which journals and conferences contain more After the items were eliminated, the works were reduced to 87
publishing about intrusion detection systems? related papers and the research questions were applied to these
RQ4: What is the distribution of publishing by years? papers.
RQ5: What is distribution of studies according to their
publishing type? IV. RESULTS
Distribution according to types of papers shown in Figure-
Semi-automatic seeking is done to reach resources by using
1. Academic Article is most published one.
key words below;
o “Deep Learning” and “Intrusion Detection”
o Deep learning with intrusion detection
o Deep learning-based intrusion detection
o “Deep Learning “ “Intrusion Detection”
Databases which are used to find publications are ACM
Digital Library, IEEE Explorer, Science Direct and Wiley. Due
to the fact that some of the publications are included in more
than one database one of them is chosen and used. For this
Figure 1: Distribution of publications by types
investigation literatures which is published years between 2009
to 2019 are viewed. Publications are excluded that is not Figure-2 shows the distributions of the publications
relevant to the Intrusion Detection subject. Inclusion Criteria analyzed by years. There is a increase in the number of
(IC) and Exclusion Criteria (EC) are identified in Table 1 publications since 2009. It is seen that most publications about
below: deep learning were prepared in 2016.

Table 1: Inclusion-Exclusion Criteria

Inclusion Criteria (IC) Exclusion Criteria (EC)


IC1: Primary studies about EC1: Search that is not related to
Intrusion Detection Systems. Intrusion Detection Systems.
IC2: Secondary studies about EC2: Literatures which are not
Intrusion Detection Systems. published in English.
IC3: Studies which approach EC3: Publications that have only
together Intrusion Detection and summarized information. Figure 2: Distribution of publications by years
Machine Learning.
EC4: Brief notes, panels and Looking at the study context examined in Figure-3, it is
poster summarizes. seen that publications in the type of Experiment are 59%
followed by publications in %31 Examinations and %11
B. Propulsion Theory.
Several researches are done from electronic databases and
to identify related subject IC and EC are utilized. After
externalization criteria are applied, 50 confirmed literature are
used for this investigation.

C. Reporting
Based on research questions, on the verge of answering each
question all needed reporting is done and consequences are
evaluated.

D. Data Extraction
Figure 3: Distribution of publications by study context
In this phase, the number of studies received from each data
source via search sequences is IEEE 158, Scopus 301, ACM The preferred deep learning algorithms in the publications
3506, Wiley 1674 and ScienceDirect 1449 were found. examined in Figure-4. Within all deep learning algorithms,
Subsequently, these studies were examined and eliminated
DNN is the most preferred one. In particular the Others Also, more work should be done on the algorithms used for
includes all machine learning algorithms. sequential data.
In the future work, try to find out the problems in the deep
learning algorithms which are not included/investigated in this
study will be done in the last stage.

REFERENCES
[1] G. Karatas and O. K. Sahingoz, “Neural network based intrusion
detection systems with different training functions,” In: 2018 6th
International Symposium on Digital Forensic and Security (ISDFS).
IEEE, 2018. p. 1-6.
[2] G. Karatas, “Genetic algorithm for intrusion detection system,” In: 2016
24th Signal Processing and Communication Application Conference
Figure 4: Distribution of Deep Learning Algorithms (SIU). IEEE, 2016. p. 1341-1344.
In this study, 87 publications were taken into consideration. [3] H. Om and T. Hazra, “Statistical techniques in anomaly intrusion
detection system,” International Journal of Advances in Engineering &
Figure-5 shows the distribution of the publications used for Technology, vol. 5, no. 1, pp. 387–398, 2012.
examination by databases from electronic databases. According [4] G. Zhao, C. Zhang, and L. Zheng, “Intrusion detection using deep belief
to this numerical data, the most relevant publication is in the network and probabilistic neural network,” in Computational Science
IEEE Explorer database. and Engineering (CSE) and Embedded and Ubiquitous Computing
(EUC), 2017 IEEE International Conference on, vol. 1. IEEE, 2017, pp.
639–642.
[5] T. Shibahara, T. Yagi, M. Akiyama, D. Chiba, and T. Yada, “Efficient
dynamic malware analysis based on network behavior using deep
learning,” in Global Communications Conference (GLOBECOM), 2016
IEEE. IEEE, 2016, pp. 1–7.
[6] J. Yan, D. Jin, C. W. Lee, and P. Liu, “A comparative study of offline
deep learning based network intrusion detection,” in 2018 Tenth
International Conference on Ubiquitous and Future Networks (ICUFN).
IEEE, 2018, pp. 299–304.
[7] M.-J. Kang and J.-W. Kang, “A novel intrusion detection method using
Figure 5: Distribution of publications by databases deep neural network for in-vehicle network security,” in Vehicular
Technology Conference (VTC Spring), 2016 IEEE 83rd. IEEE, 2016,
In this study, 22 journal articles and 65 conference papers pp. 1–5.
were studied. The majority of the publications used in the [8] S. Potluri and C. Diedrich, “Accelerated deep neural networks for
review are conference proceedings. The names of the enhanced intrusion detection system,” in Emerging Technologies and
conferences with more than 1 publications are given in the Factory Automation (ETFA), 2016 IEEE 21st International Conference
Table 2. Also IEEE Access is the most widely published on. IEEE, 2016, pp. 1–8.
journal. It has 4 publications. Other journals has one. [9] O. Kaynar, A. G. Y¨uksek, Y. G¨ormez, and Y. E. Is¸ik, “Intrusion
detection with autoencoder based deep learning machine,” in Signal
Processing and Communications Applications Conference (SIU), 2017
Table 2: Most Published Conferences 25th. IEEE, 2017, pp. 1–4.
[10] K. Alrawashdeh and C. Purdy, “Toward an online anomaly intrusion
Number of detection system based on deep learning,” in Machine Learning and
Conference Name Publication Applications (ICMLA), 2016 15th IEEE International Conference on.
IEEE, 2016, pp. 195–200.
2017 IEEE National Aerospace and Electronics Conf. 2 [11] S.DingandG.Wang,“Researchonintrusiondetectiontechnologybased on
2017 Int. Conf. on Adv. in Comp., Communi. and deep learning,” in Computer and Communications (ICCC), 2017 3rd
IEEE International Conference on. IEEE, 2017, pp. 1474–1478.
Informatics 2
[12] https://www.kisa.link/LDwM
2018 Int. Telecommunication Net. and App. Conf. 2
2018 Int. Cong. on Big Data, Deep Learning and
Fighting Cyber Terrorism 2
Int. Conf. on Bio-inspired Inf. and Communi. Tech. 2

V. CONCLUSION
Five electronic databases were used in this study, 6088
publication were found, but only 87 of them were used [12].
Suggestions at the end of the research are as follows; book
chapters on intrusion detection systems should be developed,
different types of study areas which are suitable for
implementing deep learning algorithms should be determined.

View publication stats

You might also like