Professional Documents
Culture Documents
An Optimization Method For Intrusion Detection Classification Model Based On Deep Belief Network
An Optimization Method For Intrusion Detection Classification Model Based On Deep Belief Network
An Optimization Method For Intrusion Detection Classification Model Based On Deep Belief Network
Received April 28, 2019, accepted June 24, 2019, date of publication July 1, 2019, date of current version July 18, 2019.
Digital Object Identifier 10.1109/ACCESS.2019.2925828
ABSTRACT The rapid development and popularization of the network have brought many problems to
network security. Intrusion detection technology is often used as an effective security technology to protect
the network. The deep belief network (DBN), as a classic model of deep learning, has good classification
performance and is often used in the field of intrusion detection. However, the network structure of DBN
is generally set through practical experience. For the optimization problem of the DBN-based intrusion
detection classification model (DBN-IDS), this paper proposes a new joint optimization algorithm to
optimize the DBN’s network structure. First, we design a particle swarm optimization (PSO) based on the
adaptive inertia weight and learning factor. Second, we use the fish swarm behavior of cluster, foraging,
and other behaviors to optimize the PSO to find the initial optimization solution. Then, based on the initial
optimization solution, we use the genetic operators with self-adjusting crossover probability and mutation
probability to optimize the PSO to search the global optimization solution. Finally, the global optimization
solution constructed by the above-mentioned joint optimization algorithm is used as the network structure
of the intrusion detection classification model. The experimental results show that compared with other
DBN-IDS optimization algorithms, our algorithm shortens the average detection time by at least 24.69%
on the premise of increasing the average training time by 6.9%; compared with the tested classification
algorithms, our DBN-IDS improves the average classification accuracy by at least 1.3% and up to 14.80%
in the five-category classification, which is proved to be an efficient DBN-IDS optimization method.
INDEX TERMS Intrusion detection, deep belief network, particle swarm optimization, artificial fish swarm
algorithm, genetic algorithm.
VOLUME 7, 2019 This work is licensed under a Creative Commons Attribution 3.0 License. For more information, see http://creativecommons.org/licenses/by/3.0/ 87593
P. Wei et al.: Optimization Method for Intrusion Detection Classification Model Based on DBN
classification model can be easily reduced. In the case of the performance of our method is superior to other DBN
uncertain DBN network structure, it is difficult to directly model optimization algorithms and machine learning
optimize the parameters of the DBN model, and the computa- methods on the NSL-KDD dataset in both binary and
tional cost of directly optimizing the DBN network structure multiclass classification. It improves the accuracy of
is less. Therefore, the DBN model is adjusted by optimizing intrusion detection, providing a new research method for
the DBN network structure in our algorithm. intrusion detection.
The DBN network structure optimization problem has The rest of the paper is organized as follows: In Section II,
fewer open documents. In fact, researchers always adopt we review the related research in the field of intrusion detec-
empirical methods in the fields of image recognition and tion, focusing on the application of deep learning meth-
speech recognition to solve the problem of network struc- ods in intrusion detection and the application of intelligent
ture [4]. At present, the methods that can be used for neural algorithms in optimizing intrusion detection classification
network structure or parameter optimization problems are models based on neural network. In Section III, we pro-
mainly evolutionary computation [5], which is a swarm- pose a construction and optimization method for DBN-IDS.
oriented random search technique and method for simu- In Section IV, by comparing and discussing the experimen-
lating the biological evolution process in nature. However, tal results of this algorithm with other classification algo-
researchers mainly use this method to the optimization of the rithms, the superiority of our method is demonstrated. Finally,
neural network structure and parameters of the single hidden the conclusions are discussed in Section V.
layer neural network, lacking of related research on optimiza-
tion multi-hidden layer neural network classification model. II. RELATED WORK
The DBN model is often used as the intrusion detection In prior studies, many machine learning methods are
classification model. In order to optimize the multi-hidden used in intrusion detection, where machine learning meth-
layer neural network in the DBN model, we propose an arti- ods are mainly divided into unsupervised learning, semi-
ficial fish swarm algorithm optimization PSO (AFSA-PSO) supervised learning and supervised learning. Supervised
joint genetic algorithm optimization PSO (GA-PSO) algo- learning methods, such as Support Vector Machine (SVM)
rithm. It is used to construct and optimize the DBN-IDS and KNN algorithms, unsupervised learning methods, such as
model, mainly to optimize the basic network structure cluster algorithms, have been widely used in intrusion detec-
of DBN. It is worth mentioning that the weight and threshold tion. Related references have given comparative analysis of
parameters of this paper are mainly generated automatically supervised learning and unsupervised learning techniques in
based on the basic network structure of DBN according to the the application of intrusion detection. Literature [7] validates
DBN adaptive adjustment mechanism [6]. Our contributions the classification performance comparison of NSL-KDD
of this paper are summarized as follows. datasets in different machine learning methods, such as deci-
• We propose a DBN-IDS model optimization method sion trees, SVM, and Bayesian.
for optimizing the basic network structure of the DBN In recent years, deep learning, a branch of machine learn-
classification model with multiple hidden layers. Firstly, ing, has become increasingly popular and has been applied
the AFSA is used to optimize the PSO algorithm to for intrusion detection; studies have shown that deep learn-
obtain the initial optimal particles. Then, the initial opti- ing completely surpasses traditional methods. There are also
mal particles are used as the initial particle swarm of the many neural network classification models applied to intru-
GA optimization PSO algorithm. In the case of ensuring sion detection, such as Deep Neural Networks (DNN), DBN,
the classification accuracy, the global optimal particle is etc, which have advantages in processing and learning the
finally optimized and used as the basic network structure temporal or spatial characteristics of data. In a neural net-
of the DBN classification model. work, the hidden layer neural nodes are summed from the
• By contrast, we study the performance of DBN classi- weights of the upper nodes, and the results are transmitted to
fication model optimized by our method in 2-category the next layer by nonlinear transformation. The last layer of
classification and multiclass classification. Moreover, the network represents the representation learning of the input
we study the performance of other DBN model opti- data. Literature [8] applies DBN classifier to the intrusion
mization algorithms, and some machine learning meth- detection model. Based on the KDD CUP99 dataset, it is
ods such as Support Vector Machine (SVM), Random found that DBN’s performance is better than SVM and artifi-
Forest and Naive Bayes in 2-category classification and cial neural network (ANN). Literature [9] proposes intrusion
multiclass classification based on the benchmark NSL- detection model based on DBN and probabilistic neural net-
KDD dataset. work (PNN). The DBN model is used to reduce the intrusion
• We compare the performance of DBN classification data dimension, and PNN is used to classify the reduced data.
model optimized by our method with other DBN model The experimental results show that DBN-PNN classification
optimization algorithms and some machine learning performance is better than traditional PNN. Literature [10]
methods both in 2-category classification and multi- proposes an intrusion detection optimization model based on
class classification. The experimental results illustrate deep stacking autoencoder (SAE). Literature [11] applies the
that our method is very suitable for intrusion detection, semi-supervised learning method based on random weighted
t+1
xid t
= xid + vstept+1
id (8)
Algorithm 1 AFSA Optimization PSO Algorithm
where t is the number of iterations; i = 1, 2, ···, m (m denotes Input: Field of view: Visual
the number of all particles); d = 1, 2, · · ·, n (n denotes the Moving step: Step
total number of dimensions of the particle); randi1 and randi2 Number of trials of foraging behavior: trytimes
are random number between 0 and 1; pbestid t denotes the
Maximum fitness: maxfit
th
d dimensional optimal position of the historical extremum Number of iterations: iter = 0
of ith particle in the t th iteration; gbestdt denotes the d th dimen- The maximum number of iterations of AFSA-PSO:
sional optimal position of the swarm optimal particle in the fmaxiter
t th iteration; c1 and c2 denote the learning factor; xid t denotes
Output: Initial optimization solution set temppbest
the d position of i particle in the t iteration; vsteptid
th th th
Step1: Parameter initialization
denotes the d th dimensional moving speed of ith particle in 1. Randomly initialize the particle swarm swarm according
the t th iteration. to the n value.
Literature [17] studied four kinds of PSO inertia weights w, 2. Solve the swarm fitness fswarm.
such as linear, concave and convex function reduction, and 3. Initialize pbest to swarm.
found that in multi-peak function optimization, convex func- 4. Initialize gbest to the optimal fitness particle in the
tion reduction inertia weight method has the fastest conver- swarm.
gence rate and the best effect. Therefore, the inertia weight 5. Solve the gbest fitness fgbest.
of this paper takes the value as shown in formula (9). 6. Initialize the extremum record matrix tempbesti of theith
iter 3 particle pbest and its fitness matrix ftempbesti .
wi = (wmax − wmin ) · (1 − ) + wmin (9) Step2: End condition judgment
maxiter
7. if the current iteration number iter ≤ fmaxiter or fgbest≤
where wmax and wmin denote the maximum and minimum maxfit
values of the inertia weight respectively; iter denotes the Jump to Step3.
number of iterations; maxiter denotes the maximum number else
of iterations. Jump to Step7.
c1 embodies the ability of particles to learn from them- endif
selves, and c2 embodies the ability of particles to learn from Step3: Particles perform cluster behavior
swarm. The decrease of c1 and the increase of c2 are bene- 8. Detect the neighborhood particle center xj .
ficial to the initial exploration of particles and the cognitive 9. if Fitnessi > Fitnessj
ability of the swarm in the later stage. Therefore, the learning Go to the center position xj as in formula (12), and skip
factors c1 and c2 are assigned as in formulas (10) and (11) to Step2.
respectively. else
iter Skip to Step4.
c1 = c3 − c4 · ( ) (10) endif
maxiter
iter Step4: Particle perform foraging behavior
c2 = c4 + c3 · ( ) (11)
maxiter 10. while the number of trials is not satisfied with trytimes,
where c3 and c4 are constants, and c3 > c4 . do
A particle xk is randomly selected within its Visual,
B. AFSA OPTIMIZATION PSO ALGORITHM and xk is as in formula (13).
The artificial fish swarm algorithm is an optimization strat- if Fitnessi > Fitnessk
egy based on animal autonomy [18]. The basic thought is Go further in this direction, as in formula (14), and
that the food with the most fish accumulation is also very skip to Step2.
rich. Artificial fish are constructed according to the above else
characteristics, which simulate fish swarm’s cluster, rear-end Re-randomly select the state xk .
and foraging behavior [19], through the virtual waters of the if xk satisfies the forward condition
fish swarm to achieve optimization. break;
In this paper, the fish swarm thought is used to optimize endif
the PSO to find the initial optimal particles. The cluster, rear- end while
end and foraging behaviors in the AFSA are used to improve Step5: Particles perform random behavior
the moving mode of the PSO algorithm. The basic flow of the 11. Randomly selects new particle in the field of view.
particle swarm algorithm based on the fish swarm thought is 12. Moves toward the particle as a default behavior, as in
as Algorithm 1. formula (15).
The value of the element in xit , xit+1 are integers and range Step6: Update and record particle swarm related prop-
from smin to smax. The fitness is solved by the formula (1). erties
Algorithm 1 (Continued.) AFSA Optimization PSO pairs of next-generation particles and select the parent parti-
Algorithm cles and historical extremum in the superior next-generation
13. Update particle inertia weight, the learning factors c1 particle replacement swarm.
and c2 as in formulas (9), (10), and (11), particle swarm’s The crossover operation mainly exchanges
the first parent
velocity and position, pbest, gbest, fpbest, and fgbest. particle dimensions between 1 and bL 2c with the second
14. Record each particle’s pbest to temppbest, let iter = parent particle dimensions between L−bL 2c +1 and L, and
iter+1, and jump to Step2. L denotes the total particle dimension. The number of cross-
Step7: Get global optimization solution operating particles per iteration is as shown in formula (16).
15. Use temppbest as a initial optimization solution set iter 2
Pi = bswarmsize − ( ) · (swarmsize − 2)c (16)
maxiter
where swarmsize denotes the number of all particles.
The PSO is optimized by the behavior of cluster and foraging According to the characteristics of particle swarm, the par-
of fish swarm. The cluster behavior makes the particles closer ticle swarm types are more random and diverse at first, and
to the center of the neighboring particles with better fitness the particle swarm gradually converges in the later stage, and
and no crowding [20]. It is beneficial to increase the particle the particle swarm similarity increases. We can easily know
swarm movement mode and overcome the problem that the that the cross-operation is more effective than the later stage.
particle swarm is easy to fall into the local optimal solution. Therefore, as the number of iterations increases, the number
The original movement behavior of the particle swarm is of cross-operation particles per iteration should decrease,
similar to the rear-end behavior of the fish swarm. Since and the crossover probability should decrease accordingly,
the historical extreme value of the single particle is recorded which helps to reduce the computational cost. The crossover
and the implementation is simple, cluster behavior can find a probability is updated as in formula (17).
better optimization solution than the rear-end behavior.
Although the AFSA optimization PSO (AFSA-PSO) can iter
pcross = (pcmax − pcmin ) · (1 − ) + pcmin (17)
find the initial optimization particles, it is difficult to achieve maxiter
further optimization. Based on the optimization of initial where pcmin and pcmax denote the minimum and maximum
optimization particles, this paper introduces GA optimiza- crossover probabilities, respectively.
tion PSO (GA-PSO) algorithm and find global optimization Although the search space can be well explored at the
particles. beginning of the crossover operation, it has a small effect after
the particle swarm is converged in the late stage [19]. The
xit+1 = xit + (xj − xit ) ∗ rand (12) mutation operation and the cross operation are just the oppo-
where function random generates a number ranging from site. In the initial stage of the search, the effect is not obvious,
0 to 1. but in the later stage, the search effect is better. Therefore,
the mutation operation is introduced to further improve the
xk = xit + 2 · (rand − 0.5) · Visual (13) spatial search ability, and the mutation probability increases
as the number of iterations increases. This will help to further
where Visual denotes the fish’s field of vision. find better particles. The mutation probability is updated as in
xit+1 = xit + 2 · (rand − 0.5) · Step · (xk − xit ) (14) the formula (18); the selection probability of the d th position
in the particle is as shown in formula (19).
where Step denotes the step size of the fish. iter
pmutation = (pmmax − pmmin ) · + pmmin (18)
xt+1 = xit + 2 · (rand − 0.5) · Step (15) maxiter
i
(gbestd − θ)2
IFθ < gbest d ≤ 1
1−θ
C. GA OPTIMIZATION PSO ALGORITHM CRd = (19)
(θ − gbestd ) 2
Genetic algorithm is a randomized search method that finds
IF0 ≤ gbest d ≤ θ
θ
the optimal solution by simulating the natural evolution pro-
cess. The crossover operator and mutation operator in the where pmmin and pmmax denote the minimum and maximum
Genetic algorithm have great effect on PSO algorithm, which mutation probabilities; gbestd denotes the global optimal
can increase the diversity of particle swarm change, produc- particle d th dimensional position; θ denotes the selection
ing a swarm representing the new solution set, and overcome threshold. When the value of the d th dimensional position of
the problem that particle swarm is easy to fall into local gbest is larger than the threshold θ , if the value of gbestd
optimum [21]. is larger, the CRd is larger, otherwise the CRd is smaller.
In our method, in order to avoid the problem of premature When the value of the d th dimensional position of gbest is
convergence of particle swarm, the particle swarm algorithm smaller than the threshold θ , if the value of gbestd is smaller,
crosses multiple pairs of particles at each iteration, and uses the CRd is larger, otherwise the CRd is smaller. Then generate
the roulette algorithm based on fitness value to select multiple a random number r ranging from 0 to 1. If r>CRd , perform
pairs of parent particles and cross them [22]. Produce multiple the corresponding mutation operation, such as formula (20),
otherwise the new d th dimensional position of the new parti- Algorithm 2 GA Optimization PSO Algorithm
cle is assigned to gbestd . Input: Particle swarm: swarm2
gbest d Maximum fitness: maxfit
θ + (1 − θ ) ·
IF0 ≤ gbest d ≤ θ Number of iterations: iter2 = 0
childd = θ (20)
(1−gbest d ) Minimum and maximum values of crossover probabil-
θ −
·θ IFθ < gbest d ≤ 1
1−θ ity: pcmin, pcmax
wherein, if the value of gbestd is less than or equal to the Minimum and maximum values of mutation probability:
selection threshold θ , the value of childd is mapped to a real pmmin, pmmax
number ranging from θ to 1, otherwise the value of childd The maximum number of iterations of GA-PSO: gmax-
is mapped to a real number ranging from 0 to θ, which is iter
beneficial to change the selection properties of the feature. Output: Global optimization extreme value gbest
Therefore, the basic flow of our GA-PSO algorithm is Step1: Parameter initialization
shown in Algorithm 2. It can find a better global opti- 1. The characteristics of the intrusion detection data set
mization solution based on the initial solution obtained by are assigned to different real numbers ranging from 0 to
Algorithm 1. 1, which constitute the particle position vector.
2. Randomly initialize the particle swarm position and
D. AFSA-GA-PSO ALGORITHM APPLIED velocity, solve the fitness.
TO OPTIMIZE THE DBN-IDS MODEL 3. Initialize the historical extremum pbest of particle in
In summary, this paper proposes a hybrid particle swarm swarm2 , and the swarm2 ’s optimal solve gbest and their
optimization algorithm based on AFSA and GA (AFSA-GA- fitness fswarm2 , fpbest, fgbest.
PSO) applied to optimize DBN-IDS model. The algorithm Step2: End condition judgment
firstly optimizes the particle by using the AFSA optimization 4. if iter >gmaxiter or fgbest >maxfit is satisfied
PSO algorithm, and then uses the GA optimization PSO to Execute Step6.
find the global optimization solution for the initial optimal else
particle search. The particle swarm size is n2 , and the number Execute Step3.
of non-zero elements in the particle is consistent with the endif
number of hidden layers. In our algorithm, particle repre- Step3: The specific implementation steps
sentation network structure has length hidden layers, and 5. The inertia weight w and the learning factors c1 and
ith particle is xi = [num(hi1 ), num(hi2 ), · · ·, num(hilength )]. c2 are updated as in formulas (9), (10), and (11), and
Particle velocity is vstepi = [vsi1 , vsi2 , · · ·, vsilength ], and the crossover and mutation probabilities are updated as in
its length depends on the dimensions of the initial optimal formulas (17) and (18), respectively.
particle. The basic flow of the algorithm is as follows: 6. Perform cross-operations on particle pairs in swarm 2 .
In Algorithm 3, the purpose of performing random out- 7. Perform random out-of-order operations on pbest.
of-order operation on pbest is to randomly disturb the order 8. Perform mutation operations on gbest to obtain respec-
of all particle historical extremum positions to find better tive descendant particles.
individual particle extremum, which is beneficial to avoid the 9. Replace the parent particles with the better descendant
particle individual extremum falling too fast into the local particles.
optimal solution. Step4: Update the extreme values of particles
The AFSA optimization PSO algorithm mimics the flight 10. Update the corresponding pbest, gbest and their fitness
process of the swarm of birds, and uses rear-end, foraging separately.
and cluster behaviors in the AFSA to enrich and optimize the Step5: Update particle speed and position
flight behavior of the swarm, which is beneficial to finding 11. Update the speed and position of the particle swarm,
the initial optimization solution with better effect. The initial let iter 2 = iter 2 + 1, and then skip to Step2.
optimization solution is used as the initial solution of GA Step6: Get global optimization solution
optimization PSO algorithm. The genetic operators in the 12. Use gbest as a global optimization solution
GA optimization PSO algorithm can accelerate the process
of the particle swarm through a reasonable genetic search
mechanism.
preprocessing process. Third, Part C introduces the experi-
IV. EXPERIMENT RESULTS AND DISCUSSION mental parameter settings. Last, Part D designs the experi-
In this research, the experiment is performed on a per- ments and discusses the results.
sonal computer Lenovo Tianyi 510 which has a configu- The method of Section III is used to construct and optimize
ration of an Intel(R) Core(TM) i5-8400 CPU @2.80 GHz, the intrusion detection classification model. By optimizing
8GB memory. This chapter is divided into four parts. First, the basic network structure of DBN, the DBN classifica-
Part A of this chapter introduces the NSL-KDD dataset tion model is constructed and applied to IDS, and good
used in the experiment. Second, Part B introduces the data experimental results are obtained. Two experiments have
A. DATASET DESCRIPTION
been designed to study the performance of the 2-category The NSL-KDD data set is based on the standard data set of
classification (normal, abnormal) and mutiply classification the intrusion detection domain—KDD CUP99 data set [25].
in DBN-IDS model. In order to compare with other DBN It solves the problems of redundant features and duplicate
model optimization algorithms and machine learning meth- records of the traditional KDD CUP99 data set. It has been
ods, a comparative experiment is designed. In the 2-category widely used in the field of network intrusion detection. Each
classification experiment, we compare our algorithm with network connection in the NSL-KDD data set is marked as
four DBN-IDS model optimization algorithm, such as normal and abnormal. The exception types are divided into
CMPSO [23], PSO [24], our branching algorithms—AFSA Probe (scan and probe), DoS (denial of service attack), U2R
optimization PSO (AFSA-PSO) and GA optimization PSO (illegal access to local superuser), and R2L (unauthorized
(GA-PSO), and machine learning method, such as Naive remote access), including a total of 39 types of attacks. The
Bayes, Random Forest, SVM. NSL-KDD sample data set selected is shown in Table 2, and
In the same way, we analyze the multi-classification its data characteristics are described in Table 3.
of the DBN-IDS model based on the NSL-KDD dataset. In Table 2, KDDTrain+ is the training data set, KDDTest+
By contrast, we study the performance of five DBN-IDS and KDDTest-21 are test data sets, and KDDTest-21 is a
model optimization algorithm, Naive Bayes, Random subset of KDDTest+, which is more difficult to classify.
Forest, and Support Vector Machine methods in the In Table 3, feature dimension (2), (3), (4), (7), (12), (21), (22)
are discrete features, and the rest are continuous features. The TABLE 4. Five optimization algorithms common parameter settings.
test set lacks some types of attacks in the training set, which
helps to better simulate the real environment of intrusion
detection.
B. DATA PREPROCESSING
Floating point numbers must be entered in the input layer
neurons of the deep belief network. The floating point values
range from 0 to 1. There are two stages to preprocessing the
data, as follows:
1) NUMERICALIZATION
Using the new code mapping method, the protocol_type fea-
ture can be mapped to an ordered number, such as protocol
type: tcp = [1, 0, 0], protocol type: udp = [0, 1, 0], protocol
type: icmp3 = [0, 0, 1]. Likewise, a symbol feature ‘‘service’’
with 53 different values and a ‘‘flag’’ feature with 10 different
values can be mapped to an ordered number. Finally, 41 fea-
tures are digitized into 104 features.
2) NORMALIZATION
In order to eliminate the dimensional impact of each feature,
the training set and test set must be normalized. According
to the following data transformation formula (21), each value
in the data set obtained in the first stage is normalized over
the ranging 0 to 1.
Yoriginal − Ymin TABLE 5. Optimal network structure of DBN model obtained by five
Y = (21) optimization algorithms.
Ymax − Ymin
where Yoriginal is the original value of the y feature, Ymin is
the minimum value of the y feature, and Ymax is the maximum
value of the y feature.
C. PARAMETER SETTINGS
The five optimization algorithm common parameters for opti-
mizing the basic network structure are shown in Table 4. The
non-common parameters of our algorithm are set as follows: TABLE 6. Confusion matrix of 2-category classification on KDDTest+.
weight wmax = 0.9, wmin = 0.4; learning factor c1 = 1.5,
c2 = 1.0; number of trials of cluster behavior Trytimes = 5;
field of view Visual = 150; moving step Step = 5; the number
of iterations fmaxiter = 60, gmaxiter = 40; wherein, setting
the number of iterations fmaxiter slightly more than gmaxiter
is more conducive to optimizing the initial solution, and
does not affect the search for global optimization solutions.
Moreover, we set the number of iterations for the other four in Table 5. From the table, we find that compared with other
optimization algorithms to 100. optimization algorithms, the optimal network structure of
the DBN model obtained by AFSA-GA-PSO and its branch
D. EXPERIMENT RESULTS AND DISCUSSION
AFSA-PSO algorithm has only four layers, and the total
1) BINARY CLASSIFICATION number of hidden layer nodes is the least, which is beneficial
In our experiment, we have mapped 41-dimensional features to improve detection speed in intrusion detection.
into 104-dimensional features, thus the DBN-IDS model has The confusion matrix of our algorithm applied to
104 input nodes, and 2 output nodes in the binary (2-category) DBN-IDS on the testing set KDDTest+ in the 2-category
classification experiments. The number of epochs is given classification experiments is shown in Table 6. As can be
2 for the KDDTrain+ dataset. seen from Table 6, TN is 9763, TP is 9402, FN is 1915,
The DBN’s basic network structure optimized by the and FP is 1773 which means that our algorithm works with
five optimization algorithms in the experiment is shown classification accuracy (83.86%) for the KDDTest+ dataset.
2) MULTICLASS CLASSIFICATION
Literature [24] has shown the results obtained by Naive In the 5-category classification experiments, we have mapped
Bayes, Random Forest, and Support Vector Machine. 41-dimensional features into 104-dimensional features, thus
Table 7 shows the comparison of the average classification the DBN-IDS model has 104 input nodes, and 5 output nodes.
accuracy of DBN-IDS model optimized by optimization algo- Table 8 shows the optimal network structure of DBN opti-
rithms and other classification algorithms. After calculation, mized by the five optimization algorithm in the experiment.
compared with other classification algorithms, it is found that The experimental results are similar to the 2-category classi-
our algorithm reduces the average classification accuracy by fication; only differ in the number of hidden layer nodes per
at least 1.95% and the average classification accuracy by up layer.
to 15.18%. In order to compare the performance of different DBN-IDS
We can find that compared with other classification algo- model optimization algorithms on the benchmark dataset
rithms, the average classification accuracy of the algo- for the multiclass classification experiments, Five DBN-IDS
rithm is significantly higher than other algorithms as shown optimization algorithms are used to train models through the
in Fig. 3. The DBN model network structure obtained by training set (using 10-layer cross-validation) [26]. We then
our algorithm is superior to other optimization algorithms apply the models to the testing set. The results are described
and has lower fitness value, which has higher average clas- in Table 9. It shows the average training time and the aver-
sification accuracy. Wherein, the classification accuracy rate age detection time of the DBN-IDS optimized by our algo-
obtained by AFSA-PSO algorithm is at a high level in rithm. After calculation, it is found that compared with other
the optimization algorithm; the classification accuracy rate DBN-IDS model optimization algorithm, our algorithm only
obtained by GA-PSO algorithm is at a medium level. By com- increases the average training time by 6.9%, but shortens the
bining the above two algorithms to obtain our algorithm, it is average detection time by 24.69%.
found in the experimental test that the classification accuracy We can find that in different data sets, although the training
of DBN-IDS can be effectively improved. This shows that time of our algorithm is higher than other algorithms by the
in our algorithm, AFSA-PSO algorithm itself has superior lower percentage, while the detection time of our algorithm
performance, but it has certain limitations, cannot find bet- is higher than other algorithms by the higher percentage
ter optimization solution; GA-PSO algorithm performance as shown in Fig. 5 and Fig 6. This shows that although
is general. However, in the case that the pre-algorithm obtains the time complexity of our algorithm has increased slightly,
TABLE 9. Average training time and average detection time obtained by TABLE 10. Average classification accuracy of all the classification
five optimization algorithms. algorithms in 5-category classification.
it can effectively find a better DBN-IDS model. Wherein, of classification algorithms is declined in the 5-category
AFSA-PSO algorithm has the longest training time, but the classification. After calculation, it is found that compared
detection time is only longer than our algorithm; the training with other classification algorithms, our algorithm reduces
time and detection time obtained by GA optimization PSO are the average classification accuracy by at least 1.30% and the
at medium level in the optimization algorithm. Our algorithm average classification accuracy by up to 14.80%.
combines the above two algorithms, although at the cost of We can find that the experimental results are similar
increasing the training time, it is beneficial to obtain the to the 2-category classification as shown in Fig. 7. The
optimal DBN-IDS that makes the detection time shorter. only difference is that the overall classification accuracy of
In order to compare the performance of different classifi- classification algorithm in the 5-category experiment is lower
cation algorithms, based on the above experiments, we apply than the 2-category classification experiment.
classic machine learning algorithm, such as Naive Bayes, The confusion matrix of the AFSA-GA-PSO applied to
Random Forest, and Support Vector Machine to train classifi- DBN-IDS model on the test set KDDTest+ in the 5-category
cation algorithms through the training set. We then apply the classification experiments as shown in Table 11. The exper-
models to the testing set. The results are described in Table 10. iment shows that the accuracy of our model is 82.36% for
It shows the average accuracy of the classification algorithms. KDDTest+ and 66.25% for KDDTest-21, which is better than
Compared with the 2-category classification, the accuracy those obtained using Naive Bayes, Random Forest, Support
TABLE 12. Multi-class evaluation matrix. [9] G. Zhao, C. Zhang, and L. Zheng, ‘‘Intrusion detection using deep belief
network and probabilistic neural network,’’ in Proc. IEEE Int. Conf. Com-
put. Sci. Eng., Jul. 2017, pp. 639–642.
[10] Y. W. Zhan, ‘‘The applications of deep learning on traffic identification,’’
in Proc. Blackhat, 2015, pp. 1411–1420.
[11] R. A. R. Ashfaq, X.-Z. Wang, J. Z. Huang, H. Abbas, and Y.-L. He,
‘‘Fuzziness based semi-supervised learning approach for intrusion detec-
tion system,’’ Inf. Sci., vol. 378, pp. 484–497, Feb. 2017.
[12] P. Q. Bao and F. M. Yang, ‘‘Intrusion detection based on KCPA and SVM,’’
Vector Machine and the other classification model. The detec- Comput. Appl. Softw., vol. 2, no. 23, pp. 125–127, 2006.
tion rate and false positive rate of the different attack types is [13] Z. Lin, G. Chen, W. Guo, and Y. Liu, ‘‘PSO-BPNN-based prediction of
shown in Table 12. network security situation,’’ in Proc. 3rd Int. Conf. Innov. Comput. Inf.
Control, Jun. 2008, p. 37.
[14] M. AI-Qatf, Y. Lasheng, M. AI-Habib, and K. Al-Sabahi, ‘‘Deep learning
V. CONCLUSION approach combining sparse autoencoder with SVM for network intrusion
DBN is a deep learning model widely used in speech detection,’’ IEEE Access, no. 6, pp. 52843–52856, 2018.
[15] Z. W. Wei and Y. Bin, ‘‘Research on optimal strategy of multi-stage
recognition, image recognition and other fields. Its good clas- intrusion detection game in WSNs,’’ J. Electron. Inf. Technol., vol. 40,
sification performance makes it suitable for intrusion detec- no. 1, pp. 63–71, 2018.
tion. This paper proposes a DBN-IDS model construction [16] W. Wang, Y. Sheng, J. Wang, X. Zeng, X. Ye, Y. Huang, and M. Zhu,
‘‘HAST-IDS: Learning hierarchical spatial-temporal features using deep
and optimization method. The PSO algorithm based on fish neural networks to improve intrusion detection,’’ IEEE Access, vol. 6,
swarm thought is used to find the initial optimization solution. pp. 1792–1806, 2018.
The initial optimization solution is used as the initial parti- [17] Y. Liu, X. F. Tian, and Z. H. Zhan, ‘‘Research on inertia weight control
method based on particle swarm optimization algorithm,’’ J. Nanjing Univ.
cle swarm of GA optimization PSO algorithm. Finally, the Natural Sci. Ed., vol. 47, no. 4, pp. 364–371, 2011.
DBN-IDS model is optimized by our algorithm. The exper- [18] G. Zhou, Y. Li, Y.-X. He, X. Wang, and M. Yu, ‘‘Artificial fish swarm based
imental results show that the DBN-IDS model optimized power allocation algorithm for MIMO-OFDM relay underwater acoustic
communication,’’ IET Commun., vol. 12, no. 9, pp. 1079–1085, May 2018.
by our algorithm has higher detection speed and detection [19] L. W. Zhan, C. Z. Yi, K. X. Ying, Y. Y. Hai, X. An, and L. You, ‘‘Approach
accuracy. However, the proposed method is to optimize the to WTA in air combat using IAFSA-IHS algorithm,’’ J. Syst. Eng. Elec-
DBN network structure in the range of limited hidden layers. tron., vol. 29, no. 3, pp. 519–529, Jun. 2018.
[20] Z. Qiang, W. Q. Hui, and L. W. Hong, ‘‘VNE-AFS: Network virtualization
If the maximum number of hidden layers in the DBN is set mapping algorithm based on artificial fish stocks,’’ Trans. Commun., vol. 1,
too large, the training time will have a greater impact on pp. 170–177, Aug. 2012.
the fitness, which is not conducive to find a well- function- [21] Y. Song, F. Wang, and X. Chen, ‘‘An improved genetic algorithm
for numerical function optimization,’’ Appl. Intell., vol. 49, no. 5,
ing DBN-IDS model. The next step is to explore a more pp. 1880–1902, Dec. 2018.
appropriate fitness function, throw away the excessive impact [22] D. Gong, J. Sun, and Z. Miao, ‘‘A set-based genetic algorithm for inter-
of training time on fitness, based on the DBN’s network val many-objective optimization problems,’’ IEEE Trans. Evol. Comput.,
vol. 22, no. 1, pp. 47–60, Feb. 2018.
structure optimized by our algorithm, the weights or threshold [23] H. B. Nguyen, X. Bing, P. Andreae, and M. Zhang, ‘‘Particle Swarm
parameters on the DBN network structure are optimized to Optimisation with genetic operators for feature selection,’’ in Proc. IEEE
further optimize the DBN-IDS model. Congr. Evol. Comput. (CEC), Jun. 2017, pp. 286–293.
[24] H.-M. Feng, ‘‘Self-generation RBFNs using evolutional PSO learning,’’
Neurocomputing, vol. 70, nos. 1–3, pp. 241–251, Dec. 2006.
REFERENCES [25] Bay and Hettich. (2018). The UCI KDD Archive,’ Department of Informa-
tion and Computer Science. [Online]. Available: http://kdd.ics.uci.edu
[1] M. Fatahi, M. Shahsavari, M. Ahmadi, A. Ahmadi, P. Boulet, and [26] (2016). Weka 3-Data Mining With Open Source Machine Learning Soft-
P. Devienne, ‘‘Rate-coded DBN: An online strategy for spike-based deep ware in Java. [Online]. Available: http://www.cs.waikato.ac.nz/ml/weka/
belief networks,’’ Biologically Inspired Cogn. Archit., vol. 24, pp. 59–69,
Apr. 2014.
[2] L. Yu, R. Zhou, T. Ling, and R. Chen, ‘‘A DBN-based resampling SVM
PENG WEI was born in 1994. He is currently pur-
ensemble learning paradigm for credit classification with imbalanced
data,’’ Appl. Soft Comput., vol. 69, pp. 192–202, Aug. 2018. suing the master’s degree with the National Digital
[3] H. G. Han, W. Lu, Y. Hou, and J. F. Qiao, ‘‘An adaptive-PSO-based self-
Switching System Engineering and Technology
organizing RBF neural network,’’ IEEE Trans. Neural Netw. Learn. Syst., Research Center, Zhengzhou, China. His research
vol. 29, no. 1, pp. 104–117, Jan. 2018. interest includes new-generation information com-
[4] F. C. Chen and M. R. Jahanshahi, ‘‘NB-CNN: Deep learning-based munication networks.
crack detection using convolutional neural network and Naïve Bayes
data fusion,’’ IEEE Trans. Ind. Electron., vol. 65, no. 5, pp. 4392–4400,
May 2018.
[5] J. Tian, M. Li, F. Chen, and N. Feng, ‘‘Learning subspace-based RBFNN
using coevolutionary algorithm for complex classification tasks,’’ IEEE
Trans. Neural Netw. Learn. Syst., vol. 27, no. 1, pp. 47–61, Jan. 2016. YUFENG LI was born in 1976. He is currently a
[6] D. Papamartzivanos, F. G. Mármol, and G. Kambourakis, ‘‘Introducing Professor with the School of Computer Engineer-
deep learning self-adaptive misuse network intrusion detection systems,’’ ing and Science, Shanghai University. His current
IEEE Access, vol. 7, pp. 13546–13560, 2019. research interests include broadband information
[7] D. K. Bhattacharyya and J. K. Kalita, Network Anomaly Detection: A networks and high-speed router core technology.
Machine Learning Perspective, vol. 45. Boca Raton, FL, USA: CRC Press,
2013, pp. 455–463.
[8] N. Gao, L. Gao, Q. Gao, and H. Wang, ‘‘An intrusion detection model based
on deep belief networks,’’ in Proc. 2nd Int. Conf. Adv. Cloud Big Data,
Nov. 2014, pp. 247–252.
ZHEN ZHANG was born in 1985. He is currently ZIYONG LI was born in 1995. He received the B.E.
a Lecturer with the National Digital Switching degree from the Harbin Institute of Technology,
System Engineering Technology Research Cen- in 2017. He is currently pursuing the master’s
ter. His current research interests include network degree with the National Digital Switching Sys-
measurement and network management. tem Engineering and Technology Research Center,
Zhengzhou, China. His research interests include
new-generation information communication net-
works, software-defined networking, and network
security.
TAO HU received the B.E. degree from Xi’an DIYANG LIU was born in 1995. He received the
Jiaotong University. He is currently pursuing the B.E. degree from the Beijing Institute of Technol-
Ph.D. degree in network cyber security with the ogy, in 2018. He is currently pursuing the master’s
National Digital Switching System Engineering degree with the National Digital Switching Sys-
and Technological Research Center, Zhengzhou, tem Engineering and Technology Research Center,
China. His research interests include software- Zhengzhou, China. His research interests include
defined networking, control plane, and network robustness of complex networks, software-defined
security. networking, and network security.