Journal of Experimental & Theoretical Artificial Intelligence

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

This article was downloaded by: [Biblioteca Universidad Complutense de Madrid]

On: 24 September 2012, At: 08:15


Publisher: Taylor & Francis
Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered
office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

Journal of Experimental & Theoretical


Artificial Intelligence
Publication details, including instructions for authors and
subscription information:
http://www.tandfonline.com/loi/teta20

An intrusion detection approach based


on multiple rough classifiers integration
a
Lin Feng
a
College of Computer Science, Sichuan Normal University, 8-6#,
No. 1819, 2nd Segment of ChengLong Avenue, Chengdu 610101,
China

Version of record first published: 07 Apr 2011.

To cite this article: Lin Feng (2011): An intrusion detection approach based on multiple rough
classifiers integration, Journal of Experimental & Theoretical Artificial Intelligence, 23:2, 223-231

To link to this article: http://dx.doi.org/10.1080/0952813X.2010.545998

PLEASE SCROLL DOWN FOR ARTICLE

Full terms and conditions of use: http://www.tandfonline.com/page/terms-and-


conditions

This article may be used for research, teaching, and private study purposes. Any
substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing,
systematic supply, or distribution in any form to anyone is expressly forbidden.

The publisher does not give any warranty express or implied or make any representation
that the contents will be complete or accurate or up to date. The accuracy of any
instructions, formulae, and drug doses should be independently verified with primary
sources. The publisher shall not be liable for any loss, actions, claims, proceedings,
demand, or costs or damages whatsoever or howsoever caused arising directly or
indirectly in connection with or arising out of the use of this material.
Journal of Experimental & Theoretical Artificial Intelligence
Vol. 23, No. 2, June 2011, 223–231

An intrusion detection approach based on multiple rough


classifiers integration
Lin Feng*
Downloaded by [Biblioteca Universidad Complutense de Madrid] at 08:15 24 September 2012

College of Computer Science, Sichuan Normal University, 8-6#, No. 1819,


2nd Segment of ChengLong Avenue, Chengdu 610101, China
(Received 28 June 2010; final version received 19 September 2010)

The study of intrusion detection techniques has been one of the hot spot topics in
the field of network security in recent years. For high-dimensional intrusion
detection data sets and a single classifier’s weak classification ability for data sets
with many classes, a novel intrusion detection approach, termed intrusion
detection based on multiple rough classifiers integration, is proposed. First,
some training data sets are generated from intrusion detection data by random
sampling. By combing rough sets and quantum genetic algorithm, a subset of
attributes is selected. Then, each simplified data set is trained, which establishes a
group of rough classifiers. Finally, the intrusion data classification result is
obtained according to the absolute majority voting strategy. The experimental
results illustrate the effectiveness of our methods.
Keywords: multiple rough classifiers; quantum genetic algorithm; the absolute
majority voting strategy; rough set; intrusion detection techniques

1. Introduction
With the rapid development of computer and Internet technology, network security issues
are becoming more and more important. There are different ways to implement security
which ensures that our computers do not get damaged. As an active-defence technique,
intrusion detection technique has attracted extensive attention by many researchers. Many
new intrusion detection ways are emerging constantly. For example, according to the
relationship between the antibody concentration and the pathogen intrusion intensity,
Li (2005) proposed an immunity-based model for the network security risk estimation.
Combining fuzzy inference system and statistics, Yan, Jiang, and Wu (2005) designed and
developed antibody formation and detection components of network intrusion detection
system based on immune mechanism. Lee and Heinbuch (2001) proposed a new network
intrusion detection method using artificial neural network techniques. Cai, Guan, Shao,
Peng, and Sun (2003) proposed an anomaly detection method based on rough sets theory,
which is used to monitor the process of non-normal behaviour. Rough sets could remove
useless information from incomplete, inaccurate data sets by attribute reduction (Simon,
Miroslav, and Mirko 1995; Skowron and Stepaniuk 2005). Therefore, it has unique
advantage to deal with high-dimensional intrusion detection data.

*Email: scfengyc@126.com
ISSN 0952–813X print/ISSN 1362–3079 online
ß 2011 Taylor & Francis
DOI: 10.1080/0952813X.2010.545998
http://www.informaworld.com
224 L. Feng

In general, calculating all attribute reductions for a high-dimensional intrusion


detection data is a non-deterministic polynomial (NP) problem. As a global optimising
algorithm, genetic algorithm (GA) is applied to solve the problem of NP-hard more and
more frequently. But for practical applications, GA has some drawbacks, such as multiple
iterations, the tendency to get easily trapped in local extremes and premature convergence.
In recent years, quantum genetic algorithm (QGA) is regarded as an effective tool to solve
these questions (Zhang, Li, Jin, and Hu 2004). By combining quantum computing and
genetic algorithm, QGA has good robustness and global search ability using quantum
computing mechanism. Now, QGA has been successfully applied in the field of artificial
Downloaded by [Biblioteca Universidad Complutense de Madrid] at 08:15 24 September 2012

intelligence (Han and Kim 2002).


On the other hand, the classification ability of a single rough classifier is not good for
intrusion detection data with many classes. So, it is an important research topic for
multiple rough classifiers integration. In this article, a novel intrusion detection approach,
termed intrusion detection based on multiple rough classifiers integration (IDMRCI), is
proposed. First, the minimum attribute reduction is obtained by combining rough sets and
QGA. And then, each of the reduced data sets is trained to create a rough classifier.
Finally, the final classification result for identifying intrusion data is obtained according to
the absolute majority voting strategy. The simulation experiment is also given.

2. Intrusion detection framework based on IDMRCI


Intrusion detection framework based on IDMRCI is shown in Figure 1. First, the data are
collected by data collectors from the host and the network. Next, the data are stored in the

Network/host data

Data sets Attributes reduction


Data collector

Rough classifier 1 Rough classifier 2 Rough classifier n

Class label C1 output Class label C2 output Class label Ci output

Absolute majority voting model

Output the final result Alerter

Figure 1. Intrusion detection framework based on IDMRCI.


Journal of Experimental & Theoretical Artificial Intelligence 225

database after data filtering. Then, redundant attributes of data sets are reduced by
attributes reduction, and rough classifiers are created by rough decision rules through
value reduction steps. Finally, absolute majority voting stage is adapted to decide the final
classification result. If an intrusion data is identified, then the system sends the messages
to alerter.

3. An intrusion detection approach IDMRCI


Downloaded by [Biblioteca Universidad Complutense de Madrid] at 08:15 24 September 2012

IDMRCI is composed of two steps. First, creating multiple rough classifiers. Second,
classifying the intrusion detection data by multiple rough classifiers.

3.1. Attributes reduction and establishment of multiple rough classifiers


One fundamental aspect of rough set theory for knowledge acquisition involves the search
for some particular subsets of condition attributes. By the use of one such subset, we can
provide the same quality of classification as the original. Such subsets are called
reductions. Now, many attribute reduction methods are proposed. Use of QGA and rough
sets for solving attribute reduction to avoid NP-hard problems is becoming more and more
frequent.

3.1.1. Quantum chromosomes


In QGA, the minimum information unit is qubit (Quantum Bit). In general, the value of a
qubit is 0 or 1. It is also expressed as an arbitrary linear superposition and could be defined
as follows:
j  j0i þ j1i ð1Þ
where |0i and |1i are defined as two states, i.e. spin-down and spin-up states, respectively.
 and  represent two complex numbers corresponding to states which are denoted as
probability amplitude |0i and |1i, respectively.
Here
jj2 þ jj2 ¼ 1 ð2Þ
Therefore, a system with m qubits could be described as
    
1  2  . . .  m 
ð3Þ
1  2  . . .  m 
where ji j2 þ ji j2 ¼ 1ði ¼ 1, 2, . . . , mÞ.

3.1.2. Quantum mutation


In QGA, quantum chromosome mutation is represented by the rotation angle of quantum
rotation gate. At the same time, the best individual information is joined to accelerate the
algorithm convergence. It is usually defined as
 tþ1    t 
i cosðÞ sinðÞ i
¼ ð4Þ
tþ1
i sinðÞ cosðÞ ti
226 L. Feng

Table 1. A new regulating strategy of quantum rotation gate.

ði , i Þ

xi bi f ðxi Þ  f ðbi Þ Di i i 4 0 i i 5 0 i ¼ 0 i ¼ 0

0 0 False 0.01 1 1 1 0
0 0 True 0.01 1 1 1 0
0 1 False 0.01 1 1 0 1
0 1 True 0.01 1 1 1 0
Downloaded by [Biblioteca Universidad Complutense de Madrid] at 08:15 24 September 2012

1 0 False 0.01 1 1 1 0
1 0 True 0.01 1 1 0 1
1 1 False 0.01 1 1 0 1
1 1 True 0.01 1 1 0 1

where
" #
cosðÞ sinðÞ
sinðÞ cosðÞ
is called a quantum rotation gate and  is called a rotation angle such that  ¼ ði , i ÞDi .
ði , i Þ and Di are called the direction and step length of rotation, respectively, where Di
affects the convergence speed and t is the current-evolving generation. The ith bit of
quantum bits in chromosome is
 t
i
ti

Huang, Xu and, Yu (2009) point out the drawbacks of quantum rotation gate strategy of
the existing approaches, and a novel strategy of quantum rotation gate is developed, which
is given in Table 1.
Here xi is the ith bit in current chromosome and bi is the ith bit in current, the best
chromosome. And then, the fitness function values of the current individual and the best
individual are denoted by f ðxi Þ and f ðbi Þ, respectively.

3.1.3. Quantum crossover


In this article, we use stage of all crossover interferences (Yu and Fan 2009) to overcome
the prematurity problem. All chromosomes in population participate in crossover
operation. For example, suppose the population size is 3 and the length of chromosome
is 5. The specific operation could be described in Table 2. It is a crossway, which renews
permutation and combination according to the diagonal. Each capital letter represents a
new crossed chromosome, such as A(1)–C(2)–B(3)–A(4)–C(5).

3.1.4. The fitness function


In the decision table S ¼ ðU, C [ D, V, f Þ (Li, Ruan, Geert, Song, and Xu 2007), where C is
called the conditional attribute set and D is a decision attribute. For each individual
Journal of Experimental & Theoretical Artificial Intelligence 227

Table 2. All crossover interferences.

1 A C B A C
2 B A C B A
3 C B A C B

Source: Yu and Fan (2009).

pj ð j ¼ 1, 2, . . . , nÞ in population P ¼ f p1 , p2 , . . . , pn g, the fitness function f ð pj Þ could be


defined as
Downloaded by [Biblioteca Universidad Complutense de Madrid] at 08:15 24 September 2012

j pj j 1
f ð pj Þ ¼ 1   ðC ðDÞp ðDÞÞ ð5Þ
jCj e j

where C ðDÞ represents the approximation classification quality of C with respect to D in


S, which is calculated as follows:
j [Xi 2UjD C ðXi Þj
C ðDÞ ¼ : ð6Þ
jUj
Similarly, the approximation classification quality of conditional attribute set pj with
respect to decision D, which is denoted as pj ðDÞ.
Theorem 1: Let S ¼ ðU, C [ D, V, f Þ be a decision table, then we have C ðDÞ  pj ðDÞ.
Proof: Since pj  C, using related definitions of rough set, the conclusion holds.
Theorem 2: Let S ¼ ðU, C [ D, V, f Þ be a decision table, then we have f ð pj Þ  0.
Proof: According to Theorem 1, we have eðC ðDÞpj ðDÞÞ  1. So, we can draw a
conclusion that
j pj j 1
05   1:
jCj eðC ðDÞpj ðDÞÞ
Thus, we can get that f ð pj Þ  0.
Theorem 3: Let S ¼ ðU, C [ D, V, f Þ be a decision table, if we treat f ð pj Þ as the fitness
function, then we could get the minimum attribute reduction in theory.
Proof: Let pj be an attribute reduction in S. According to the definition of attributes
reduction, we know C ðDÞ ¼ pj ðDÞ. So, we have
j pj j 1 j pj j
f ð pj Þ ¼ 1   ¼1 :
jCj eðC ðDÞpj ðDÞÞ jCj
Assume that f ð pj Þ obtains the maximum value, if and only if j pj j is the minimum value,
i.e. pj is the minimum attributes reduction.
From Theorem 3, we know that f ð pj Þ reserves the least conditional attribute set, which
enables us to keep the approximation classification quality C with respect to D.
Next, based on the above discussion, we can develop a new attribute reduction method.

3.1.5. Attributes reduction algorithm


Algorithm 1: Attribute reduction algorithm.
228 L. Feng

Input: A decision table S, population size n, qubit length m and the maximum number of
iterations maxgen;
Output: An attribute reduction in S.
Step 1: Initialisation: Let P ¼ f p1 , p2 , . . . , pn g be a population size, where
pj ð j ¼ 1, 2, . . . , nÞ is the jth individual of population, which can be denoted as
    
j1  j2  . . .  jm 
pj ¼ :
j1  j2  . . .  jm 
pffiffiffi
Downloaded by [Biblioteca Universidad Complutense de Madrid] at 08:15 24 September 2012

Let ji , ji ði ¼ 1, 2, . . . , mÞ be 1= 2, which denotes all states superposition on the same
probability in initial search. The initial value of evolving generation g is 0.
Step 2: Quantum superposition observation state R, R ¼ fa1 , a2 , . . . , an g, could be
constructed according to the individual probability amplitude of P, where
aj ð j ¼ 1, 2, . . . , nÞ is a binary string with m length (i.e. aj ¼ b1 b2 . . . bm ), which denotes
each individual observation. The value of bk ðk ¼ 1, 2, . . . , mÞ is ‘0’ or ‘1’. The specific
process of observation generated by probability is given as follows: for probability
amplitude
 t
i
ði ¼ 1, 2, . . . , n  mÞ
ti

of each qubit in P, random number r in the range [0,1] is generated. If r 5 ji j2 , then
the corresponding observation value b is ‘0’. Otherwise, the corresponding observation
value b is ‘1’.
Step 3: According to formula (5), the fitness function f ð pj Þ is adopted to evaluate each
individual of population.
Step 4: If some individuals have the better fitness function values, then put the higher
probability into next generation.
Step 5: Combining quantum gate and all crossinterferences of quantum to update each
chromosome.
Step 6: g ¼ g þ 1, if g satisfies maxgen, then output an attributes reduction; otherwise, go
to Step 2.

3.2. Classification process


For identification data, we can give Algorithm 2.
Algorithm 2: Classification process for identification data.
Input: Identification intrusion detection data x;
Output: Classification result of x.
Step 1: x input rough classifiers 1, rough classifiers 2, . . . , rough classifiers n, respectively.
Step 2: Calculate the output result of rough classifiers i for i ¼ 1 to n.
Step 3: The final classification result of x is obtained according to the absolute majority
voting strategy.
Journal of Experimental & Theoretical Artificial Intelligence 229

Table 3. The experimental data sets.

Training data sets Testing data sets

DoS 3500(3) 3000


Probe 3000(3) 2000
U2R 1500(3) 1000
R2L 1500(3) 1000
Normal 3500(3) 3000
Downloaded by [Biblioteca Universidad Complutense de Madrid] at 08:15 24 September 2012

4. Experimental testing and analysis


We use DARPA1999 intrusion detection data set. The data set contains 7 weeks of
training data and 2 weeks of testing data, and contains more than 300 instances of 38
different attacks from four attack categories (i.e. DoS, Probe, R2L and U2R).
In this article, experimental data sets are selected from DARPA1999 as follows. First,
three groups of training data sets are generated by simple random sampling without
replacement from 7 weeks of training data in DARPA1999. Then, three separate rough
classifiers are established by the process of data preprocessing, attribute reduction and
value reduction. Finally, testing data sets are established by simple random sampling
without replacement from 2 weeks of testing data in DARPA1999. The selected data sets
are given in Table 3.

4.1. Experiment 1
The purpose of Experiment 1 is to compare the classification performance of single rough
classifier method with multiple rough classifiers method. The evaluation criteria are the
true positive (TP) rate and the false positive (FP) rate. In the experiment, we use
the Nguyen improved greedy algorithm to discrete continuous-valued attributes and the
general attribute value reduction to generate decision rules. The minority priority
matching strategy is adopted for testing data. In QGA algorithm, initial population size
and the evolving generation are 100 and 1500, respectively. The experimental results are
given in Table 4.

4.2. Experiment 2
The purpose of Experiment 2 is to compare the performance of QGA with the GA under
the condition of multiple rough classifiers. Relative runtime is used to evaluate the
effectiveness of two algorithms. Parameter values of QGA algorithm is the same as
Experiment 1. In GA, the population size, the single-point crossover probability and the
basic mutation probability are 100, 0.5 and 0.0005, respectively. The hardware parameter
values of computer are listed as: Intel 2.2 GHz, 2 GB and Windows XP. The results are
given in Table 5.
The results of Experiments 1 and 2 show that multiple rough classifiers method are the
higher TP rate and the lower FP rate than single classifier method. Consequently, multiple
rough classifiers have better performances for the complex intrusion detection data
classification problems. On the other hand, QGA has a lesser value of runtime than GA
230 L. Feng

Table 4. The results of Experiment 1.

Multiple rough classifiers Single rough classifier

TP (%) FP (%) TP (%) FP (%)

DoS 96.3 21.3 95.6 22.4


Probe 95.7 17.2 95.5 23.1
U2R 93.4 21.6 92.3 28.0
R2L 92.7 29.4 91.9 33.2
Downloaded by [Biblioteca Universidad Complutense de Madrid] at 08:15 24 September 2012

Normal 97.9 20.3 96.5 23.8

Table 5. The results of Experiment 2.

Training data set 1 Training data set 2 Training data set 3

QGA training time 1:53 1:47 2:14


GA training time 2:26 2:38 2:51

for detecting attack types. So, IDMRCI could meet the real-time demand of intrusion
detection system.

5. Conclusions
In this article, we proposed an attribute reduction algorithm, which combines rough sets
and QGA. Next, a novel intrusion detection frame IDMRCI is developed. The
experimental results illustrate that multiple rough classifiers method has higher TP rate
and lower FP rate than single rough classifier method. These methods could meet the
demand of accuracy of real-time for an intrusion detection system, especially for the
complex intrusion detection data classification problems with many classes.

Acknowledgements
This study is supported by the Scientific Research Fund of Sichuan Provincial Education
Department under Grant No. 09ZC079 and the Key Research Foundation of Sichuan Normal
University, respectively.

References

Cai, Z.M., Guan, X.H., Shao, P., Peng, Q.K., and Sun, G.J. (2003), ‘A New Approach to Intrusion
Detection Based on Rough Set Theory’, Chinese Journal of Computers, 26, 361–366.
Han, K.H., and Kim, J.H. (2002), ‘Quantum-Inspired Evolutionary Algorithm for a Class of
Combinatorial Optimization’, IEEE Transactions on Evolutionary Computation, 6, 580–593.
Huang, L.M., Xu, Y., and Yu, R.Q. (2009), ‘Improved Quantum Genetic Algorithm and its
Application’, Computer Engineering and Design, 30, 1987–1990.
Journal of Experimental & Theoretical Artificial Intelligence 231

Lee, S.C., and Heinbuch, D.V. (2001), ‘Training a Neural-Network Based Intrusion Detector to
Recognize Novel Attacks’, IEEE Transactions on Systems, Man, and Cybernetics – Part A:
Systems and Humans, 31, 294–299.
Li, T. (2005), ‘Risk Detection about Network Security Based on Immune System’, Science in China
Series E: Information Sciences, 8, 798–816.
Li, T.R., Ruan, D., Geert, W., Song, J., and Xu, Y. (2007), ‘A Rough Set Based Characteristic
Relation Approach for Dynamic Attribute Generalization in Data Mining’, Knowledge-Based
Systems, 20, 485–494.
Simon, P., Miroslav, K., and Mirko, D. (1995), ‘A Rough Set Approach to Reasoning under
Uncertainty’, Journal of Experimental and Theoretical Artificial Intelligence, 7, 175–193.
Downloaded by [Biblioteca Universidad Complutense de Madrid] at 08:15 24 September 2012

Skowron, A., and Stepaniuk, J. (2005), ‘Hierarchical Modeling in Searching for Complex Patterns:
Constrained Sums of Information Systems’, Journal of Experimental and Theoretical Artificial
Intelligence, 17, 83–102.
Yan, Q., Jiang, Y., and Wu, J.P. (2005), ‘Antibody Generation and Antigen Detection
Component in Immune-Based Network Intrusion Detection System’, Chinese Journal of
Computers, 28, 1601–1607.
Yu, H.Y., and Fan, J.L. (2009), ‘Generalized Fuzzy Entropy Threshold Based on Quantum Genetic
Parameter Optimization’, Pattern Recognition and Artificial Intelligence, 22, 305–311.
Zhang, G.X., Li, N., Jin, W.D., and Hu, L.Z. (2004), ‘A Novel Quantum Genetic Algorithm and its
Application’, Acta Electronica Sinica, 32, 476–479.

You might also like