2018 Computers and Security Journal Paper

Accepted Manuscript
Intrusion detection system for wireless mesh network using multiple

support vector machine classifiers with genetic-algorithm-based
feature selection
R. Vijayanand , D. Devaraj , B. Kannapiran
PII: S0167-4048(18)30376-6
DOI: 10.1016/j.cose.2018.04.010
Reference: COSE 1332
To appear in: Computers & Security
Received date: 22 November 2017

Revised date: 11 April 2018
Accepted date: 13 April 2018
Please cite this article as: R. Vijayanand , D. Devaraj , B. Kannapiran , Intrusion detection system for
wireless mesh network using multiple support vector machine classifiers with genetic-algorithm-based
feature selection, Computers & Security (2018), doi: 10.1016/j.cose.2018.04.010
This is a PDF file of an unedited manuscript that has been accepted for publication. As a service
to our customers we are providing this early version of the manuscript. The manuscript will undergo
copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please
note that during the production process errors may be discovered which could affect the content, and
all legal disclaimers that apply to the journal pertain.
ACCEPTED MANUSCRIPT
Intrusion detection system for wireless mesh network using multiple support vector
machine classifiers with genetic-algorithm-based feature selection
R. Vijayananda*, D.Devarajb, B.Kannapiranc

Department of {aComputer Science and Engineering, bElectrical and Electronics Engineering,
c
Instrumentation and Control Engineering}
a,b,c
Kalasalingam University, Srivilliputur, Tamilnadu, India
a*
rkvijayanand@gmail.com, bdeva230@yahoo.com, ckannapiran79@gmail.com
T
*
Corresponding author. Tel. +919025541417.
IP
CR
Abstract
Security is a prime challenge in wireless mesh networks. The mesh nodes act as the backbone
of a network when confronting a wide variety of attacks. An intrusion detection system
US
provides security against these attacks by monitoring the data traffic in real time. A support
vector machine for intrusion detection in wireless mesh networks is proposed in this paper.
AN
The redundant and irrelevant variables in the monitored data affect the accuracy of attack
detection by the system. Hence, feature selection techniques are essential to improve the
M
performance of the system. In this paper, a novel intrusion detection system with genetic-
algorithm-based feature selection and multiple support vector machine classifiers for wireless
ED
mesh networks are proposed. The proposed system selects the informative features of each
category of attacks rather than the features common to all the attacks. The proposed system is
evaluated using intrusion datasets generated by simulating a wireless mesh network in
PT
Network Simulator 3 and by considering packet delivery ratio, delay, etc. as the parameters.
The experimental results have demonstrated that the proposed system exhibits a high
CE
accuracy of attack detection and is suitable for intrusion detection in wireless mesh networks.
AC
Keywords
Wireless mesh network, intrusion detection system, GA based feature selection, SVM
classifier.
1. Introduction
The wireless mesh network (WMN) is a notable communication technology in recent

years, which adopts multi-hop forwarding techniques for high speed data communication
ACCEPTED MANUSCRIPT
with minimum data loss. It is emerging as a highly developed communication technology and
is suitable for various cyber–physical applications such as in smart grids, military
applications and internet of things [1].The multi-hop nature of a WMN is vulnerable to
various attacks such as flooding, blackhole and greyhole. Flooding is a denial-of-service
attack that occurs by frequent transmission of bulk amounts of HELLO packets, data and
undesirable messages by the malicious node. It causes congestion at the receiver buffer and
communication channels of the WMN, which increases the packet dropping ratio and blocks
the communication. Blackhole is a major Man-in-the-Middle (MITM) attack on WMNs, that
T
interrupts communication by affecting an active node that drops the packet of all the nodes in
IP
a network. Greyhole attack also belongs to the MITM category; it maliciously drops the
CR
packets of only the selected nodes. An intrusion detection system (IDS) is an important
security mechanism that protects the system from these attacks by analysing the network
traffic. Numerous methods are adopted for the implementation of IDSs in WMNs [2, 3].
US
Various machine learning algorithms such as support vector machine (SVM) [4] and
artificial neural network (ANN) [5] are used as classifiers of IDSs in WMNs. They detect
AN
attacks by analysing the parameters of the network traffic data. The traffic data is noisy, and
the input of certain features increases the accuracy of intrusion detection, whereas certain
M
other features decrease the detection accuracy. Hence, the selection of informative features to
serve as the input to the learning algorithms is essential for an IDS. The selection of features
ED
is challenging, and the complexity of the feature selection problem is NP-Hard [6].
The feature selection methods are categorized as either filter or wrapper methods. In a
filter method, the non-informative features are removed from the input set by determining the
PT
relations between the input variables and the corresponding output [7].However, in a wrapper
method, the informative features are selected by evaluating the fitness of the features, using
CE
learning algorithms such as Bayesian classifier and SVM in an iterative manner[8, 9]. Feature
selection using evolutionary computation (EC) techniques such as genetic algorithm (GA),
AC
differential evolution (DE) and particle swarm optimization (PSO) belongs to the category of
wrapper methods. Recently, filter and wrapper methods have been combined for selecting
optimal subsets of features and are named as hybrid feature-selection technique. In [10], a
hybrid technique based on mutual information (MI) and GA is used, in which MI selects the
semi-informative features by eliminating the non-informative features and GA selects the
informative features from those semi-informative features. The major drawback of this
method is the usage of filter methods as the initial technique, which is likely to eliminate a
few informative features. In certain cases, multiple EC techniques are combined for obtaining
ACCEPTED MANUSCRIPT
good classification results: e.g., GA and particle swarm optimization (PSO) [11], ant colony
optimization (ACO) and bee colony optimization (BCO) [12].
Genetic algorithm is a population-based search algorithm that iterates the population of
individuals using the three genetic operators, namely, selection, crossover and mutation, to
obtain the optimal solution [13]. It has undergone numerous enhancements from the early
version for obtaining optimized result in different applications. For example, the population is
represented in real coded format in [14], and the crossover operation is enhanced in [15].
Apart from the above, several enhancements were carried out on GA for the selection of
T
informative features from datasets. Machine learning algorithms such as fuzzy, learning
IP
automata, SVM and multilayer perceptron (MLP) have been used as classifiers in such
CR
applications. The performance of SVM is higher than those of numerous available machine
learning algorithms and is widely used as a classifier in a number of complex applications.
Most of the available feature selection techniques select the common informative
US
features (CIF) of all the classes from the sample space. In the previous works [16, 17], a
feature subset is selected as the common informative features for all type of attacks. The
AN
drawback of using CIF is that the classifier exhibits high false positive rate. Another
drawback is that it results in a sub-optimal subset of informative features. Hence, it is
M
necessary to identify the informative features of each class for improving the performance of
the classifier.
ED
The local feature selection for improving the classifier performance can be implemented
in either instance- or model-based learning method. In instance-based method, the weights of
each feature are adjusted to achieve maximal margin, whereas, in model-based approach, an
PT
approximate model is designed for learning purpose [18].Both instance- and model-based
methods have their own drawbacks, such as high computational intensity and coarse function
CE
approximation, respectively [19]. In certain cases [20], both the learning methods are
combined to gain the advantages of both methods.
AC
In this paper, a model-based local feature selection for each category of attacks is
proposed for IDS development. The performance of the proposed system is evaluated by
using an intrusion dataset generated from a WMN simulated in Network Simulator 3 (NS3)
tool by using the standard intrusion dataset. The experimental results have demonstrated that
the proposed system with the feature selection technique is substantially more efficient than a
conventional system with common feature selection techniques.
ACCEPTED MANUSCRIPT
2. Proposed methodology for development of IDS
The proposed IDS is based on GA-based feature selection and SVM classifier. SVM
classifier exhibits high attack detection ratio and is suitable for multiple attack detection in
WMN [21]. A separate classifier is assigned for each attack category and is trained with the
informative features of each attack data selected by the proposed feature selection technique.
The classifiers are arranged in linear order as shown in Fig. 1, and each classifier is placed in
the order of severity of the attack. The output of each classifier is either “belongs to the attack
T
category” or “does not belong to the attack category’, except the last classifier output, which
IP
is of “new class” category rather than “does not belong to” category. If the classifier classifies
CR
the data as “belongs to” category, it reports to the user for further processing; otherwise, the
input data will be forwarded to the next classifier for determining the attack category. This
process is repeated till the category of input data is determined.
Fig. 1. Proposed Multiple SVM classifiers

US
AN
The steps involved in the proposed IDS development are shown in Fig. 2.
M
Fig 2. Proposed IDS with GA-based feature selection

ED
In this work, GA-based wrapper method is used to select the informative features of each
category of attacks for the SVM classifier.
PT
The selection of local informative features from a multiclass dataset is a challenging task.
The informative features of classes containing large amounts of data dominate the selection
CE
process. It can suppress the informative features of classes having smaller data. To solve this
problem, the actual output of multiple class training data is normalized into binary output. It
can be carried out by considering the corresponding class output as “1” and the output of all
AC
other classes as “0” for identifying the informative features of a particular class. After the
normalization operation, the GA-based algorithm is applied for determining the informative
features of each attack. After the selection of the local informative features of each attack, the
SVM classifier is trained by those informative features and is evaluated by using the testing
dataset. The details of GA-based local feature selection and their implementation in SVM-
based IDS are provided in the following sections.
ACCEPTED MANUSCRIPT
3. GA-based feature selection technique
The feature selection problem is a highly complex NP-hard problem. It can be stated as
follows: “m” subsets of informative features are selected from the “n” number of features,
where m < n. The informative features are expected to require less computation effort and
exhibit high accuracy. In this section, the optimal feature selection using genetic algorithm is
described in detail.
Genetic algorithm [22] is an adaptive heuristic global search algorithm inspired from the
T
evolutionary process of biological organisms. It optimizes the solution by searching the
IP
solution space globally. It uses selection, recombination and mutation operations for
CR
determining the solutions to a problem. In a GA-based feature selection problem, the initial
random population of chromosomes is generated based on the features specified as input. The
features are represented as genes on a chromosome, in which “1” represents the availability
US
of the feature and “0” represents the non-availability of the feature. Each chromosome in the
population exhibits different characteristics and is considered as a unique individual.
AN
Once the chromosomes are generated, the fitness value of each chromosome is calculated.
The fitness value indicates the level of suitability of the chromosome in problem solving. The
M
chromosomes with large fitness values are selected as the parents for the next generation. The
top two chromosomes of the population are selected through elitism method, and the
ED
remaining parent chromosomes are selected with the aid of selection techniques. Roulette
wheel and tournament-based and rank-based feature selection techniques are a few of the
widely-used selection techniques of GA. In this work, tournament-based feature selection
PT
method is adopted. After selecting the parent chromosomes for the next generation, they
undergo crossover and mutation operations.
CE
In a crossover operation, a few blocks of the chromosomes are randomly swapped, and in
the mutation operation, certain bits of the chromosomes are shifted from 1 to 0 or 0 to 1
AC
based on the crossover and mutation rates, respectively. The crossover operator helps to
generate the new offspring in each generation, whereas the mutation operator helps to prevent
the problem of getting stuck at the local minima. This process is repeated till the optimal
solution is obtained. The features of the optimal solution are identified as the most
informative features.
In IDS, the network traffic features are used as input variables and are represented as
individuals of the GA population. The GA-based feature selection is a wrapper approach, so
that a classifier or decision-making algorithm is used to evaluate the fitness value of each
ACCEPTED MANUSCRIPT
chromosome. The accuracy of the classifier in the detection of attacks is used as the objective
of the GA-based feature selection technique used in this study. The input variables are
provided as the input to the SVM classifier for the fitness evaluation. The individual with a
large fitness value is passed to the next generation as a parent, and the process repeats till the
optimal features are selected.
4. SVM Classifier
T
The SVM classifier, a supervised learning algorithm, is designed using the fundamental
IP
concept of classifying data with a hyperplane or line. The architecture of an SVM classifier is
CR
shown in Fig. 3. The hyperplanes are linear in nature and is mathematically expressed as
f(x) = wTx+ b = 0
where w is the weight vector, x is the input data and b is the bias value. Several hyperplanes
US
are available between the classes, and it is necessary to select the most effective classification
hyperplane to obtain a highly accurate classifier. The most effective hyperplane in an SVM
AN
classifier implies the hyperplane exhibiting largest margin with the support vectors. In
numerous classification problems, the global margin (2/||w||) is used as the standard to obtain
M
the classifier with the maximum accuracy.

However, in real world applications, the curve, rather than the hyperplane, is required to
ED
classify the data. Therefore, the kernel functions are used to select the suitable hyperplane for
those applications. The input data is initially mapped into the feature space using the kernel
functions and is then separated by a linear separable hyperplane [23]. Linear, polynomial,
PT
quadratic functions as well as radial basis function (RBF) are a few of the kernel functions
widely used to map input data
CE
AC
Fig. 3. SVM Architecture
to a feature space, and it is necessary to use a suitable function for achieving high accuracy in
classification problems. Linear function exhibits low classification rate and is not suitable for
most of the complex classification applications. In this work, RBF kernel function [24] is
used to determine the maximum margin of the SVM classifier and exhibits adequate
convergence rate.
ACCEPTED MANUSCRIPT
RBF kernel uses the exponentially decaying function for computing the margin by
determining the support vectors using equation (1):
f(x,y)=exp(-(||x-y||2)/2σ2) (1)
where x and y are two data points, ||x-y||2 is the squared Euclidean distance and σ is the
influence distance of a single training data. The maximal-valued neighbourhood data points
are selected as the support vectors and is further decayed in all the directions from that vector
for obtaining the suitable hyperplane. The selected hyperplane with the maximal margin
efficiently classifies the input data into the corresponding classes.
T
Multiple SVMs are used for the multiclass problems in which each classifier is assigned to
IP
an individual class. In this classifier, the distance of the test sample to the hyperplane is used
CR
to determine the confidence value of each classifier, and the output of the maximum
confidence value of the classifier is selected as the final output. It can be represented as
Output class=arg(maxi=1,2....,n f(xi))
US
where “n” is the number of classifiers and f(xi) is the confidence value of the ith classifier.
AN
5. Results and Discussion
M
Fig. 4. Wireless mesh network simulated in NS3 during attack

ED
To demonstrate the efficiency of the proposed feature selection technique, an intrusion

dataset has been generated from a WMN environment simulated in NS3. The WMN
PT
simulation diagram is shown in Fig. 4.The simulated network has 30 nodes, of which one
node acts as the base station, and all the other nodes transmit data to that base station in mesh
CE
fashion. In this simulation, AODV routing protocol is used to establish WMN communication
in NS3 [25].
After generating the dataset, the informative features of each category of attacks is
AC
determined by GA, and multiple SVM classifiers are developed in Matlab 2014a. The various
steps involved in the generation of the dataset and the development of IDS is presented
below. In addition to that, the standard intrusion datasets ADFA-LD and CICIDS2017 are
used to validate the proposed system.
5.1 Dataset generation

In general, a dataset contains the parameters collected from the working environment and is
ACCEPTED MANUSCRIPT
used to design the system for any application [26]. The dataset is categorised as training data
for constructing the system and testing data for validating that system. In this work, the
dataset contains normal, jamming, data flooding, hello flooding, blackhole and greyhole data,
which are generated by the following procedure:
1. Each datum has a time frame of 2 min, and the input variables of that data are
determined by processing the fundamental network features such as sending time,
receiving time and number of packets received.
2. The normal and attack class for the data are assigned based on the packet delivery ratio
T
of the network.
IP
3. If the packets received after the threshold time or the number of packets received is
CR
significantly higher than those sent, it should be considered as an attack.
4. The proposed system is designed for highly sensitive networks such as a smart grid.
Thus, the normal and distinctive attacks are simulated separately based on the expert
knowledge on the attack in the network.
US
Table 1. Features of Generated Dataset
AN
S. No Features S. No Features
1 Packet delivery ratio 11 Difference in receiving time
M
2 Delay 12 Flow id
3 Throughput 13 Jitter_total
ED
4 Received bytes 14 Times forwarded

5 Received packets 15 Packets dropped
6 Packet size 16 Source port
PT
7 Time 17 Destination port

8 Total transmitted packets 18 Transmission rate
CE
9 Total received packets 19 Missed flows

10 Difference in transmission time 20 Number of connection
AC
Three datasets of message size 512 bits, 1024 bits and a combination of both the sizes are
generated by following the above procedures. Each dataset has 20 input features, which are
provided in Table 1, and an output that contains the category of the data.
5.2 Feature selection

Feature selection aids to reduce the computational cost, space complexity, over fitting, etc.
and increase the accuracy of the IDS. The objective of the work is to improve the
ACCEPTED MANUSCRIPT
performance of the IDS classifier by selecting the informative features for each category of
attacks. The informative features of each attack class are selected through GA by using the
SVM classifier. After selecting the informative features, it can be evaluated by using testing
data.
Table 2. Informative features of each attack category on combined dataset
Attacks Number of Informative features selected

genes selected
T
Normal 14 1,2,3,4,6,7,8,12,13,14,15,17,18,20
IP
Jamming 10 1,2,3,6,8,9,10,11,15,18
Data flooding 10 5,7,9,11,12,14,16,18,19,20
CR
Hello flooding 14 3,4,5,6,7,8,10,11,13,14,15,16,17,19
Black hole 13 1,3,4,59,10,11,13,14,15,18,19,20
Grey hole 14
US 1,3,4,5,6,8,9,10,11,12,15,17,18,20
Table 2 shows the informative features selected for each category of attacks in the combined
AN
dataset. It clearly reveals that the different types of attacks have different informative
features.
M
For better understanding, the step-by-step procedure for feature selection for jamming attack
is provided below:
ED
1. First, the output of jamming data in the dataset is labelled as“1” and all the other
attacks as “0”.
2. The initial population of size N × F is randomly generated as shown in Table 1, in
PT
which “1” represents the selection of features and “0” represents the non-selection of
features. Here, “N” is the number of individuals of a generation and F is the total number
CE
of features. A sample individual randomly generated is shown in Table 3,

Table 3. Sample Individual
AC
S. No 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Genes 1 1 0 0 1 0 0 0 1 1 0 1 1 0 1 1 0 1 1 1
The table shows that the selected features in the individual are1,2,5,9,10, 12, 13, 15, 16,
18, 19 and 20 that and the non-selected features are 3, 4, 6, 7, 8, 11, 14 and 17.
3. Then, the SVM classifier for jamming attack detection is trained by the training data
with selected features and is tested by the testing data.
ACCEPTED MANUSCRIPT
4. The accuracy of the classifier is used as an objective function. Hence, the correct
classification accuracy of the SVM classifier on the testing data is considered as the
fitness value of the individual. Similarly, the fitness of each individual is calculated, and
the results are given in Table 1.
5. The two individuals with the maximum fitness values are selected as parents for the
next generation through elitism method and the other parents by tournament selection.
6. After the parents are selected, they undergo crossover and mutation.
7. The process is repeated till the last generation or stopping criteria is reached. The
T
features of the individual with the maximum fitness in the last generation are selected as
IP
the informative features for the detection of jamming attack.
CR
5.3 Performance evaluation
In this work, SVM classifiers are used to select the local informative features for the IDS
US
in the WMN. Radial basis function is used as the kernel function of the SVM classifiers. The
classifier functions in a hierarchical order for determining the category of the input data.
AN
After the category is determined, the normal traffic data is permitted for further processing,
and the attacks are prevented by quarantining the attack data. The generated datasets are
equally divided into training and testing sets for the evaluation process. Each classifier is
M
trained separately for improving the performance of the proposed system and reducing the
training time. The characteristics of an effective intrusion detection system with regard to the
ED
security aspects are high accuracy, low false positive rate and low false negative rate. The
accuracy, false positive rate (FPR), false negative rate (FNR), sensitivity, specificity and
PT
precision are provided in the table, for analysing the security of proposed system. The
proposed system is analysed using the different-sized data and the data collected from
CE
different networks, for identifying the suitability of the system in multiple environments. The
accuracy, FPR and FNR of the classifier on the datasets of message sizes 512 bits, 1024 bits
and a combination of two dataset by using the proposed feature selection technique are
AC
presented in Tables4, 5and 6 respectively. The validation of the proposed system with other
standard datasets is essential to understand the performance of the system in different
environments [27]. Thus, the standard intrusion datasets ADFA-LD and CICIDS2017 are
used for the evaluation process, and the results are presented in Tables7 and 8, respectively.
Table 4. Experimental results of GA-SVM-based feature selection on 512-bit dataset
Attacks Testing Accuracy FPR FNR Sensitivity Specificity Precision

ACCEPTED MANUSCRIPT
data size
Normal 99 0.9589 0.061 0.027 0.9729 0.9389 0.9574
Jamming 26 0.9494 0.027 0.127 0.8723 0.9734 0.8542
Data flooding 14 0.9968 0.003 0 1 0.9966 0.9565
Hello flooding 15 0.9905 0.007 0.05 0.9524 0.9932 0.9091
Black hole 11 0.9968 0.003 0 1 0.9967 0.9375
Grey hole 33 1 0 0 1 1 1
T
Table 5. Experimental results of GA-SVM-based feature selection on 1024-bit dataset
IP
CR
data size
Normal 188 0.9430 0.031 0.097 0.9032 0.9688 0.9492
Jamming 48 0.9051 0.06 0.28 0.7200 0.9398 0.6923
Data flooding 23
Hello flooding 22
0.9684
0.9430
0.007
0.034
US0.24
0.29
0.7647
0.7143
0.9929
0.9653
0.9286
0.6667
AN
Black hole 16 0.9557 0 0.39 0.6111 1 1
Grey hole 19 0.9241 0 0.29 0.7333 1 1
M
Table 6. Experimental results of GA-SVM-based feature selection on combined dataset

ED

data size
PT
Normal 247 0.9281 0.051 0.089 0.9105 0.9491 0.9551

Jamming 74 0.9238 0.041 0.25 0.7468 0.9594 0.7867
CE
Data flooding 37 0.9937 0.005 0.028 0.9722 0.9954 0.9459

Hello flooding 37 0.9471 0.009 0.38 0.6182 0.9904 0.8947
Black hole 27 0.9852 0.002 0.19 0.8125 0.9977 0.9629
AC
Grey hole 52 0.9640 0.004 0.23 0.7656 0.9951 0.9608
Table 7. Experimental results of GA-SVM-based feature selection on ADFA-LD dataset

ACCEPTED MANUSCRIPT
data size
Normal 1640 0.9133 0.143 0 1 0.8563 0.8209
Reconnaissance 339 0.6759 0.34 0.14 0.86 0.6559 0.1550
DoS 64 0.8216 0.176 0.44 0.436 0.8241 0.0305
Fuzzers 457 0.9220 0.005 0.47 0.498 0.9429 0.3003
Backdoor 9 0.7334 0.27 0 1 0.7334 0.0037
Generic 685 0.9913 0 0.018 0.98 0.999 0.9988
Worms 2 0.9530 0.04 0 1 0.9525 0.0104
T
IP
Table 8. Experimental results of GA-SVM-based feature selection on CICIDS2017
CR
dataset
data size
Benign
DoS
793
377
0.9767
1.00
0.020
0
US 0.027
0
0.9723
1
0.9795
1
0.9673
1
AN
Portscan 283 0.999 0.001 0 1 0.9988 0.9982
Web Attack 22 0.9985 0 0.014 0.88 1 1
M
Bot 27 0.9985 0 0 1 1 1
FTP-Parator 160 0.9916 0 0.096 0.9039 1 1
ED
SSH-Parator 202 0.9930 0.016 0.033 0.9668 0.9988 0.9944
From Tables 4–8, it is observed that the accuracy and FPR of most of the classifiers is
PT
adequate. Although the FNR of the attacks are higher, as the FNR value is adequate for
normal cases, the developed system can be used for intrusion detection. The sensitivity and
CE
precision for jamming attack in the generated dataset are low, similar to Blackhole attack.
Next, the overall performance of the proposed technique is compared with the SVM-
AC
developed IDS using a common informative feature selection technique, and the results are
presented in Table 9.
Table 9. Validation of proposed feature selection with common informative feature

selection using various intrusion datasets
ACCEPTED MANUSCRIPT
Dataset Dataset Testing Common informative Proposed Local

generation data feature selection with informative feature
size Multiclass SVM selection with multiple
SVM
Accuracy FPR FNR Accuracy FPR FNR
WMN In this 158 0.7848 0.0832 0.0902 0.8354 0.044 0.26
dataset paper
(512 bit)
T
WMN In this 316 0.9114 0.0443 0.0443 0.9556 0.0169 0.034
IP
dataset paper
CR
(1024 bit)
WMN In this 474 0.8266 0.10 0.113 0.957 0.019 0.19
dataset paper
(combined)
ADFA-LD [28] 3196 0.8495
US
0.0675 0.083 0.9695 0.0125 0.153
AN
dataset
CICIDS20 [29] 4030 0.9985 0.0009 0.0009 0.9939 0.0032 0.19
M
17 dataset
ED
Table 9 clearly reveals that the proposed system with local informative features exhibits a
significantly higher accuracy than the SVM with global informative features.
The training and testing times of the classifier is used to determine the suitability of the
PT
classifier for IDS development. The training generally requires significantly more time than
that for the testing of a large amount of data [30]. Hence, the training of the classifier
CE
becomes more important than the testing. To reduce the development time of the IDS, the
training of the classifiers can be conducted parallelly. Therefore, the maximum training time
AC
of any classifier is selected as the training time of intrusion detection.
Table 10. Training and testing time of proposed system on combined dataset
Attack category Training time Testing time

Normal 0.6084 0.0022
Jamming 0.5618 0.0025
Data flooding 0.5928 0.0025
ACCEPTED MANUSCRIPT
Hello flooding 0.4836 0.0026

Blackhole 0.4836 0.0022
Wormhole 0.5616 0.0019
Metrics IDS 512-bit 1024-bit Combined ADFA-LD CICIDS20
techniques dataset dataset dataset dataset 17 dataset
T
IP
CR
US
AN
M
In Table 10, the training and testing times of each classifier on the combined dataset is
ED
presented. It demonstrates that the SVM consumes the minimum time for training and testing
and therefore is suitable for on-line application.
PT
5.4 Comparison with mutual-information-based feature selection

CE
Fig. 5. Comparison of GA-based feature selection with MI-based and without feature
selection
AC
Next, the performance of the proposed system is compared with that of MI and all the
features, with accuracy of attack detection, computational complexity and communication
overhead as the metrics. The SVM classifier uses RBF kernel functions during the training
process. The result of the comparison on the combined dataset is shown in Fig. 5.The figure
shows that GA exhibits better detection ratio than MI and SVM with all the features for all
ACCEPTED MANUSCRIPT
Accuracy GA+SVM 0.7848 0.9114 0.957 0.9695 0.9985
MI+SVM 0.7353 0.8295 0.9326 0.9728 0.9895
SVM 0.7320 0.7631 0.9330 0.9702 0.9895
FPR GA+SVM 0.0832 0.0443 0.032 0.0675 0.0009
MI+SVM 0.068 0.119 0.0298 0.0133 0.0041
SVM 0.099 0.075 0.0415 0.013 0.0041
T
FNR GA+SVM 0.0902 0.0443 0.011 0.083 0.0009
IP
MI+SVM 0.65 0.532 0.28 0.230 0.185
CR
SVM 0.66 0.653 0.231 0.385 0.185
US
Table 11. Performance analysis of proposed system
Table 12. Computational Complexity analysis of proposed system with MI and SVM
AN
the attacks except data flooding. The comparative security analysis and computational
complexity of the proposed system with the available techniques are presented in Tables
M
IDS Techniques Time complexity

Intrusion Detection with GA-based O(L × S)
ED
feature selection (proposed) L-Length of the individual

S-Population size
Intrusion Detection with MI-based O(N log N)
PT
feature selection [31] N - features selected for evaluation

11and12, respectively.
CE
 The result in Table 11 shows that the proposed GA-based feature selection exhibits
higher accuracy and low false positive / negative rate and outperforms the MI-based
AC
feature selection and SVM, with all the features.

 Table 12 shows that the computational complexity of GA is significantly higher than
that of MI; however, the offline training with a high-speed server reduces the
computation complexity of the proposed system.
The communication overhead of the proposed system is evaluated by the support vectors
generated by the SVM classifier. In SVM, the support vectors are exchanged among the
nodes; however, in other machine learning algorithms such as neural network and naïve
Bayesian algorithms and certain clustering techniques, the entire dataset needs to be
ACCEPTED MANUSCRIPT
transferred [32]. Table 13 presents the communication complexity analysis on the combined
dataset.
Table 13. Communication overhead analysis of proposed system with MI and SVM on
combined dataset
IDS Techniques Normal Jamming Data Hello Blackhole Wormhole

Flooding Flooding
Intrusion detection 151 ×13 192 × 14 62 × 12 158 × 12 130 × 11 85 × 14
T
using GA + support
IP
vector machines
Intrusion detection 131 × 20 167 × 20 78 × 20 91 × 20 85 × 20 86 × 20
CR
using MI + support
vector machines
Intrusion detection 145 × 20 173 × 20 81 × 20 87 × 20 86 × 20 86 × 20
using support vector
machines with all
features
US
AN
The size of the support vectors in Table 13 reveals that the proposed system has less
communication overhead than MI and without feature selection. Thus, from the security and
M
performance analysis, it is observed that the proposed system with GA-based informative
feature selection exhibits a higher accuracy of attack detection and requires less time for
ED
training the system than existing techniques and is suitable for intrusion detection in WMNs.
PT
6. Conclusion
CE
In this paper, the local informative features of each category of attacks are selected using GA
and SVM classifiers for developing an IDS for a WMN. The use of the informative features
AC
of each category of attacks yields a higher accuracy of detection than the use of the common
informative features, for particular attacks. The performance of the proposed feature selection
algorithm is analysed by comparing with MI-based feature selection techniques using
generated intrusion datasets and standard datasets, namely, ADFA-LD andCICIDS2017. The
comparison of the results demonstrates that the proposed system using GA-based feature
selection exhibits higher accuracy, less computational complexity, less communication
overhead, etc. and is suitable for providing security to WMNs.
ACCEPTED MANUSCRIPT
References
[1] J. Kim, D. Kim, K. Lim, Y. Ko, S. Lee, Improving the Reliability of IEEE 802.11s Based
Wireless Mesh Networks for Smart Grid Systems, J. Commun. Netw. 14(6)(2012) 629–639.
[2] X. Wang, P. Yi, Security framework for wireless communications in smart distribution
grid, IEEE Trans. Smart Grid, 2(4)(2011) 809–818.
[3] H. Nguyen, G. Scalosub, R. Zheng, On quality of monitoring for multichannel wireless
T
infrastructure networks, IEEE Trans. Mob. Comput. 13(3)(2014) 664–677.
IP
[4] C. Cortes, V. Vapnik, Support-vector networks, Machine Learning, 20(3)(1995) 273–297.
CR
[5] W.S. McCulloch, W. Pitts, A logical calculus of ideas immanent in nervous
activity, Bulletin of Mathematical Biophysics, 5 (4)(1943) 115–133.
[6] M. Montazeri, H.R. Naji, M. Montazeri, A. Faraahi, A novel memetic feature selection
US
algorithm, In: Information and knowledge technology, 2013 International conference on,
IEEE, 2013 pp. 295–300.
AN
[7] B. Zhang, J.F. Liu, X.L. Tang, Scale-based local feature selection for scene text
recognition, International Journal of Advanced Research in Artificial Intelligence, 3(4)
M
(2014) 18–23.
[8] L. Jie, S. Po, Naive bayesian classifier based on genetic simulated annealing algorithm,
ED
Proceedia Engineering, 23 (2011)504–509.

[9] C. Huang, C. Wang, A GA based feature selection and parameter optimization for support
vector machines, Expert Syst. 31 (2006) 231–240.
PT
[10] J. Huang, Y. Cai, X. Yu, A hybrid genetic algorithm for feature selection wrapper based
on mutual information, Patter. Recog. Lett. 28(13)(2007)1825–1844.
CE
[11] P. Ghamisi, J.A. Benediktsson, Feature selection based on hybridization of genetic

algorithm and particle swarm optimization, IEEE Trans. Geosci. Remote. 12(2)(2015) 309–
AC
313.
[12] P. Shunmugapriya, S. Kanmani, A hybrid algorithm using ant and bee colony
optimization for feature selection and classification, Swarm and Evolutionary comput. 36
(2017) 27–36.
[13] D.E. Goldberg, Genetic algorithms in Search, Optimization, and Machine Learning,
Addison-Wesley, Reading, 1989.
[14] D. Devaraj, Improved genetic algorithm for multi-objective reactive power dispatch
problem, Int. Trans. Electrical Energy Syst.17(6)(2007), 569–581.
ACCEPTED MANUSCRIPT
[15] S. Durairaj, D. Devaraj, P.S. Kannan, Voltage stability constrained reactive power
planning using improved genetic algorithm, Int. J. Water and Energy 1 (2006) 56–64.
[16] L. Wang, Feature selection with kernel class separability, IEEE Trans. Pattern Anal.
Machine Intelligence. 30(9) (2008) 1534–1546.
[17] R.N. Khushaba, A. Al-Ani, A. Al-Jumaily, Feature subset selection using differential
evolution and a statistical repair mechanism, Expert Syst. Appl. 38(9) (2011) 515–526.
[18] Z. Liu, W. Hsiao, B. L. Cantarel, E. F. Drabek, C. F.Liggett, Sparse distance-based
T
learning for simultaneous multiclass classification and feature selection of metagenomic data,
Bioinformatics, 27(23) (2011) 3242–3249.
IP
[19] K. Driessens, S. Dzeroski, Combining model-based and instance-based learning for first
CR
order regression, 2005 International conference on, ICML, 2005, pp. 193-200.
[20] Z. Liu, H. Bensmail, M. Tan, Efficient feature selection and multiclass classification
US
with integrated instance and model based learning, Evolutionary Bioinformatics, 8 (2012)
197–205.
AN
[21] Y. Zhang, L. Wang, W. Sun, R.C.Green II, M. Alam, Distributed intrusion detection
system in a multi-layer network architecture of smart grids, IEEE Trans. Smart Grid.
2(4)(2011) 796–808.
M
[22] J.H.Holland, Adaptation in natural and artificial systems, MIT press, 1975.
[23] S.K. Biswas, M.M.A. Mia, Image reconstruction using multi layer perceptron and
ED
support vector machine classifier and study of classification accuracy, Int. J.Scientific &
technology research volume. 4(2) (2015) 226–231.
PT
[24] A. Rahimi, B. Recht, Random features for large-scale kernel machines, in: Neural
information processing systems, 2007 International conference on, ACM, 2007, pp.1177–
CE
1184.
[25] D. Kim, A. Lee, Y. Cho, C.K. Toh, I. Lee, An efficient on-demand routing approach
AC
with directional flooding for wireless mesh networks, J. Commun. Netw. 12(1)(2010) 67-73.
[26] D. Devaraj, B. Yegnanarayana, K. Ramar, Radial basis function networks for fast
contingency ranking, Electrical Power and Energy Syst. 24 (2002), 387–395.
[27] B. Mukherjee, L.T. Heberlein, K.N. Levitt, Network intrusion detection, IEEE Netw.
8.3(1994) 26-41.
[28] G. Creech, J. Hu, Generation of a new IDS test dataset: time to retire the KDD
collection, in: Wireless Communications and Networking, 2013 International Conference on,
IEEE, 2013, pp. 4487-4492.
ACCEPTED MANUSCRIPT
[29]I. Sharafaldin, A. Habibi Lashkari, A.A. Ghorbani, Toward generating a new intrusion
detection dataset and intrusion traffic characterization, in: Information Systems Security and
Privacy, 4th International Conference on ICISSP, 2018.
[30] X.Gan, J.Duanmu, J.Wang, W.Cong, Anomaly intrusion detection based on PLS feature
extraction and core vector machine, Knowl. Based Systems. 40 (2013) 1–6.
[31] H.H.Holiman, N.A.Hikal, N.A.Sakr, A comparative performance evaluation of intrusion
detection techniques for hierarchical wireless sensor networks, Egypt. Inform. J. 13 (2012)
225–238.
T
[32] D. Evans, A computationally efficient estimator for mutual information, Proc. The Royal
IP
Soc. 464 (2008) 1203–1215.
CR
US
AN
M
R. VIJAYANAND, He is a Research Scholar in Department of Computer Science and

ED
Engineering, Kalasalingam University, Krishnankoil, India. He received his B.E in Computer

Science and Engineering from PTR College of Engineering and Technology, Madurai and
M.Tech in Computer Science and Engineering from Kalasalingam University. Currently he is
PT
pursuing his doctoral degree in the field of security in smart meter data transmission.
CE
AC
Dr. D. Devaraj is a Senior Professor in the Department of Electrical and Electronics

Engineering, Kalasalingam University, Tamilnadu, india. He completed his B.E and M.E in
Electrical & Electronics Engineering and Power System Engineering in the year 1992 and
ACCEPTED MANUSCRIPT
1994, respectively, from Thiagarajar College of Engineering, Madurai. He obtained his Ph.D
degree from IIT Madras, in the year 2001. He guided more than 18 Ph.D scholars. His
research interest includes Power system security, Voltage stability, Smart Grid and
Evolutionary Algorithm.
T
IP
CR
Dr. B. Kannapiran is an Associate Professor, in the Department of Instrumentation and
Control Engineering, Kalasalingam University, Tamilnadu, India. He received his Ph.D.
US
degree in Information and Communication Engineering from Anna University, Chennai in
the year 2013. He received his M.E. degree in Applied Electronics from Madurai Kamaraj
University in the year 2002. He also received his B.E. degree in Instrumentation and Control
AN
Engineering from Madurai Kamaraj University in the year 2001. His research interests
include soft computing, fault diagnosis, Biomedical Instrumentation, wireless networks.
M
ED
PT
CE
AC

2018 Computers and Security Journal Paper

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

2018 Computers and Security Journal Paper

Uploaded by

Copyright:

Available Formats

Accepted Manuscript

Intrusion detection system for wireless mesh network using multiple

R. Vijayanand , D. Devaraj , B. Kannapiran

To appear in: Computers & Security

Received date: 22 November 2017

R. Vijayananda*, D.Devarajb, B.Kannapiranc

The wireless mesh network (WMN) is a notable communication technology in recent

2. Proposed methodology for development of IDS

Fig. 1. Proposed Multiple SVM classifiers

Fig 2. Proposed IDS with GA-based feature selection

3. GA-based feature selection technique

the classifier with the maximum accuracy.

Fig. 3. SVM Architecture

Fig. 4. Wireless mesh network simulated in NS3 during attack

To demonstrate the efficiency of the proposed feature selection technique, an intrusion

5.1 Dataset generation

4 Received bytes 14 Times forwarded

7 Time 17 Destination port

9 Total received packets 19 Missed flows

5.2 Feature selection

Attacks Number of Informative features selected

of features. A sample individual randomly generated is shown in Table 3,

Attacks Testing Accuracy FPR FNR Sensitivity Specificity Precision

Table 6. Experimental results of GA-SVM-based feature selection on combined dataset

Attacks Testing Accuracy FPR FNR Sensitivity Specificity Precision

Normal 247 0.9281 0.051 0.089 0.9105 0.9491 0.9551

Data flooding 37 0.9937 0.005 0.028 0.9722 0.9954 0.9459

Grey hole 52 0.9640 0.004 0.23 0.7656 0.9951 0.9608

Table 7. Experimental results of GA-SVM-based feature selection on ADFA-LD dataset

SSH-Parator 202 0.9930 0.016 0.033 0.9668 0.9988 0.9944

Table 9. Validation of proposed feature selection with common informative feature

Dataset Dataset Testing Common informative Proposed Local

of any classifier is selected as the training time of intrusion detection.

Attack category Training time Testing time

Hello flooding 0.4836 0.0026

5.4 Comparison with mutual-information-based feature selection

Accuracy GA+SVM 0.7848 0.9114 0.957 0.9695 0.9985

MI+SVM 0.7353 0.8295 0.9326 0.9728 0.9895

SVM 0.7320 0.7631 0.9330 0.9702 0.9895

FPR GA+SVM 0.0832 0.0443 0.032 0.0675 0.0009

MI+SVM 0.068 0.119 0.0298 0.0133 0.0041

SVM 0.099 0.075 0.0415 0.013 0.0041

IDS Techniques Time complexity

feature selection (proposed) L-Length of the individual

feature selection [31] N - features selected for evaluation

feature selection and SVM, with all the features.

IDS Techniques Normal Jamming Data Hello Blackhole Wormhole

Proceedia Engineering, 23 (2011)504–509.

[11] P. Ghamisi, J.A. Benediktsson, Feature selection based on hybridization of genetic

[18] Z. Liu, W. Hsiao, B. L. Cantarel, E. F. Drabek, C. F.Liggett, Sparse distance-based

R. VIJAYANAND, He is a Research Scholar in Department of Computer Science and

Engineering, Kalasalingam University, Krishnankoil, India. He received his B.E in Computer

Dr. D. Devaraj is a Senior Professor in the Department of Electrical and Electronics

You might also like