Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 19

C4.

5 based Sequential Attack Detection and


Identification
Model

Radhika Kumar , Anjali Sardana, R. C. Joshi

Information Security Laboratory


Department of Electronics and Computer Engineering
Indian Institute of Technology
Roorkee – 247667
{Anjlsfec, radhsdec, rcjosfec}@iitr.ernet.in
9,000

Introduction
8,000

Total vulnerabilities cataloged


7,000

6,000

5,000

CERT Statistics 4,000

3,000

 Internet was designed for openness 2,000

and functionality 1,000

Failures can be accidental or intentional


0
 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006

Year
 Examples : Figure : The number of total vulnerabilities
 Denial of Service (DoS) catalogued from 1995 to 2006
160000

 Distributed Denial of Service (DDoS) 1,53,140

Number of incidents reported


140000

 Domain Name System attack 120000

100000
 IP Spoofing 82,094
80000

 Sequence Number Hijacking 60000

40000

20000
6
0

Year

Figure : The number of Internet security


incidents reported from 1988 to
2003 2
Service denied to Legitimate Users

Packets drop due to queue


overflow

Victim

overflow
Buffer
Edge Router Transit Domain
Legitimate Packets
Edge Router Stub Domain
Attack Packets
bottleneck link
Figure : Packets drop under DDoS Attack
3
Motivation
Existing approaches to defend against Attacks

 Before the attack


 Prevention

 During the attack


 Detection and Characterization

 After the attack


 Response and Mitigation

All of them suffer from various constraints

4
Sequential Multi-Level Classification
Model
 The objective is to find the natural hierarchy in the network traffic and to exploit
the generic and differentiating characteristics of different attacks to build a
more secure environment.
 A differential approach is used to detect one kind of attack at a time from the
network traffic.
 A sequential model with different binary classifiers at each level, categorizing
attacks in a step by step manner, is used.
 Rules are also generated at different levels of abstraction.
 KDD99 Dataset is used for evaluation

Node 1

Class 1 Node 2

Class 2 Node 3

Class 3 Class 4
Mathematical Model
Traffic Feature Distribution

Flow Number of
X  {ni , i  1,2,3, N} Id (i) Packets
Where
(ni)
X is a random process that i occurs
ni times.
1 n1
2 n2
X
i is defined by one or combination
of following traffic features in
packet header like: 3 n3
 Source IP address

 Destination IP address
: :
: :
.
 Source Port

 Destination Port

 Layer 3 Protocol Type N nN


6
Mathematical Model: Basis of C4.5
1. Traffic Feature Measurement : Entropy
ni N
S   ni
N
H ( X )   pi log 2 pi Where pi  ( )
1 S 1
Measure of Dispersal or concentration of a distribution.
Range is (0 to log2N) All observations same H(X) = 0
H(X)=log2N if n1=n2=………=nN.

2. Sampling
{t  ,t} {t  ,t} {t  ,t}
{ X ( t ), t  j  , j  n } 1 2 3 ... ... N H(X)

Δ X(Δ,1) X(Δ,2) X(Δ,1) ... ... X(Δ,N) H(Δ)


 is constant time window
n is set of positive integers 2Δ X(2Δ,1) X(2Δ,2) X(2Δ,3) ... ... X(2Δ,N) H(2Δ)

X (t ) represents number of packet 3Δ X(3Δ,1) X(3Δ,2) X(3Δ,3) ... ... X(3Δ,N) H(3Δ)
arrivals for a flow in {t  , t}
: : : : : : : :

3. Traffic Feature Selection : : : : : : : :

nΔ X(nΔ,1) X(nΔ,2) X(nΔ,3) ... ... X(nΔ,N) H(nΔ)


7
C4.5: The Classification Algorithm
 The sequential nature of the proposed multi-level architecture
needed binary classification at each level.
 C4.5 gives highest overall accuracy as the single level classifier compared to
other single level classifiers (Tavallaee, 2009, classifiers were tested on KDD99
Dataset)
 C4.5 uses the concept of entropy,
x to measures the impurity of data items.
I(S) =  RF(Ci,
j1
S) log(RF(Ci, S))
t
Information gain


G(S, B) = I(S) - (| Si | / | S |) I(Si)
i 1
t
 Gain ratio P(S, B) = -  (| Si | / | S |) log (| Si | / | S |)
i 1

 The test B that maximizes G(S, B) / P(S, B) is then chosen as current


partitioning attribute.
KDD’99 Dataset

 KDD attacks: Four main categories


 DOS: Denial-of-Service Attack
 Probing Attack
 U2R: User to Root Attack
 R2L: Remote to Local Attack
 KDD’99 datasets: 41 features classified into three groups:
 Basic Features
 Traffic Features: Time based and Host based Traffic Features
 Content Features
Sequential Classification
 Some observations:
 The Dos attack instances in training data are more than the combined
number of Probe, U2R and R2L attacks => Dos attack is the most common type
of attack.
 The Dos attack by nature is characterized by Time-based traffic features.
 Probe attack is defined by host-based features.
 U2R and R2L attacks are detected by studying the content features of the
data.
 Finally they all are attack so must have some common characteristic different
from the normal traffic.
 First Stage:
 Separation of attack data from normal traffic on the basis of characteristics
common to all attack traffic.
 Second Stage:
 Separation of most common attack – Dos attack from other three kinds using
time based features.
 Third Stage:
 Separation of Probe attacks from other two kinds using host- based traffic
features
 Fourth Stage:
 Separation of U2R and R2L attacks using Content features.
 Snapshot of Level 4 classifier (Trained on U2R and R2L attack data )
 Training Results
 Correctly Classified Instances 98.9813 %
 Incorrectly Classified Instances 1.0187 %
Evaluation Matrix

Actual Class Classified as


Normal Attack
Normal True Negative (TN) False Positive (FP)
Attack False Negative (FN) True Positive (TP)

 Precision: proportion of predicted positives/negative which are


actual positive/negative
 True Alarm Ratio: TP / (TP + FP)
 False Alarm Ratio: FP / ( FP + TP )
 Recall: proportion of actual positives/negative which are predicted
positive/negative
 Sensitivity, Detection rate, Alarm rate -TP / (TP + FN)
 False positive rate , False Alarm rate – FP / ( FP + TN)
 False negative rate – FN / (FN + TP)
Actual Class Classified as
Normal Attack
Normal 60253 340 All data
Testing Results
Attack 22684 227752 Level 1
Correctly Classified Instances 92.5975%

Actual Classified as
Class Normal Dos Attack Other
Attacks Attack
Normal 0 83 257 Normal
Dos Attack 0 222524 795 Level 2
Other 0 435 3998
Attacks
Correctly Classified Instances 99.3117%

Other
Actual
Class Normal Dos
Classified as
Probe Others
Dos Attacks
Attack Level 3
Normal 0 0 253 7
Dos Attack 0 0 358 471
Probe 0 0 3086 0
Attack
Other 0 0 347 527 Other
Attacks Probe
Correctly Classified Instances 71.5587% Attacks
Attack Level 4
Actual Classified as
Class Normal Dos Probe U2R R2L

Normal 0 0 0 1 6
Dos 0 0 0 0 471
Probe 0 0 0 0 0 U2R R2L
U2R 0 0 0 9 8
R2L 0 0 0 2 508
Correctly Classified Instances 51.4428%
Improvements in Training
Dataset
 KDD99 10% Training dataset and Testing dataset distribution
Training Set Testing Set
Normal 19.69% 19.48%
Probe 0.83% 1.34%
Dos 79.24% 73.90%
U2R 0.01% 0.07%
R2L 0.23% 5.20%

 The 10% KDD99 training dataset has huge number of similar


records for Dos attack and normal traffic as compared to Probe,
U2R and R2L attacks.
 Level 1 classifier get biased towards normal class
 Testing Result: High false negative rate– 9.95%

 Improvements
 New dataset – U2R, R2L and Probe data was duplicated 5 times
 Level 1 classifier was trained using this new dataset
 Testing Results of Level-1 Classifier on earlier dataset
 Attack detection rate increased from 90.942% to 92.2515%.

 Accuracy percentage increased from 92.5975% to 93.5974%.


Results after Improvements in
Training Data
 Confusion Matrix of Level 1 Classifier after data duplication
Actual Class Classified as
Normal Attack
Normal 60099 494
Attack 19405 231031
Correctly Classified Instances 93.5974%
 Misuse and Anomaly Detection Rate of Level 1 Classifier before and
after data duplication
True Positives Known Attacks New Attacks
In Test dataset 220,525 29,911
Detected by level 1 classifier 219,827 7,905
(trained on original dataset) (99.6835 %) (26.4618%)

Detected by level 1 classifier 220,525 10,543


(trained on new dataset) (99.9832%) (35.2479%)

 The data data duplication improved the misuse and anomaly detection
rate from 99.6835% and 26.4618% to 99.9832% and 35.2479%,
respectively.
Descriptive Modeling
 The advantage of multi-level sequential approach is that we
get small and easily interpretable trees.
 Rules can be derived from these decision trees at different level of
abstraction.
 These rules are in terms of 41 features of KDD dataset.
 E.g. Rule derived from second classifier
If ( %of connection to different services for same host for last
1000 connections < 0.1 and
% of connection to different host for same service for past
1000 connections < 0.01 and
number of connection to same host for the past two seconds
>2)
=> Dos Attack
Conclusion
 The model has low false alarm ratio of 0.15%.
 Individual attack detection rate of 99.644% for Dos and
100% for Probe is achievable.
 The percentage accuracy for classification between U2R and
R2L is as high as 98.1024%.
 New dataset gives better result :
 Misuse detection rate 99.9832% and anomaly detection rate 35.247%
 The trees generated are small and easy to derive rules at
different levels of abstraction.
References
[1] S. Axellson, “The Base-Rate Fallacy and the Difficulty of Intrusion Detection,” ACM
Transaction on Information and System Security, 2000.
[2] Corey, V. et. al.: Network forensics analysis, Internet Computing, IEEE , Volume:6 Issue:
6 , 2002 pp: 60 –66.
[3] R. J. Henery, “Classification,” Machine Learning, Neural and Statistical
Classification,” D. Michie , D. J. Spiegelhalter, and C. C. Taylor (Eds.), Ellis Horwood, New
York, 1994.
[4] E. Bloedorn, L. Talbot, C. Skorupka, A. Christiansen, W. Hill, and J. Tivel, “Data Mining
applied to Intrusion Detection: MITRE Experiences,” In Proc. IEEE International
Conference on Data Mining, 2001.
[5] Y. Ma, D. Choi, and S. Ata, Eds., Application of Data Mining to Network Intrusion
Detection: Classifier Selection Model, ser. Lecture Notes in Computer Science. Berlin
Heidelberg, Germany: Springer-Verlag , 2008, vol. 5297.
[6] M. Tavallaee, E. Bagheri, W. Lu, and A. A. Ghorbani, “A Detailed Analysis of the KDD
CUP 99 Data Set,” in Proc. IEEE Symposium CISDA’09, 2009.
[7] J. R. Quinlan, “C4.5: Programs for machine learning,” Morgan Kaufmann, San Mateo,
California, 1993.
[8] Weka – Data Mining Machine Learning Software. [Online]. Available:
http://www.cs.waikato.ac.nz/ml/weka/
[9] KDD Cup 1999 Data. [Online]. Available:
http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html
[10] M. Sabhnani, and G. Serpen, “Why Machine Learning Algorithms Fail in Misuse Detection on
KDD Intrusion Detection Dataset,” Intelligent Data Analysis, vol. 6, June 2004.
[11] K. Kendall, “A Database of Computer Attacks for the Evaluation of Intrusion Detection
Systems,” M. Eng. Thesis, Massachusetts Institute of Technology, Massachusetts, United
Thank You

You might also like