NCCS Technical Report v2

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 115

PERIODIC TECHNICAL REPORT

CRC LAB, BUIC

DR NAJAM UL ISLAM

BAHRIA UNIVERSITY ISLAMABAD

Date: 5th July, 2022

NATIONAL CENTER FOR CYBER SECURITY

A Project of Higher Education Commission and Planning Commission of


Pakistan
Please read instruction before filling the form.
1. Please do not alter the layout of the application form. Information must be filled in the spaces
provided, under set format.
2. Guidance notes in various fields should not be deleted.
3. Required information should be duly filled in the specified fields.
4. Required heads/fields which are not relevant to the project should be marked N/A (Not
Applicable) or left blank and should not be deleted.
5. Specifications, justifications, purposes must be provided against each item in the Equipment
list.

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 2
List of Abbreviations and Acronyms

NCCS National Centre for Cyber Security

NSC National Steering Committee

SIAB Scientific & Industrial Advisory Board

PI Principal Investigator

Co-PI Co-Investigator

SME Small and Medium-Enterprises

RBM Result Based Monitoring

List of Abbreviations and Acronyms Used by PI in the Proposal

(Please add abbreviations and acronyms in the table below, if any)

For further information, please contact:


National Center for Cyber Security, Air University, Sector E-9, Islamabad

Email: rmeo@nccs.pk, secretariat.nccs@gmail.com

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 3
Table of Contents
PART A - project report.............................................................................................................

1. PROJECTS details.............................................................................................................

1.1. Project(s) Summary.....................................................................................................


1.1.1. Executive Summary........................................................................................................10
1.1.1. Input Parameters of the Lab..........................................................................................10
1.1.2. Detail of Equipment (as per KPI)...................................................................................11
1.1.3. Detail of HR (complete HR, including PI, Co-PI and everyone else)........................11
1.1.4. Output Parameters of the Lab.......................................................................................12
1.2. Expenditure Plan.........................................................................................................
1.3. Lab Visit Evaluation Summary.....................................................................................

PART B – PERIODIC REPORT (TECHNICAL DETAILS)....................................................

2. Domain Name 1:...................................................................................................................

2.1. Milestone/Deliverable 1:................................................................................................


2.1.1. Description of Milestone...................................................................................................3
2.1.2. Literature Review (if relevant)..........................................................................................3
2.1.3. Methodology.......................................................................................................................3
2.1.4. Implementation, Analysis, theoretical and/or analytical models and results.............4
2.1.4.1. Theoretical and/or analytical models or architectural Implementation.......................4

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 4
2.1.4.2. Implementation Details.....................................................................................................4
2.1.4.3. Results (Actual and Perceived).......................................................................................4
2.1.4.4. Analysis and Discussion...................................................................................................4
2.1.4.5. Testing................................................................................................................................4
2.1.4.6. Additional Features Activities added...............................................................................4
2.2. Milestone/Deliverable 2:................................................................................................
2.3. Milestone/Deliverable 3:................................................................................................

3. Domain Name 2:...................................................................................................................

3.1. Milestone/Deliverable 1:................................................................................................


3.1.1. Description of Milestone...................................................................................................5
3.1.2. Literature Review (if relevant)..........................................................................................5
3.1.3. Methodology.......................................................................................................................5
3.1.4. Implementation, Analysis, theoretical and/or analytical models and results.............6
3.2. Milestone/Deliverable 2:................................................................................................
3.3. Milestone/Deliverable 3:................................................................................................

4. Domain Name 3:...................................................................................................................

4.1. Milestone/Deliverable 1:................................................................................................


4.1.1. Description of Milestone...................................................................................................7
4.1.2. Literature Review (if relevant)..........................................................................................7
4.1.3. Methodology.......................................................................................................................7

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 5
4.1.4. Implementation, Analysis, theoretical and/or analytical models and results.............8
4.2. Milestone/Deliverable 2:................................................................................................
4.3. Milestone/Deliverable 3:................................................................................................

5. OUTCOMES...........................................................................................................................

5.1. Labs Progress on KPIs..................................................................................................


5.1.1. Publication Details (as per KPI).......................................................................................9
5.1.2. Research Proposals (at least 2 per year).....................................................................10
5.1.3. International Collaboration (at least 1 per year)..........................................................10
5.1.4. Industrial Collaboration (at least 1 per year)................................................................11
5.1.5. Dissemination and exploitation of results.....................................................................12
5.1.6. Project Sustainability and its Impacts...........................................................................13

6. Conclusion & LESSON LEARNT....................................................................................

7. FUTRE WORK and PLAN.................................................................................................

8. Risk, issues and challenges...........................................................................................

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 6
Structure of the Periodic Report

The periodic technical report must be submitted by the project PIs within two weeks following the end of each
reporting period.

The periodic technical report consists of two parts:

Part A of the periodic technical report contains the cover page, publishable summary and the answers to the
questionnaire covering issues related to the project implementations and its impact in the context of key
performance indicators and the milestones/deliverables committed in the NCCS PC-1 Document.

Part B of the periodic technical report is the narrative part that includes explanations of the work carried out by the
beneficiaries during the reporting period. Note: Part B should not exceed Ten (10) pages limit (excluding Title page,
TOC, List of figures, Bibliography & Appendices).

Part A and Part B both needs to be submitted as a PDF document following the template provided in this
document.

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 7
PART A - PROJECT REPORT

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 8
1. PROJECTS DETAILS

1.1. Project(s) Summary

1.1.1. Executive Summary


Please provide summary of the work performed from the beginning of the project(s) till date and results achieved (in chronological order)

Domain/ Time/ Summary of Perceived Results Actual Result


Project Title Duration Implementation done
(e.g. P1)
P1 0 – 6th months

P1 7th – 12th
months
P1 13th – 18th
months
P1 19th – 24th
months
P1 25th – 30th
months
P1 31st – 36th
months
P1 37th – 42nd
months

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 9
1.1.1. Input Parameters of the Lab
Ser Input
. Equipment Budget HR Budget

1.1.2. Detail of Equipment (as per KPI)

Ser. Equipment Detail Project Name Remarks


Nomenclature Life Status

1.1.3. Detail of HR (complete HR, including PI, Co-PI and everyone else)

Ser. Personnel Permanent Qualification Project Working Remun Details of Tasks Contribution
Detail with Employment (PhD, MS, BS for which Detail (Total eration Assigned towards
start and Status Certification) employed hours / week Paid overall
end date of (Organization and timings per objective of
employme and for each day month the Lab
nt appointment) of week)*
(students
give details of
degree with
start and end
dates)
Working Workin Task Due Status
Schedule g Hour Assigned Date

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 10
Technical Problem Problem’s Perceived Resources Required Resources Released Impact
Problem likely Impact on Solution mitigation plan
cause(s) the project (if any)
Equipment Trained Other Equipment HR Other
HR

* Permanent employed and students (not employed) to provide written consent for their employability at lab with hours / week
and timings by the employer and educational institution respectively

1.1.4. Output Parameters of the Lab

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 11
Ser Output Target as per KPI Outcome Impact
.
Journal Linkage to the project, deliverables,
Publications products, PC-1 objectives
Conference Networking with top tier researchers &
Publications Cyber Security experts, in addition to
above
Non PSDP 15 million (after 3 Self-Sustainability
Research Fund years completion)
from External
Sources
Industrial 7.5 million (after 3 Self-Sustainability & Commercial
Project Funding years completion) Viability
Startups 1 Solution to local problems, commercial
viability
Trained 3 trainings per year Skilled workforce & Knowledge workers
Professional (10 paid attendees
per training)
Indigenous
capacity
building
Products / Sponsored user, utility, user feedback,
deliverables delivery date, potential users

1.2. Expenditure Plan

Domain Total Total Total Remaining Utilization Plan up till next

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 12
Release Expenditure Expenditure on Amount quarter
s on Equipment HR with the lab
1
2
3

1.3. Lab Visit Evaluation Summary

Ser. Project Ongoing POC Completed Equipment Revenue Remarks


Completed Project Completed Product/ working Generated
Prototype condition

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 13
PART B – PERIODIC REPORT (TECHNICAL DETAILS)

(The information provided in this section will only be available to NCCS Secretariat, members of the NCCS NSC, and the NCCS Scientific and Industrial
Advisory Board)
Lab Project (s) Titles:

PI Name:
Domain: Duration Employment Milestone Deliverable [% Status Impact as
Start/End Completed] per PC-1
Date
Co PI name: 1-
0-6 Months 2-
7-12 Months 3-
12-18 Months
19-24 Months
24-30 Months
30-36 Months
37-42 Months
Co PI name: 1-
2-
0-6 Months 3-
7-12 Months
12-18 Months
19-24 Months
24-30 Months
30-36 Months
37-42 Months
Co PI name: 0-6 Months 1-
7-12 Months 2-
12-18 Months 3-
19-24 Months
24-30 Months
30-36 Months
37-42 Months

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 2
2. DOMAIN 1: DETECTION ENGINE

Sr. Project Title/ Prior / Existing Work prior to Market Requirement Project Progress Status
No. Domain this quarter (reporting period)

2.1. Milestone/Deliverable 1:

2.1.1. Description of Milestone

During the last decade, attackers have compromised a lot of victim systems to launch massive Distributed Denial of

Service (DDoS) attacks against banking services, corporate websites, and e-commerce businesses, etc. Such attacks

can cause enormous financial losses and ruin their services to authorised users. Different solutions have been

proposed to fight against such DDoS attacks, but no ideal solution has been found till date. To validate the majority of

existing solutions, researchers have been using simulation-based experiments, but currently the trend has shifted to

publicly available realistic datasets for DDoS validation purposes. Thus, in this research study, we have provided a

comprehensive review of current datasets and proposed a novel taxonomy for the classification of DDoS attacks.

Further, we generated a new dataset called "CRCDDoS2022", which can overcome all existing shortcomings.

Moreover, with our new dataset, we have provided a new family classification and detection approach. This approach is

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 3
based on the set of features in network flow. Lastly, we gave the most important sets of features for detecting different

kinds of DDoS attacks, along with their weights.

Currently, a critical problem in internet-based interconnected systems is cyber-attacks. DDoS attacks have emerged as

a significant threat to Internet services [1]. In a DDoS attack, the attacker produces a huge amount of traffic and

exhausts the resources of victim systems. This is normally started by one attacker who exploits and takes control of

multiple devices called "zombies." These zombies do not know the fact that they are compromised and being used for

an attack. Normally, a sweep operation is conducted by the attacker to identify devices eligible to become zombies, like

a device having an open port. After that, the attacker uses zombie devices to launch an attack. The detection of attacks

proves difficult because the number of zombie devices can reach hundreds, thousands, or even millions [2]. Different

techniques have been presented for the prevention of DDoS attacks. However, this is still a significant threat to network

security. The existing solutions have anomaly-based and signature-based techniques for intrusion detection [3,4].

Signature-based Intrusion Detection Systems (IDS) and Intrusion Prevention Systems (IPS) play the most active roles

in defending against cyberattacks but are mostly ineffective against zero-day and distributed denial of service (DDoS)

attacks. Current research shows anomaly-based detection approaches are effective in intrusion detection, and they

have received good attention from the research community in recent years. While the signature-based system is easy

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 4
to implement, it has limitations in terms of the known signatures. Anomaly approaches: Machine Learning (ML) or deep

learning (DL), a subset of Artificial Intelligence (AI), can be used to distinguish between benign and abnormal traffic.

Telecom vendors are now focusing more on anomaly-based IDS solutions because of the advanced computing power

and the effectiveness of identifying cyber-attacks by anomaly approaches. Palo Alto Networks has just released the

first-ever anomaly-based IDS system in June 2020 [5]. However, the performance of the anomaly-based detection

approach depends on useful datasets to train. With the higher accuracy of the learning, various network attacks can be

detected.

2.1.2. Literature Review (if relevant)

The DARPA [1] dataset proposed in 2000 includes three datasets, such as the DDoS attack run by a novice attacker,

the DDOS attack run by a stealthy attacker, and the Windows NT attack dataset. Researchers extract features that will

serve as flags in DDoS attack detection by studying application layer attack tools. A flow correlation coefficient is

defined, which is helpful in the detection of flash mob DDOS attacks. The principal behind the usage of the flow

correlation coefficient is that the flow standard deviation for an attack is less as compared to legitimate traffic. CAIDA

[2] is a DDOS attack dataset proposed in 2007 that contains traffic traces. Details of the attack and its response are

also present in pcap format. Traffic traces are anonymized by removing payloads from the packets. Due to IP spoofing,

IP routing stateless nature, etc., traces are exceedingly difficult to gather. The DARPA dataset consists of three

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 5
datasets. This includes DDOS attack information that is generated by novice attackers. The second dataset is LLDOS

2.0.2, which contains attack information that is generated by stealthy attackers. And the third dataset is the Windows

NT attack dataset that includes NT auditing of one day of traffic.

The basic purpose of the [3] BOUN dataset is that it can be used to evaluate network-based intrusion detection

techniques. Traffic is taken from the campus router mirroring method. Recorded traffic is converted. csv file using

Wireshark software. Two attack scenarios are included in this dataset. For flooding purposes, in both scenarios,

randomly generated spoofed destination IP addresses are used. In a TCP flood attack, Port 80 is used as the

destination port. A realistic dataset is developed using Spirent communication's state-of-the-art emulator, CyberFlood

CF20, which fulfils the needs of network topology. ([4] CyberFlood-CF20 is a user-friendly testing platform which

generates realistic attack scenarios for testing IDS. Performance and scalability. In a DDOS attack, hundreds of

zombies attack servers to consume network bandwidth. CF20 is also used to simplify network configuration and create

network topology to develop efficient datasets. CF20 also eliminates other DDOS simulation tools like LOIC and HOIC

because these simulation tools create limited attack vectors. A realistic dataset for intrusion detection must have the

following requirements: a set of prominent features and an efficient machine learning algorithm for detection.

CICIDS2017 is proposed in 2017, which is very comprehensive for intrusion detection. [5] Using a network topology

with a benign background, some datasets generate various attack traffic. These datasets use CICFlowMeter to analyse

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 6
the generated traffic. Some researchers simulated 25 users' traffic with five different protocols to generate realistic

datasets for intrusion detection.

2.1.3. Methodology

Some taxonomies of DDOS attacks are presented in this section. Mirkovic and Reihe discussed [6] some DDOs' attack

taxonomies and defence mechanisms here. DDOS attacks are categorised into the following categories: automation,

vulnerability, source address validity, attack rate dynamics, victim, and impact on the victim, etc. In most automation-

based methods, an attacker must manually and automatically search for vulnerabilities on a machine. Bhardwaj et al.

proposed a taxonomy of DDOS attacks based on cloud computing. [7] DDOS attacks regarding cloud computing are

categorised as: degree of automation, attack impact, attack rate, and vulnerability. Another research also identifies the

same DDOS attacks along with the identification of real-time response, throughput response time, request and zero-

day attacks. [8] Masdari and Jalali performed a detailed analysis of DDOS attacks in cloud computing. They identify

major types of DDOS attacks by identifying vulnerabilities that lead to a DDos attack and then classify those attacks

using cloud components such as virtual machines, hypervisors, etc. The most common types of cloud DDOS attacks

are bandwidth attacks, connectivity attacks, resource exhaustion, and physical and data disruption. This research

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 7
shows that DDOS attacks on the cloud are more severe because of the more available resources. [9] Singh et al.

concentrated on the HTTP-GET flood DDOS attack. In this research, they categorise high-rate and low-rate attacks

Limitations

The limitation of existing datasets are they only train application layer for DDOS attack. Only one tool is used for attack.

Available datasets do not give an accurate result. Real time implementation is not possible in available datasets. All

tools used in these datasets shows zero prediction. Mostly tools return false positive. Detection results is only provided

for single trained model. Available datasets are not able to perform efficiently in real time scenarios.   No mechanism is

provided to integrate available datasets with intrusion detection system for real time detection and prediction.

2.1.4. Implementation, Analysis, theoretical and/or analytical models and results

We can implement three networks which was named as Attack-network, Client Network and Victim-Network. Victim

network includes ubuntu server, capturing server, windows server, firewall, router, and network switch etc. Attacker

network includes bots which generate DDOS attacks using tools such as Zero-day DDoS, Apache and Windows, LOIC,

HOIC, Slowloris, DDoSIM, HULK, Goldeneye, Bonesi, Mirai Botnet and Tor Hammers. Attacker network generates

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 8
traffic that is

captured by

the capture

server placed in

victim network to

generate the

data set.

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 9
CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 10
When the dataset is generated before training PCA (Principal Component Analysis) is applied to check whether this

dataset is classified into 2 feature classes or not. Before training data is visualized by using Principal component

analysis. Principal component analysis is used for dimensionality reduction in machine learning. The reason behind

using PCA is that algorithms fail to perform efficiently without feature reduction because of high-dimensionality. Above

figure shows that data is visualized in two dimensions so that it is algorithms can perform efficiently. Decision tree is also

used to extract the best features. Separation of features is also used to measure the accuracy of dataset using individual

feature. After checking accuracy on individual features, we create subsets of features to test dataset accuracy. In this

report the proposed methodology is to combine the best features which give above 80% accuracy and then generate the

dataset.

2.1.4.1. Theoretical and/or analytical models or architectural Implementation

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 11
CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 12
1.

2.

2.1.

2.1.1.

2.1.2.

2.1.3.

2.1.4.

1.

2.

2.1.

2.1.1.

2.1.2.

2.1.3.

2.1.4.

2.1.4.2. Implementation Details

We can implement three networks, which are named Attack-network, Client Network, and Victim-Network. The victim

network includes an Ubuntu server, a capturing server, a Windows server, a firewall, a router, and a network switch,

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 13
among other things. network includes bots which generate DDOS attacks using tools such as Zero-day DDoS, Apache

and Windows, LOIC, HOIC, Slowloris, DDoSIM, HULK, Goldeneye, Bonesi, Mirai Botnet, and Tor Hammers. The

attacker network generates traffic that is captured by the capture server placed on the victim network to generate the

data set.

1.

2.

2.1.

2.1.1.

2.1.2.

2.1.3.

2.1.4.

2.1.4.1.

2.1.4.2.

2.1.4.3. Results (Actual and Perceived)

Dataset presented in this report can detect three types of DDoS attacks which are volume-based attacks, Protocols

based attacks and application layer attacks. In volume-based attacks it can detect UDP Flood, ICMP Flood and

Spoofed-Packet Flood attacks. In protocol-based attack this dataset covers SYN Flood, Ping-Of-Death, Smurf DDoS,

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 14
and Fragmented Packet attacks. Datasets also cover Application layer attacks that include Slowloris, Zero-day DDoS,

Apache and Windows etc. DDoS attack tools used for this data sets are LOIC, HOIC, Slowloris, DDoSIM, HULK,

Goldeneye, Bonesi, Mirai Botnet and Tor Hammers. Slowloris is used to send authorized traffic to the server through Get

and Post request. Main advantage of using solwloris is that it can send partial packet instead of corrupted packets and

traditional intrusion detection systems are not able to detect this type of attack efficiently. Hulk is used to generate

unique traffic that can bypass cache server. LOIC is used in this dataset to send customized TCP, UDP and HTTP

request. One of the main reasons of using LOIC is that it can hide identity and we are able to control zombie network

computers. HOIC is used in this dataset to launch DDOS attack using HTTP protocol. HOIC can attack up to 256 DDOS

websites at once.

2.

2.1.1.

2.1.2.

2.1.3.

2.1.4.

2.1.4.1.

2.1.4.2.

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 15
2.1.4.3.

2.1.4.4. Analysis and Discussion

Results of this dataset indicate that by using SVM (Support Vector Machine) classifier along with feature reduction

proposed approach gives 0.983% accurate results and in this case the number of selected features are 46. Using logistic

regression classifier along with feature reduction the accuracy is 0.987%. when linear classifier is used with feature

reduction the proposed dataset give accuracy up to 0.97%. Implementing decision tree along with feature reduction the

accuracy is 1. So, comparison of all classifiers shows that decision tree gives best results when it combines with feature

reduction. Without feature reduction technique the classifier shows less accuracy.

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 16
Above figure shows the accuracy of multiple approaches. In above figure red line graph shows the accuracy of

ExtraTreeClassifier  which is lowest amongst all. Green line graph shows the accuracy of subset of features. Individual

feature accuracy is also shown in the graph some features show accuracy above 80%. This individual feature accuracy

measurement helped us to propose an innovative approach known as crcApproach in which only those features are

selected which give 80% accuracy along with best classifier and feature reduction technique. When number of selected

features are 46 then crcApproach give 0.98% accurate results.

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 17
2.1.4.5. Conclusion / Future work

In this report a new dataset presented which handle 11 types of DDOS attacks to evaluate the IDS algorithms and

systems. In this paper we reviewed existing datasets which are used for evaluation of IDS algorithms but these datasets

have some limitations like offline detection etc. In this paper proposed dataset cover all the weaknesses of existing

datasets. One of the main advantages of this dataset is that it can provide Realtime analysis. Proposed approach in this

paper is effective because it selects best features and use feature reduction techniques to reduce computation. Mostly

DDOS generating tools are those which anonymize the traffic and remove extra information from the packet. In this

dataset almost 8 tools are used to generate DDOS attack traffic. In this paper 4 classifiers are compared and the best

performance classifier is used along with feature reduction techniques to provide efficient results.

2.2. Milestone/Deliverable 2:

2.3. Milestone/Deliverable 3:

3. DOMAIN NAME 2: PREPROCESSOR


Sr. Project Title/ Prior / Existing Work prior to Market Requirement Project Progress Status
No. Domain this quarter (reporting period)

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 18
3.1. Milestone/Deliverable 1:
3.1.1. Description of Milestone

Signatures are the core of any IDS/IPS. In previous module we converted snort and suricata rules which were compatible

with our IDS. Now in the extension of this module, to make Netspection more robust and advance, YARA rules are planned

to be migrated to detect netwrok footprints of malwares and antiviruses. YARA is a great repository of known malwares

which can be utilized for the detection of malicious activities of any malware over the network. For now, YARA rules

conversion methodology is proposed to be implemented in future.

3.1.2. Literature Review (if relevant)

3.1.3. Methodology
The proposed methodology contains the following set of activities in the below order Malware has become one of the most

severe cyber risks in recent years. Malware is any program that performs harmful acts, such as information theft,

espionage, and so on. Malware is defined by Kaspersky Labs (2017) as "a sort of computer program designed to infect a

genuine user's machine and harm it in numerous ways". Anti-virus scanners cannot keep up with the rising diversity of

malware, resulting in millions of hosts being infected. According to Kaspersky Labs (2016), 6563145 distinct hosts were

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 19
targeted in 2015, with 4 000 000 malware items identified. According to Juniper Research (2016), the worldwide cost of data

breaches would reach $2.1 trillion by 2019.

YARA rules are a way of identifying malware (or other files) by creating rules that look for certain characteristics. YARA was

originally developed by Victor Alvarez of Virustotal and is mainly used in malware research and detection. It was developed

with the idea to describe patterns that identify strains or entire families of malware.

Syntax

Each rule must start with the word rule, followed by the name or identifier. The identifier can contain any alphanumeric

character and the underscore character, but the first character is not allowed to be a digit. There is a list of YARA keywords

that are not allowed to be used as an identifier because they have a predefined meaning.

Condition

Rules are composed of several sections. The condition section is the only one that is required. This section specifies when

the rule result is true for the object (file) that is under investigation. It contains a Boolean expression that determines the

result. Conditions are by design Boolean expressions and can contain all the usual logical and relational operators. You can

also include another rule as part of your conditions.

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 20
Strings

To give the condition section a meaning you will also need a strings section. The strings sections is where you can define

the strings that will be looked for in the file. Let’s look at an easy example.

rule vendor

strings:

$text_string1 = “Vendor name” wide

$text_string2 = “Alias name” wide

condition:

$text_string1 or $text_string2

The rule shown above is named vendor and looks for the strings “Vendor name” and “Alias name”. If either of those strings

is found, then the result of the rule is true.

There are several types of strings you can look for:

1. Hexadecimal, in combination with wild-cards, jumps, and alternatives.

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 21
2. Text strings, with modifiers: nocase, fullword, wide, and ascii.

3. Regular expressions, with the same modifiers as text strings.

There are many more advanced conditions you can use, but they are outside the scope of this post. If you would like to

know more, you can find it in the YARA documentation.

“Analyzing the memory” is used for the detection of malicious activities in many conspicuous cases. Signature matching

technique is used to determine the malicious content or code within the memory. [1] proposed the mechanism for examining

the files as well as the physical memory to identify the malicious activities efficiently. Along with the efficient identification of

malware activities, it also focuses on the creation of new signatures for efficient searching of malwares in the physical

memory.

Ransomware is a sub-category of malwares that attacks the data on the garget systems to block user’s data by simply

encrypting the files on target systems to achieve financial benefits. [2] uses the technique of static and dynamic analysis for

the detection of WannaCry ransomware intrusion. Based on the analysis, the features that are extracted are used by

Intrusion Detection Systems with the signature rules created by the examination of WannaCry file.

Due to exponential increase in internet traffic, intrusion detection in networks is emerging as a huge challenge for network

administrators. To minimize the risk of network intrusion [3] presents a mechanism of network attack identification for

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 22
intrusion detection systems based on decision trees. This research uses a new dataset, called Kyoto 2006+ dataset. In this

dataset, the network traffic is classified as normal (legitimate traffic), attack (known intrusion) and the unknown attack [3].

J48 algorithm for decision tree is used for the classification of network traffic. The mechanism presented was trained and

tested for a set of network traffic for the creation of IDS rules.

Network monitoring and network security is considered a major area of working considering the sensitive information

floating over different computer networks. To secure the network traffic and allowing the legitimate traffic and blocking the

unintended traffic, Intrusion Detection Systems are being used to a larger extent. [4] presents a mechanism of automatic

signature generation based on hashing scheme. The malicious content is processed through the designed tools to create

the hash-based signatures of the malicious content and populated in the rule file for the IDS. Snort rules follow a set syntax.

This mechanism receives a malicious file and other IDS rule parameters as an input and create a fixed length output which

is populated in the content section of rule files for the analysis of desired output corresponding to the malware file.

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 23
A custom binary signature based on the source code of the original malware is used and custom database is created.

ClamAV signature is converted to Yara format automatically with a python script. Yara is used to create rules that detect

strings, instruction sequences, regular expressions, byte patterns, and so on. Today there are various IDS, IPS (intrusion

prevention system) and antivirus solutions that use Yara rules to detect or prevent malwares, its popularity comes from its

simple and efficient way of writing rules. As soon as the malware is found on any device, ClamAV mitigate the malware

using YARA rules. Figure 2.6 shows a simple example of how a Yara rule syntaxes looks like.

rule silent banker: banker {

meta:

description = "This is just an example"

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 24
thread_level = 3

in_the_wild = true }

strings:

$a = {6A 40 68 00 30 00 00 6A 14 8D 91}

$b = {8D 4D B0 2B C1 83 C0 27 99 6A 4E 59 F7 F9}

condition:

$a or $b

YARA is the industry standard for searching for patterns in malware records. Malware analysts mostly depend on YARA

rules to identify specific threats, for example by scanning the pattern of malware that is unknown to the specific pattern for a

particular malware strain. YARIX, a more efficient methodology, is being introduced by [5]. YARIX uses an inverse n-gram

index that assigns a long sequence of bytes to a list of files. To make the corpora query more efficient, YARIX optimizes the

YARA search by changing the YARA rule to an index search to retrieve a range of potential candidate files in accordance

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 25
with the rules. Due to the memory requirements that arise when indexing of binary files, YARIX compresses disk traces with

variable byte delta encoding, extract from the offset file. This completeness is then scanned with YARA to get a real

matching set of files. The index footprint is quite small because some of the compression techniques used including

grouping-based compression scheme. That is, if the YARA search is optimized by five orders, only 74% of the accumulated

storage space of all instances will be required to store the YARIX inverse n-gram index.

Technological advances accompanied by many information topics: security, privacy, and integrity. Malware is one of those

security problems that threaten computer systems. Ransomware is a kind of malicious Software that threatens to publish

victim data or block further access to it unless a ransom is paid. [2] investigates the WannaCry ransomware malfunction and

detect ransomware through static and dynamic Analysis The features of the malware emerge from the analysis extracted

and the detection has been done using those features. The intrusion detection technique used in this study is the Yara rule-

based detection, which involves trying to establish a set of rules contains unique strings that will be decoded from the

WannaCry file. The proposed approach uses YARAGUI which is a malware analysis tool which compare the rules with the

desired directory. custom written rules contain important strings that are compared with directory and if the string is found in

the file or directory then the presence of malware is confirmed.

[6] proposes a new approach for malware detection that produces static signatures of the YARA based on n-gram analysis.

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 26
The proposed approach uses a genetic algorithm (GA) with Artificial Intelligence (AI) methods for creating YARA rules. The

GA application for generating YARA rules is considered the main contribution of the work.

3.1.4. Implementation, Analysis, theoretical and/or analytical models and results

The proposed methodology contains the following set of activities in the below order.

Below figure shows the Creation of a framework that

a. Intakes the Yara rules from Yara rule repository

b. Creating IDS (Snort) rules corresponding to YARA rules

c. Add the network features and parameters to YARA ruleset for network monitoring.

d. Adding the automatically created rules to IDS rule database.

Figure 1: Architecture

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 27
A high-level design of proposed signature conversion framework is shown in figure 1 where the framework takes a file

based YARA rules and perform two major tasks that includes:

1. Conversion of YARA signatures to IDS compatible signature options

2. Adding network features to formulate IDS Rules.

Formulation of IDS signatures from YARA rules comprise on series of process that includes the extraction of YARA rules

from rule database, parsing and extraction of signature strings along with the conditions and conversion of extracted

signatures to an IDS compatible rule option. Also, the addition of network parameters is also an integral part of an IDS

signature, So the addition of network parameters along with protocol and actions are added with rule option to formulate a

complete IDS rule.

The detailed flow diagram for the conversion of YARA signatures to IDS compatible signatures can be shown in the figure

below.

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 28
Conversion of YARA signature to IDS

compatible rule options:

Till now literature is reviewed and proposed methodology is presented above. Implementation, deployment, and testing is

still pending and planned for future tasks.

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 29
3.1.4.1. Analysis and Discussion

As attacks are now more and more sophisticated and modern so the detection of these attack is also complicated and

tough. IDS alone cannot detect this attack efficiently. We need a solution which can correlate multiple attack patterns and

generate some sort of event. In future we are planning to work on SIEM solutions. SIEM is capable to detect modern attack

patterns and even can detect techniques and procedures.

3.2. Milestone/Deliverable 2:

3.3. Milestone/Deliverable 3:

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 30
4. DOMAIN NAME 3: EBPF

Sr. Project Title/ Prior / Existing Work prior to Market Requirement Project Progress Status
No. Domain this quarter (reporting period)

3.

4.

4.1. Milestone/Deliverable 1:

4.1.1. Description of Milestone

The development of new technologies and their usage has opened new horizons for monitoring and analyzing network

traffic. Modern solutions like Extended Berkeley Packet Filter eBPF show clear distinction between conventional and

modern techniques, which lead to a more customized and more proficient filtering. Although these technologies play an

important role in increasing or decreasing system performance, because these frameworks are entirely operated in the

lowest layer of operation system like kernel. The Intrusion Detection/Prevention Systems (IDPS) which are Network

based such as Snort and Bro are responsible for passively monitoring the network traffic obtained from the network

Terminal Access points. Most of the IDPS are signature based. On large networks, drop rate increases due to

limitations in IDPS capturing and packet processing. Large throughput results in overheads and IDPS buffers start to

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 31
drop packets that can cause serious threats to the network. Mostly IDPS are attacked by Volumetric DDoS to increase

bandwidth of the network more than the reception and processing capacity of IDPS, which causes the IDPS to drop

packets due to buffer overflows. To over-come this threat, proposed solution iKern uses eBPF and Virtual Network

Functions (VNF) for examining and filtering packets at kernel level, before forwarding the packets into userspace.

Extended Berkely Packet Filter ebpf

It is a technology which is responsible for making the Linux Kernel programmable by injecting the fragments of the

code at different locations of the kernel code[1][2]. Ebpf is used to safely and efficiently extend the capabilities of the

kernel without requiring to change kernel source code or load kernel modules. The eBPF can be statically injected

during the runtime and is then verified just to make sure that it does not crash and cannot get caught in the infinite

loops[9]. However, this type of verification is only possible for the programs that are not complete. Thus eBPF

programs lack the features such as arbitrary length loops but these loops must have a maximum count of iterations.

The backward jumps in the code are not allowed in general. It means that eBPF can only be possibly used for the

implementation of algorithms which are independent of turing-completeness.

The eBPF programs are written in the C language and then compiled to the eBPF bytecode. Once it is injected into the

kernel, the eBPF bytecode first undergoes verification and then statically compiled to the native code. The eBPF is

specifically suitable for the packet processing. Initially when a packet reaches a network interface, some specific

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 32
actions are performed as in dropping the packet. This is essential for programs as the firewalls or tcpdump, which does

the work of recording the packets according to certain filters. For example, if the packets are coming from port 80 that

should be recorded, the tcpdump will combine an eBPF program that first encodes this and then loads it into the

kernel. This kernel then drops all these fragments or the pack- ets which are not compatible to the filter and then only

the correct ones are passed to tcpdump. The alternate is that first tcpdump receives all the fragments and then filters

them itself. The disadvantage of this is that each packet has to be individually passed to the kernel from the tcpdump.

This process involves duplicating the whole packet in memory and also other steps that are involved for computation.

This is the reason why passing the packets between the programs and the kernel should be avoided be- cause the

performance can be affected. eBPF helps overcome this problem[3]. Since the eBPF bytecode is put together with the

native or the original code, it should be faster in general than all the other codes in the kernel.

The eBPF programs mostly use the data structures which are specific and safe. This can pose a penalty on the

performance because testing the bounds of an array every time requires extra work every single time it is accessed. An

alternate instead of using eBPF is to use the kernel modules. Still, using the kernel modules has one disadvantage or a

drawback that they cannot be checked for stability and that they have to be assembled for a characterized kernel

version. Apart from this, the method of kernel module developing is not very straightforward and at times it is not

possible to expand some specific functionality in the kernel with a module of the kernel without changing this kernel

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 33
itself. It is very

hard to

reassemble

the whole

kernel[2].

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 34
• A & B is the code for execution
• C is the userspace code of BPF
• D is the bpf bytecode to be inserted in kernel
• Varifier initially verifies the bytecode
• BPF applies the filtering conditions on incoming traffic and store data in Maps (G)

1.2 – Volumetric DDoS attacks used to overload IDS

1.2.1 UDP Flood Attack

By definition, A UDP flood is any type of a DDoS attack that completely floods a target with packets of User Datagram

Protocol (UDP)[4]. The actual target or goal of the attack is to flood the ports randomly on a remote host. It results in the

host to repeatedly check out for the application listening at that specific port, and in case if no application is found it

replies as ‘Destination Unreachable’ . This process can eventually lead to inaccessibility [3].

1.2.2 ICMP Flood Attack

The principle is quite similar to the UDP flood attack. An ICMP flood cascades the target resource with packets of ICMP

Echo Request. It keeps on sending packets as fast as possible without waiting for the replies [5]. This attack can cause a

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 35
complete blockade of the pathway and using up the outgoing and the incoming bandwidth. The user’s servers will try to

respond with ICMP packets causing a complete slowing down of the system.

1.2.3 SYN Flood Attack

This attack utilizes an already known weakness in the three-way handshake or the TCP connection sequence. However

a SYN’s request for starting up a TCP connection in compliance with a host should be immediately answered by a SYN-

ACK feedback from the same host, and then eventually confirmed by an ACK feedback from the requester. In this attack

scenario, the suppliant sends a number of SYN requests, but he either does not acknowledge to the host’s feedback or

transmits the SYN requests using a spoofed IP address. Both of the ways, the host system keeps on waiting until the

requests are acknowledged, new connections cannot be made and eventually it leads to a denial of service and system

slowdown [3].

1.2.4 Ping of Death

In this attack the attacker sends a number of malicious or malformed pings to the system. The maximum limit of the

packet length including the header is about 65,535 bytes. But sometimes the Data Link Layer poses a limit to the frame

size too e.g. 1500 bytes over an ethernet network [5]. A ping of death (“POD”) attack involves the attacker sending

multiple malformed or malicious pings to a computer. The maximum packet length of an IP packet (including header) is

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 36
65,535 bytes. However, the Data Link Layer usually poses limits to the maximum frame size – for example 1500 bytes

over an Ethernet network. In this scenario, a big IP packet is divided into a number of IP packets known as smaller

fragments, and then the receiver host rearranges these IP fragments into a complete set or a packet. In the case of Ping

of Death scenario, the receiver host ends up with a large packet of a size of about 65,535 bytes after being

reassembled. It can ultimately cascade up the space allotted for the packet. This leads to denial of service.

1.2.5 Fragmentation Buffer Overload Attack

In this type of overload, the attacker sends many packets of fragments which are not finalized or completed. All these

fragments are saved in the IDS buffer and they stay there until other parts arrive. Finally, when the buffer is fully loaded

the fragment which is the oldest gets deleted. In case the attacker manages to load the buffer before the timeout finished

of the host’s fragments, he can send the fragments finishing the packet [4]. The goal or the target host will rearrange the

packet but the IDS will not.

1.2.6 Computational Power Exhaustion Attack

The attacker sends an enormous number of packets into the goal network with- out a definite purpose. These packets

are formed so that they need a higher amount of the computational power required for processing. In case of an IDS

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 37
which is not pow- erful or fast enough to carry out processing may either skip some resource requiring g the packets

altogether[6].

1.3 – Volumetric DDoS attack tools

1.3.1 HULK

The word Hulk stands for HTTP Unbearable Load King. It actually is a flood attack tool used by the web server. It is

formed solely for the research purposes[7]. It can detour the cache engine and it is also capable of generating unique

and vague traffic. It creates a huge amount of traffic at the web server.

1.3.2 SLOWLORIS

It is responsible for sending authorized HTTP traffic straight to the server. It does not pose an effect to the other related

services and the ports of the target net- work. This attack actually aims at keeping a maximum connection engaged with

the ones that are open[7]. Thus, it completes its action by sending a partial request while holding the connections as

long as possible. While the server keeps the false connec- tion open, it overflows the connection pool and then as a

result it denies the request to the true connections[8].

1.3.3 LOIC

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 38
This word stands for Low Orbit Ion Cannon. It is free of cost and a popular tool that is available for the DDoS attack. It is

quite easy to use. It is responsible for sending UDP, TCP, and HTTP requests to the server. It can perform the attack

based on the IP or URL address of the server[8]. In the matter of seconds, the website slows down and then stops

responding to the original requests. It does not hide the IP address. The proxy server also stops working. The proxy

server stops because in this case it makes the proxy server as the target.

1.3.4 RUDY

The word RUDY stands for R-U-Dead-Yet. This tool attacks using a field sub- mission which is in a long form through the

POST method. The console menu is interactive.[7] The forms can be selected from the URL for the POST-based DDoS

attack. The form fields are identified for the data submission. The long content length data is injected to this form at a

very slow rate.

1.4 – Objectives

The objectives of the proposed research are:

a) To detect and mitigate Volumetric DDoS flood attacks in linux kernel.

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 39
b) To analyze large traffic floods within kernel on 1 Gbps throughput and integrate traffic balancing to reduce CPU

overheads

c) To create Virtual Network Function using ebpf to enable programmability of linux kernel from user space.

d) To create ebpf maps within kernel to store the generated drop rules based on bytecode.

4.1.2. Literature Review (if relevant)

Various attack detection and prevention mechanisms have been proposed based on user space and kernel space detection

models. Signature and anomaly based intrusion detection systems detect malicious traffic using detection engines that

operate in user space of an operating

system. Kernel space detection models

require creation of Virtual Network

Functions VNFs and inject bytecodes

from user space to make the kernel

programmable for executing customized

functions using ebpf. In this literature review, attack detection models based on user space Intrusion Detection Systems and

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 40
kernel space attack detection based on Virtual Network Functions is discussed. Existing work in the domain can be divided

into two categories Kernel Space Detection Model (KDM) and User Space Detection Model (UDM), as shown in Figure 2.1

Kernel Space Attack Detection Model (KDM)

In the research conducted by Sebastiano Miano[9], Polycube framework is pro- posed. It is a software framework whose

major aim is to utilize the Network Functions Virtualization (NFV) by the in kernel applications involving packets. It enables a

range of customization and flexibility. The Polycube helps in the creation of com- plex and arbitrary network function chains,

here each of the function includes a very efficient in-kernel data plane and a user space plane which is flexible to use. The

net- work functions of the polycube known as the cubes, can be generated dynamically and then eventually be injected into

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 41
the kernel networking stack by using the AF- PACKET socket,

without requiring the kernels which are customized or the specified

kernel modules. This ultimately simplifies the introspection

and debugging. The injected cubes use the eBPF functions which

identifies and then drop the DDoS attacks. Polycube framework is

shown in Figure 2.2

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 42
MARCOS A. M. VIEIRA [10] in his research described a key solution to the packet processing which is fast enough with

eBPF and XDP. In this paper the BPF and the eBPF machines are discussed , apart from this an overview of the eBPF

system is given by the Linux kernel, the recently available hooks and few results of the recent research. This paper is

based on filtering network traffic on the basis of TCP protocol. Author has developed a program that can be injected into

linux kernel and exploit eBPF to increase the network monitoring performance by defin- ing filters using TCP protocol. It

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 43
makes use of the eBPFin order to drop the pack- ets or fragments without any TCP parts and alleviates the TCP reset

attack. It is displayed in two kinds of scenarios or perspectives: one is the C code of higher level and the original eBPF

arrangement like the code generated post compilation process. Specifically, this program is designed to be directly loaded

into the XDP hook, this is why the input parameter of the function should be of the type struc- ture. The bytes of the packet

which are being processed are delimited by the data itself and the data end pointers, and this must be made to use

throughout the pro- gram to access the packet. By making use of the data, the parsing of the headers can be completed

by the files which are standard for headers, provided by Linux.

Research conducted by , Luca, et al [11], showcased how the eBPF is used in order to trace and monitor the behavior of the

software pragmatically and also the network traffic with the aim of identifying the stegomalware. In order to prove the

efficacy of the idea they calculated the use of the eBPF in order to gather all the data in two mul- tiple use cases. In the first

case, it displayed how it can be used in order to track the specified calls from the system when an attack is based on the

conspiring applications scheme is ongoing. In the second case, an eBPF was developed in order to evaluate the behavior of

the Flow Label field when it is used for the implementation of a covert channel in bulk of the IPv6.

User Space Attack Detection Model (UDM)

Josy Elsa Varghese[12], in his research proposed an Framework of the IDS for the DDoS type attacks in the environment

of SDN. The suggested approach displays DDoS Detection framework by making use of one statistical parameter within

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 44
the SDN ar- chitecture related to the DPDK framework. Developed algorithm for detecting attacks. in userspace. This

framework sorts out all the problems regarding the contradictory relationship of the SDN architecture and the DDoS attack

and the drawbacks of the IDS in network with high speed. Apart from that , the examination or the detection al- gorithm

gives a brief prediction of the attacks with a very good and efficient detection performance. The results attained through the

experiments show that the framework is quite successful in creating a trade off between the efficiency of the framework and

the detection effect in a high speed network. CICDDoS2019 Dataset is used to generate attacks for the framework testing.

Proposed solution performed at 720mbps throughput with 96.59% accuracy.

In the research conducted by Sumit Badotra [13] a detection system based on the DDoS which is implemented by the help

of the SNORT IDS which stands for the Intru- sion Detection System in the Opendaylight (ODL) and Open Networking

Operating System . for the purpose of analyzing the activity of the DDoS tool which is implemented, various scenarios with

a varied number of the hosts, the generated data traffic and the switches are used. For the purpose of traffic generation,

various tools for pen- etration are used such as the hping3 and napping, while on the other hand involving the varied

number of switches and hosts, the Mininet tool for emulation is used. The final evaluation of the DDoS detection tool was

attained on the base of the number of the packets dropped, the packets received and the efficiency and the accuracy of the

utilization of the cpu. Figure 2.5 shows the experimental setup of the proposed system.

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 45
It has been observed that Kernel Space Attack Detection Models KDM clearly outperformed the User Space Attack

Detection Models UDM in terms of Volumetric Attack detection accuracy, Packets Reception rate, Packets Drop Rate

and CPU utilization. UDM systems however showed better accuracies on average network sizes, whereas CPU

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 46
overhead and packets drop rate increased on larger networks. Detection engines operating in userspace required

more processing power than the Virtual Net- work Functions VNF running inside kernel space. Packets received by

Network Card are copied to kernel and attack is detected before forwarding the flow to other processes working for

userland applications, reducing CPU overhead. Related work comparison based on threat detection accuracy, data

acquisition methods and the achieved network throughput is shown in Table 2.1.

Title Year VNF DAQ KDM/UDM Mbps CPU% Accu%

eBPF-Based VNF.. J,21 ebpf afpacket KDM 420 100 98.76

Processing with eBPF.. J,20 ebpf libpcap KDM 178 100 97.42

Kernel-level trac.. J,21 ebpf libpcap KDM 120 95 98.2

IDS for DDoS.. J,21 ebpf dpdk UDM 570 97 96.89

SNORT based DDoS.. J,21 - afpacket UDM 280 100 97.9

eBPF for non-intr.. C,21 ebpf libpcap KDM 84 100 98.86

IDS for.. C,20 - libpcap UDM 85 100 97

DDoS attacks.. J,20 - libpcap UDM 140 100 99.85

Methodology

The overall goal of the research is to provide innovative network monitoring technique adapting the latest

technologies like eBPF and PF RING to detect and mitigate Volumetric and Multi Vector flood attacks on large

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 47
throughput. The proposed solution must be as versatile and flexible as possible, allowing creating networking

probes that dynamically adapt to the user needs, changing the filtering program at runtime and exporting the

requested metrics. The traditional approaches to monitor network traffic include IDSPS systems, a more optimized

approach towards using IDPS is proposed using initial traffic monitoring and filtering at kernel level by exploiting

the eBPF system along with kernel ring buffer for acquiring network traffic on multiple cores.

In order to detect and mitigate Volumetric and Multi Vector Flood attacks, traffic monitoring and filtering is

implemented at linux kernel level to avoid application overheads and CPU utilization by IDPS signature matching

algorithms. This performed by using modern technique, linux eBPF system and PF-RING to make it a hybrid

framework for monitoring net- work traffic over high-speed networks. This solution is intended to perform on large

networks where data rate reaches at around 1 Gbps. Detection and mitigation of flood attacks is performed in

kernel level before sending the packets to user space applications such as IDPS[8][22]. Detection of attacks on

high speed networks and large flows of data is a challenging task. It requires an efficient Data Acquisition Module

to receive packets without dropping. As if the packets are dropped due to CPU over- heads or buffer overflows, it

can cause the network to be at a great risk of resources overhead and data loss. First phase of the methodology is

to implement Data Acquisition Module with load balancing capability for running instances on multiple cores and

sharing the load of incoming traffic.

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 48
Proposed system is comprised of separate modules as shown in the Figure 3.1. Data Acquisition Module (DM) is

designed for capturing packets over large network flows. Volumetric and Multi Vector floods are generated to test

the capability of DM. Multi core implementation of DM is used in Streamed Data Acquisition Module (SDM) by

applying load balancing techniques. Incoming traffic is inspected by using in kernel detection module designed by

using ebpf actions and ebpf bytecode injected from user space and creating a Virtual Network Function (VNF).

Volumetric Multi Vector Attack Generation

For the evaluation of proposed system

performance, CIC-DDoS2019[13] data set is used along with the attack generation tools mostly used by the

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 49
attackers for launch- ing floods on systems. The data set consists of benign and the updated common Volumetric

DDoS attacks, which resembles the true real-world data in the form of pcaps. Included floods are such as UDP,TCP,

PortMap, NetBIOS, LDAP, MSSQL,, SYN, DNS,NTP and SNMP. Along with using this data set, Some of the most

common and effective tools to launch Volumetric and Multi Vector attacks are also tested such as HIOC, RUDY,

HULK and LOIC[11]. These tools are capable of launching attacks with large volume flows and compromise the

resources of target system. These tools were used to test the data acquisition and attack detection capability of

proposed sys- tem on large high-rate attacks.

iKern filtering and detection module

iKern is loaded inside the linux kernel to detect volumetric and multi vector floods by using iKern detection algorithm

and iKern drop rules. iKern is directly connected to the Data Acquisition Module (DM) for receiving network packets

sniffed by the PF RING socket. iKern architecture consists of multiple modules operating within linux kernel. NAPI

copies packets from the NIC to the circular buffer. Incoming packets are inspected with in iKern Engine by the ebpf

Virtual Network Function (VNF). VNF is created by injecting eBPF bytecode from userspace.

iKern detection algorithm and drop rules are stored in ebpf maps. Every time a new ebpf bytecode is inserted into the

linux kernel, it is examined by the verifier for compatibility or any syntax issues for preserving the kernel space state.

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 50
Verified bytecode is stored into ebpf maps. It includes the iKern algorithm and iKern drop rules writ- ten for detecting

the attacks. Incoming traffic received at the ring buffer is matched with the ebpf map attributes and in case of any

malicious activity, iKern drop rule is triggered to drop those packets and send an alert to userspace. Traffic forwarded

from VNF to ring buffer aware libpcap is filtered from any volumetric or multi-vector flood. Figure shows the internal

architecture of iKern Detection along with the HDAQ Module.

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 51
CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 52
iKern filtering and detection module

iKern detection algrorithm checks the incoming traffic to the System, whether it is malicious traffic or normal. In case it

senses a volumetric or multi vector attack, iKern will specify the type and priority of the attack, whether it is High-

priority or Low-priority attack. It will send alerts containing the address of the attacker, port number and the attack

type. High volume flood is detected if the following equation gets true and it is flagged as a high priority attack.

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 53
1.1.1.1. Results

For the initial tests, generated traffic is filtered using the iKern filter mechanisms supported by the ebpf. Traffic

filters are based on IP addresses, Protocols and ports. UDP filter is applied to filter all the UDP packets received

at NIC. Screenshots below show the different filtering mechanisms. Initial phase of testing consists of iKern DM

evaluation and comparison with the default libraries. All these tests are performed using same parameters for the

tested libraries and iKern DM. Socket clustering used for receiving packets and balancing the load across multiple

instances is performed by integrating PF RING socket in iKern DM.

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 54
CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 55
CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 56
1.1.1.2. Conclusion & Future Work

iKern uses PF RING socket to capture network packets at high data rate and iKern Filtering is used that filters specified

traffic by in kernel processing, before sending packets into userspace. Detected threats and malicious packets are

dropped using ebpf filters. iKern filter code is injected into the linux kernel using ebpf bytecode. ebpf maps are stored for

applying actions on the detected IP addresses or protocols.

In the future this research can be extended to detect the Volumetric attacks for larger network flows greater than 1Gbps

throughput. Multi core Streamed Data Acquisition Module (SDM) is scale able by using 10Gbps multi queue network

interface cards. This research can be extended towards the study of other cyber attacks and their detection can be

implemented in kernel space using ebpf and Virtual Network Functions. Signature matching per- formed by the IDPS

detection engines in user space can be implemented in the kernel space using VNF for decreasing the CPU overheads

and extensive resource utilization

1.2. Milestone/Deliverable 2:

1.3. Milestone/Deliverable 3:

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 57
5. DOMAIN NAME 4: HARDWARE

Sr. Project Title/ Prior / Existing Work prior to Market Requirement Project Progress Status
No. Domain this quarter (reporting period)

5.

6.

6.1. Milestone/Deliverable 1:

6.1.1. Description of Milestone

In this report, a complete hardware-based Netspection IDS is designed in such a way that we can achieve high

performance, throughput, and data rate. In hybrid IDS, there is a communication delay between software and hardware. So

in order to achieve high speed NIDS, we need to implement all these modules shown in figure 3 (i.e., packet capturing,

packet decoding, packet preprocessing, and pattern matching engine) on FPGA. In terms of hardware implementation, the

packet capturing module is used to capture incoming internet traffic (in terms of packets) using an FPGA board, while the

packet decoder is responsible for splitting captured packets into packet header and payload. The preprocessing module is

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 58
used to organise the split packets for the event detection engine. Finally, the event detection module is responsible for

performing the most computational intensive parts. Among these, pattern matching is a complex and time-consuming

process. Somehow, these modules are implemented on the Spartan (SP605) tool kit, and then we are converting the

prototype into a high-performance board, NetFPGA Sume, to achieve high end speed. 

7.1.1. Literature Review (if relevant)

The exponential increase in malicious activities over the internet network causes security threats. Several software-based

applications and hardware-based communication devices are commonly available to protect the internet networks against

security threats and attacks. Due to higher security provisions and to maximise the throughput of the communication

devices, hardware-based solutions are preferred. Scalable hardware architectures for network security are required by

communication devices to provide protection against threats and attacks.On the other hand, scalability is important to

provide due to an ever increasing number of attack types. As a result, many researchers [1-49] have created scalable

hardware architectures to implement intrusion detection engines or accelerators.

The term scalability refers to the range of capabilities for processing computations involved in the intrusion detection

engines. Normally, an intrusion is an unauthorised entry into the internet network. An intrusion detection system (IDS) is a

software application or communication device that has the capability of monitoring communication devices or incoming

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 59
network traffic for malicious activities [50]. two types: (1) network intrusion detection systems (NIDS) and (2) host intrusion

detection systems (HIDS). Out of these two, the former monitors the incoming traffic from the internet source while the latter

monitors the operating system files [51]. The main point of this article is to talk about the NIDS environment's existing

scalable architectures.

Network intrusion detection systems contain packet capturing, packet decoding, packet preprocessing, and event

detection/engine modules [52]. In terms of hardware implementations, the packet capturing module is used to capture the

incoming internet packets using a network interface card (NIC), while the packet decoder module is responsible for splitting

the captured packet into a packet header and payload. Moreover, the packet header can further be analysed to obtain the

five internet tuples: (1) source IP, (2) destination IP, (3) source port, (4) destination port, and (5) protocol. The preprocessing

module is utilised to organise the incoming packets for the event detection and engine module. Finally, the event detection

module is in charge of the most computationally intensive part, pattern matching, which is required for the development of

NIDS [1]-[49].

3.1.1. Methodology

Pattern matching is the art of comparing a set of incoming characters with the elements of the stored patterns in a database

[53]. Broadly speaking, there are two types of pattern matching: (1) string matching (SM) and (2) regular expression

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 60
(RegEx) matching. The SM is utilised to coordinate a set of strings against a stream of received characters through RegEx.

These are standard dialects, built utilising character classes over a fixed letter set [54]. The use of pattern matching

algorithms or techniques is determined by the needs of the target application, such as incoming Ethernet traffic in network

security [55], protomata comparing in computational biology [56], and data mining in artificial intelligence [57]. This study

has looked at how different pattern- matching algorithms can be used to make
Pattern Matcher
networks safer.

i. Hardwired based ii. Memory based


Pattern Matching Pattern Matching

Utilize Logic Cells Utilizes BRAMs of FPGA

Aho Corasick Bit Split


Algorithm Algorithm

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 61
Aho-Corasick (AC) Algorithm

The AC algorithm is a string search algorithm developed by Alfred V. Aho and Margaret J. Corasick. It resembles a

dictionary matching algorithm that detects elements of a finite set of strings within the input text. The algorithm’s time

complexity can be given by, where is the length of the string, is the length of the input text, and is the total number of

outputs. [12].

The AC algorithm builds a finite-state machine that is similar to a tyre with surplus links between the several internal nodes.

These additional internal links permit fast transitions within other branches of the tyre with the longest common prefix when

a match fails. In this way, the automaton can make transitions between the nodes without the need for backtracking. When

a signature set is known in advance, the automaton can be constructed once off-line and then be used. In such a scenario,

the run time is proportional to the length of the input text and the number of matched outputs. All the signatures are

assimilated into a single deterministic finite automaton (DFA) in such a way that the size of the signature set and processing

time are independent of each other. The AC algorithm comprises of three functions, i.e., goto function, failure function, and

output function. Fig. 1 shows the finite state machine of the signature set app, apple, aim, cap, and cat. The Goto function is

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 62
shown by solid lines, while failure transitions are shown by dotted lines. The input character is discarded when the DFA

navigates into a goto transition edge. If a valid goto function is not found, then it tracks the failure pointer without discarding

the character if the current node is not the root. If the failure pointer is not there, then by default it refers to the root state. If

the current node is the root and the goto function is invalid, then it discards the input character. Whenever the DFA comes

to an output node, it generates an output signal.

Memory-Based Bit-Split Algorithm

A String-Matching Engine At a high level, our algorithm works by breaking the set of strings down into groups and building a

small state machine for each group. Each state machine is in charge of recognising a subset of the strings from the rule set.

The first concern is that building a state machine from any general regular expression can, in the worst case, require an

exponential number of states. We get around this problem by exploiting the fact that we are not matching general regular

expressions but rather a proper and well-defined subset of them for which we can apply the Aho-Corasick algorithm [Aho

and Corasick 1975]. The other problem is that if we are not careful, we will need to support 256 possible out-edges (one for

each possible byte) on each and every node on the state machine. This results in a huge data structure that can neither be

stored nor traversed efficiently. We solve this problem by bit-splitting the state machines into many smaller state machines,

which each match only one bit (or a small number of bits) of the input at a time (in parallel). Our architecture is built

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 63
hierarchically around the way that the sets of strings are broken down. The full device is at the highest level. Each device

holds the entire set of strings that are to be searched, and each cycle the device reads in a character from an incoming

packet and computes the set of matches. Matches can be reported either after every byte, or can be accumulated and

reported on a per-packet basis. For the purposes of this paper, we will focus on a single device. Inside each device is a set

of rule modules. The left side of Figure 1 shows how the rule modules interact with one another. Each rule module acts as a

large state machine, which reads in bytes and outputs string match results. The rule modules are all structurally equivalent,

being configured only through the loading of their tables, and each module holds a subset of the rule database. As a packet

flows through the system, each byte of the packet is broadcast to all of the rule modules, and each module checks the

stream for an occurrence of a rule in its rule set. Because throughput, not latency, is the primary concern of our design, the

broadcast has limited overhead because it can be deeply pipelined, if necessary. The full set of rules is partitioned between

the rule modules. The way this partitioning is done has an impact on the total number of states required in the machine and

will, hence, have an impact on the total amount of space required for an efficient implementation. Finding an efficient

partition is discussed in Section 3. When a match is found in one or more of the rule modules, that match is reported to the

interface of the device so that the intrusion detection system can take the appropriate actions. It is what happens inside

each rule module that gives our approach both high efficiency and throughput. Each rule module is made up of a set of tiles.

The right hand side of Figure 1 shows the structure of each and every tile in our design. When working together, tiles are

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 64
responsible for the actual

implementation of a state machine that

really recognises a string in the input. If

we just generated a state machine

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 65
Core Concept of Bit-Split:

AC state machine has 256 possible outgoing edges from each nodes create large state machine is requires memory of

the order of kb. So, there is a need of compress the data and then store it. This is done by using Bit-Split technique. By

Bit-Split algorithm, we can reduce the branches by

diving the state machine into smaller states. e.g.

input character = 8 bit 2^8 = 256 branches is require in

AC algorithms then we split the input character into

two parts like 8/2 = 4 (bit split) then 2^4=16 locations/

branches (per S-FSM) is require to match the input

character So we have two FSM 16+16=32 total

branches is needed. Complete bit split Memory

based architecture in shown in figure.

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 66
Figure 7: Complete bit split Memory based architecture

Memory Based AC

Memory base implementation of AC utilizes the BRAMs of dedicated FPGA while making of transition tables and it

left slices unused. This technique has the advantage of whenever the signature/ rule set is needed to update, only

the contents of the memory are required to be replaced but disadvantage of this memory based AC algorithm is the

elevated memory demand to store the DFA’s transition table.

The algorithm used for memory based AC is bit-split algorithm. The approach behind bit-split algorithm is taking AC

state machine and dividing it into multiple state machines and these are referred to as bit-state machines and they

work independently while providing input. In case of matching of string against signature set, output logic is tied to

their respective states. The division of AC-state machine into different machines is done on the basis of individual bits

of input.

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 67
An application based on C++ has been developed for the memory based ac, which takes the text file of a rules/

signature set as an input and create the memory initialization files.

• Uses BRAMs of FPGA.

• Transition tables are implemented using BRAMs.

• Pattern matching of incoming data is done through transition tables.

• Main advantage of bit split is whenever there is a need to update the signatures/ rules set only the

contents of the memory needs to be replaced.

• Auto Memory Files Generator is used to generate the memory files for any number of input rules and

update in BRAMs.

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 68
Figure 2: Memory Based AC

Implementation on FPGA

Before implementing complete pattern detection engine on FPGA, individual IPs has been tested on Xilinx Zynq-7000

ZC702 providing XC7020-CLG484-1 evaluation kit. This evaluation kit provides both PS (Processing System) and PL

(Programmable Logic) in a single chip.

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 69
Figure 3: Xilinx Zynq-7000 ZC702

Based on the generated Verilog-HDL through auto-HDL-generator, project was created in Xilinx Vivado Design Suite.

Tasks related to packet capturing, decoding and preprocessing of incoming Ethernet packets are carried out using PS

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 70
which finally passes the payload data to PL by writing it to BRAM of FPGA. The BRAM controller module reads the

payload byte by byte from the BRAM and sends it to pattern matching module implemented within PL section using auto

generated Verilog-HDL code. The alerts are reported back in case incoming payload data contains any signature that

are considered in the design. After the successful testing of IPs individually on above mentioned evaluation kit, the

whole pattern engine is implemented and tested on Spartan 6 – SP605 FPGA kit because this device provides PCIe.

Figure 4: Spartan 6 - SP605 Evaluation kit

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 71
The PCIe IP Controller core communicates data with the user logic through a standard Application FIFO, which is

supplied by the PCIe IP. On the other end PCIe IP talk with host PC.

2.5 Accelerating IDS using high end FPGA boards

Key Features:

 Xilinx Virtex-7 XC7V690T FFG1761-3

 SFP+ interface supporting upto 80Gbps

 PCIe Gen3 x8 (8 lane) (8Gbps/lane)

 Two 4GB DDR3 SODIMM (MT8KTF51264Hz-1G9E1)

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 72
Figure 5: High End NetFPGA Sume Evaluation Board

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 73
5.1.2. Implementation, Analysis, theoretical and/or analytical models and results

COMPLETE HARDWARE-BASED IMPLEMENTATION OF NETSPECTION.

We performed pattern matching on hardware, while the host PC handled software tasks such as packet capture, decoding,

and preprocessing.Here, we will discuss the implementation of these parts that we have done other than the host PC.

Capturing packets

Since the host PC was capturing packets itself before, Our main task was to acquire data on Spartan SP605 other than that

on the host PC to increase the performance of the system. The Spartan SP605 has a data acquisition capacity of 1

GB.capturing procedure on FPGA includes modules for handling Ethernet frames as well as IP, UDP, and ARP and the

components for constructing a complete UDP/IP stack. The packet capturing module on FPGA has submodules to capture

data, i.e., eth_mac_1g_fifo, eth_axis_rx, eth_axis_txs, and udp_payload_fifo.

Decoding of Packets

The next task was to decode captured packets. The packet decoder module divides the data into source ip, destination ip,

length, checksum, and payload data. Implementation of the packet decoding module has been done on the Spartan SP605

prototype.

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 74
 

Design Inclusion

Now we have designed all the modules and implemented them as discussed above. The next step is the integration of

these modules. In this section we are going to explain how these modules are integrated and how they work.

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 75
As we can see in the block design, we have three blocks, i.e., packet capturing, decoding, and an SME engine. Before that,

we had two blocks. One was an FPGA block while the second was running on a host PC. In this case, two blocks, i.e.,

packet capturing and decoding, are integrated with SME. As we know, the packet decoder gives source ip, destination ip,

length, checksum, and payload data. The controller reads data from the packet decoder and gives it to the fifo module,

which then gives the input character to the SME engine.

PCIe

In our project, we interfaced an FPGA board to a host PC through PCIE. PCIE is a high-throughput protocol available on

most modern motherboards as well as some embedded boards. PCI Express provides an end-to-end solution for data

transport between an FPGA and a host running Linux. The PCIe IP Controller core communicates data with the user logic

through a standard Application FIFO, which is supplied by the PCIe IP. On the other hand, PCIe IP talks with the host PC.

The above system is tested on the SP605 development board and the HP Core I3 system. The below figure shows an

FPGA board (SP605) installed in the host system.

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 76
s

PCIe for NetFPGA Sume

In further more, we interfaced an NetFPGA board to a host PC through PCIE. PCIE is a high-throughput protocol available

on most modern motherboards as well as some embedded boards. PCI Express provides an end-to-end solution for data

transport between an FPGA and a host running Linux. The PCIe IP Controller core communicates data with the user logic

through a standard Application FIFO, which is supplied by the PCIe IP. On the other hand, PCIe IP talks with the host PC

only shows the results. The above system is tested on the SP605 development board and the HP Core I3 system. In feature

work shift complete hardware implementation of IDS prototype in the High performance board. The below figure shows an

NetFPGA Sume board installed in the host system.

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 77
CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 78
PCIe IP Synthesis

After successful simulation, the IP is synthesized. Below table shows the resources utilization by the IP. For

validation of IP on FPGA we use PCIe Tree.

Logic Utilization Used Available Utilization

Number of slices LUTs 6995 433200 1.61%

Number of BRAM/FIFO 12 1470 0.81%

Table 1: Device Utilization Summary

Application FIFO Synthesis

After successful simulation, the FIFO IP is synthesized and implemented on FPGA. Below table shows the synthesized

report of device utilization summary.

Logic Utilization Used Available Utilization

Number of slice registers 48 866400 0.011%

Number of slices LUTs 47 433200 0.011%

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 79
Number of BRAM/FIFO 1 1470 0.03%

Table 2. Device Utilization Summary

For testing and validating the design, we generated a pcap file having known content in payload. Below figure show the

pcap files.

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 80
Figure 6: Pcap File View in

Wireshark

As shown in above fig.10, we have content of GUID=2E, and this payload have ID: 06. Our proposed system

detected that ID and Data and results have been mentioned on below fig.11 by using Chip Scope Debugging in ISE

Design Suite.

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 81
Figure 7: ID Detection and Data Detection of content of Signature Set

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 82
Time taken by detection engine based on software is shown in fig 12. It can be seen in figure that software took 1us – 2us.

This is the time difference of arrival of packet in detection engine until its ID detection. On hardware side, detection engine

system is tested on SP605 development board.

Figure 8: Time utilized by FPGA (sp605) based detection engine

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 83
The performance of hardware-based detection engine is better than software-based detection engine as it can be seen in fig

13, due to the parallelism property of FPGA. System is running on the frequency of 125MHz and clock cycle time calculation

using this frequency is 8ns. SP605 FPGA took almost 27 clock cycles of packet ID detection from the time of its arrival.

27∗8 ns=216 ns or 0. 216 μs. Hardware based detection engine almost 10 times faster than the software-based detection

engine.

NET FPGA SUME:

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 84
Our next task was to implement Netspection IDS on NETFPGA SUME which is an ideal platform for high-performance and

high-density networking design. The NetFPGA-SUME is an amazingly advanced board that features one of the largest and

most complex FPGAs ever produced, a Xilinx Virtex-7 690T supporting thirty 13.1 GHz GTH transceivers. Four SFP+

10Gb/s ports, five independent high-speed memory banks built from both 500MHzQDRII+ & 1866MT/s DDR3 So DIMM

devices, and an eight-lane third generation PCIe offer incredible throughput and can sustain a large number of high-speed

data streams to the FPGA fabric and memory devices. Other features include the presentation of twenty transceivers in total

on FMC and QTH expansion connectors, and SATA ports. The NetFPGA-SUME's main mission is to give students,

researchers and developers a state-of-the-art platform for networking, whether it’s learning the fundamentals or creating

new hardware and software applications. This board easily supports simultaneous wire-speed processing on the four

10Gb/s Ethernet ports, and it can manipulate and process data on-board, or stream it over the 8x Gen.3 PCIe interface and

the expansion interfaces.

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 85
Figure 9: NET FPGA- SUME Board

Features

 Xilinx Virtex-7 XC7V690T FFG1761-3

 Four SFP+ interface (4 RocketIO

GTH transceivers) supporting

10Gbps

 PCI-E Gen3 x8 (8Gbps/lane)

 QTH Connector (8 RocketIO GTH transceivers)

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 86
 Two SATA-III ports

 One HPC FMC Connector (10 RocketIO GTH transceivers)

 Three x36 72Mbits QDR II SRAM (CY7C25652KV18-500BZC)

 Two 4GB DDR3 SODIMM (MT8KTF51264Hz-1G9E1)

 MicroUSB Connector for JTAG programming and debugging (shared with UART interface)

 Two 512Mbits Micron StrataFlash (PC28F512G18A)

 Xilinx CPLD XC2C512 for FPGA configuration

 User LEDs and Push Buttons

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 87
Figure 10: NET FPGA-SUME block

Diagram

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 88
Packet Capturing and decoding on NETFPGA SUME

Packet capturing and decoding is implemented on NETFPGA SUME other than host pc. Integration of all three

modules is under process.

Pattern Matching on NETFPGA SUME

Since Spartan SP605 had small number of resources for pattern matching algorithm. So it was designed for less

number of rules up to 512. When we move to NETFPGA SUME, the number of resources increase, hence we can

match strings up to seven thousand rules for hardware implementation. SME engine in this case is designed for more

rules use maximum 90% resources. Implementation and testing of Pattern matching on hardware for NETFPGA

SUME have done.

1.3.1.1. Analysis and Discussion

This project proposed a signature based Netspection hybrid network intrusion detection system which guarantee a

robust system with high throughput. Bit Split Algorithm with both memory-based and hardwired-based has been

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 89
implemented to utilize maximum resources of FPGA and to enhance the performance of the system. Packet

capturing, decoding and Pattern matching are implemented on Spartan SP605 FPGA to make Netspection IDS more

efficient and robust. All three modules have been interfaced to PCIe through PCIe interfaced output alert are display

on host PC and store into database for further use. We concluded that integrated design and all IPs are working

properly. All the results are attached above to the report.

Future work includes end-to-end complete implementation of Netspection IDS such as packet capturing, packet

decoding preprocessing, and pattern matching on the high performance NETFPGA SUME hardware development

board. At this stage, packet capturing, decoding, and pattern detection engines are implemented on Spartan SP605.

Furthermore, all three modules will be integrated and tested as a prototype. Further, we will convert the prototype into

a NETFPGA SUME evolution board. Finally, through the PCIE interface, the result will be displayed on the host PC

and also stored in the database.

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 90
1.4. Milestone/Deliverable 2:

1.5. Milestone/Deliverable 3:

6. OUTCOMES

2.

2.1. Labs Progress on KPIs


Please state your progress on following yearly KPIs as mentioned in PC-1 document:
Provide their mapping to Lab project domains

1. At least 1 SCI indexed journal papers OR 1 CORE ranked A/B conference paper
2. At least 1 formal international collaboration with meaningful interactions and targets beneficial to both
NCCS and the collaborator
3. At least 2 Research Proposals on cyber security related projects to be submitted to ICT R&D Fund, HEC,
PSF or other national/international funding agencies
4. Dissemination activities carried out (quarterly events, open houses, industry interactions and linkage,etc.)
5. 1 Training program (5-days) carried out in Lab’s area of focus
6. Agreements with Industry (if any)
7. MoUs signed (if any)
8. Commercialization efforts/Start-ups (if any)

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 91
3.

4.

5.

5.1.

5.1.1. Publication Details (as per KPI)

Ser. Type Title of the Complete Detail of Authors Deliverable


(Journal/ conference/ Referenc Name Permanent Employment, Role / / end
Conference Journal e as per (Details Employment Role and contributio product /
) paper IEEE of all / Role Task at Lab n in the PC-1
/Title/ HEC
Format authors Outside Lab and the Publication Objective /
– HJRS Cat/
ranking
Including in the (for student related KPI
URL/ order as give project linkage to
ISSN/ in the complete assigned the Paper
ESSN paper) details) (with dates)
Journal 1 Author 1
Author 2
Author 3
Conference
1

3.
4.
5.
5.1.
5.1.1.

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 92
5.1.2. Research Proposals (at least 2 per year)

Ser. Research Date of Present Funding Detail Deliverable Link with KPI
Proposal Submission status Funding Amount
Title Agency

5.1.3. International Collaboration (at least 1 per year)

Ser. Collaborator Objective/ Present Status Deliverable Link with KPI


Targets

5.1.4. Industrial Collaboration (at least 1 per year)

Ser. Collaborator Industrial MOU Milestone/ Current Status Deliverable


Partner Signed Deliverable link with KPI
(If any)
National International

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 93
5.1.4.1. Detail of Partner/ Organization

Title Funding URL/ Purpose of Activities Detail Relevant Official Deliverables


and (if any) Contact Partnership Correspondence Link with
details Details may be attached KPI as per
PC-1
Task Start End Status
Date Date

5.1.4.2. Detail of User Organization


Title Funding URL/ Task Timeline Relevant Official User Feedback
and (if any) Contact Assigned by Correspondence
details Details User / may be attached
product / Start End Status
deliverables Date Date

5.1.5. Dissemination and exploitation of results

5.1.5.1. Dissemination and Communication Activity (Marketing Strategy)

Ser. Subject Dissemination Activity Detail Duration Total Attendees Link with
/ Topic & Date and Target Audience KPI

Seminar Workshop Conference Open Gen Industry Investor Policy Customer Others
House Pub Maker

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 94
5.1.5.2. Details of Training Programs

Ser. Subject/ Dur. Target Training Detail Attendees Details Revenue Link with
Topic & Audience (Name, Contact, Generated KPI
Date Employment)
Seminar Workshop Courses Certifications Paid (at least 5) Non Paid

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 95
1.

2.

3.

4.

5.

5.1.

5.1.1.

5.1.2.

5.1.3.

5.1.4.

5.1.5.

5.1.5.1.

5.1.5.2.

5.1.5.3. Visits

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 96
Ser Visit Details Objective Members Expenditures Detail Of Deliverable link with
. Schedule KPI
Local Foreign Locatio Duration Timings
n

5.1.6. Project Sustainability and its Impacts

Ser. Types of Fund Product Revenue Target as per KPI Impact


Generated
Industrial Project Funding
Non PSDP Research
Funds From External
Sources
Startups
Paid Trainings
Services

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 97
5.1.6.1. Start-ups

1.

1.1.

1.2.

1.3.

1.4.

1.5.

Ser. Local Commercial Partner Fund Fund Employ Details Solution Survival Revenue Deliverable
Problems Viability Name Required Released (5 Paid Employee) Provision Rate (at Generated link with
Identified (If any) along with Status/ least 6 KPI
Detail Submitted month)
Proposals/
Software
Tool/
Prototype
Name & Status Remuneration
Contact
Detail

5.1.6.2. Intellectual property rights resulting from the project (s)


Type of IP Rights Product Application Details IPR Status

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 98
6. CONCLUSION & LESSON LEARNT

Free textbox

CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan 99
7. FUTRE WORK AND PLAN

Explain the work plan for the future/ remaining duration of the project as per PC-1

Free textbox

10
CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan
0
8. RISK, ISSUES AND CHALLENGES

Explain the issues and challenges faced and how they are going to impact the outcome and timelines of the project. Provide
the mitigation plan

Free textbox

10
CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan
1
HISTORY OF CHANGES
VERSION DISSEMINATION DATE CHANGE
1.0 version

10
CRC LAB, BUIC – A Partner Lab of National Center for Cyber Security, Pakistan
2

You might also like