Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 12

A Network Traffic Abnormal Detection Method: Sketch-

Based Profile Evolution

Junkai Yi 1 , Shuo Zhang 2,* , Lingling Tan 2 and Yongbo Tian 2


1
Key Laboratory of Modern Measurement and Control Technology, Ministry of Education,
School of Automation, Beijing Information Science & Technology University, Beijing 100096, China;
yijk@bistu.edu.cn
2
School of Automation, Beijing Information Science & Technology University, Beijing 100096, China;
tanlingling@bistu.edu.cn (L.T.); 2021020439@bistu.edu.cn (Y.T.)
* Correspondence: 2021020372@bistu.edu.cn

Abstract

Network anomaly detection faces unique challenges from dynamic traffic,


including large data volume, few attributes, and human factors that influence it, making
it difficult to identify typical behavioral characteristics. To address this, we propose
using Sketch-based Profile Evolution (SPE) to detect network traffic anomalies. Firstly,
the Traffic Graph (TG) of the network terminal is generated using Sketch to identify
abnormal data flow positions. Next, the Convolutional Neural Network and Long Short-
Term Memory Network (CNN-LSTM) are used to develop traffic behavior profiles,
which are then continuously updated using Evolution to detect behavior pattern changes
in real-time data streams. SPE allows for direct processing of raw traffic datasets and
continuous detection of constantly updated data streams. In experiments using real
network traffic datasets, the SPE algorithm was found to be far more efficient and
accurate than PCA and Basic Evolution for outlier detection. It is important to note that
the value of ϕ can affect the results of anomaly detection.

Keywords: network traffic; traffic graph; abnormal detection; sketch; evolution


1.Introduction

With the rapid increase in the amount of data and the extensive use of network
applications, the problem of network attacks in the field of network security is
becoming more and more serious, and discovering abnormal problems in network
traffic problems is now a critical problem to be solved. Abnormal network traffic refers
to the phenomenon that the current state of network traffic deviates from the normal
state of network traffic. Abnormal network traffic is (more often than not) brought on
through malicious network attacks [1], such as denial-of-service attacks [2], port
scanning [3], password blasting, far-flung control, etc., as well as network configuration
errors and other exceptions. Therefore, network traffic anomaly detection [4–9] is a
necessary function to maintain security in cyberspace.
Anomaly detection of network traffic comes from outlier detection [10]. The
purpose of outlier detection is to identify objects that are significantly different from
most data objects in the dataset. Therefore, outliers can be applied to the analysis of
network behavior and detect anomalies that are generated in the network.
The previous methods for detecting network traffic anomalies relied too heavily on
manual feature selection, lacked adaptability, and could not directly process the original
network traffic. On the other hand, in the face of massive high-dimensional traffic data,
they cannot handle the dynamic data flow, and it is difficult to effectively extract key
features and meet the real-time requirements of the system.
To solve the above problems, Sketch Evolution is proposed for detecting abnormal
network traffic. First, the historical network traffic is analyzed to build a normal
behavioral model (Profile) of the network traffic, and then calculate the deviation of the
contemporary network traffic from the behavioral model of the normal network traffic.
Thus, it is decided whether the network traffic is abnormal or not.
To solve the above problems, Sketch-based Profile Evolution (SPE) for network
traffic anomaly detection is proposed. SPE algorithm mainly includes three parts. The
first part is to analyze the historical network traffic, that is, generate the Traffic Graph
(TG) based on Sketch. The second part is to build the normal behavior model (Profile)
of network traffic, and use CNN and LSTM to extract the spatial and temporal behavior
characteristics to generate profiles. The third part is to calculate the deviation between
the current new network traffic and the normal network traffic behavior model to judge
whether the network traffic is abnormal. Compare the newly obtained network data with
the Profile to determine whether abnormal behavior exists. Profiles are not fixed due to
the addition of new data. Profiles evolve when new data streams are generated.
The other contents of this paper are as follows: in the second part, the related work
in the field of abnormal network traffic detection is presented. In the third part, we
introduce the detection model, which is the generation process of the TG based on
Sketch—-that is, the structuring process of the network traffic, the modeling method of
Profile, and the transition calculation process. The fourth part deals with the analysis of
the experimental results, including the Profile analysis experiment and the network
traffic volution analysis experiment. The fifth part describes the experimental
conclusions and future work.
In addition, the technique proposed in this paper has established commercial
cooperation with Beijing Esafenet Development Technology Co., Ltd. (Beijing, China).
It has been applied in Esafenet’s DLP (Data Leakage Prevention) system to efficiently
detect traffic with abnormal features in a continuous and dynamic network environment,
and to provide it to the DLP system for defense measures in time.

2.Related Works

Anomaly detection is an important task in wireless network data evaluation and


management, which helps to improve the intelligent management of the network and
realize the optimal allocation of network resources. Network anomaly detection can be
divided into four categories: statistics-based, time series-based, sketch-based, and
machine learning-based network anomaly detection. The statistics and timeseries-based
method is unsupervised and does not require labeling data, while the sketch and
machine learning based method requires the detection system to train the detection
model using labeling data, which is very time consuming but can achieve higher
accuracy. In addition, the sketch-based method requires little memory in a fast and
complex network environment, so it can store the characteristic information of network
traffic in real time. Therefore, this paper applies the method based on Sketch for
abnormal detections of network traffic.
Statistics-based methods are well suited for anomaly detection. Wang et al. [11]
proposed a network anomaly detection method (PCSS) that combines principal
component analysis (PCA) and a single-stage headless face detection (SSH) algorithm.
It solves the problem that the existing detection methods cannot learn regarding the
spatio-temporal characteristics of the data, the classification accuracy is no longer high,
and the detection time and accuracy are easily affected by redundant data in the sample.
Patil et al. [12] proposed an abnormal network traffic detection framework, which
makes use of principal component analysis (PCA) with feature extraction and
dimensionality reduction as the main purpose and makes use of a bidirectional
generative adversarial network (BiGAN) model to detect abnormal network traffic.
Ibrahim et al. [13] propose a comprehensive entropy-based method for network traffic
anomaly classification that protects in opposition to the deception of entropy detection
capabilities through a novel protection mechanism. It analyzes changes in different
entropy types and monitors the number of different elements in the feature distribution
as a unique detection metric to achieve entropy deception protection mechanism. Then,
based on multivariate analysis of entropy changes of multiple features and aggregation
of complex feature combinations, an entropy-based anomaly classification rule is
proposed, expanding the entropy-based anomaly detection method. Ren et al. [14]
propose an anomaly detection method based on dynamic Markov models. This method
segments the sequence data using sliding windows. In the sliding window, the state of
the data is defined by the value of the data, which creates a high-order Markov model
with an appropriate order to balance the length of the memory attribute and keep up
with the trend of the sequence. In addition, an anomaly replacement strategy is proposed
to prevent the detected anomalies from affecting the model building and to maintain the
continuity of anomaly detection.
Strategies based on time series consist of autoregressive and moving average
model (ARMA) regression, empirical mode decomposition (EMD) transform, wavelet
transform [15,16], instantaneous frequency analysis, etc. These methods are suitable for
network traffic data processing, which can meet the quantification requirements in
network traffic anomaly detection and flexibly use signal processing techniques. Yu et
al. [17] proposed a traffic anomaly detection algorithm for wireless sensor networks
(WSNs) based on the improved autoregressive integrated moving average (ARIMA)
model, and they improved the traditional time series ARIMA model to detect traffic in
WSNs, make predictions, and make judgments about traffic anomalies. Yang et al. [18]
proposed a threshold model based on Fractional Autoregressive Integrated Moving
Averages (FARIMA) to describe SCN traffic and detect anomalies. Cao et al. [19]
proposed a network traffic anomaly detection model MPTCP-EMD based on the
Multipath TCP(MPTCP) network. The model combines multi-scale detection and
digital signal processing theory to realize anomaly detection based on the self-similarity
of MPTCP network traffic. This method uses the empirical mode decomposition (EMD)
method to decompose MPTCP traffic data, and reconstructs effective signals by
removing high frequency noise and residual trend terms. This model exploits the idea of
a sliding window to compare the changes in the Hurst exponent of the MPTCP network
under different attack conditions and decide whether an anomaly exists or not.
The Sketch is a distributed profile data structure, which is widely used in network
traffic anomaly detection and can process a large amount of data in a short time. Ippoliti
et al. [20] proposed and developed a dynamic method for enhanced network flow
anomaly detection. We delineate the network state during the creation of the data flow,
enabling threat detection for general purposes. We describe an efficient flow
augmentation method based on a count-minute sketch that provides per-flow-, per-
node-, and per-network-level statistics in parallel with flow record generation. Tong et
al. [21] first proposed a general architecture on FPGA to speed up Sketch and deployed
it in two widely used Sketches: Count-min Sketch and Kary Sketch. For two key
network anomaly detection tasks, we propose online sketch-based algorithms: Heavy-
Hitter Detection and Heavy-Change Detection. We use the proposed Sketch general
architecture to accelerate these online algorithms.
Machine learning methods mainly include classification, clustering, pattern
recognition, neural networks, and decision trees. Machine learning-based methods can
process large network traffic data briefly and correctly via self-learning methods. Pu et
al. [22] proposed an unsupervised anomaly detection method that combines Sub-Space
Clustering (SSC) and One-Class Support Vector Machine (OCSVM), which can detect
attacks without any prior knowledge. Baek et al. [23] established a new attribute that
can efficiently identify anomalous events using clustering, which allows us to construct
label information for individual data points called estimated samples while preserving
the local neighborhood information of the connections’ features by using the Laplacian
eigenmap technique. Jain et al. [24] proposed two techniques, an Error Rate Based
Concept Drift Detection and Data Distribution Based Concept Drift Detection, and
investigated their effects. In addition, a sliding window-based data collection and drift
analysis combined with K-Means Clustering has been used to reduce the amount of data
size and improve the training datasets. We have used the Support Vector Machine
(SVM) classifier for anomaly detection, and retraining of the model was initiated based
solely on statistical tests. Hwang et al. [25] proposed an abnormal traffic detection
method, namely, D-PACK, which consists of a Convolutional Neural Network (CNN)
and an unsupervised deep learning model (e.g., Autoencoder) that can automatically
analyze the traffic patterns and filter abnormal traffic. The CNN module can
automatically extract features from the original network data. Garg et al. [26] proposed
a robust anomaly detection technique, Fuzzified Cuckoo-based Clustering Technique
(F-CBCT), which is divided into two stages: the hybridization of Cuckoo Search
Optimization and K-means clustering. The advantages and disadvantages of some of the
above references are listed in Table 1.

Table 1. The Comparison of Existing Recent Network Anomaly Detection Algorithms.


Algorithm ML/EL Method Advantages Disadvantages
It cannot meet the detection requirements of new
Superior to other detection network abnormal traffic and has poor
PCSS [11]
PCA, SSH models in detection speed and scalability. It
accuracy may even be incorrectly classified as a training
dataset.
Cannot improve the feature engineering by
Framework [12] Further enhance the performance auto-generation of meaningful derived features
PCA, BiGAN
of the BiGAN model and
find ways to interpret anomaly score
Entropy-based [13] More feasible for practical Unsupervised machine learning, with no training
Entropy
implementation and general use required; needs further research
ARIMA-based [17] Have better anomaly detection
ARIMA Need to reduce the false alarm rate
accuracy
Improve the robustness of
MPTCP-EMD [19] EMD decomposition will produce modal
EMD MPTCP transport systems
aliasing

Online Adaptive Cannot adaptively tune itself to meet


Maintain high accuracy without
[20] SVM performance
the need for offline training
goals and constraints
Unsupervised Does not develop an effective feature selection
Clustering-Based The method performs better than method
SSC, OCSVM
[22] some of the existing techniques and implement the parallelization of the
algorithm
Clustering-based
Naive Bayes, There should be an error in the label estimation
label Improve the quality of estimated
Adaboosting, SVM, process
estimation [23] labels
RF based on clustering

The proper configuration cannot be


automatically
calculated. Unable to balance the factor of
Consume much less flow
D-PACK [25] acceptable
CNN, Autoencoder pre-processing time and detection
training time and still gain high classification
time, speed up the detection
performance, the DL-based classification
approach is
highly susceptible to the data poisoning attack
The fast mobility of nodes in VANETs and the
SMOTE [27] RF classifier Great performance for intrusion dynamic
detection in VANETs changes in network typology pose challenges for
intrusion detection
Catboost in NIDS
Catboost Make high and more robust
[28] Lack of trust and reliability between Fog nodes
performance with low cost in time

The network traffic can directly


This study is only an experiment on a small-
process the original dataset
SPE CNN, LSTM scale
during anomaly detection and can
dataset and has not been applied to a practical
anomaly-detect dynamic data
environment on a large scale
streams
译文:

一种网络流量异常检测方法:基于 Sketch 的轮廓演化

Junkai Yi 1 , Shuo Zhang 2,* , Lingling Tan 2 and Yongbo Tian 2


1
Key Laboratory of Modern Measurement and Control Technology, Ministry of Education,
School of Automation, Beijing Information Science & Technology University, Beijing 100096, China;
yijk@bistu.edu.cn
2
School of Automation, Beijing Information Science & Technology University, Beijing 100096, China;
tanlingling@bistu.edu.cn (L.T.); 2021020439@bistu.edu.cn (Y.T.)
* Correspondence: 2021020372@bistu.edu.cn

摘要

网络异常检测面临着动态流量带来的独特挑战,包括数据量大、属性少、人为
因素影响等,使得典型行为特征难以识别。为了解决这个问题,我们建议使用基
于草图的配置文件进化(SPE)来检测网络流量异常。首先,利用 Sketch 生成网络终
端的流量图(Traffic Graph, TG)来识别异常数据流位置;接下来,利用卷积神经网络
和长短期记忆网络(CNN-LSTM)开发流量行为轮廓,然后利用演化不断更新流量
行为轮廓,以检测实时数据流中的行为模式变化。SPE 允许直接处理原始流量数
据集和持续检测不断更新的数据流。在使用真实网络流量数据集的实验中,发现
SPE 算法在异常点检测方面比 PCA 和 Basic Evolution 更加高效和准确。需要注意
的是的值会影响异常检测的结果。

关键词:网络流量;流量图;异常检测;草图;进化
1.引言

随着数据量的快速增加和网络应用的广泛使用,网络安全领域的网络攻击问
题越来越严重,在网络流量问题中发现异常问题是现在亟待解决的关键问题。网
络流量异常是指当前网络流量状态偏离正常网络流量状态的现象。网络流量异常
是(更多情况下)通过恶意网络攻击[1],如拒绝服务攻击[2]、端口扫描[3]、密码爆破
远控等,以及网络配置错误等异常而带来的。因此,网络流量异常检测[4-9]是维
护网络空间安全的必要功能。
网络流量异常检测来自于离群点检测[10]。离群点检测的目的是识别出与数据
集中大多数数据对象有显著差异的对象。因此,可以将离群点应用于网络行为分
析,检测网络中产生的异常。
以往检测网络流量异常的方法过于依赖人工特征选择,缺乏自适应性,不能
直接处理原始网络流量。另一方面,在面对海量高维流量数据时,无法处理动态
的数据流,难以有效提取关键特征,满足系统的实时性要求。
为解决上述问题,Sketch Evolution 被提出用于检测异常网络流量。首先,对
历史网络流量进行分析,构建网络流量的正常行为模型(Profile),然后计算当代网
络流量与正常网络流量行为模型的偏差。从而判定网络流量是否异常。
为解决上述问题,提出了基于草图的网络流量异常检测策略——SPE。SPE 算
法主要包括三个部分。第一部分是对历史网络流量进行分析,即基于 Sketch 生成
流量图(TG)。第二部分是建立网络流量的正常行为模型(Profile),利用 CNN 和
LSTM 提取时空行为特征来生成 Profile。第三部分是计算当前新网络流量与正常网
络流量行为模型的偏差,判断网络流量是否异常。将新获取的网络数据与 Profile
进行对比,判断是否存在异常行为。由于新数据的添加,Profile 并不固定。当产生
新的数据流时,Profiles 会发生变化。
论文的其他内容如下:第二部分介绍了网络异常流量检测领域的相关工作。第
三部分,我们介绍了检测模型,即基于 Sketch 的 TG 生成过程——即网络流量的
构造过程、Profile 的建模方法、过渡计算过程。第四部分是对实验结果的分析,包
括 Profile 分析实验和网络流量卷积分析实验。第五部分为实验结论和未来工作。
此外,本文提出的技术已与北京 Esafenet 开发技术有限公司(中国北京)建立
了商业合作。它已应用于 Esafenet 的 DLP (Data leak Prevention)系统中,可以在连
续动态的网络环境中高效地检测出具有异常特征的流量,并及时提供给 DLP 系统
进行防御。

2.相关工作
异常检测是无线网络数据评估和管理中的一项重要任务,有助于提高网络的
智能化管理,实现网络资源的优化配置。网络异常检测可分为四类:基于统计的、
基于时间序列的、基于草图的和基于机器学习的网络异常检测。基于统计和时间序
列的方法是无监督的,不需要标注数据,而基于草图和机器学习的方法需要检测
系统使用标注数据来训练检测模型,非常耗时,但可以达到更高的准确率。此外 ,
基于草图的方法在快速复杂的网络环境下需要的内存很少,因此可以实时存储网
络流量的特征信息。因此,本文将基于 Sketch 的方法应用于网络流量的异常检测。
基于统计的方法非常适合异常检测。Wang 等人[11]提出了一种结合主成分分
析(PCA)和单级无头人脸检测(SSH)算法的网络异常检测方法(PCSS)。解决了现有
检测方法无法针对数据的时空特性进行学习,分类精度不再高,且检测时间和精
度易受样本中冗余数据影响的问题。Patil 等人[12]提出了一种异常网络流量检测框
架,该框架以特征提取和降维为主的主成分分析(PCA)为主要目的,利用双向生
成对抗网络(BiGAN)模型检测异常网络流量。Ibrahim 等人[13]提出了一种全面的基
于熵的网络流量异常分类方法,通过一种新颖的保护机制对抗熵检测能力的欺骗
进行保护。它分析不同熵类型的变化,并监测特征分布中不同元素的数量作为一
种独特的检测指标,以实现熵欺骗保护机制。然后,基于多元分析多个特征的熵
变化和复杂特征组合的聚合,提出了基于熵的异常分类规则,扩展了基于熵的异
常检测方法。Ren 等人[14]提出了一种基于动态马尔可夫模型的异常检测方法。该
方法利用滑动窗口对序列数据进行分割。在滑动窗口中,通过数据的值来定义数
据的状态,从而创建一个有适当顺序的高阶马尔可夫模型,以平衡内存属性的长
度并跟上序列的趋势。此外,提出一种异常替换策略,防止检测到的异常影响模
型构建,保持异常检测的连续性。
基于时间序列的策略包括自回归和移动平均模型(ARMA)回归、经验模态分解
(EMD)变换、小波变换[15,16]、瞬时频率分析等。这些方法适用于网络流量数据处
理,能够满足网络流量异常检测中的量化要求,灵活运用信号处理技术。Yu 等
[17]提出了一种基于改进的自回归综合移动平均(ARIMA)模型的 WSNs 流量异常
检测算法,对传统的时间序列 ARIMA 模型进行了改进,用于检测 WSNs 中的流
量,并对流量异常进行预测和判断。Yang 等人[18]提出了一种基于分数阶自回归
综合移动平均(FARIMA)的阈值模型来描述 SCN 流量并检测异常。Cao 等[19]提出
了一种基于 MPTCP 网络的网络流量异常检测模型 MPTCP- emd。该模型结合多尺
度检测和数字信号处理理论,实现了基于 MPTCP 网络流量自相似性的异常检测。
该方法采用经验模态分解(empirical mode decomposition, EMD)方法对 MPTCP 流
量数据进行分解,去除高频噪声和残余趋势项,重构出有效信号。该模型利用滑
动窗口的思想,比较不同攻击条件下 MPTCP 网络赫斯特指数的变化,判断是否
存在异常。
Sketch 是一种分布式 Profile 数据结构,被广泛应用于网络流量异常检测,可
以在短时间内处理大量数据。Ippoliti 等人[20]提出并开发了一种增强网络流量异常
检测的动态方法。我们在创建数据流时划定了网络状态,使一般用途的威胁检测
成为可能。我们描述了一种基于计数分钟草图的高效流增强方法,该方法在流记
录生成的同时提供了每个流、每个节点和每个网络级别的统计数据。Tong 等人[21]
首先在 FPGA 上提出了一种通用架构来加速 Sketch,并将其部署在两个广泛使用
的 Sketch 中:Count-min Sketch 和 Kary Sketch。对于两个关键的网络异常检测任务
我 们 提 出 了 基 于 草 图 的 在 线 算 法 : Heavy-Hitter detection 和 Heavy-Change
detection。我们使用提出的 Sketch 通用架构来加速这些在线算法。
机器学习方法主要包括分类、聚类、模式识别、神经网络和决策树。基于机器学
习的方法可以通过自学习方法简单而正确地处理大型网络流量数据。 Pu 等人[22]
提出了一种结合子空间聚类(SSC)和一类支持向量机(OCSVM)的无监督异常检测
方法,可以在不需要任何先验知识的情况下检测攻击。 Baek 等人[23]建立了一种
新 的 属 性 , 可 以 利 用 聚 类 有 效 识 别 异 常 事 件 , 它 允 许 我 们 在 使 用 Laplacian
eigenmap 技术保留连接特征的局部邻域信息的同时,为称为估计样本的单个数据
点构建标签信息。Jain 等人[24]提出了两种技术,即错误率基于概念漂移检测和数
据分布基于概念漂移检测,并研究了它们的效果。此外,基于滑动窗口的数据收
集和漂移分析与 K-Means 聚类相结合,减少了数据量,提高了训练数据集的质量。
我们使用支持向量机(SVM)分类器进行异常检测,并且仅基于统计测试启动模型
的再训练。Hwang 等人[25]提出了一种异常流量检测方法,即 D-PACK,该方法由
卷积神经网络(CNN)和无监督深度学习模型(如 Autoencoder)组成,可以自动分析
流量模式并过滤异常流量。CNN 模块可以自动从原始网络数据中提取特征。Garg
等人[26]提出了一种鲁棒的异常检测技术——基于杜鹃的模糊聚类技术 (fuzzzed
Cuckoo-based Clustering technique, F-CBCT),该技术分为两个阶段:hybridization of
Cuckoo 搜索优化和 K-means 聚类。上述部分参考文献的优缺点列于表 1。

表 1 现有的近期网络异常检测算法对比
算法 ML/EL 方法 优点 缺点
无法满足新的检测要求网络流量异常,可扩
PCSS [11] 模型在检测速度和精度优于其他
PCA, SSH 展性差。它甚至可能被错误地分类为训练数据
检测

Framework [12] 不能改进特征工程的自动生成有意义的派生
PCA, BiGAN 进一步提升 BiGAN 模型的性能
特征和找到解释异常分数的方法
Entropy-based [13] 无监督机器学习,无需训练,要求;需进一步
Entropy 更实用可行和通用
研究
ARIMA-based [17]
ARIMA 有更好的异常检测精度 需要降低误报率

MPTCP-EMD [19]
EMD 提高 MPTCP 运输系统的鲁棒性 EMD 分解会产生模态混叠
Online Adaptive
[20] SVM 保持高精度不需要离线训练 无法自适应调整自身以满足性能目标和约束

Unsupervised
Clustering-Based 没有开发出有效的特征选择方法并实现算法
SSC, OCSVM 该方法表现优于现有的一些技术
[22] 的并行化

Clustering-based
Naive Bayes,
label
Adaboosting, SVM, 提高预估质量标签 标签估计过程中应该有误差基于聚类
estimation [23]
RF

正确的配置不能自动计算。无法平衡可接受的
D-PACK [25] 消耗更少的流量预处理时间和检
CNN, Autoencoder 因素训练时间和仍然获得高分类性能,基于
测时间,加快检测
dl 的分类方法是极易受到数据中毒攻击
SMOTE [27] RF classifier VANETs 中节点的快速移动性和动态网络类
VANETs 探测中性能出色
型的变化对入侵检测
Catboost in NIDS
Catboost
[28] 更好的性能以及低成本 节点之间缺乏信任和可靠性

网络流量可以直接处理原始数据
SPE CNN, LSTM 这项研究只是一个小规模的实验数据集,并
集异常检测期间可以异常检测动
没有应用于实际大规模的环境
态数据流

You might also like