Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

NETWORK TRAFFIC ANALYSIS OF BACKGROUND

AND FOREGROUND
Bui Tien Duc1,*, Vuong Xuan Chi1, Nguyen Van Thanh1,
Nguyen Tran Ai Duy2, Do Thanh Thai3, 4, Quang Tran Minh3, 4
1
Department of Information Systems, Faculty of Information Technology, Nguyen Tat Thanh University (NTTU),
300A, Nguyen Tat Thanh Street, Ward 13, District 4, HCMC, Vietnam
2
Facculty of Foreign Languages, Ho Chi Minh City Open University, 97 Vo Van Tan Street, Ward 6, District 3,
HCMC, Vietnam
3
Department of Information Systems, Faculty of Computer Science and Engineering, Ho Chi Minh City
University of Technology (HCMUT), 268 Ly Thuong Kiet, District 10, Ho Chi Minh City, Vietnam.
4
Vietnam National University Ho Chi Minh City (VNU-HCM), Linh Trung Ward, Thu Duc District, Ho Chi Minh
City, Vietnam.

Email: {ducbt@ntt.edu.vn, vxchi@ntt.edu.vn, nvanthanh@ntt.edu.vn, duy.nta@ou.edu.vn,


dothanhthai161@gmail.com, quangtran@hcmut.edu.vn}

Received xx xx xx
Revised xx xx xx; Accepted xx xx xx

Abstract: This paper aims at separating background (BG) and foreground (FG) network traffic
based on statistical information of distinct frequency of transmission control protocol (TCP) traffic
flows. BG traffic is silently generated by running applications without user awareness, while FG
traffic is intentionally generated by users with different purposes such as web-browsers and
applications. By statistically analyzing the distinct frequency of TCP traffic flows, this research
proved that using only from the 6th packet up to 24th packet in each TCP session can successfully
classify FG or BG traffic. The main contribution of this research is early traffic classification of
the TCP session from the first packets, based on statistical information of distinct frequency of
TCP traffic flows. Accordingly, if the frequency of the latent attributes of the FG traffic exceeds
that of the BG, our model of early traffic classification assigns the “FG” tag to that TCP session,
and vice-versa. Our study analyzes Packets (Packet= Segment + IP) in the Transport layer, based
on the frequency of the latent attributes of each TCP session to map our TCP Early Classification
Model. Our new contribution to the field of Computer Science - our TCP Early Classification
Model, provides effective, universal early classifications of TCP sessions, achievable from the 6th
packet to the 24th packet. This is a great contribution to the science of computer networks, and can
be the foundation for further studies.

Keywords: classification, data analysis, prediction


Abbreviation: DDOS (Distributed Denial of Service), TCP session (Transmission Control
Protocol Session)

________
*Corresponding author.
Email address: ducbt@ntt.edu.vn
https://doi.org/10.25073/2588-1140/vnunst.xxxx
1
1. Introduction

Background traffic is computer network traffic


generated by applications running silently
inside system without the user's knowledge.
Foreground Traffic is computer network traffic
generated when user interact on the system
through application utilities. For example, the Figure1. Background and Foreground traffic
operating system updates itself, the application
itself exchanges to the server to get information Figure1 describes: Background and foreground
will generate the background traffics, and users traffic are generated
use online tools to read news, exchange
information by chat will generate foreground
2. Related work
traffic.
In the field of system optimization, access Foreground Traffic [1] (FG) is traffic which is
control, network security, DDOS attack generated by human during process using
identification, etc., the results of early computer devices. For example: using
classification of computer network traffic from Facebook tools to exchange information with
the first packets are needed. While background each other, messaging via email, and browsing
traffic and foreground traffic are being websites to read news, etc. Background Traffic
generated and have not ended yet. From the [1] (BG) is traffic that is generated by
first packets, early classification of this applications running in the background
computer network traffic as background traffic underlying the system. For example the
or foreground traffic will create a great turning operating system updates, the software
point in improving service quality, supporting connects itself to the server to upgrade to a new
network security systems, data security as well version, the applications running in the
as user privacy. And this is the motivation as background below the system exchange
well as the goal that this topic approaches for information with each other, etc.
research for early classification of whether the To classify foreground and background traffic,
data traffic is created by the user or by the there are a lot of research to solve this problem
software applications running inside system. with many different methods. The number of
Early classification of computer network traffic background traffic generated is usually more
as background traffic or foreground traffic than the number of foreground traffic
from the first packets is a big problem and generated. Because the amount of background
containg a lot of challenges . Thus, the project traffic is usually higher, it is easy to lead to a
aims to focus on the early classification of "bottleneck" on the computer network. By
computer network traffic generated as evaluating the efficiency of background traffic
background traffic or foreground traffic from [2]. This study evaluated background traffic
the first packets. performance through simulation and analysis of
The early classification of computer network network parameters' impact over a long time.
traffic as background traffic or foreground And this project has proposed a new direction
traffic from the first packets has positive to reduce the "bottleneck" created by
implications in the field of computer networks. background traffic. However [2] has not yet
The results generated by this project will be resolved the occupied system resources. And[3]
directly used to improve service quality, by turning the screen on and off to optimize
support network security. resources, distinguish between the percentage
2
B.T.Đ at al. / VNU Journal of Science: Natural Sciences and Technology, Vol. xx, No. x (202x) xx-xx 3

of packets belonging to background and to collect before they are used for analysis. And
foreground traffic when the screen is turned off [8] unusual flows were detected when analyzed
or on, power consumption has been reduced for "short duration". The discovery of the
based on analysis of smartphone traffic from "collision" of background traffic with
twenty users over five months. From port foreground traffic[9] is a discovery in the
number and address (IP), it is difficult to problem of network traffic analysis. When the
classify network traffic. By using a support data used for the analysis is too small, the
vector machine [4] to analyze the randomness results of the analysis are often skewed. Thus,
of foreground traffic, computational costs have network traffic analysis using homologous
been significantly reduced. Sharing files in a information [10] was born to solve part of this
peer-to-peer network model will generate a problem. Currently, in the field of network
large amount of traffic competing with each traffic classification, statistics[11] and data
other. And [5] analyzed procrastination to mining for classification[12] have been actively
address this issue. Machine learning and used as the best analytical aids. But in general,
statistical probability tools have also been the above methods are quite expensive from
introduced into the field of computer network O(nlogn).
classification [6], [7] to solve network
classification problems when analyzing large
amounts of data collected during experiments.
There are also very "unusual" network traffic.
They always change over time, making difficult

3. Proposed approaches

3.1. Goal Figure 2 describes: When users use


The computer network traffic early computing devices such as laptops, personal
classification system will take the first packets computers, and mobile phones, etc. to surf the
of a network transaction (TCP Session), and the web, exchange information with each other, and
analysis system will extract hidden attributes to use utilities to work, etc., then a immediately a
serve as "raw materials" for building network mass of foreground traffic is generated. They
traffic analysis models. The output of the come and go continuously through packets.
network traffic analysis model is to clearly Foreground traffic is always generated during
answer the question "Is this network session user interaction with the device. Besides that, a
background traffic or foreground traffic?" series of other applications are also silently
running underneath of the system. These
applications will manually check the new
version with the server or exchange information
or send the user's information back to the server
without the user ever knowing. They kept
running silently. And they are also generating
masses of background traffic.
The network traffic analysis system in
Figure 1 will rely on the characteristic
Figure2. Network Traffic Analysis System properties seen and observed in network
transactions (TCP Session) to calculate and
extract the characteristic hidden properties that
4B.T.Đ at al. / VNU Journal of Science: Natural Sciences and Technology, Vol. xx, No. x
(202x) xx-xx
only foreground traffic or background traffic classify this traffic whether it is foreground or
has. From these characteristic implicit background traffic.
properties, the system is going to analyze and

3.2. Procedure
3.2.1. Data collection
The open-source software Wireshark session packets exchanged between computers
version 2.2.2 was employed to collect TCP in the Transport layer as illustrated in Figure 2.

3.2.2. Generating Background Traffic and


Foreground Traffic of TCP Session
Generating Background traffic: All Generating Foreground traffic: Users
applications silently run in the background. For intentionally run these software utilities.
example, Facebook (smartphone) and Firefox Typical usage of the applications includes web-
automatically update to the new version, browsing, opening the Facebook app, opening
BkvPlus automatically communicates with the YouTube to watch movies, etc.
server, etc.

3.2.3. Method Implementation


Step 1: Extracting latent attributes of of the latent attributes of BG or FG traffic
Background traffic and Foreground traffic of flows.
each TCP session. Step 4: Assigning the tags to the TCP session,
Step 2: Assessing the impact of latent from the 6th packet to the 24th packet,
attributes on our new classification model. depending on the length of the TCP session.
Step 3: Assigning the “BG” or “FG”
tag to the TCP session, based on the frequency

3.2.4. Our Model of TCP early traffic


classification
The frequency of the latent attributes of of the latent attributes of the BG traffic is
the BG traffic and that of the FG of the same higher than that of the FG traffic, and vice-
TCP session are compared. In our new model versa.
of TCP early traffic classification, the “BG” tag
is assigned to that TCP session if the frequency
B.T.Đ at al. / VNU Journal of Science: Natural Sciences and Technology, Vol. xx, No. x (202x) xx-xx 5

Figure 3 describes Our Model of TCP early


traffic classification that classifies TCP sessions
as background traffic or foreground traffic
based on the statistical information of distinct
frequency of TCP traffic flows.

Figure 3. Our Model of TCP early traffic


classification

3.2.5. Results:
The result of our study classifies TCP
sessions as foreground traffic or background
traffic, as in Figure 4.

Figure 4 describes the classification of


foreground or background traffic based on the
6th to 24th packets.

Figure 4. Classification of background (BG)


and foreground (FG) traffic
3.2.6. Statistics of nearly 700 TCP sessions
extracted from real data

Figure 5 illustrates the latent attributes of


background traffic and foreground traffic from
the nearly 700 TCP sessions in this study.

Figure 5. Latent attributes used for TCP


session classification.
6B.T.Đ at al. / VNU Journal of Science: Natural Sciences and Technology, Vol. xx, No. x
(202x) xx-xx

3.3. Results
Our new TCP early classification model
classifies TCP sessions as foreground or
background traffic from packet 6th to 24th as
follows:

Figure 6 shows the results of our study


separating background traffic and foreground
traffic from the first packets.

Figure 6. Classification results of the system


3.4. Comparison

Figure 7 shows a comparison of the percentage


of errors from our new TCP early classification
model and other algrorithms: Decision Tree,
Figure 7. Comparison with other algorithms
Naïve Bayes, ANN, SVM.

4. Results and Discussion the quality of Internet service and prevention of


network attacks. It also builds a premise of the
This study has demonstrated that our new design and construction of Software Defined
TCP early classification model requires a Networking systems as well as system
minimum number of packets, from 6th to 24th, optimization. These results of the study are
to successfully achieve a classification of the used for future related studies.
background or foreground traffic. Further, this
new model contributes to the science of Acknowledgement:
computer networks, solving the issue of early We would like to express our deep sense of
classification of computer network traffic. gratitude to the VNU-HCM Ho Chi Minh City
While other studies often use very large University of Technology (HCMUT), for the
numbers of known packets or many network support of time and facilities. We would like to
transactions (TCP sessions) for analysis, this thank Associate Professor Quang Tran Minh
study works in the Transport layer, hence and PhD Candidate Do Thanh Thai for their
provides a more effective and universal method precious contributions by in this study.
of early classification of traffic networks.
Additionally, this new model improves improve
B.T.Đ at al. / VNU Journal of Science: Natural Sciences and Technology, Vol. xx, No. x (202x) xx-xx 7

5. References
[1] Q. T. Minh, H. Koto, T. Kitahara, L. Chen, S. I. Arakawa, S. Ano, et al., "Separation of Background and
Foreground Traffic Based on Periodicity Analysis," 2015 IEEE Global Communications Conference
(GLOBECOM), 2015, pp. 1-7.
[2] Z. Kenesi, Z. Szabo, Z. Belicza, and S. Molnár, "On the effect of the background traffic on TCP's
throughput," 10th IEEE Symposium on Computers and Communications (ISCC'05), 2005, pp. 631-636.
[3] J. Huang, F. Qian, Z. M. Mao, S. Sen, and O. Spatscheck, "Screen-off traffic characterization and
optimization in 3G/4G networks," Proceedings of the 2012 ACM Conference on Internet Measurement
Conference, 2012, pp. 357-364.
[4] M. Suzuki, M. Watari, S. Ano, and M. Tsuru, "Traffic classification on mobile core network considering
regularity of background traffic," 2015 IEEE International Workshop Technical Committee on
Communications Quality and Reliability (CQR), 2015, pp. 1-6.
[5] M. Arumaithurai, X. Fu, and K. Ramakrishnan, "NF-TCP: a network friendly TCP variant for background
delay-insensitive applications," International Conference on Research in Networking, 2011, pp. 342-355.
[6] K. V. Vishwanath and A. Vahdat, "Evaluating distributed systems: Does background traffic matter?,"
USENIX Annual Technical Conference, 2008, pp. 227-240.
[7] T. T. Nguyen and G. Armitage, "A survey of techniques for internet traffic classification using machine
learning," IEEE Communications Surveys & Tutorials, vol. 10, no. 4, 2008, pp. 56-76.
[8] F. Silveira, C. Diot, N. Taft, and R. Govindan, "ASTUTE: Detecting a different class of traffic anomalies,"
ACM SIGCOMM Computer Communication Review, vol. 40, no. 4, 2010, pp. 267-278.
[9] G. Nychis and D. R. Licata, "The impact of background Network traffic on foreground network traffic," The
Proceeding of the IEEE Global Telecommunications Conference, GLOBECOM, 2001, pp. 1-16.
[10] J. Zhang, Y. Xiang, Y. Wang, W. Zhou, Y. Xiang, and Y. Guan, "Network traffic classification using
correlation information," IEEE Transactions on Parallel and Distributed Systems, vol. 24, no. 1, 2013, pp.
104-117.
[11] J. Zhang, Y. Xiang, W. Zhou, and Y. Wang, "Unsupervised traffic classification using flow statistical
properties and IP packet payload," Journal of Computer and System Sciences, vol. 79, no. 5, 2013, pp. 573-
585.
[12] J. Zhang, C. Chen, Y. Xiang, W. Zhou, and A. V. Vasilakos, "An effective network traffic classification
method with unknown flow detection," IEEE Transactions on Network and Service Management, vol. 10,
no. 2, 2013, pp. 133-147.
[13] G. G. Sena, and P. Belzarena, "Early traffic classification using support vector machines," The Proceedings of
the 5th International Latin American Networking Conference, ACM, 2009, pp. 60-66.

You might also like