Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/268195776

LTE traffic analysis and application behavior characterization

Conference Paper · June 2014


DOI: 10.1109/EuCNC.2014.6882660

CITATIONS READS
5 1,972

6 authors, including:

Gianluca Foddis Rosario Giuseppe Garroppo


Telecom Italia Università di Pisa
8 PUBLICATIONS   136 CITATIONS    152 PUBLICATIONS   1,251 CITATIONS   

SEE PROFILE SEE PROFILE

Gregorio Procissi
Università di Pisa
121 PUBLICATIONS   1,226 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Data Plane Acceleration API View project

MDPI Sensors - SI on Green Sensors Networking View project

All content following this page was uploaded by Rosario Giuseppe Garroppo on 25 July 2016.

The user has requested enhancement of the downloaded file.


LTE Traffic Analysis and Application Behavior
Characterization

Gianluca Foddis2 , Rosario G. Garroppo1 , Stefano Giordano1 , Gregorio Procissi1 , Simone Roma1 , Simone Topazzi2
1
Dept. of Information Engineering, University of Pisa, Italy
email: <first.last>@iet.unipi.it, simone.roma@ing.unipi.it
2
Telecom Italia, Torino, Italy
email: <first.last>@telecomitalia.it
Abstract—The deployment of LTE and the explosion of smart- II. R ELATED W ORKS
phones and tablet market increase the requirements of mobile
connectivity, together with a change in the users expectations Since the beginning of the mobile data era there has been
in terms of bandwidth, access speed, reliability and QoS man- a great interest on the characterization and measurement of
agement. In this new network scenario, traffic characterization mobile traffic. The different studies in this area can be classi-
and monitoring is of paramount relevance in order to prevent fied in terminal–based and network–based studies. Terminal–
possible pitfalls during the deployment of new services. Hence, based studies are aimed at characterizing applications and user
the paper presents the traffic analysis of a deployed eNodeB in behavior by acquiring data on terminals (as examples, see [1]
a commercial network. The analysis is aimed at detecting traffic and [2]), whereas the network–based ones attempt to evaluate
features at call and frame level, also accounting for the handset
types.
network performance and usage by measurement sessions
carried out through equipment installed in the network. Hence,
in this latter case the user has no information about the
underlying monitoring process.
I. I NTRODUCTION
In [3] the authors conducted a detailed measurement anal-
Long Term Evolution (LTE), the latest deployed cellular ysis of network resource usage and subscriber behavior by
network technology, delivers high-speed data services for using a large scale data set collected inside a 3G cellular data
mobile devices with advertised bandwidth matching and even network. They studied the behavior of mobile subscribers in
exceeding the home broadband network speeds. In particular, terms of the traffic they generate, their mobility and their
LTE provides higher throughput (e.g. up to 300 Mbps in activity, and find a significant variation of network usage
DL and 150 Mbps in UL exploiting advanced 4x4 MIMO among subscribers. Recently, in [4], the authors presented an
techniques and promising up to 1Gbps with LTE-A) and lower in–depth study of the interactions among applications, network
latency than its 3G predecessors (i.e., UMTS, HSPA+). The transport protocol, and the radio layer in the LTE system. They
explosion in the consumer market of smart-phones and tablets, highlighted that LTE has significantly shorter state promotion
together with the pervasive usage of social networks and delays and lower RTTs than those of 3G networks, and pointed
video on-demand has poured millions of new mobile users out various inefficiencies in TCP over LTE.
into the net, so that internet mobile traffic is expected to
exceed the traffic generated computers in the coming years. Despite the aforementioned papers, we deal with traffic
Given this new scenario, LTE technology needs extensive generated by new smartphones and handset devices, with high
studies aimed at experimentally understanding how network computing power and connectivity, absolutely comparable to
resources are utilized by real users in a deployed commercial the ones of commodity PCs and fixed networks respectively.
network setting. To this aim, the paper presents the analysis We do not consider the detailed analysis of TCP interaction
of traffic data acquired at one eNodeB of an Italian mobile with other protocols of LTE system, as presented in [4], since
operator. Traffic was treated according to security procedures our study is aimed at evaluating the traffic characteristics,
and properly anonymized to respect customers privacy. The in terms of devices behavior, duration and inter-arrival times
goal is to evaluate the characteristics of traffic at call level and between successive calls (or sessions), and applications traffic
frame level. In particular, at call level, the analysis accounts for volume, as observed at the LTE eNodeB.
the duration of calls and the inter–arrival times of successive
calls, whereas at the frame level the focus is on the applications III. M EASUREMENT S CENARIO
and on the volume of traffic they generate.
The monitored eNodeB is operative in the bandwidth of
The rest of the paper is organized as follows. Section II 1800 MHz, and is located in a business area of Turin. Three
provides a survey on previous researches concerning traffic different measurement sessions have been carried out, each
measurements in mobile networks, while section III describes one lasted one week, in the spring, the summer and the fall
the measurement scenario. Sections IV and V report and 2013, respectively . The three sessions mostly spot the same
discuss the main results of the analysis at the call and frame qualitatively results. From now, we’ll refer to these as week1,
levels, respectively. Finally, section VI concludes the paper and week2 and week3. Due to the higher amount of data, in
sketches possible extension of this work. sections IV and V all figures refer to week2 and week3.
TABLE I. N UMBER OF O BSERVED C ALLS FOR E ACH C ONSIDERED D EVICE
Measurement Session SmartphoneA TabletA SmartphoneB Other
Week1 1725 (67,43%) 465 (17,59%) 167 (6,52%) 197 (7,70%)
Week2 3320 (50,54%) 2619 (39,86%) 401 (6,1%) 229 (3,48%)
Week3 4364 (58,19%) 498 (6,64%) 1482 (19,76%) 1155 (15,41%)

chose to analyze only the LTE originated/terminated calls,


involving the interfaces S1-MME/S1-U or S1-MME/S1-U/X2,
and classified as ”Normal Status”. The Normal Status indicates
that the call was closed properly.
Both LTE originating calls and LTE terminating calls
are detected by observing the Non-Access Stratum (NAS)
Service Request procedure. In particular, after the attachment
procedures, the User Equipment (UE) remains attached to the
Evolved Packet System (EPS) and the default bearer is main-
tained active as a logical connection between the UE and the
EPC. In this condition, the UE is in RRC-IDLE/ECM-IDLE
state (RRC is the acronym of Radio Resource Control, whereas
ECM is the acronym of EPS Connection Management). In
Fig. 1. Operational Scenario this state, three different cases can trigger the NAS Service
Request procedure: i) the UE has data to send to the PDN;
ii) the MME needs to send NAS signaling to the UE (e.g. the
Tektronix K18 GbE probe, which was connected to a Detach command); iii) the SGW has data coming from the
Tektronix NSA server, was used to perform the analysis, PGW/PDN and directed to the UE.
capturing all the traffic flowing through the S1 interface, that
is both Control Plane (S1-MME) and User Plane (S1-U). K18 In ECM-IDLE mode, when the UE has data to transmit,
GbE probe has 4x SFP connectors for support of optical the UE triggers the NAS procedure transmitting a NAS Service
and electrical interfaces, even in mixed configuration. The Request message, which quickly resumes a logical channel on
reference scenario is reported in Figure 1. the radio interface. Then, the UE transmits a Service Request
message (on the NAS layer) to the MME. At eNodeB, the
We acquired data at both the call and the frame level. To NAS PDU is piggybacked by the S1AP transport function and
protect user privacy, no payload data is considered except for forwarded to the MME using an S1AP initial UE message.
HTTP headers and no personal information are used to develop Upon receiving the Service Request message, the MME initiate
this study. At call level we filtered the data to obtain only the Bearer setup procedure with the SGW. By the observation
the calls with Type field equal to LTE originating/terminating of this messages exchange, the measurement equipment is able
call and Involved Interface equals to S1-MME/S1-U or S1- to detect an LTE originating call.
MME/S1-U/X2. This way, we focused the analysis on calls
When data is directed to the UE, the SGW activates a
containing User Plane (UP) traffic. At the frame level, we
NAS Service Request procedure for transmitting such data to
analyzed the data to retrieve diverse information, such as: i)
the UE. In particular, the SGW creates a DL Data Notification
throughput; this has been evaluated by means of a moving
message and forwards it to the MME. Being in idle state, the
average window on the observed traffic over a time window of
UE location is known to MME on a per Tracking Area (TA)
1 s; ii) the packet traffic of each active user in order to evaluate
basis. Thus, the MME has to page all eNodeBs within the TA.
its traffic volume as well as the used application/service; and
The eNodeB receives S1AP Paging message and constructs
iii) the packet traffic exchanged during a particular session or
RRC Paging message. The UE wakes up every at Paging
service.
occasion and if Paging is for PS (Packet Switched) domain
then the UE NAS layer triggers a Service Request procedure.
IV. M EASUREMENT R ESULTS AT THE C ALL L EVEL In this case, the measurement equipment detects that the first
message triggering the procedure was a Paging message, then
The correlation of the traffic data acquired at both Control it establishes that the new call is an LTE terminating call. The
Plane (CP) and User Plane (UP) levels provides information on two procedures, which permit to detect the LTE originating
single calls in the eNodeB. In particular, the used measurement and LTE terminating call, are used to establish the starting
equipment provides a row for each classified call. To detect a time of an observed Call, Ts .
call and to bind to it all messages, the used equipment, on the
basis of 3GPP specifications, follows pre–configured patterns In both cases, the call is closed when the UE re–enters
in the messages observed in the CP. Different frame patterns the idle state and the RRC bearer on Uu interface is removed.
in the CP are assumed to be of different call types. The observation of the UE Context Release messages on the
S1-MME interface permits to establish the termination time,
For any call, the measurement equipment permits to drill– Tt , of a call. Then the duration of a call, CD can be easily
down all individual messages associated with it. By taking computed as CD = Tt − Ts .
advantage of this feature, we were able to reconstruct the
most important message exchange patterns. In particular, we Table I summarizes the results for the call analysis. In
80 1
Week2
Week3
70
0.8
60
Percentage [%]

50 0.6

40
0.4
30

20 0.2
SmartphoneA
10 SmartphoneB
TabletA
0
0 0 200 400 600 800 1000
SmartB TabletA SmartA Other Duration [sec]

Fig. 2. Device Distibution Fig. 3. Call Duration Times, CD : SmartphoneA vs TabletA vs SmartphoneB

particular, we observed that most of the traffic is produced


by three kinds of handset. We will refer to these devices seconds for the sake of simplicity; however, we include more
as smartphoneA, smartphoneB and tabletA, where A and B then 99% of all values. Regardless of the handset type, each
indicate two different vendors. For detecting the kind of device, call experiments a minimum call duration of 60 seconds, unless
the TAC (Type Approval code) has been used. Depending on a previous handover occured. This is due to the Inactivity
the device, the TAC consists of the first 6 or 8 numbers of Timer parameter set to 60 s, that forces the device to release the
the IMEI. It is worth noting that we never considered the RRC connection after 60 s of inactivity. Furthermore, CDFs
whole IMEI, because of privacy constraints. In the Table, Other exhibit a wide range of values: about 90% of calls, both for
indicates other devices (e.g dongles) or refers to call without smartphoneA and tabletA, end in less than 200 seconds and a
TAC notification. small amount of calls last up to 6000 sec (not shown).
During the three sessions, the overall number of calls
constantly grows, indicating the increase of usage of LTE On the contrary, when we analyze the inter-arrival times
devices: smartphoneA stably represents the most used device, among successive calls (Figure 4), we observed some dif-
followed by smartphoneB and tabletA. From week1 to week3, ferences among the handset types. In particular, the data
smartphoneA calls have increased by four times, smartphoneB associated with smartphoneA (figure 4(a)) evidence a large
points out the largest growth passing from 150 calls to more number of calls with inter–arrival times of about 600 s. This
than 1400 calls, and tabletA, after an unexpected growth, behavior can be associated with some typical smartphoneA
returns to the values of week1. applications that periodically request services from the net-
work. Similarly, tabletA (figure 4(b)) generate a lot of calls
In order to obtain the device distribution, we associated with inter-arrival times of about 300 s, even if this behavior
on a daily basis each IP address with each kind of device, is much less pronounced than that of smartphoneA. For the
exploring the ”User Agent” field in the HTTP message. Due sake of simplicity, we limit the observation to 1000 seconds.
to the concerns of privacy, we did not collect any IMEI or It is worth noticing that gaps longer than 1000 s are not very
IMSI and cannot perform the correct approach, based on the important: as we will emphasize later (section V) the network
association between device and IMEI. Subscriber identification load is mainly produced by workers during working hours,
based on IP addresses introduces some approximations because therefore long inter–call times come out during the night, when
NAT in cellular networks uses timers to disconnect connections no user activity is detected.
that are idle for too long [6]. Nevertheless, LTE handsets can
be assumed always-on device and NAT timers typically are These results confirm the analysis carried out with 3G
large enough to ensure that no changes occur during one day. devices and summarized in [5]. Open operating systems en-
Figure 2 shows the distribution of the considered classes able the mobile phone software developer to design various
of devices for week2 and week3, where at least 6000 calls programs similar to the ones running on desktops. Thus,
have been observed. Undoubtedly, smartphoneA is the most almost every application has its mobile-oriented version on
widespread device, followed by smartphoneB. Actually, we smartphones. For real-time web services, frequent or periodical
note that during week2 the number of users with tabletA is heartbeat packets work as keep-alive packets to maintain
twice as that of week3. This observation justifies the great connection. Three interaction technologies are mainly used:
number of tabletA calls in the second session. As in table I, Pull/Polling, Long-polling and Push. Some proprietary appli-
Other indicates both dongles and IP addresses that cannot be cations on smartphoneA are based on push technology: a server
mapped to any devices. In particular, we also associated with pushes the new content periodically, every 10–15 minutes,
this class the observation of frames generated by operating hence justifying the observed inter-arrival time of 600 s. We
systems, like Windows NT 6.1 (Windows 7), which do not may infer that push notification services for tabletA applica-
report any device identifier in the User Agent. tions access to the network every 5 minutes. Apparently, no
match for particular behavior can be deducted for smartphoneB
Figure 3 shows the empirical CDF of the observed CD (figure 4(c)), which does not point out prevailing inter–arrival
for each considered device type. We limit the CDF to 1000 time values.
1 0.1 1 0.07
0.09
available resource on LTE radio interface. Other applications,
0.06
0.8 0.08 0.8 such as Facebook or email, generate very little load. It is
0.05
0.07
worth noticing that user 2 executes a Speedtest. Speedtest
0.6 0.06 0.6
CDF
Histogram
0.05 CDF
Histogram
0.04
has become the most popular and immediate testing tool for
0.4 0.04 0.4
0.03
network performance. It is largely used in LTE, as evidence
0.03
0.2 0.02 0.2
0.02
of the higher user expectations.
0.01
0.01
0 0 0 0
0 200 400 600 800 1000 0 200 400 600 800 1000
Gap [sec] Gap [sec] B. Application Analysis
(a) SmartphoneA (b) TabletA
1 0.07
The next step was to investigate on the frequency of oc-
0.06
currences of the most common applications, namely Facebook,
0.8
0.05
Google, Apple, Youtube and Mail. Moreover, we considered
0.6
0.04
the connections towards Akamai servers. Akamai is one of the
CDF
Histogram
0.03
world’s largest distributed-computing platforms and accounts
0.4

0.02
customers like Facebook, Twitter, Yahoo and so on.
0.2
0.01
We analyzed the observed DNS queries and responses.
0
0 200 400 600 800
0
1000 In particular, we checked if the DNS hostname contained
Gap [sec]
(c) SmartphoneB patterns correlated to one of the aforementioned applications.
For example, string as icloud or itunes belong to Apple
Fig. 4. Call Interarrival Times: SmartphoneA vs TabletA vs SmartphoneB application, whereas fbstatic or fbexternal belong to Facebook
application. The results are summarized in Table II, which
reports the percentage of occurrences for each considered
application during each day of week2.
The results show a low percentage of YouTube (with a
peak of 9% on Saturday) with respect to the other popular
applications and a high percentage of non classified occur-
rences (”Other” in the Table), mostly referring to generic
Web browsing with a small component of applications such
as Twitter, Whatsapp or Instagram. It is interesting to note a
(a) Week (b) One Day large percentage of gaming during Monday and Tuesday, about
32 and 35% respectively. During the week, the percentage of
Fig. 5. Throughput: Whole Week vs One Day Facebook and Apple occurrences is quite stable, in the range
7-24% the former and around 20% the latter. Mail is in the
range 10-20%, with a peak at 26% on Friday.
V. M EASUREMENT R ESULTS AT THE F RAME L EVEL
A further analysis is aimed at evaluating the traffic volume
A. Throughput analysis and the session duration for each application. In particular,
The first analysis focuses on the UP throughput observed we detect the frames exchanged for each connection using the
during one of the considered weeks. The results reported in canonical 5-tuple: Protocol, IP source address, IP destination
Figure 5(a) shows that in general the network load is quite address, layer 4 source port, layer 4 destination port. Then, we
low. This is mainly due to the early stage of LTE services. detected the packets belonging to each connection, and com-
Furthermore, we observed that during the week-end (Saturday puted the number of exchanged bytes as well as the duration
and Sunday), the traffic is almost zero. This observation can of connections. By using the information of DNS queries, we
be explained taking into account that the monitored eNodeB resolved IP destination addresses with the associated hostname
is located in a business area of Turin. Hence, we can deduce and joined each observed connection to a specific service.
that the main part of traffic is generated by workers during the
We defined and analyzed the sessions of the main applica-
week. Moreover, the results show that, typically, the maximum
tions observed in the measurements. For ease of presentation,
throughput is reached in the interval 12:00–16:00 (e.g. the
in the following we report the results for Facebook and Mail
lunch break and the early afternoon), and then it progressively
only. First of all, we need to define one Facebook session.
decreases. Of course, this characteristic enforces the idea that
Mainly, we are interested on the impact of Facebok usage
LTE devices are not yet spread outside business environment,
from a network perspective. We generate several traces in a
although some application bandwidth consuming, such as
controlled fashion to define the signature for the beginning of
video streaming, are used during the work break times.
a session. Taking into account our results and previous studies
The next step is to analyze the day–by–day throughput and ([7], [8]) we find that approximately 6-7 TCP connections are
to focus on the most popular applications. As an example, opened when the user activates its Facebook App. We suppose
we propose the throughput profile of one day, shown in that when all Facebook connections are closed, or when the
Figure 5(b). The figure points out that the most important users leave the cell, the session is closed. The statistics on
traffic volume is provided by one user that established an the sessions duration and the traffic volume exchanged during
HTTP video streaming session. Indeed, during this session, the the sessions are summarized in the scatter plot (Figure 6). The
average throughput is about 21 Mbit/s, more than 20% of the scatter plot reports a set of points, where each point represents
TABLE II. S TATISTICS OF THE N ETWORK S CENARIOS
Facebook Apple Akamai Google Mail YouTube Other (Web browsing, Twitter, Whatsapp, etc.)
Sunday 24,34% 21,55% 10,65% 7,05% 9,50% 0,61% 35,04%
Monday 8,27% 5,56% 1,03% 16,19% 8,47% 1,64% 52,87%
Tuesday 7,95% 22,08% 9,04% 8,08% 10,34% 0,51% 47,27%
Wednesday 7,63% 11,51% 1,1% 9,42% 10,17% 2,72% 41,24%
Thursday 14,69% 14,97% 2,15% 9,89% 12,08% 3,21% 38,57%
Friday 17,08% 21,93% 6,95% 7,42% 26,89% 0,93% 19,93%
Saturady 18,75% 18,24% 8,73% 5,96% 7,01% 9,24% 35,28%

600 70
dated more times but not necessarily a larger amount
60
500 of data is downloaded;
50
Time [Minutes]

Time [Minutes]

400
40 • some calls expire in very short time, but generate
300
30 many Kbytes of network traffic. This is probably due
200
20 to the download of large attachments.
100 10

0 0 VI. C ONCLUSION AND F UTURE W ORKS


0 5000 10000 15000 20000 0 100 200 300 400 500 600
Data [Kbyte] Data [Kbyte]
(a) Facebook (b) Mail This paper reports on traffic measurement carried out at
an LTE eNodeB of Telecom Italia located in a business
Fig. 6. Call Analysis: Scatter Plots area in Turin back in summer/fall 2013. The analysis of
the acquired traffic data carried out at call and frame level
highlights some interesting features. In particular, at call level,
an observed session: the abscissa represents the data exchanged the results show that the utilization of the push technology
during the session (in KBytes), while the ordinate its duration. implies the presence of a regular pattern in the inter–arrival
times between successive calls and in the call duration. This
The results for Facebook sessions show a high variability: phenomenon impacts the traffic in the control plane and the
the duration of sessions varies from few seconds to more than energy consumption of the user device. A deeper analysis on
800 minutes, and data download or upload varies from few these aspects is considered as a future work. At the frame
bytes to less than 80000 KBytes. For better visualization, we level, the analysis shows that the observed eNodeB located in
limit the scatter plot to 600 minutes (10 hours) and 10000 a business area is mainly used during the break time of the
KBytes, including more that 92% of the samples (figure 6(a)). working days of the week. On the basis of DNS classification,
Apparently, there is no correlation between duration and a picture of the most widespread applications is provided.
data exchanged. Very long sessions do not involve huge traffic
exchange and, conversely, short sessions can produce a large R EFERENCES
amount of data. These observations stimulate the need to learn [1] Falaki, H., Mahajan, R., Kandula, S., Lymberopoulos, D., Govindan,
more about the behavior of these widely used applications, to R., and Estrin, D. (2010, June). Diversity in smartphone usage. In
discover their coexistence with typical timers inside mobile Proceedings of the 8th international conference on Mobile systems,
applications, and services (pp. 179-194). ACM.
networks (e.g. Inactivity Timer).
[2] Xu, Q., Erman, J., Gerber, A., Mao, Z., Pang, J., and Venkataraman, S.
For Mail connections, we heuristically assumed as belong- (2011, November). Identifying diverse usage behaviors of smartphone
ing to the same session all packets belonging to successive apps. In Proceedings of the 2011 ACM SIGCOMM conference on
Internet measurement conference (pp. 329-344). ACM.
connections separated by less than 300 s. We associate a name-
[3] Paul, U., Subramanian, A. P., Buddhikot, M. M., and Das, S. R. (2011,
resolved TCP connection to ”Mail”, when these strings are April). Understanding traffic dynamics in cellular data networks. In
recognized: mail, posta (the italian name for mail), pop and INFOCOM, 2011 Proceedings IEEE (pp. 882-890). IEEE.
hot, when it appears together with microsoft. We scanned our [4] Huang, J., Qian, F., Guo, Y., Zhou, Y., Xu, Q., Mao, Z. M., and
traces, and we chose this pattern for classifying all DNS query Spatscheck, O. (2013, August). An in-depth study of LTE: Effect
that we saw. Sessions with duration shorter than 70 minutes of network protocol and application behavior on performance. In
and traffic volume less than 600 KBytes (Figure 6(b)) account Proceedings of the ACM SIGCOMM 2013 conference on SIGCOMM
(pp. 363-374). ACM.
for about the 99% of all samples. Again, most of them last
[5] http://www.huawei.com/en/static/hw-001545.pdf
for less than 50 minutes and generate less than 300 Kbytes
[6] Haverinen, H., Siren, J., and Eronen, P. (2007, April). Energy con-
traffic volume. Unexpectedly, the scatter plot evidences a great sumption of always-on applications in WCDMA networks. In Vehicular
number of samples with data exchanged between few Kbytes Technology Conference, 2007. VTC2007-Spring. IEEE 65th (pp. 964-
and 300 Kbytes, when the session duration is of few minutes 968). IEEE.
or in the range of 35–40 minutes. This behavior is under [7] Schneider, F., Feldmann, A., Krishnamurthy, B., and Willinger, W.
investigation. A deeper analysis reveals additional interesting (2009, November). Understanding online social network usage from
points, such as: a network perspective. In Proceedings of the 9th ACM SIGCOMM
conference on Internet measurement conference (pp. 35-48). ACM.
• many connections need no more than 30 seconds to [8] Wongyai, W., and Charoenwatana, L. (2012, May). Examining the
update the mail box; network traffic of Facebook homepage retrieval: An end user perspec-
tive. In Computer Science and Software Engineering (JCSSE), 2012
• when the duration time increases, the mail box is up- International Joint Conference on (pp. 77-81). IEEE.

View publication stats

You might also like