Professional Documents
Culture Documents
On Load Balancing For A Virtual and Dist
On Load Balancing For A Virtual and Dist
Approaches Architecture MME Type LB Policy Mapping Alg. Mapping ID Eval. Method
DMME [7] 2-Tier Stateless N/A N/A N/A Analy. Model.
Yusuke et al. [8] 3-Tier Stateless RR DHT Group of IMSI Prototype
Gopika et al. [9] 3-Tier Stateless RR DHT eNodeB UE S1AP ID Prototype
SCALE [10] 2-Tier Stateful Least Loaded CH GUTI Prototype
vMME [11] 3-Tier Stateless RR N/A N/A Analy. Model.
Yousaf et al. [12] 2-Tier Stateful RR N/A N/A Analy. Model.
Varudu [13] 2-Tier Stateful RD/RR DHT eNodeB UE S1AP ID Prototype
RD = Random; RR = Round Robin; DHT = Distributed Hash Table; CH = Consistent Hashing
of requests being handled by each back-end server is (a) Scenario 1 (b) Scenario 2
significant. However, when the load balancer receives a
Fig. 4. Detailed configuration of experimental scenarios.
large number of requests, the RD algorithm will be able
to evenly distribute the requests. The RD algorithm is
easy to implement, but it only performs well for clusters the LB and the vMME servers. The weight assigned
of servers with a similar configuration in terms of CPU, in this vMME system should not be confused with the
memory, etc. A uniform RD algorithm is considered in weight factor used in the traditional approach. In fact,
this paper. the weight in the vMME system can be recomputed
2) RR Algorithm: Similar to the RD load balancing adaptively based on the load of each vMME instance
algorithm, the RR load balancing is one of the easiest or other factors.
load balancing algorithms to implement. The RR al-
gorithm distributes incoming requests to the back-end IV. E XPERIMENTAL E VALUATION
servers in a round-robin manner. Like the uniform RD This section first describes our experiment setup fo-
algorithm, the RR works irrespective of the resource cusing on two different scenarios: a link latency based
capabilities and location of the back-end servers. It is load balancing (Scenario 1) and a CPU capacity based
sufficient for clusters consisting of servers with identical load balancing (Scenario 2). Then, results from our
specifications. The RR algorithm implemented in this experiments are provided.
paper is similar to the RR scheduling algorithm used in
the Linux Virtual Server [17]. A. Experiment Setup
3) WRR Algorithm: In contrast to the RD and RR We use the Open5GCore platform, which is a
algorithms, the WRR algorithm assigns each back-end standards-compliant implementation of the mobile
server a value called “weight” (w), which is in propor- packet core system for 4G and 5G, for conducting our
tional to the actual resource capabilities (e.g., CPU and experiments. Kernel-based Virtual Machine (KVM) [19]
RAM) of the server or can be proportional to the speed is used as the virtualization base. In the Open5GCore,
of the link from the LB to each server. In this way, the MME is implemented according to the two-tier
the incoming requests are proportionally distributed to approach, and the load balancer randomly distributes
the weight of each server. For example, the server with the user requests. The Benchmarking Tool (BT) VM
a weight of two will receive two times more requests provides the emulation of the Long Term Evolution
than the one with a weight of one. The way WRR is (LTE) radio access network including emulated UEs
used in this paper is inspired from how it is employed connecting to emulated eNodeBs. We consider two types
for load balancing in cloud web services [18]. In our of scenarios and the settings for these two scenarios are
current implementation, the weight is set to a fixed value detailed in Table II. The attachment delay measures the
according to two factors: the number of CPUs allocated time from when the UE sends an attachment request to
to the vMME servers, and the latency of links between the network until it receives back the attachment accept
1200 1600
RD RD
RR 1400 RR
1000 WRR WRR
Average attachment delay [ms]
600 800
600
400
400
200
200
0 0
10 30 50 80 100 150 180 200 250 300 50 80 100 150 200 250 300 350 400 450
Request Rate [Requests/s] Request Rate [Requests/s]
Fig. 5. Comparison of the attachment delay of the RD, RR, and WRR Fig. 6. Comparison of the attachment delay of the RD, RR, and WRR
schemes in Scenario 1. schemes in Scenario 2.
and a default bearer (i.e., the data plane) for the UE is stances are allocated with the same amount of memory
established. In order to have a more realistic scenario, we but different number of CPUs, as shown in Fig. 4 b). In
take into account the heterogeneity of signaling load by addition, the link latency from the LB to each MME is
considering the mix of three different signaling events: kept the same. The weight (w) of each MME instance in
attachment, detachment, and handover. It has been shown the WRR load balancing scheme is assigned correspond-
in [20] and [21] that the number of attachment requests ing to their CPU capacity. To make it possible to have an
is similar to the number of detachment requests. And the overload situation, the total number of UEs configured
number of S1-based handover requests is often a third to in this scenario is 20000. We run the experiment with
a half of the number of attachment requests. Based on different overall request rates: 50, 80, 100, 150, 200, 250,
this observation on signaling usage pattern in the LTE 300, 350, 400, and 450 requests/s.
network, we use the following ratio in all our experiment
scenarios: 40% attachment, 40% detachment, and 20% B. Experimental Results
handover. All experiments start by pre-attaching 100 UEs In the following, we present the results obtained from
to the system and then the request pattern is executed the two experiment scenarios. The attachment delay and
under different request rates until all pre-configured UEs CPU utilization of each MME are measured to compare
finish performing their operations (i.e., generating attach- the three load balancing schemes.
ment, detachment, and handover requests). Each time 1) Scenario 1: The experimental results for this sce-
requests are generated, we measure the attachment delay nario are shown in Fig. 5 and Fig. 7. The bar graphs
and the number of successfully attached UEs. Finally, we in Fig. 5 show the mean of the attachment delay for
compute the weighted average of the attachment delay the three load balancing schemes in Scenario 1; the
for each UE at each request rate. error bars in the graphs illustrate the 95% confidence
1) Scenario 1: In this scenario, we compare the intervals. As observed from Fig. 5, the attachment delay
performance of three load balancing schemes when the increases as the request rate increases for all three load
link latency from the LB to the MME instances are balancing schemes. We observe that when the overall
different while the resource capacities, i.e., CPU and request rate is less than 200 request/s which is equivalent
memory capacities, of the MME instances are the same. to 80 attachment requests/s, the attachment delay in the
The detailed configuration of the experimental Scenario WRR scheme is up to 38% lower than that in the RD and
1 is shown in Fig. 4 a). The link latency is emulated the RR schemes. The reason is that in the WRR scheme,
using the Netem traffic control tool in Linux [22]. The according to the weight assignment, there is eight times
weight (w) of each MME in the WRR scheme is assigned more requests being handled by the fourth MME, which
according to the link latency, as shown in Fig. 4 a). The has the lowest link latency to the LB, than what is being
number of UEs used in this scenario is 10000. We run handled by the first MME, which has the highest link
the experiment with different overall request rates: 10, latency to the LB. The requests are distributed evenly
30, 50, 80, 100, 150, 180, 200, 250, and 300 requests/s. in RR or almost evenly in RD among the four MMEs.
2) Scenario 2: In this scenario, we let the cluster of Therefore, the number of requests traversed through
MME instances be heterogeneous by allocating MMEs the long-delay links is higher than that in the WRR
with different resources. In particular, the MME in- scheme, leading to a higher attachment delay. When the
70 70 100
MME1 MME1 MME1
MME2 MME2 MME2
60 60
MME3 MME3 MME3
80
MME4 MME4 MME4
50 50
CPU Utilization [%]
30 30
40
20 20
20
10 10
0 0 0
10 30 50 80 100 150 180 200 250 300 10 30 50 80 100 150 180 200 250 300 10 30 50 80 100 150 180 200 250 300
Request Rate [Requests/s] Request Rate [Requests/s] Request Rate [Requests/s]
100 100 60
MME1 MME1 MME1
MME2 MME2 MME2
MME3 MME3 50 MME3
80 80
MME4 MME4 MME4
CPU Utilization [%]
20 20
10
0 0 0
50 80 100 150 200 250 300 350 400 450 50 80 100 150 200 250 300 350 400 450 50 80 100 150 200 250 300 350 400 450
Request Rate [Requests/s] Request Rate [Requests/s] Request Rate [Requests/s]
overall rate is about 250 requests/s or more, the WRR delay in the three schemes is almost the same. However,
scheme performs worse. Due to its large weight, the when the overall rate is higher, the WRR starts out-
fourth MME has to process a large number of UEs at a performing the other two schemes. For example, at the
high rate, which exceeds its capacity, thus resulting in a overall rate of 450 requests/s, which corresponds to 180
higher attachment delay than that in the RD and the RR attachment requests/s, the attachment delay for the WRR
schemes. scheme is about 40% lower than that for the RD scheme.
the queue is filling up quickly, and the CPU utilization The reason is that the number of requests that needs
of the MME going up, thus resulting in a higher attach- to be handled by MME1, which has the lowest CPU
ment delay than that of the RD and the RR schemes. resource, in the RD and the RR schemes is higher than
By observing the CPU utilization of each MME in its maximum capacity, leading to a saturation of CPU
Fig. 7, it is clear that when the request rate increases, resources. Consequently, the attachment delay goes up.
the CPU utilization of each MME also increases for all However, since we know the amount of CPU resource
three schemes. The CPU utilization of the 4 MMEs in allocated to each MME and we assign the weights
the RD and the RR schemes are almost the same for accordingly, the number of requests being handled by
each request rate. Whereas, the CPU utilization of the 4 each MME will be proportional to the assigned weight.
MMEs in the WRR scheme is proportional to the weight Therefore, contrary to the RD and RR schemes, no
assigned to each MME (as shown in Fig. 4 a)). Since overload happens, and the MMEs are able to process
the fourth MME is assigned with the highest weight, it more requests than in these schemes.
has the largest CPU utilization.
2) Scenario 2: The experimental results for this sce- As seen in Fig. 8, the CPU utilization of each MME
nario are shown in Fig. 6 and Fig. 8. The bar graphs also increases when the request rate increases. In the RD
in Fig. 6 show the mean of the attachment delay under and the RR schemes, the distribution of CPU utilization
the three load balancing schemes in Scenario 2; the error of each MME corresponds to their allocated capacity.
bars in the graphs illustrate the 95% confidence intervals. When the overall rate increases and becomes higher than
We observe that when the request rate increases, the 350 requests/s, we can see that the CPU utilization of
attachment delay also increases in all three schemes. MME1 goes up to approximately 100% utilization. That
When the overall rate is less than 300 request/s, which is explains the high attachment delay in the RD and the RR
equivalent to 120 attachment requests/s, the attachment schemes shown in Fig. 6. In contrast, the CPU utilization
of each MME in the WRR scheme are almost the same [2] ——, “SDN/NFV-based mobile packet core network archi-
for each request rate, and is far from the saturation point tectures: A survey,” IEEE Communications Surveys Tutorials,
vol. 19, no. 3, pp. 1567–1602, 2017.
even at an overall rate of 450 requests/s. [3] 3GPP, “3GPP ts 23.501 technical specification group services and
system aspects; system architecture for the 5G system; stage 2
V. C ONCLUSION AND F UTURE W ORKS (release 15),” 2017.
[4] ——, “3GPP ts 23.401 technical specification group services and
In this paper, we have presented a preliminary study system aspects; general packet radio service (gprs) enhancements
which attempts to address the scalability of the MME for evolved universal terrestrial radio access network (e-utran)
in 4G/5G mobile core networks. We discussed two access, (release 10),” 2011.
[5] Nokia Siemens Networks, “White paper: Signaling is growing
main vMME architectures: a two-tier and a three-tier 50% faster than data traffic,” http://goo.gl/aUKANS, 2012, [On-
architecture. Our initial experiment results of the two- line; accessed 6-June-2017].
tier architecture have illustrated the benefits of having [6] A. S. Rajan, S. Gobriel, C. Maciocco, K. B. Ramia, S. Kapury,
A. Singhy, J. Ermanz, V. Gopalakrishnanz, and R. Janaz, “Un-
multiple vMME instances in reducing the control plane derstanding the bottlenecks in virtualizing cellular core network
latency when the system is heavily loaded. However, functions,” in The 21st IEEE LANMAN. IEEE, 2015, pp. 1–6.
most of the existing approaches pay less attention to the [7] X. An, F. Pianese, I. Widjaja, and U. Günay Acer, “DMME: a
distributed LTE mobility management entity,” Bell Labs Techni-
load balancing function while implementing such a new cal Journal, vol. 17, no. 2, pp. 97–120, 2012.
vMME architecture. The two currently available load [8] Y. Takano, A. Khan, M. Tamura, S. Iwashina, and T. Shimizu,
balancing schemes the RD and the RR schemes are not “Virtualization-based scaling methods for stateful cellular net-
work nodes using elastic core architecture,” in Proc. of the 6th
sufficient since they do not take into account the resource IEEE CloudCom. IEEE, 2014, pp. 204–209.
capacities and the locations of the MMEs. This is illus- [9] G. Premsankar, K. Ahokas, and S. Luukkainen, “Design and
trated in the paper by comparing the results of these two implementation of a distributed mobility management entity on
OpenStack,” in Proc. of the IEEE CloudCom. IEEE, 2015, pp.
algorithms with our implemented WRR algorithm that 487–490.
WRR offers a significantly reduced attachment delay and [10] A. Banerjee, R. Mahindra, K. Sundaresan, S. Kasera, K. Van der
resource utilization. More specifically, the WRR gives Merwe, and S. Rangarajan, “Scaling the LTE control-plane for
future mobile access,” in Proc. of the 11th ACM CoNEXT. ACM,
almost a 38% reduction in attachment delay in the link- 2015.
based load balancing scenario when the request rate is [11] J. Prados-Garzon, J. J. Ramos-Munoz, P. Ameigeiras, P. Andres-
low, and almost a 40% reduction in attachment delay in Maldonado, and J. M. Lopez-Soler, “Modeling and dimensioning
of a virtualized MME for 5G mobile networks,” IEEE Transac-
the CPU capacity based load balancing scenario when tions on Vehicular Technology, vol. 66, no. 5, pp. 4383–4395,
the request rate is high. However, the current implemen- 2017.
tation of the WRR algorithm statically assign weights [12] F. Z. Yousaf, P. Loureiro, F. Zdarsky, T. Taleb, and M. Liebsch,
“Cost analysis of initial deployment strategies for virtualized mo-
of vMME instances. In addition, the two scenarios are bile core network functions,” IEEE Communications Magazine,
now considered separately and there are situations where vol. 53, no. 12, pp. 60–66, 2015.
the WRR does not perform better than the RD and RR. [13] R. Varudu, “Design and implementation of a flexible mme
load balancing solution for carrier grade virtual mobile core
For example, there is no significant difference among the networks,” Master’s thesis, The Technical University of Berlin,
three schemes in the second scenario when the request Berlin, 2016.
rate is low. Therefore, in our future work, we intend [14] D. King, M. Liebsch, P. Willis, and J. dong Ryoo,
“Virtualisation of mobile core network use case,”
to enhance our WRR algorithm by letting the weights Working Draft, IETF Secretariat, Internet-Draft, January
be dynamically assigned on the basis of information fed 2015. [Online]. Available: http://www.ietf.org/internet-drafts/
back from the vMMEs about their actual loads as well draft-king-vnfpool-mobile-use-case-02.txt
[15] Fraunhofer FOKUS Research Institute,
as the link to the LB. We are also taking into account the http://www.open5gcore.org/, [Online; last accessed 18-March-
difference of signaling events while making the balanc- 2018].
ing decision. In addition, several scaling strategies such [16] K. Suzuki, K. Kunitomo, T. Morita, and T. Uchiyama, “Tech-
nology supporting core network (epc) accommodating lte,” NTT
as proactive/reactive and hybrid scaling strategies are DOCOMO Technical Journal, vol. 13, no. 1, pp. 33–38, 2011.
being investigated in conjunction with the load balancing [17] Linux Virtual Server, http://kb.linuxvirtualserver.org/wiki/Round-
function in order to create a scalable and efficient vMME Robin Scheduling, [Online; last accessed 18-March-2018].
[18] W. Wang and G. Casale, “Evaluating weighted round robin load
solution. balancing for cloud web services,” in In the proceedings of
2014 16th International Symposium on Symbolic and Numeric
ACKNOWLEDGMENT Algorithms for Scientific Computing (SYNASC). IEEE, 2014,
The work was carried out in the High Quality Net- pp. 393–400.
[19] KVM Virtualization, https://www.linux-kvm.org/, [Online; last
worked Services in a Mobile World (HITS) project, accessed 18-March-2018].
funded by the Knowledge Foundation of Sweden. [20] B. Hirschman, P. Mehta, K. B. Ramia, A. S. Rajan, E. Dylag,
A. Singh, and M. Mcdonald, “High-performance evolved packet
R EFERENCES core signaling and bearer processing on general-purpose proces-
sors,” IEEE Network, vol. 29, no. 3, pp. 6–14, 2015.
[1] V.-G. Nguyen, A. Brunstrom, K.-J. Grinnemo, and J. Taheri, [21] S. Tabbane, “Core network and transmission dimensioning,”
“5G mobile networks: Requirements, enabling technologies, and http://goo.gl/kYyiKe, [Online; last accessed 18-March-2018].
research activities,” in Comprehensive Guide to 5G Security, [22] Linux, https://wiki.linuxfoundation.org/networking/netem, [On-
M. Liyanage, I. Ahmad, A. B. Abro, A. Gurtov, and M. Yliant- line; last accessed 18-March-2018].
tilla, Eds. John Willey & Sons Ltd., 2018, ch. 2, pp. 31–57.