Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Novel Algorithm to Recover the Lost CDR

Information by Control and User Planes Separation


in 4G and 5G
2021 IEEE International Conference on Electronics, Computing and Communication Technologies (CONECCT) | 978-1-6654-2849-1/21/$31.00 ©2021 IEEE | DOI: 10.1109/CONECCT52877.2021.9622598

Kopperla Ranjith Kumar Katyayani Sesha kumar Kavuluri Debabrata Das


Samsung R&D Institute India- Samsung R&D Institute India- Networking and Communication
Bangalore Pvt. Ltd Bangalore Pvt. Ltd Research Lab
Bengaluru,India Bengaluru,India International Institute of Information
r.kopperla@samsung.com ksk.kavuluri@samsung.com Technology-Bangalore
Bengaluru,India
ddas@iiitb.ac.in

Abstract: In 3GPP networks, Charging Data Record (CDR)


plays a very important role since it maintains the information
related to the data usage which are used for billing the
subscriber. In Evolved Packet Core (EPC) network, CDR can be
generated by both Serving Gateway (SGW) and Packet Data
Network Gateway (PGW) where both Control Plane (CP)
messages which represents the messages related to the creation
of the Packet Data Network (PDN) sessions and the User Plane
(UP) messages which represents the actual data (Internet data
from device or voice/video calls) are transferred. With Control
and User Plane Separation (CUPS) architecture, the CP
messages will be handled by the CP node (GW-C) and UP
messages will be handled by the UP node (GW-U). Similar
architecture enhancement of CP and UP separation is continued
towards 5G also. In CUPS system, charging information is
calculated at the UP node based on actual usage and will be
conveyed to CP node in Sx messages using Packet Forwarding Fig. 1. Charging Architecture in CUPS environment
Control Protocol (PFCP) via UDP. Loss of this CDR information
in between the CP and UP nodes due to any network issue will The sequence of events that occur while generating the
directly impact the revenue and reputation of the operator. In CDR in CUPS environment are as explained below:
the current paper we have discussed some of the scenarios which • The data traffic for all the users travel from RAN to
can result in CDR information loss between CP and UP nodes
UP node towards PDN and vice-versa. Based on the
(observed in the commercial deployed networks) and proposed a
novel algorithm to recover the same. The simulation result also
amount of data used UP node will calculate the usage
represents that with the proposed idea more than 95% of the information related to that user session.
missing information can be recovered with minimal increase in • UP node will send the usage information to CP node
the network load. in PFCP messages (Session Modification/Delete
Response or Session Report Request).
• CP node will generate the CDR entry based on the
I. INTRODUCTION received information and send it towards Charging
systems.
CUPS is an architectural enhancement which separated • Charging systems perform billing of the subscriber
the CP and UP nodes in 4G as well as 5G core network. This based on the CDR received.
enhancement targeted mainly in separating the UP activities by
introducing a separate node which can be deployed nearer to • Loss of CDR information can be seen in CUPS,
the RAN (Radio Access Network) side to reduce the latency because UP node calculate the charging information
for the end user. In charging architecture of CUPS as and CP node generate the CDR.
mentioned in Fig. 1, GW-USER PLANE node (UP node of 4G
system) is connected to the external PDN network in one
direction and to the GW-CONTROL PLANE node (CP node As shown in Fig. 2, the sequence of events that occur while
of 4G system) in another direction. The Charging systems generating the CDR in traditional EPC environment are as
represented in the Fig. 1 are Online charging system (OCS) explained below:
and Offline Charging System (OFCS). • PGW calculates the usage information based on the
user plane messages (data traffic of both uplink and
downlink data) related to the user session.
• PGW generates the CDR based on the usage and end
it towards the charging systems (OCS/OFCS).

978-1-6654-2849-1/21/$31.00 ©2021 IEEE

Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR. Downloaded on December 18,2022 at 05:04:23 UTC from IEEE Xplore. Restrictions apply.
• Charging systems perform billing of the subscriber
based on the CDR received.
• Loss of charging information will not be seen EPC,
because PGW itself calculates charging information
and generates CDR.

Fig. 4. Sx Interface between CP and UP Nodes [1]

The paper has been structured as follows. In section II,


we have provided a brief information about the literature
referred. In section III, we will discuss about how the CDR
information is calculated and transferred from UP to CP node
along with the introduction of problem statement. In section
IV, we introduce our novel procedure which helps the CP/UP
node in recovering the lost CDR information. Respective
sections provide call flows and flow diagrams for better
understanding. In section V, we have provided an analytical
model to calculate the number of sessions that get impacted.
Fig. 2. Charging Architecture in traditional LTE [7] Section VI provides simulation method and results with the
existing architecture and with the proposed novel idea.
Fig. 3 represents the CUPS of 4G core network. The CP II. LITERATURE SURVEY
functionality interfaces like S5/S8C, S2aC, S2bC terminate at
PGW-C and UP functionality interfaces like S1U, S5/S8U
terminate at UP nodes. These CP and UP nodes communicate In [1], 3GPP TS 29.244 Rel. 16 Sec 5.18 discuss about
using PFCP protocol as shown in Fig. 4. The newly introduced the Enhanced PFCP Association Release (EPFAR) procedures
PFCP protocol for the Sx reference point runs over UDP to address the loss of usage information when any node wants
protocol. to perform the PFCP Association Release. UP node will send
Session Report Request message for all the sessions having
non-zero usage information using EPFAR procedure.
In [4], US9331913B2 - Methods and devices for packet
data network gateway suspension for accurate charging in an
evolved packet system. This patent discusses about the method
that can be implemented in the first node for charging of a
mobile device with the first node in a communications
network.
In [5], US9621444B2 - Method and device for handling
dropped data Packets which discusses about a method in the
charging node for handling the dropped packets in the
communication networks.
In [6], the authors present about a survey of UDP packet
loss characteristics which explains on understanding the
packet loss characteristics in the internet.
Fig. 3. EPC with Control Plane and User Plane Separation [1]
None of the above references address or discuss about
the loss of charging information in between the CP and UP
The reliability of delivery of the UDP messages in between nodes and how to recover the same.
the CP and UP Node, to some extent can be ensured using
retransmission via existing configured T1 timer for N1 times. III. CDR CHALLENGES INTRODUCED DUE TO CUPS
In this paper we will discuss about the scenarios where the
usage information for generating the CDRs can be lost during CUPS architecture newly introduced the communication
IP network fluctuations, along with a new idea on how to between CP and UP node using PFCP session. PFCP session
recover the usage information with minimal impact on the messages are broadly divided into three set of procedures.
network. These procedures are clearly explained in section 6.3 “PFCP
Session Related Procedures” of [1].

Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR. Downloaded on December 18,2022 at 05:04:23 UTC from IEEE Xplore. Restrictions apply.
The first procedure being the Sx session creation where the 8 seconds. This retransmission timer will be configured based
Session Establishment Request and Response messages get on the timers used across all the EPC nodes for a single
exchanged in between CP and UP nodes. transaction.
• This procedure happens once in the lifetime of the Below mentioned are a couple of scenarios where Sx
session association with CP and UP nodes. packet loss can result in loss of charging information.
• The loss of the response to the create request messages
will not have any impact in loss of usage information.
The next set of procedure being the Sx session update
where Session Modification Request and Response messages,
Session Report Request and Response messages gets
exchanged in between CP and UP nodes.
• These procedures can happen multiple times during
the lifetime of the session.
• Loss of these session modification procedures can also
result in the usage information loss, if the usage
information is shared in during the procedure.
The last procedure being the Sx session deletion where the Fig. 5. Loss of Session Deletion Request message due to Network
Session Delete Request and Response or Session Report fluctuation
Request and Response (with terminate trigger set) messages
gets exchanged in between CP and UP Nodes. A. Scenario-1 (Fig. 5 explanation) of information loss:
The first scenario as shown in Fig. 5 is about the loss of
• These procedures can happen once in the lifetime and Session Delete Request message in the Sx interface and how
at that too at the end of the PFCP session. the existing system react when this kind of scenario occurs.
• The UP nodes shares all the usage information in • CP node has initiated Session Delete Request message
Session Delete Response or Session Report Request towards UP node.
messages.
• Due to IP fluctuation, Session Delete Request message
• After the messages are exchanged, the UP node will is dropped in the network and has not reached the UP
delete the PFCP session context. node.
• Loss of these messages can directly impact on the • All the retry messages sent by the CP node are also
charging information of the subscriber. dropped in the network. (consider the number of
retransmissions is configured as 2 and time to wait for
For session establishment procedures if the request or the response is 2sec).
response message is not received, then session creation • Since there is no response from UP node even after all
procedure itself gets impacted and hence loss of charging retries, CP node will mark this delete request
information is not seen. But for session modification and internally as failure.
termination procedures, if the response or the request message
is lost there is more probability for the charging information to • Generate the CDR with usage information as 0 (which
be lost. This situation can result in the revenue loss to the is not correct and inaccurate).
operators along with the inaccurate billing for the end user.
• CP node will clear all the details of the PDN session
Coming to the background of the problem statement, the from its internal database or unstructured memory (in
CUPS architecture introduced a new Sx Interface in between case of 5G).
the CP and UP nodes which run on UDP Protocol. But this
architecture enhancement didn’t address on how to recover the • CP node will respond back with delete response to the
messages that are lost in the Sx Interface for any reason even originator node.
after the retransmissions are completed. IP fluctuation is one • Session Context entry in UP node will remain stale.
of the major reason where the PFCP message can be lost in
between CP and UP node. IP fluctuations at the operator site B. Scenario-2 (Fig. 6 explanation) of information loss:
are somethings which can’t be controlled. Even though the As per the Enhanced PFCP Association Release
fluctuations might not occur on a daily basis, but when (EPFAR) procedure defined in 3GPP 29.244 Rel. 16 Sec 5.18
occurred will stay for some duration in tens of seconds to a and sec 6.2.8[1], whend any of the CP or UP node wants to
couple of minutes. This kind of IP fluctuations can be release the PFCP Association in between them, UP node will
frequently seen in the commercial deployments, and when initiate the Session Report Request for all the sessions which
occurred huge amount of Sx packet losses can be seen. contains usage information. This scenario which is shown in
In general, the total time configured for sending the Fig. 6 is as explained below:
PFCP request message and wait for its response (including all
PFCP retransmissions) will be approximately in between 4 to

Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR. Downloaded on December 18,2022 at 05:04:23 UTC from IEEE Xplore. Restrictions apply.
• UP node will send Session Report Request by setting received, CP node will generate the CDR and send it
the trigger to TEBUR (Termination by UP Function towards charging systems
Report) Flag.
In the real commercial systems, ideally CP and UP node
can support up to millions of sessions. With the number of
sessions being so huge, the number of sessions that gets
impacted in the above mentioned scenarios can be in hundreds
of thousands as well.
The problem statement which is shared in this paper can
be easily noticed in any operator where CUPS architecture is
introduced. Since the above mentioned scenario and Sx loss
are not properly addressed from the specification, the
responsibility of handling this issue is neither from the
applications (CP or UP node) nor from the IP side. This
behavior can result in huge unidentified revenue loss to the
operator.
IV. PROPOSED IDEA FOR RECOEVERING CDR
Fig. 6. Loss of Session Report Request message due to Network fluctuation INFORMATION

As per the analysis based on the use-cases from the


• Due to IP fluctuations, all the Session Report Request commercially deployed networks and theoretically generated
messages (including retries) are dropped in the data, we have proposed the below mentioned novel procedure
network. which can help the operator to recover the lost charging
information with minimal enhancements to the existing
• Since UP node has not received the response even procedures.
after all retries, it will delete the context internally.
With the existing architecture, when any of the above
• The Usage information is not received in the CP node mentioned response or request message is lost, the session
and hence the usage information will be reported as 0 details will be cleaned in either of CP or UP node by sending
in the CDR’s. a failure reason similar to “No response from the peer node”.
When a session response is not received for Session Delete
C. Scenario-3 (Fig. 7 explanation): Request message in CP node or for the Session Report Request
Ideal successful behavior for both the above mentioned message (while termination trigger is set) in the UP node, we
scenarios, when IP fluctuation or Sx loss is not observed are suggest to enhance the existing procedure with the below way:
mentioned in the Fig. 7 as below:
• Don’t delete the session information even after all the
PFCP retries are completed.
• Create a new extra timer (a longer timer say in
between 5 to 10 min) linked with the session in UDSF
or local memory which stores the session information.
• Retry the Session Deletion Request message towards
UP node or the Session Report Request message
towards CP node in an exponential fashion until a long
configured timer (around 10 min) or a response
message is received.
• Since the new retry timer is incremented in
exponential form, the impact of the new message load
on the system due to new retry messages will be low
Fig. 7. Successful Behavior for Scenario -1 and Scenario-2
and has very high probability of getting response.
• CP node sends Session Delete Request towards UP Fig. 8 and Fig. 9 shown below represent the behavior of
node. UP node will calculate the usage information the system with the novel idea proposed for the problematic
and respond back with Session Deletion Response. scenarios mentioned above.
After Session Deletion Response message is received,
CP node will generate the CDR and send it towards
charging systems.
• UP node will send Session Report Request by setting
the trigger to TEBUR (Termination by UP Function
Report) Flag. After Session Report Request is

Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR. Downloaded on December 18,2022 at 05:04:23 UTC from IEEE Xplore. Restrictions apply.
completed. Based on this the total time for retransmissions to
be completed is calculated as below,

TR TMR = N × Re TMR + Re TMR (1)

Sessions whose first and last retransmission is sent within


the fluctuation time are impacted. The Impacted Sessions ( )
are directly proportional to the difference between Fluctuation
time and total retransmission time and number of messages
per sec ( ),
Fig. 8. Recovery of charging Information using the proposed solution for
scenario-1.
IS = (FT − TRTMR ) × TX PS , FT > TRTMR
(2)

IS = 0, FT < TRTMR (3)

If the fluctuation is less than the TRTMR (in this paper the
value is considered as 6 sec), then Impacted Sessions (IS) will
be Zero. If fluctuation time is more than the TRTMR (>= 6 sec),
then Impacted Sessions will grow based on the number of
subscribers and transactions that node is handling at that time.
VI. SIMULATIONS RESULTS AND DISCUSSION

As mentioned in the analytical model, the Impacted


Sessions are directly proportional to the Fluctuation time (FT)
along with the number of messages per sec (TXPS) handled at
the CP/UP node. The simulation results shown in the below
mentioned Table I and Table II represent how the impacted
Fig. 9. Recovery of charging Information using the proposed solution for sessions are seen with the existing architecture and with the
scenario-2.
proposed solution.
V. ANALYTICAL MODEL
For the simulation, the configuration values are considered
similar to the commercial filed configurations. Let ReTMR be 2
Analytical model for the impacted sessions is sec, N be 2. With the above values TRTMR in this example will
formulated as below. Let, become 6 secs (2sec*2 + 2sec).
FT – fluctuation time, be the duration of time during which
the connectivity between CP and UP nodes is having issues. In the below mentioned tables, the number of messages the
This packets during this time considered to be dropped in the CP/UP node is handling per sec TXPS is provided in each row
network. (100, 150, 200 till 500). While the Fluctuation time’s (FT) in
seconds are represented in the columns (5sec, 8sec, 10 secs and
TXPS – number of messages transmitted between the UP and 15). Impacted Sessions (IS) are represented in each cell based
CP nodes per second. on the TXPS and FT values.

ReTMR – retransmission timer configured in CP node. A. Summary of Table I (Existing Architecture):


N – Maximum number of retransmissions as per the N1
timer. Table I contain simulation results based on the existing
architecture. As explained in equation 3 of Section V, the
TRTMR - Total time taken for the initial PFCP message and all Impacted Sessions(IS) are zero (0) when the fluctuation
its retransmission messages. In Fig. 8 the time taken for time(FT) value is 5 secs which is less than the total time
sending all 3 (one initial and two retransmissions) Session (TRTMR) 6 sec.
Deletion Request messages is 6 sec.
It can be clearly seen from the Table I that Impacted
CP/UP node send messages and waits for ReTMR seconds. Sessions are getting increased with increase in the FT and
After the timer is expired, the message is retransmitted if the TXPS. If we consider the FT value as 8sec, then the
response is not received. This process repeats until either Impacted Sessions(IS) has increase from 200 to 1000 with
response is received or number of retransmissions is increase in TXPS value from 100 to 500.

Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR. Downloaded on December 18,2022 at 05:04:23 UTC from IEEE Xplore. Restrictions apply.
TABLE I. IMPACTED SESSIONS(IS) IN THE EXISTING ARCHITECTURE As per the simulation results represented in Table II we
WITH INCREASE IN MESSAGES AND FLUCTUATION TIME
have identified that we can recover almost more than 99% of
Formulae = ( FT - Existing Behavior
TRTMR) * 5 8 10 15
MSG/SEC Fluctuations in Sec (FT)
100 0 200 400 900
150 0 300 600 1350
200 0 400 800 1800
250 0 500 1000 2250
MSG
300 0 600 1200 2700
PER SEC
350 0 700 1400 3150
400 0 800 1600 3600
450 0 900 1800 4050
500 0 1000 2000 4500
Impacted Sessions

But in real commercial deployments the number of Fig. 10. Message Load with and without solution
messages per sec and the number of subscribers will be really
huge which will directly increase the counts of Impacted lost data information due to IP fluctuations. If the fluctuation
Sessions(IS). is greater than the configured new timer (which is ideally a big
value in commercial networks), then operator has to identify
B. Summary of Table II(with proposed idea): and correct the exact reasons for fluctuation.
VII. CONCLUSION
TABLE II. IMPACTED SESSIONS(IS) IN THE PROPOSED IDEA WITH
INCREASE IN MESSAGES AND FLUCTUATION TIME In this paper, we have proposed a novel way which can be
Proposed Enhancements implemented using very minimal changes to recover the lost
Formulae = ( FT -
5 8 10 15
TRTMR) * MSG/SEC
charging information in the CUPS architecture. The same
Fluctuations in Sec (FT) solution can be re-used in any other procedures as well where
0 0 0 0 100 critical information loss from the messages cannot be accepted
0 0 0 0 150
0 0 0 0 200
due to packet loss in the network.
0 0 0 0 250 The idea proposed in this paper is based on the existing
MSG PER
0 0 0 0 300
0 0 0 0
SEC
350
CUPS architecture and the problem observed from the
0 0 0 0 400 commercial deployments which resulted in the loss of
0 0 0 0 450 charging information, but there could be some more scenario’s
0 0 0 0 500 which can cause the loss of CDR information which needs
Impacted Sessions further study and attention.

Table II contains simulation results based on the proposed REFERENCES


idea. As per the proposed enhancement, a new long timer
(ideally around 10 min) will be configured in the CP / UP [1] [3GPP TS 29.244 V16.0.0 Interface between the Control Plane and
nodes. So the CP / UP nodes will perform retransmissions in the User Plane Nodes.
an exponential manner until the response is received or the new [2] 3GPP TS 23.214 V16.0.0 Interface between the Control Plane and the
User Plane Nodes.
timer is completed. Because of this retransmissions, all the
[3] 3GPP TS 32.295, “Telecommunication management; Charging
Impacted Sessions (IS) due of the fluctuation (10 min) can be management; Charging Data Record (CDR) transfer”
recovered. Since all the sessions are recovered, the IS became [4] Loudon Lee Campbell, “US9331913B2 - Methods and devices for
Zero (0) with all Fluctuation Time ( FT ). packet data network gateway suspension for accurate charging in an
evolved packet system”
With the new enhancement the IS value is becoming [5] Di Liu, Feng Guo, Yingjiao He, Yong Yang, Jingzhe ZHANG, and Nan
Zero with minimal increase in the load of the system. The Zhang, “US9621444B2 - Method and device for handling dropped data
below mentioned Fig. 10 represent the increase in the load of Packets”.
the system with the novel proposed idea . The simulation is [6] Suk Kim Chin and R. Braun, "A survey of UDP packet loss
characteristics," Conference Record of Thirty-Fifth Asilomar
made for a single iteration with a high Fluctuation time (FT) of Conference on Signals, Systems and Computers (Cat.No.01CH37256),
30 sec. Based on the observation from the simlations: Pacific Grove, CA, USA, 2001, pp. 200-204 vol.1, doi:
10.1109/ACSSC.2001.986905.
• Messages per sec (TXPS) is represented on the X-Axis [7] Netmanias Technical Document, LTE Charging I: Offline, February
and Load on the systems or the number of messages 2015
received is represented on Y-Axis.
• We can clearly observe from the grapth that the
increase in the load of system is minimal and we can
achieve the Impacted Sessions (IS) count of zero even
for a Fluctuation time (FT) of 30 sec.

Authorized licensed use limited to: INDIAN INSTITUTE OF TECHNOLOGY KHARAGPUR. Downloaded on December 18,2022 at 05:04:23 UTC from IEEE Xplore. Restrictions apply.

You might also like