Radio Link Failure Troubleshooting

You might also like

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 27

Radio Link Failure Troubleshooting

Changsong (Charles) Sun


2021-03-30
v1.0

1 © 2020 Nokia
Introduction
Radio Link Failure basics

• Call drop ratio is one of the most important metrics used to assess network performance, and impact Retainability KPI.
‒ Radio Link Failure (RLF) is one of the most often reasons of call drops
• Whenever call (voice or data) is cut off before parties had finished, Radio Link Failure is in place
• Call drop can happen from variety of reasons:
- Coverage issues
- Sleeping cells
- Wrong parameterization
- UE malfunction
- Synchronization issues
- Handover failure
- Transport inaccessibility
- g/eNB reset
- Physical layer problems
- etc…

2 © Nokia 2019
For internal use Nokia Internal Use
Introduction
High level category

NSA SA
UE initiated gNB initiated UE initiated gNB initiated

RRC scgFailureInfo X2AP: SgNB RRC rrcRelease


via LTE Release Required Reestablishment
Request

3 © Nokia 2019
For internal use Nokia Internal Use
Introduction
NSA UE initiated RLF

• Reasons for UE initiated Radio Link Failure on 5G side:


SgNB
- Random Access Procedure Failure
- DL out-of-sync (OOS)
- UL Radio Link Control (RLC) failure – maximum number of retransmissions is reached
- synchReconfigFailure-SCG
SCG failure
- SCG reconfiguration failure* - UE is unable to comply with the configuration included in
SCG failure
the RRCReconfiguration message received over SRB3
MeNB
- SRB3 integrity failure*- indication from SCG lower layers concerning SRB3
• Upon RLF detection UE informs MeNB which passes information to SgNB
• Data is switched to LTE leg and if 5G radio link will not recover itself or via
PSCell change then MeNB initiates SgNB release
- conditions for PSCell change must be fulfilled in order to trigger PSCell change

4 © Nokia 2019 Nokia Internal Use


Signaling Flow
NSA UE initiated RLF - PSCell Release

• If PSCell change is not triggered, the SgNB-CU starts a UE MeNB SgNB- CU SgNB- DU

timer: tWaitingRlRecover SCGFailureInformation Random Access Procedure tWaitingRlRecover


- When timer is running the data is switched to the LTE leg of LTE SRB SgNB Modification Required

the split bearer SgNB Modification Request Ack

• After timer expiry, if UE was not recovered, the SgNB UE data transmission and UE scheduling
stopped. Bearer is suspended
will release the UE context by sending X2AP: SgNB
F1: GTP-U PDU (F1-U DDS)
Release Required with the cause ‘Radio Connection
with UE lost’ to MeNB
PDCP transmission switch to X2-U
• SgNB Release Required message is a trigger point for
counter update. X2AP: SgNB Release Required

Respective counter update

X2AP: SgNB Release Confirm

5 © Nokia 2019
Signaling Flow
NSA UE initiated RLF - PSCell Change
• MeNB triggers PSCell change by sending RRC Connection Reconfiguration with necessary information for PSCell change when a
SgNB receives 5G measurements within SgNB Modification Request
• Successful completion of PSCell change solves the RLF, therefore RL recover timer is stopped
- counter RLF_INITIATED_UE_PSCELL_CHANGE is updated when RRC Reconfiguration Complete is received and waiting RL recovery
timer is stopped in case of successful recovery procedure

UE MeNB SgNB-CU Source SgNB-DU Target SgNB-DU


SCGFailureInformation tWaitingRlRecover
X2: SgNB Modification Request
SCGFailureInformation

X2: SgNB Modification Request


Acknowledge

Admission Control check in


Target SgNB-DU F1AP: UE Context Setup Request

X2: SgNB Modification Required F1AP: UE Context Setup Response


5GC000572

RRC Connection Reconfiguration


HO Command
RRC Connection F1AP: UE Context Release
X2: SgNB Modification Confirm
Reconfiguration Complete Command

F1AP: UE Context Release


Complete

6 © Nokia 2019 Contention based/Contention free RA procedure


Nokia Internal Use
Technical Details
NSA UE initiated RLF- Random Access Procedure Failure (1/2)

• Upon reception of Msg1 gNB will create temporary UE context


and schedules a grant for Msg2: Random Access Response PRACH: Msg1 - Preamble transmission
(RAR)

raReponseWindow
Create Temp. UE

- The UE uses the beam to receive as it used for the transmission of


Context

PDDCH: RAR Grant


PRACH (PRBs, MCS)
PDSCH: Msg2 RAR
• If the RAR is not received within a specific time window (temp. C-RNTI, UL Grant, TA)

(NRCELL: raReponseWindow + 3 slots), the UE will retransmit If RAR not received within
preamble (Msg1) with power ramped up by a value specified in (3slots+ raResponseWindow)

the RRC Connection Reconfiguration message PRACH: Msg1 - Preamble transmission

- Msg1 retransmission is done as long as maximum numer of preamble transmissions is not reached
(NRCELL: preambleTransMax )
• If it is reached Random Access Problem Indication is sent to UE higher layers and Random Access Procedure is started
from the scratch

7 © Nokia 2019 Nokia Internal Use


Technical Details
NSA UE initiated RLF- Random Access Procedure Failure (2/2)
• UE receives the UL grant in RAR Msg2 received from the gNB and sends the
RACH Msg3 containing its C-RNTI
PRACH: Msg1 - Preamble transmission
• After transmission of Msg3, UE starts the MAC Contention Resolution timer
Create Temp. UE
(NRCELL: raContentionResolutionTmr) Context

PDDCH: RAR Grant


- The MAC contention Resolution timer monitors the reception of Contention Resolution (PRBs, MCS)
PDSCH: Msg2 RAR
• At the successful reception of RACH Msg3 gNB schedules Contention (temp. C-RNTI, UL Grant, TA)
Resolution Grant for the UE addressing it with C-RNTI
PUSCH: Msg3
• If the CRC is NOK, Random Access Msg3 retransmission is sent MAC Contention
CRNTI
Resolution Timer
- The UE restarts the MAC Contention Resolution Timer at each HARQ reTx PUSCH: Msg3 ReTx
CRNTI
Restart Contention
- if the maximum number of RA Msg3 transmissions (NRCELLGRP:maxHarqMsg3Tx) Resolution Timer

has been reached and still the CRC is NOK, UE will again send preamble unless
NRCELL: preambleTransMax is not reached, otherwise RLF is declared. NRCELL:maxHarqMsg3Tx
FailureType: randomAccessProblem reached
• If MAC Contention Resolution expires, UE will send Msg1 again PRACH: Msg1 - Preamble transmission

8 © Nokia 2019 Nokia Internal Use


Technical Details
NSA UE initiated RLF- DL out-of-sync (OOS)
• UE monitors "out-of-sync" and "in-sync" indications from
layer 1 Radio Link Recovery
N310 Out of Sync
• Upon receiving NRBTS: N310 (default :10) indications
consecutive "out-of-sync" indications, UE starts timer
Connection
NRBTS: T310 (default :2000 ms) ongoing
• NRBTS: T310  is stopped upon receiving NRBTS: N311
t
(default :1) consecutive "in-sync" indications
T310 started T310 stopped
- as a result connection is continued without any dedicated signaling
exchange Radio Link Failure
N310 Out of Sync
• If NRBTS: T310 expiries (no or less than N311 "in-sync" indications
indications), radio link failure is detected and RLF
SCGFailureInformation is sent by UE to MeNB indication
• FailureType: t310-Expiry
t
• Note: iPhone also uses this FailureType to indicate that UE is T310 started T310 expired
going to power saving mode, and request to be released.

9 © Nokia 2019 Nokia Internal Use


Technical Details
NSA UE initiated RLF- RLC Failure

• Radio link failure can be triggered by UE if maximum Radio Link Failure


number of UL RLC retransmissions is reached
x RLC Retransmissions
• This number is defined by configurable parameter RLF
below and passed to UE via indication
RRCConnectionReconfiguration message
t
- NRDRB_RLC_AM.dlMaxRetxThreshold/ x = maxRetxThreshold
NRDRB_RLC_AM.ulMaxRetxThreshold
• In case of RLC failure UE reports radio link failure to
the MeNB via SCGFailureInformation
• FailureType: rlc-MaxNumRetx
• Note: It could be caused by bad UL RLC data
transmission, or bad DL RLC statusPDU transmission.
It seems Samsung UE also
uses this failure type to
indicate UE overheat
10 © Nokia 2019 Nokia Internal Use
issue.
Technical Details
NSA UE initiated RLF- other causes
SgNB
• SCG change failure- reported if UE could not complete PSCell change (e.g. UE could
not access target SgNB with sync failure)
- FailureType: synchReconfigFailure-SCG
• SCG reconfiguration failure (5GC000578)- if the UE is unable to comply with (part
of) the configuration included in the RRCReconfiguration message received over SCG failure
SRB3 SCG failure
MeNB
- FailureType: scg-reconfigFailure
• SRB3 integrity failure(5GC000578)- which is reported, whenever integrity check for
SRB3 is not passed
- FailureType: srb3-IntegrityFailure
In all three cases UE will initiate the SCG failure information procedure to report error
towards MeNB, which will forward message to SgNB

11 © Nokia 2019 Nokia Internal Use


Where to find the RLF related message?
NSA UE initiated RLF handling

• In NR Emil log or X2AP wireshark log, scgFailure


is carried by X2AP:SgNBModificationRequest from
eNB to gNB.
• In UE or LTE Emil, RRC scgFailure message.
• In NR Cplane syslog.
WRN/[cp_ue][114][114] SgnbModificationRequestService.cpp:246
[ueIdCu:12186, gnbDuUeF1APId:16860, menbUeX2APId:2650, gnbDuId:0]
Start UE initiated RadioLinkFailure handling with failureType t310Expiry,
isIntraDu false

12 © Nokia 2019 Nokia Internal Use


What to do with these RLF failures
NSA UE initiated RLF- different causes
• randomAccessProblem (M55116C00009 RLF_INITIATED_UE_RACH_FAIL)
- Check which step RA failed, and on which message. (UE log, ttiTrace, L1/L2 BIP, syslog, Sherpa)
• t310-Expiry (M55116C00008 RLF_INITIATED_UE_T310_EXPIRY)
- Check if the SSB/TRS is transmitted correctly and consistently. (I/Q data, RF log)
• rlc-MaxNumRetx (M55116C00010 RLF_INITIATED_UE_MAX_RLC_RETX)
- Check the PUSCH transmission quality, and the DL statusPdu at gNB. (UE log, ttiTrace, L1/L2 BIP, Sherpa)
• synchReconfigFailure-SCG (M55116C00012 RLF_INITIATED_UE_SCG_CHGE_FAIL)
- Check if the SSB/TRS is being transmitted. (I/Q data, RF log)
• scg-reconfigFailure (M55116C00013 RLF_INITIATED_UE_SCG_RECNF_F)
- Check which IE UE doesn’t accept. (UE log, Emil, Cplane syslog)
• srb3-IntegrityFailure (M55116C00011 RLF_INITIATED_UE_SRB_INTGRTY_F)
- Check integrity procedure in gNB. (syslog)

13 © Nokia 2019 Nokia Internal Use


Technical Details
NSA SgNB initiated RLF overview

• SgNB detects RLF for 5G link based on:


- DTX detection for requested DL HARQ feedback on PUCCH/PUSCH
- DTX detection of CSI reports on PUCCH/PUSCH
- Radio Link Control (RLC) failure – maximum numer of retransmissions is reached
SgNB
- PDCP count rollover
- GTP-U Transmission error
• When a SgNB declares an RLF on the PSCell a control timer is set
MeNB
• During the waiting time, the data is switched to LTE leg of the split bearer
• After timer expiry, SgNB will release the UE context by sending an SgNB initiated
SgNB release with the cause: ‘Radio Connection with UE lost’ to MeNB

14 © Nokia 2019 Nokia Internal Use


Technical Details
NSA SgNB initiated RLF – DL HARQ or CSI DTX
• gNB counts number of consecutive DTX detection for requested DL HARQ feedback or
CSI reports
• If number of conscutive DTX HARQ or CSI exceeds threshold, gNB starts RLF guard
timer NRBTS: tRLFindForDU (default: 300ms)
gNB- DU
- Thresholds are configurable parameters: NRCELL.rlpDetDlHarqThreshold (default: 35)
Number of detected:
NRCELL.rlpDetCsiBsiThreshold (default: 75). TMO TDD: 75, FDD: 10000.

tRLFindForDU
DL HARQ DTX > 35
Or CSI DTX> 75
• When RLF guard timer NRBTS: tRLFindForDU (default: 300ms) is running, HARQ
feedback and CSI reports are still monitored to detect possible recovery of the radio link.
TMO: 1000ms. Number of detected:
DL HARQ feedbacks >
• If the number of consecutively received nonDTX for DL HARQ feedback exceeds 5 or CSI reports>2

NRCELL.rlpRecDlHarqThreshold (default:5) or NRCELL.rlpRecCsiBsiThreshold


(default:2) CSI reports are received, the radio link is assumed to be recovered
• If RLF guard timer will expire, SgNB DU will inform gNB CU about RLF by sending UE
Context Modification Required over F1 link
• Counter: M55116C00014 RLF_INIT_SGNB_DLHARQ_OR_CSI

15 © Nokia 2020 Nokia Internal Use


Technical Details
NSA SgNB initiated RLF – DL RLC transmission failure
• Radio link control (RLC) failure can be triggered by SgNB if maximum
number of RLC retransmissions is reached
• This number is defined by configurable parameter NRDRB_RLC_AM : MeNB SgNB- CU SgNB- DU

maxRetxThreshold set in RLC profile* RLC Failure


- Parameter is used by the transmitting side of each AM RLC entity to limit the tWaitingRlRecover F1: UE Context Modification
number of retransmissions corresponding to an RLC SDU, including its segments Required

GTP-U PDU (F1-U DDDS)


Cause: Radio Link Outage
Radio Link Control Failure UE data transmission and UE
scheduling stopped. Bearer is
x RLC Retransmissions suspended

RLC failure
PDCP transmission switch to X2-U
indication

• In case of RLC failure SgNB-DU informs SgNB-CUt via : UE Context


x = maxRetxThreshold
X2: SgNB Release Required
Modification Required message with Cause set to ’RLC Failure’
• Counter: M55116C00015 RLF_INIT_SGNB_MAX_RLC_RETX

16 © Nokia 2019
Technical Details
NSA SgNB initiated RLF – PDCP count rollover, GTP-U Transmission

• Feature 5GC000509 L3 non standalone call introduces 2 MeNB SgNB- CU SgNB- DU

additional RLF triggers: PDCP count rollover


X2: SgNB Release Required
- PDCP count rollover- if happens gNB-CU triggers the SgNB

5GC000475
initiated SgNB Release with ’Count reaches max value’ cause TimerX2UeProcGuard
X2: SgNB Release Confirm
- GTP-U Transmission Error- if in place SgNB-DU sends F1AP:
UE context release request to SgNB-CU, starts F1: UE Context Release Command

timerF1ProcGuard
timerF1ProcGuard F1: UE Context Release Complete
- Note: there is no dedicated counters for these 2 failures.
• increments counter M55117C01008 GTP-U Transmission Error

NF1CD_UE_CTXT_REL_REQ_SENT F1: UE Context Release Request

timerF1ProcGuard
X2: SgNB Release Required
TimerX2UeProcGuard
X2: SgNB Release Confirm
F1: UE Context Modification
Required
timerF1ProcGuard F1: UE Context Release Complete

17 © Nokia 2019 Nokia Internal Use


Technical Details
Where to find NSA SgNB initiated RLF – Control Plane
In NR Emil:

In gNB Syslog:
INF/[cp_ue][123][123] ActiveProcedureSupervisor.cpp:47 ActiveProcedure set to sgnbRadioLinkFailureProcedure, ueIdCu:505
Then waiting for recovery timer expires.
WRN/[cp_ue][123][123] NsaRadioLinkFailureService.cpp:40 tWaitingRlRecover timeout: [ueIdCu:505, gnbDuUeF1APId:92002,
menbUeX2APId:658, gnbDuId:0]

18 © Nokia 2019 Nokia Internal Use


Technical Details
Where to find NSA SgNB initiated RLF -- DL HARQ or CSI DTX
In L2RT runtime or gNB syslog:
• HARQ DTX detection:
- L2-PS/src/dl/sch/RlfDetAndRecovery.cpp[56] RLF RLF_DL_HARQ: event sent, subcellIndex=0, ue=10319/35293, rlf=RLF_DL_HARQ,
rlfMon=36/0, detThd=35/75, recThd=5

• HARQ RLF recovery:


- L2-PS/src/dl/sch/RlfDetAndRecovery.cpp[56] RLF RLF_DL_HARQ: event sent, subcellIndex=0, ue=10319/35293, rlf=RLF_OFF, rlfMon=42/77,
detThd=35/75, recThd=5

• CSI/BSI DTX detection


- L2-PS/src/dl/sch/RlfDetAndRecovery.cpp[56] RLF RLF_BSI_CSI: event sent, subcellIndex=0, ue=42539/58523, rlf=RLF_BSI_CSI, rlfMon=76/0,
detThd=75/35, recThd=2

• CSI/BSI RLF recovery:


- L2-PS/src/dl/sch/RlfDetAndRecovery.cpp[56] RLF RLF_BSI_CSI: event sent, subcellIndex=0, ue=44379/58523, rlf=RLF_OFF, rlfMon=79/37,
detThd=75/35, recThd=2

In L1/L2 BIP log:


• Check the Ack/Nack info and CSI report on PUCCH/PUSCH. Receiving power, noise power. If there is
certain pattern existing, e.g., always failed on certain slot.
19 © Nokia 2019 Nokia Internal Use
Technical Details
Where to find NSA SgNB initiated RLF -- DL RLC transmission failure
In L2HI runtime or gNB syslog:
• Maximum DL Retransmission failure detection:
- a7 ASP-1831-Disp_3 <2022-02-21T19:49:48.874538Z> 14B-L2RlcDLQ INF/L2HiDu,
g0r041u08A7ucp1535t0001h04/RlcDlRbBase.cpp:120/Sending BearerErrorIndication due to MaxRlcRetransmission exceeded
- a9 ASP-1831-Disp_3 <2022-02-21T19:49:48.874545Z> 14B-L2RlcDLQ INF/L2HiDu,
g0r041u08A7ucp1535t0001h04/RlcDlRbBase.cpp:124/MaxRlcRetransmission exceeded, SN=59, txNextAck=12, txNext=60 lcId=4
- 06 FCT-1524-2-Cprt <2022-02-21T19:49:48.874851Z> 8F-cp_rt_ue INF/cp_rt/Init.hpp:#59 Init::HandleEvent: StartTask [gnbDuUeF1APId: 16926]
cause MaxRlcRetr

In L1/L2 BIP log or Sherpa:


• Check DL PDSCH transmission quality, and check PUSCH transmission quality. Firstly, configure out if DL
has the issue or UL.
• Tracing the failed RLC SN to understand the entire failure scenario.

20 © Nokia 2020 Nokia Internal Use


KPI Analysis
NSA RLF
1. Check every dedicated counter for each failure case
- UE initiated release = RLF_INITIATED_UE_T310_EXPIRY +RLF_INITIATED_UE_RACH_FAIL
+RLF_INITIATED_UE_MAX_RLC_RETX +RLF_INITIATED_UE_SRB_INTGRTY_F
+RLF_INITIATED_UE_SCG_CHGE_FAIL +RLF_INITIATED_UE_SCG_RECNF_F
- gNB initiated release = RLF_INIT_SGNB_DLHARQ_OR_CSI +RLF_INIT_SGNB_MAX_RLC_RETX
2. Check total RLFs (Release due to UE lost):
- SGNB_REL_SN_REQ_UE_LOST (M55112C00502)
3. All releases minus normal releases
- X2_SGNB_REL_REQUIRED_SENT - SGNB_RELEASE_REQ_UE_INACT- X2_SGNB_REL_REQUIRED_SENT_A2

Usually Result of #1/#2/#3 should be same, however because of missing dedicated counters for GTP-U
error and PDCP count rollover, #3 might be bigger than #1 and #2.
Feature is still missing in the available SW releases.

21 © Nokia 2020 Nokia Internal Use


Signaling Flow
SA UE initiated RLF- Fallback to RRC Setup

• If UE detected Radio Link failure, it will initiate RRC


Reestablishment procedure to continue the RRC connection
gNB- DU
• gNB will initiate RRC Setup if valid UE context is NOT available UE gNB- CU

or RLF detection
RRC Reestablishment
- Such functionality is controlled by setting rrcReestabTypeSA= 2 Request

(viaRRCSetup) Admission Control


check and/or UE
• When UE initiates the RRC Request procedure, it starts a RA context check
procedure and gNB-DU will allocate a new C-RNTI and the new F1AP: Initial UL RRC Message
gNB-DU UE F1AP ID. Transfer*

timerRRCGuard
DL RRC Message Transfer
• If gNB performs the fallback procedure, gNB-CU will allocate the RRC Setup
new gNB CU UE F1AP ID, create new UE context which will be
RRC Setup Complete
identified by those new identifiers. That means the gNB treats the
UL RRC Message Transfer
UE as a new UE.

22 © Nokia 2020 Nokia Internal Use


<feature:5GC00732>

Signaling Flow
SA UE initiated RLF- RRC reestablishment
• Upon Radio Link Failure (RLF) or other failure cases detection
UE may now initiate intra-DU RRC Reestablishment procedure
with UE context retrieval. Any cell from the same gNB-DU can
UE gNB- DU gNB- CU
be targeted during reestablishment. After that gNB will continue
RLF detection
with either retrieval of UE context derived from its previous
RRC Reestablishment
connection or in case if it does not have the valid UE context Request
(or in case of any mobility procedure ongoing), it will fallback F1AP: Initial UL RRC Message
to RRC setup (and eventually abort ongoing mobility Transfer

procedure). RRC reestablishment procedure from 5GC000732


Admission Control
applies also for VoNR calls. check

timerRRCGuard
• Feature is activated with the same parameter as legacy feature, F1AP: DL RRC Message Transfer
by setting NRBTS.rrcReestabTypeSA= 1 (sameGNB) RRC Reestablishment

• Feature is applicable only for SA option 2 deployments. RRC Reestablishment Complete

F1AP: UL RRC Message Transfer


• New functionality does not require any specific UE capabilities.

23 © Nokia 2020
Technical details
SA UE initiated RLF -- Detection gNB

• The gNB provides to the UE a set of operator configurable parameters via


System Information Broadcast based on which UE can detect potential RLF,
which is very similar as NAS case.
RRCReestablishmentRequest
• UEs will declare RLF upon detection of : message:
ReestablishmentCause ::=
- DL out-of-sync, ENUMERATED
- Random access procedure failure, {reconfigurationFailure,
handoverFailure,
- RLC failure otherFailure,
spare1}
- mobility from NR failure
- integrity check failure indication from lower layers concerning SRB1 or SRB2
- a RRC connection reconfiguration failure
• After RLF detection, UE may then initiate an RRC Connection Re-establishment
procedure if the AS security has been activated with SRB2 and at least one DRB
are established
RRCReestablishmentRequest call

24 © Nokia 2020 Nokia Internal Use


Technical Details
SA gNB initiated RLF -- Detection

• Same as NSA gNB initialted RLF


• gNB detects RLF for 5G link based on:
- DTX detection for requested DL HARQ feedback on PUCCH
- DTX detection of CSI reports on PUCCH
- Radio Link Control (RLC) failure – maximum numer of retransmissions is
gNB
reached
- PDCP count rollover
- GTP-U Transmission error
• When a gNB declares an RLF a control timer is set
• After timer expiry, gNB will release the UE context

25 © Nokia 2020 Nokia Internal Use


References and acknowledgments

Reference Version Author(s) Link Date of access


NEI: Radio Link Failures and UE 1.2 Katarzyna Rybianska
inactivity handling in 5G NSA

NEI: Radio Link Failure in 5G SA 0.4 Katarzyna Rybianska

26 © Nokia 2020
Thank You 

27 © 2020 Nokia

You might also like